CINXE.COM
(PDF) A Survey on Dialog Management: Recent Advances and Challenges | chengguang tang - Academia.edu
<!DOCTYPE html> <html > <head> <meta charset="utf-8"> <meta rel="search" type="application/opensearchdescription+xml" href="/open_search.xml" title="Academia.edu"> <meta content="width=device-width, initial-scale=1" name="viewport"> <meta name="google-site-verification" content="bKJMBZA7E43xhDOopFZkssMMkBRjvYERV-NaN4R6mrs"> <meta name="csrf-param" content="authenticity_token" /> <meta name="csrf-token" content="rTNZBj5+PXVhtkNb/ErpAPsD20XWyqxgMqaVkQs2K0zD4AjO/txkl+BPSljkqMSFl8JGH+UDgE7/7ahkjV/+AA==" /> <meta name="citation_title" content="A Survey on Dialog Management: Recent Advances and Challenges" /> <meta name="citation_publication_date" content="2020/01/01" /> <meta name="citation_journal_title" content="ArXiv" /> <meta name="citation_author" content="chengguang tang" /> <meta name="twitter:card" content="summary" /> <meta name="twitter:url" content="https://www.academia.edu/64391635/A_Survey_on_Dialog_Management_Recent_Advances_and_Challenges" /> <meta name="twitter:title" content="A Survey on Dialog Management: Recent Advances and Challenges" /> <meta name="twitter:description" content="Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog history, DM predicts the dialog state and decides the next action that the dialog agent should take. Recently, dialog policy learning has been widely" /> <meta name="twitter:image" content="https://0.academia-photos.com/164963742/70220139/58634006/s200_chengguang.tang.jpeg" /> <meta property="fb:app_id" content="2369844204" /> <meta property="og:type" content="article" /> <meta property="og:url" content="https://www.academia.edu/64391635/A_Survey_on_Dialog_Management_Recent_Advances_and_Challenges" /> <meta property="og:title" content="A Survey on Dialog Management: Recent Advances and Challenges" /> <meta property="og:image" content="http://a.academia-assets.com/images/open-graph-icons/fb-paper.gif" /> <meta property="og:description" content="Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog history, DM predicts the dialog state and decides the next action that the dialog agent should take. Recently, dialog policy learning has been widely" /> <meta property="article:author" content="https://independent.academia.edu/chengguangtang" /> <meta name="description" content="Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog history, DM predicts the dialog state and decides the next action that the dialog agent should take. Recently, dialog policy learning has been widely" /> <title>(PDF) A Survey on Dialog Management: Recent Advances and Challenges | chengguang tang - Academia.edu</title> <link rel="canonical" href="https://www.academia.edu/64391635/A_Survey_on_Dialog_Management_Recent_Advances_and_Challenges" /> <script async src="https://www.googletagmanager.com/gtag/js?id=G-5VKX33P2DS"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-5VKX33P2DS', { cookie_domain: 'academia.edu', send_page_view: false, }); gtag('event', 'page_view', { 'controller': "single_work", 'action': "show", 'controller_action': 'single_work#show', 'logged_in': 'false', 'edge': 'unknown', // Send nil if there is no A/B test bucket, in case some records get logged // with missing data - that way we can distinguish between the two cases. // ab_test_bucket should be of the form <ab_test_name>:<bucket> 'ab_test_bucket': null, }) </script> <script> var $controller_name = 'single_work'; var $action_name = "show"; var $rails_env = 'production'; var $app_rev = '49879c2402910372f4abc62630a427bbe033d190'; var $domain = 'academia.edu'; var $app_host = "academia.edu"; var $asset_host = "academia-assets.com"; var $start_time = new Date().getTime(); var $recaptcha_key = "6LdxlRMTAAAAADnu_zyLhLg0YF9uACwz78shpjJB"; var $recaptcha_invisible_key = "6Lf3KHUUAAAAACggoMpmGJdQDtiyrjVlvGJ6BbAj"; var $disableClientRecordHit = false; </script> <script> window.require = { config: function() { return function() {} } } </script> <script> window.Aedu = window.Aedu || {}; window.Aedu.hit_data = null; window.Aedu.serverRenderTime = new Date(1732456818000); window.Aedu.timeDifference = new Date().getTime() - 1732456818000; </script> <script type="application/ld+json">{"@context":"https://schema.org","@type":"ScholarlyArticle","abstract":"Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog history, DM predicts the dialog state and decides the next action that the dialog agent should take. Recently, dialog policy learning has been widely formulated as a Reinforcement Learning (RL) problem, and more works focus on the applicability of DM. In this paper, we survey recent advances and challenges within three critical topics for DM: (1) improving model scalability to facilitate dialog system modeling in new scenarios, (2) dealing with the data scarcity problem for dialog policy learning, and (3) enhancing the training efficiency to achieve better task-completion performance . We believe that this survey can shed a light on future research in dialog management.","author":[{"@context":"https://schema.org","@type":"Person","name":"chengguang tang"}],"contributor":[],"dateCreated":"2021-12-15","dateModified":null,"datePublished":"2020-01-01","headline":"A Survey on Dialog Management: Recent Advances and Challenges","inLanguage":"en","keywords":["Computer Science","arXiv"],"locationCreated":null,"publication":"ArXiv","publisher":{"@context":"https://schema.org","@type":"Organization","name":"ArXiv"},"image":null,"thumbnailUrl":null,"url":"https://www.academia.edu/64391635/A_Survey_on_Dialog_Management_Recent_Advances_and_Challenges","sourceOrganization":[{"@context":"https://schema.org","@type":"EducationalOrganization","name":null}]}</script><link rel="stylesheet" media="all" href="//a.academia-assets.com/assets/single_work_page/loswp-352e32ba4e89304dc0b4fa5b3952eef2198174c54cdb79066bc62e91c68a1a91.css" /><link rel="stylesheet" media="all" href="//a.academia-assets.com/assets/design_system/body-8d679e925718b5e8e4b18e9a4fab37f7eaa99e43386459376559080ac8f2856a.css" /><link rel="stylesheet" media="all" href="//a.academia-assets.com/assets/design_system/button-3cea6e0ad4715ed965c49bfb15dedfc632787b32ff6d8c3a474182b231146ab7.css" /><link rel="stylesheet" media="all" href="//a.academia-assets.com/assets/design_system/text_button-73590134e40cdb49f9abdc8e796cc00dc362693f3f0f6137d6cf9bb78c318ce7.css" /><link crossorigin="" href="https://fonts.gstatic.com/" rel="preconnect" /><link href="https://fonts.googleapis.com/css2?family=DM+Sans:ital,opsz,wght@0,9..40,100..1000;1,9..40,100..1000&family=Gupter:wght@400;500;700&family=IBM+Plex+Mono:wght@300;400&family=Material+Symbols+Outlined:opsz,wght,FILL,GRAD@20,400,0,0&display=swap" rel="stylesheet" /><link rel="stylesheet" media="all" href="//a.academia-assets.com/assets/design_system/common-10fa40af19d25203774df2d4a03b9b5771b45109c2304968038e88a81d1215c5.css" /> </head> <body> <div id='react-modal'></div> <div class="js-upgrade-ie-banner" style="display: none; text-align: center; padding: 8px 0; background-color: #ebe480;"><p style="color: #000; font-size: 12px; margin: 0 0 4px;">Academia.edu no longer supports Internet Explorer.</p><p style="color: #000; font-size: 12px; margin: 0;">To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to <a href="https://www.academia.edu/upgrade-browser">upgrade your browser</a>.</p></div><script>// Show this banner for all versions of IE if (!!window.MSInputMethodContext || /(MSIE)/.test(navigator.userAgent)) { document.querySelector('.js-upgrade-ie-banner').style.display = 'block'; }</script> <div class="bootstrap login"><div class="modal fade login-modal" id="login-modal"><div class="login-modal-dialog modal-dialog"><div class="modal-content"><div class="modal-header"><button class="close close" data-dismiss="modal" type="button"><span aria-hidden="true">×</span><span class="sr-only">Close</span></button><h4 class="modal-title text-center"><strong>Log In</strong></h4></div><div class="modal-body"><div class="row"><div class="col-xs-10 col-xs-offset-1"><button class="btn btn-fb btn-lg btn-block btn-v-center-content" id="login-facebook-oauth-button"><svg style="float: left; width: 19px; line-height: 1em; margin-right: .3em;" aria-hidden="true" focusable="false" data-prefix="fab" data-icon="facebook-square" class="svg-inline--fa fa-facebook-square fa-w-14" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 448 512"><path fill="currentColor" d="M400 32H48A48 48 0 0 0 0 80v352a48 48 0 0 0 48 48h137.25V327.69h-63V256h63v-54.64c0-62.15 37-96.48 93.67-96.48 27.14 0 55.52 4.84 55.52 4.84v61h-31.27c-30.81 0-40.42 19.12-40.42 38.73V256h68.78l-11 71.69h-57.78V480H400a48 48 0 0 0 48-48V80a48 48 0 0 0-48-48z"></path></svg><small><strong>Log in</strong> with <strong>Facebook</strong></small></button><br /><button class="btn btn-google btn-lg btn-block btn-v-center-content" id="login-google-oauth-button"><svg style="float: left; width: 22px; line-height: 1em; margin-right: .3em;" aria-hidden="true" focusable="false" data-prefix="fab" data-icon="google-plus" class="svg-inline--fa fa-google-plus fa-w-16" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><path fill="currentColor" d="M256,8C119.1,8,8,119.1,8,256S119.1,504,256,504,504,392.9,504,256,392.9,8,256,8ZM185.3,380a124,124,0,0,1,0-248c31.3,0,60.1,11,83,32.3l-33.6,32.6c-13.2-12.9-31.3-19.1-49.4-19.1-42.9,0-77.2,35.5-77.2,78.1S142.3,334,185.3,334c32.6,0,64.9-19.1,70.1-53.3H185.3V238.1H302.2a109.2,109.2,0,0,1,1.9,20.7c0,70.8-47.5,121.2-118.8,121.2ZM415.5,273.8v35.5H380V273.8H344.5V238.3H380V202.8h35.5v35.5h35.2v35.5Z"></path></svg><small><strong>Log in</strong> with <strong>Google</strong></small></button><br /><style type="text/css">.sign-in-with-apple-button { width: 100%; height: 52px; border-radius: 3px; border: 1px solid black; cursor: pointer; }</style><script src="https://appleid.cdn-apple.com/appleauth/static/jsapi/appleid/1/en_US/appleid.auth.js" type="text/javascript"></script><div class="sign-in-with-apple-button" data-border="false" data-color="white" id="appleid-signin"><span ="Sign Up with Apple" class="u-fs11"></span></div><script>AppleID.auth.init({ clientId: 'edu.academia.applesignon', scope: 'name email', redirectURI: 'https://www.academia.edu/sessions', state: "89d2f66d6302484d7348c477b5ffa8d78b0ec8a2860f06b53ea86db3ba2dc67e", });</script><script>// Hacky way of checking if on fast loswp if (window.loswp == null) { (function() { const Google = window?.Aedu?.Auth?.OauthButton?.Login?.Google; const Facebook = window?.Aedu?.Auth?.OauthButton?.Login?.Facebook; if (Google) { new Google({ el: '#login-google-oauth-button', rememberMeCheckboxId: 'remember_me', track: null }); } if (Facebook) { new Facebook({ el: '#login-facebook-oauth-button', rememberMeCheckboxId: 'remember_me', track: null }); } })(); }</script></div></div></div><div class="modal-body"><div class="row"><div class="col-xs-10 col-xs-offset-1"><div class="hr-heading login-hr-heading"><span class="hr-heading-text">or</span></div></div></div></div><div class="modal-body"><div class="row"><div class="col-xs-10 col-xs-offset-1"><form class="js-login-form" action="https://www.academia.edu/sessions" accept-charset="UTF-8" method="post"><input name="utf8" type="hidden" value="✓" autocomplete="off" /><input type="hidden" name="authenticity_token" value="tFI2GB4XuCwSxoVDRtkxrIki6u7b0ZwzAuT8I6VAwjfagWfQ3rXhzpM/jEBeOxwp5eN3tOgYsB3Pr8HWIykXew==" autocomplete="off" /><div class="form-group"><label class="control-label" for="login-modal-email-input" style="font-size: 14px;">Email</label><input class="form-control" id="login-modal-email-input" name="login" type="email" /></div><div class="form-group"><label class="control-label" for="login-modal-password-input" style="font-size: 14px;">Password</label><input class="form-control" id="login-modal-password-input" name="password" type="password" /></div><input type="hidden" name="post_login_redirect_url" id="post_login_redirect_url" value="https://www.academia.edu/64391635/A_Survey_on_Dialog_Management_Recent_Advances_and_Challenges" autocomplete="off" /><div class="checkbox"><label><input type="checkbox" name="remember_me" id="remember_me" value="1" checked="checked" /><small style="font-size: 12px; margin-top: 2px; display: inline-block;">Remember me on this computer</small></label></div><br><input type="submit" name="commit" value="Log In" class="btn btn-primary btn-block btn-lg js-login-submit" data-disable-with="Log In" /></br></form><script>typeof window?.Aedu?.recaptchaManagedForm === 'function' && window.Aedu.recaptchaManagedForm( document.querySelector('.js-login-form'), document.querySelector('.js-login-submit') );</script><small style="font-size: 12px;"><br />or <a data-target="#login-modal-reset-password-container" data-toggle="collapse" href="javascript:void(0)">reset password</a></small><div class="collapse" id="login-modal-reset-password-container"><br /><div class="well margin-0x"><form class="js-password-reset-form" action="https://www.academia.edu/reset_password" accept-charset="UTF-8" method="post"><input name="utf8" type="hidden" value="✓" autocomplete="off" /><input type="hidden" name="authenticity_token" value="DqIP2//5nu0IPyOmKQrjRTMgGGeCMg6y+Mb7C3asmNJgcV4TP1vHD4nGKqUx6M7AX+GFPbH7Ipw1jcb+8MVNng==" autocomplete="off" /><p>Enter the email address you signed up with and we'll email you a reset link.</p><div class="form-group"><input class="form-control" name="email" type="email" /></div><input class="btn btn-primary btn-block g-recaptcha js-password-reset-submit" data-sitekey="6Lf3KHUUAAAAACggoMpmGJdQDtiyrjVlvGJ6BbAj" type="submit" value="Email me a link" /></form></div></div><script> require.config({ waitSeconds: 90 })(["https://a.academia-assets.com/assets/collapse-45805421cf446ca5adf7aaa1935b08a3a8d1d9a6cc5d91a62a2a3a00b20b3e6a.js"], function() { // from javascript_helper.rb $("#login-modal-reset-password-container").on("shown.bs.collapse", function() { $(this).find("input[type=email]").focus(); }); }); </script> </div></div></div><div class="modal-footer"><div class="text-center"><small style="font-size: 12px;">Need an account? <a rel="nofollow" href="https://www.academia.edu/signup">Click here to sign up</a></small></div></div></div></div></div></div><script>// If we are on subdomain or non-bootstrapped page, redirect to login page instead of showing modal (function(){ if (typeof $ === 'undefined') return; var host = window.location.hostname; if ((host === $domain || host === "www."+$domain) && (typeof $().modal === 'function')) { $("#nav_log_in").click(function(e) { // Don't follow the link and open the modal e.preventDefault(); $("#login-modal").on('shown.bs.modal', function() { $(this).find("#login-modal-email-input").focus() }).modal('show'); }); } })()</script> <div id="fb-root"></div><script>window.fbAsyncInit = function() { FB.init({ appId: "2369844204", version: "v8.0", status: true, cookie: true, xfbml: true }); // Additional initialization code. if (window.InitFacebook) { // facebook.ts already loaded, set it up. window.InitFacebook(); } else { // Set a flag for facebook.ts to find when it loads. window.academiaAuthReadyFacebook = true; } };</script> <div id="google-root"></div><script>window.loadGoogle = function() { if (window.InitGoogle) { // google.ts already loaded, set it up. window.InitGoogle("331998490334-rsn3chp12mbkiqhl6e7lu2q0mlbu0f1b"); } else { // Set a flag for google.ts to use when it loads. window.GoogleClientID = "331998490334-rsn3chp12mbkiqhl6e7lu2q0mlbu0f1b"; } };</script> <div class="header--container" id="main-header-container"><div class="header--inner-container header--inner-container-ds2"><div class="header-ds2--left-wrapper"><div class="header-ds2--left-wrapper-inner"><a data-main-header-link-target="logo_home" href="https://www.academia.edu/"><img class="hide-on-desktop-redesign" style="height: 24px; width: 24px;" alt="Academia.edu" src="//a.academia-assets.com/images/academia-logo-redesign-2015-A.svg" width="24" height="24" /><img width="145.2" height="18" class="hide-on-mobile-redesign" style="height: 24px;" alt="Academia.edu" src="//a.academia-assets.com/images/academia-logo-redesign-2015.svg" /></a><div class="header--search-container header--search-container-ds2"><form class="js-SiteSearch-form select2-no-default-pills" action="https://www.academia.edu/search" accept-charset="UTF-8" method="get"><input name="utf8" type="hidden" value="✓" autocomplete="off" /><svg style="width: 14px; height: 14px;" aria-hidden="true" focusable="false" data-prefix="fas" data-icon="search" class="header--search-icon svg-inline--fa fa-search fa-w-16" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><path fill="currentColor" d="M505 442.7L405.3 343c-4.5-4.5-10.6-7-17-7H372c27.6-35.3 44-79.7 44-128C416 93.1 322.9 0 208 0S0 93.1 0 208s93.1 208 208 208c48.3 0 92.7-16.4 128-44v16.3c0 6.4 2.5 12.5 7 17l99.7 99.7c9.4 9.4 24.6 9.4 33.9 0l28.3-28.3c9.4-9.4 9.4-24.6.1-34zM208 336c-70.7 0-128-57.2-128-128 0-70.7 57.2-128 128-128 70.7 0 128 57.2 128 128 0 70.7-57.2 128-128 128z"></path></svg><input class="header--search-input header--search-input-ds2 js-SiteSearch-form-input" data-main-header-click-target="search_input" name="q" placeholder="Search" type="text" /></form></div></div></div><nav class="header--nav-buttons header--nav-buttons-ds2 js-main-nav"><a class="ds2-5-button ds2-5-button--secondary js-header-login-url header-button-ds2 header-login-ds2 hide-on-mobile-redesign" href="https://www.academia.edu/login" rel="nofollow">Log In</a><a class="ds2-5-button ds2-5-button--secondary header-button-ds2 hide-on-mobile-redesign" href="https://www.academia.edu/signup" rel="nofollow">Sign Up</a><button class="header--hamburger-button header--hamburger-button-ds2 hide-on-desktop-redesign js-header-hamburger-button"><div class="icon-bar"></div><div class="icon-bar" style="margin-top: 4px;"></div><div class="icon-bar" style="margin-top: 4px;"></div></button></nav></div><ul class="header--dropdown-container js-header-dropdown"><li class="header--dropdown-row"><a class="header--dropdown-link" href="https://www.academia.edu/login" rel="nofollow">Log In</a></li><li class="header--dropdown-row"><a class="header--dropdown-link" href="https://www.academia.edu/signup" rel="nofollow">Sign Up</a></li><li class="header--dropdown-row js-header-dropdown-expand-button"><button class="header--dropdown-button">more<svg aria-hidden="true" focusable="false" data-prefix="fas" data-icon="caret-down" class="header--dropdown-button-icon svg-inline--fa fa-caret-down fa-w-10" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><path fill="currentColor" d="M31.3 192h257.3c17.8 0 26.7 21.5 14.1 34.1L174.1 354.8c-7.8 7.8-20.5 7.8-28.3 0L17.2 226.1C4.6 213.5 13.5 192 31.3 192z"></path></svg></button></li><li><ul class="header--expanded-dropdown-container"><li class="header--dropdown-row"><a class="header--dropdown-link" href="https://www.academia.edu/about">About</a></li><li class="header--dropdown-row"><a class="header--dropdown-link" href="https://www.academia.edu/press">Press</a></li><li class="header--dropdown-row"><a class="header--dropdown-link" href="https://medium.com/@academia">Blog</a></li><li class="header--dropdown-row"><a class="header--dropdown-link" href="https://www.academia.edu/documents">Papers</a></li><li class="header--dropdown-row"><a class="header--dropdown-link" href="https://www.academia.edu/terms">Terms</a></li><li class="header--dropdown-row"><a class="header--dropdown-link" href="https://www.academia.edu/privacy">Privacy</a></li><li class="header--dropdown-row"><a class="header--dropdown-link" href="https://www.academia.edu/copyright">Copyright</a></li><li class="header--dropdown-row"><a class="header--dropdown-link" href="https://www.academia.edu/hiring"><svg aria-hidden="true" focusable="false" data-prefix="fas" data-icon="briefcase" class="header--dropdown-row-icon svg-inline--fa fa-briefcase fa-w-16" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><path fill="currentColor" d="M320 336c0 8.84-7.16 16-16 16h-96c-8.84 0-16-7.16-16-16v-48H0v144c0 25.6 22.4 48 48 48h416c25.6 0 48-22.4 48-48V288H320v48zm144-208h-80V80c0-25.6-22.4-48-48-48H176c-25.6 0-48 22.4-48 48v48H48c-25.6 0-48 22.4-48 48v80h512v-80c0-25.6-22.4-48-48-48zm-144 0H192V96h128v32z"></path></svg>We're Hiring!</a></li><li class="header--dropdown-row"><a class="header--dropdown-link" href="https://support.academia.edu/"><svg aria-hidden="true" focusable="false" data-prefix="fas" data-icon="question-circle" class="header--dropdown-row-icon svg-inline--fa fa-question-circle fa-w-16" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><path fill="currentColor" d="M504 256c0 136.997-111.043 248-248 248S8 392.997 8 256C8 119.083 119.043 8 256 8s248 111.083 248 248zM262.655 90c-54.497 0-89.255 22.957-116.549 63.758-3.536 5.286-2.353 12.415 2.715 16.258l34.699 26.31c5.205 3.947 12.621 3.008 16.665-2.122 17.864-22.658 30.113-35.797 57.303-35.797 20.429 0 45.698 13.148 45.698 32.958 0 14.976-12.363 22.667-32.534 33.976C247.128 238.528 216 254.941 216 296v4c0 6.627 5.373 12 12 12h56c6.627 0 12-5.373 12-12v-1.333c0-28.462 83.186-29.647 83.186-106.667 0-58.002-60.165-102-116.531-102zM256 338c-25.365 0-46 20.635-46 46 0 25.364 20.635 46 46 46s46-20.636 46-46c0-25.365-20.635-46-46-46z"></path></svg>Help Center</a></li><li class="header--dropdown-row js-header-dropdown-collapse-button"><button class="header--dropdown-button">less<svg aria-hidden="true" focusable="false" data-prefix="fas" data-icon="caret-up" class="header--dropdown-button-icon svg-inline--fa fa-caret-up fa-w-10" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 320 512"><path fill="currentColor" d="M288.662 352H31.338c-17.818 0-26.741-21.543-14.142-34.142l128.662-128.662c7.81-7.81 20.474-7.81 28.284 0l128.662 128.662c12.6 12.599 3.676 34.142-14.142 34.142z"></path></svg></button></li></ul></li></ul></div> <script src="//a.academia-assets.com/assets/webpack_bundles/fast_loswp-bundle-bf3d831cde46cd0e142f29f81a3fc4ce5ab45a404c10c12a480e83de68aff851.js" defer="defer"></script><script>window.loswp = {}; window.loswp.author = 164963742; window.loswp.bulkDownloadFilterCounts = {}; window.loswp.hasDownloadableAttachment = true; window.loswp.hasViewableAttachments = true; // TODO: just use routes for this window.loswp.loginUrl = "https://www.academia.edu/login?post_login_redirect_url=https%3A%2F%2Fwww.academia.edu%2F64391635%2FA_Survey_on_Dialog_Management_Recent_Advances_and_Challenges%3Fauto%3Ddownload"; window.loswp.translateUrl = "https://www.academia.edu/login?post_login_redirect_url=https%3A%2F%2Fwww.academia.edu%2F64391635%2FA_Survey_on_Dialog_Management_Recent_Advances_and_Challenges%3Fshow_translation%3Dtrue"; window.loswp.previewableAttachments = [{"id":76449794,"identifier":"Attachment_76449794","shouldShowBulkDownload":false}]; window.loswp.shouldDetectTimezone = true; window.loswp.shouldShowBulkDownload = true; window.loswp.showSignupCaptcha = false window.loswp.willEdgeCache = false; window.loswp.work = {"work":{"id":64391635,"created_at":"2021-12-15T18:52:05.649-08:00","from_world_paper_id":187254692,"updated_at":"2021-12-15T19:15:09.425-08:00","_data":{"abstract":"Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog history, DM predicts the dialog state and decides the next action that the dialog agent should take. Recently, dialog policy learning has been widely formulated as a Reinforcement Learning (RL) problem, and more works focus on the applicability of DM. In this paper, we survey recent advances and challenges within three critical topics for DM: (1) improving model scalability to facilitate dialog system modeling in new scenarios, (2) dealing with the data scarcity problem for dialog policy learning, and (3) enhancing the training efficiency to achieve better task-completion performance . We believe that this survey can shed a light on future research in dialog management.","publisher":"ArXiv","publication_date":"2020,,","publication_name":"ArXiv"},"document_type":"paper","pre_hit_view_count_baseline":null,"quality":"high","language":"en","title":"A Survey on Dialog Management: Recent Advances and Challenges","broadcastable":false,"draft":null,"has_indexable_attachment":true,"indexable":true}}["work"]; window.loswp.workCoauthors = [164963742]; window.loswp.locale = "en"; window.loswp.countryCode = "SG"; window.loswp.cwvAbTestBucket = ""; window.loswp.designVariant = "ds_vanilla"; window.loswp.fullPageMobileSutdModalVariant = "full_page_mobile_sutd_modal"; window.loswp.useOptimizedScribd4genScript = false; window.loswp.appleClientId = 'edu.academia.applesignon';</script><script defer="" src="https://accounts.google.com/gsi/client"></script><div class="ds-loswp-container"><div class="ds-work-card--grid-container"><div class="ds-work-card--container js-loswp-work-card"><div class="ds-work-card--cover"><div class="ds-work-cover--wrapper"><div class="ds-work-cover--container"><button class="ds-work-cover--clickable js-swp-download-button" data-signup-modal="{"location":"swp-splash-paper-cover","attachmentId":76449794,"attachmentType":"pdf"}"><img alt="First page of “A Survey on Dialog Management: Recent Advances and Challenges”" class="ds-work-cover--cover-thumbnail" src="https://0.academia-photos.com/attachment_thumbnails/76449794/mini_magick20211215-5900-19ogpsq.png?1639623235" /><img alt="PDF Icon" class="ds-work-cover--file-icon" src="//a.academia-assets.com/assets/single_work_splash/adobe.icon-574afd46eb6b03a77a153a647fb47e30546f9215c0ee6a25df597a779717f9ef.svg" /><div class="ds-work-cover--hover-container"><span class="material-symbols-outlined" style="font-size: 20px" translate="no">download</span><p>Download Free PDF</p></div><div class="ds-work-cover--ribbon-container">Download Free PDF</div><div class="ds-work-cover--ribbon-triangle"></div></button></div></div></div><div class="ds-work-card--work-information"><h1 class="ds-work-card--work-title">A Survey on Dialog Management: Recent Advances and Challenges</h1><div class="ds-work-card--work-authors ds-work-card--detail"><a class="ds-work-card--author js-wsj-grid-card-author ds2-5-body-md ds2-5-body-link" data-author-id="164963742" href="https://independent.academia.edu/chengguangtang"><img alt="Profile image of chengguang tang" class="ds-work-card--author-avatar" src="https://0.academia-photos.com/164963742/70220139/58634006/s65_chengguang.tang.jpeg" />chengguang tang</a></div><p class="ds-work-card--detail ds2-5-body-sm">2020, ArXiv</p><p class="ds-work-card--work-abstract ds-work-card--detail ds2-5-body-md">Dialog management (DM) is a crucial component in a task-oriented dialog system. Given the dialog history, DM predicts the dialog state and decides the next action that the dialog agent should take. Recently, dialog policy learning has been widely formulated as a Reinforcement Learning (RL) problem, and more works focus on the applicability of DM. In this paper, we survey recent advances and challenges within three critical topics for DM: (1) improving model scalability to facilitate dialog system modeling in new scenarios, (2) dealing with the data scarcity problem for dialog policy learning, and (3) enhancing the training efficiency to achieve better task-completion performance . We believe that this survey can shed a light on future research in dialog management.</p><div class="ds-work-card--button-container"><button class="ds2-5-button js-swp-download-button" data-signup-modal="{"location":"continue-reading-button--work-card","attachmentId":76449794,"attachmentType":"pdf","workUrl":"https://www.academia.edu/64391635/A_Survey_on_Dialog_Management_Recent_Advances_and_Challenges"}">See full PDF</button><button class="ds2-5-button ds2-5-button--secondary js-swp-download-button" data-signup-modal="{"location":"download-pdf-button--work-card","attachmentId":76449794,"attachmentType":"pdf","workUrl":"https://www.academia.edu/64391635/A_Survey_on_Dialog_Management_Recent_Advances_and_Challenges"}"><span class="material-symbols-outlined" style="font-size: 20px" translate="no">download</span>Download PDF</button></div></div></div></div><div data-auto_select="false" data-client_id="331998490334-rsn3chp12mbkiqhl6e7lu2q0mlbu0f1b" data-doc_id="76449794" data-landing_url="https://www.academia.edu/64391635/A_Survey_on_Dialog_Management_Recent_Advances_and_Challenges" data-login_uri="https://www.academia.edu/registrations/google_one_tap" data-moment_callback="onGoogleOneTapEvent" id="g_id_onload"></div><div class="ds-top-related-works--grid-container"><div class="ds-related-content--container ds-top-related-works--container"><h2 class="ds-related-content--heading">Related papers</h2><div class="ds-related-work--container js-wsj-grid-card" data-collection-position="0" data-entity-id="30524414" data-sort-order="default"><a class="ds-related-work--title js-wsj-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/30524414/Optimizing_dialogue_management_with_reinforcement_learning_Experiments_">Optimizing dialogue management with reinforcement learning: Experiments …</a><div class="ds-related-work--metadata"><a class="js-wsj-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="34843701" href="https://independent.academia.edu/MarilynWalker5">Marilyn Walker</a></div><p class="ds-related-work--metadata ds2-5-body-xs">Journal of Artificial …</p><p class="ds-related-work--abstract ds2-5-body-sm">Designing the dialogue policy of a spoken dialogue system involves many nontrivial choices. This paper presents a reinforcement learning approach for automatically optimiz- ing a dialogue policy, which addresses the technical challenges in applying reinforcement ...</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Optimizing dialogue management with reinforcement learning: Experiments …","attachmentId":50968493,"attachmentType":"pdf","work_url":"https://www.academia.edu/30524414/Optimizing_dialogue_management_with_reinforcement_learning_Experiments_","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-wsj-grid-card-view-pdf" href="https://www.academia.edu/30524414/Optimizing_dialogue_management_with_reinforcement_learning_Experiments_"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-wsj-grid-card" data-collection-position="1" data-entity-id="64024556" data-sort-order="default"><a class="ds-related-work--title js-wsj-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/64024556/Dialog_policy_optimization_for_low_resource_setting_using_Self_play_and_Reward_based_Sampling">Dialog policy optimization for low resource setting using Self-play and Reward based Sampling</a><div class="ds-related-work--metadata"><a class="js-wsj-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="61028408" href="https://independent.academia.edu/durashilangappuli">durashi langappuli</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2020</p><p class="ds-related-work--abstract ds2-5-body-sm">Reinforcement Learning is considered as the state of the art approach for dialogue policy optimization in task-oriented dialogue systems. However, these models demand a large corpus of dialogues to learn effectively. Training Reinforcement Learning agent with low data amount tends to overfit the agent. Although synthesizing dialogue agendas with dialogue Self-play using rule-based agents and crowdsourcing has demonstrated promising results with the low amount of samples, these methods hold limitations. For instance, rulebased agents acquire specific domain and language while crowdsourcing demands a high price and domain experts, especially in local languages. In this paper, we address these limitations by proposing a novel approach for synthetic agenda generation by acknowledging the underlying probability distribution of the user agendas and a reward-based sampling method that prioritizes failed dialogue acts. Evaluations conducted shows leveraged performance without overfitting, c...</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Dialog policy optimization for low resource setting using Self-play and Reward based Sampling","attachmentId":76253808,"attachmentType":"pdf","work_url":"https://www.academia.edu/64024556/Dialog_policy_optimization_for_low_resource_setting_using_Self_play_and_Reward_based_Sampling","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-wsj-grid-card-view-pdf" href="https://www.academia.edu/64024556/Dialog_policy_optimization_for_low_resource_setting_using_Self_play_and_Reward_based_Sampling"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-wsj-grid-card" data-collection-position="2" data-entity-id="92048809" data-sort-order="default"><a class="ds-related-work--title js-wsj-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/92048809/Predictable_and_Adaptive_Goal_oriented_Dialog_Policy_Generation">Predictable and Adaptive Goal-oriented Dialog Policy Generation</a><div class="ds-related-work--metadata"><a class="js-wsj-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="248226883" href="https://independent.academia.edu/NhatLe161">Nhat Le</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2021 IEEE 15th International Conference on Semantic Computing (ICSC)</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Predictable and Adaptive Goal-oriented Dialog Policy Generation","attachmentId":95163659,"attachmentType":"pdf","work_url":"https://www.academia.edu/92048809/Predictable_and_Adaptive_Goal_oriented_Dialog_Policy_Generation","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-wsj-grid-card-view-pdf" href="https://www.academia.edu/92048809/Predictable_and_Adaptive_Goal_oriented_Dialog_Policy_Generation"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-wsj-grid-card" data-collection-position="3" data-entity-id="110026802" data-sort-order="default"><a class="ds-related-work--title js-wsj-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/110026802/Deep_Reinforcement_Learning_for_Dialogue_Systems_with_Dynamic_User_Goals">Deep Reinforcement Learning for Dialogue Systems with Dynamic User Goals</a><div class="ds-related-work--metadata"><a class="js-wsj-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="166390836" href="https://retired.academia.edu/glenchandler">glen chandler</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2020</p><p class="ds-related-work--abstract ds2-5-body-sm">Dialogue systems have recently become a widely used system across the world. Some of the functionality offered includes application user interfacing, social conversation, data interaction, and task completion. Most recently, dialogue systems have been developed to autonomously and intelligently interact with users to complete complex tasks in diverse operational spaces. This kind of dialogue system can interact with users to complete tasks such as making a phone call, ordering items online, searching the internet for a question, and more. These systems are typically created by training a machine learning model with example conversational data. One of the existing problems with training these systems is that they require large amounts of realistic user data, which can be challenging to collect and label in large quantities. Our research focuses on modifications to user simulators that &quot;change their mind&quot; mid-episode with the goal of training more robust dialogue agents. We ...</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Deep Reinforcement Learning for Dialogue Systems with Dynamic User Goals","attachmentId":107972837,"attachmentType":"pdf","work_url":"https://www.academia.edu/110026802/Deep_Reinforcement_Learning_for_Dialogue_Systems_with_Dynamic_User_Goals","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-wsj-grid-card-view-pdf" href="https://www.academia.edu/110026802/Deep_Reinforcement_Learning_for_Dialogue_Systems_with_Dynamic_User_Goals"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-wsj-grid-card" data-collection-position="4" data-entity-id="121528854" data-sort-order="default"><a class="ds-related-work--title js-wsj-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/121528854/Conversation_Learner_A_Machine_Teaching_Tool_for_Building_Dialog_Managers_for_Task_Oriented_Dialog_Systems">Conversation Learner - A Machine Teaching Tool for Building Dialog Managers for Task-Oriented Dialog Systems</a><div class="ds-related-work--metadata"><a class="js-wsj-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="310442875" href="https://independent.academia.edu/LarsLiden">Lars Liden</a></div><p class="ds-related-work--metadata ds2-5-body-xs">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Conversation Learner - A Machine Teaching Tool for Building Dialog Managers for Task-Oriented Dialog Systems","attachmentId":116380144,"attachmentType":"pdf","work_url":"https://www.academia.edu/121528854/Conversation_Learner_A_Machine_Teaching_Tool_for_Building_Dialog_Managers_for_Task_Oriented_Dialog_Systems","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-wsj-grid-card-view-pdf" href="https://www.academia.edu/121528854/Conversation_Learner_A_Machine_Teaching_Tool_for_Building_Dialog_Managers_for_Task_Oriented_Dialog_Systems"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-wsj-grid-card" data-collection-position="5" data-entity-id="91518249" data-sort-order="default"><a class="ds-related-work--title js-wsj-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/91518249/SUMBT_LaRL_End_to_end_Neural_Task_oriented_Dialog_System_with_Reinforcement_Learning">SUMBT+LaRL: End-to-end Neural Task-oriented Dialog System with Reinforcement Learning</a><div class="ds-related-work--metadata"><a class="js-wsj-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="72898483" href="https://independent.academia.edu/SeokhwanJo">Seokhwan Jo</a></div><p class="ds-related-work--metadata ds2-5-body-xs">ArXiv, 2020</p><p class="ds-related-work--abstract ds2-5-body-sm">The recent advent of neural approaches for developing each dialog component in task-oriented dialog systems has remarkably improved, yet optimizing the overall system performance remains a challenge. In this paper, we propose an end-to-end trainable neural dialog system with reinforcement learning, named SUMBT+LaRL. The SUMBT+ estimates user-acts as well as dialog belief states, and the LaRL models latent system action spaces and generates responses given the estimated contexts. We experimentally demonstrate that the training framework in which the SUMBT+ and LaRL are separately pretrained and then the entire system is fine-tuned significantly increases dialog success rates. We propose new success criteria for reinforcement learning to the end-to-end dialog system as well as provide experimental analysis on a different result aspect depending on the success criteria and evaluation methods. Consequently, our model achieved the new state-of-the-art success rate of 85.4% on corpus-base...</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"SUMBT+LaRL: End-to-end Neural Task-oriented Dialog System with Reinforcement Learning","attachmentId":94783550,"attachmentType":"pdf","work_url":"https://www.academia.edu/91518249/SUMBT_LaRL_End_to_end_Neural_Task_oriented_Dialog_System_with_Reinforcement_Learning","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-wsj-grid-card-view-pdf" href="https://www.academia.edu/91518249/SUMBT_LaRL_End_to_end_Neural_Task_oriented_Dialog_System_with_Reinforcement_Learning"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-wsj-grid-card" data-collection-position="6" data-entity-id="26513438" data-sort-order="default"><a class="ds-related-work--title js-wsj-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/26513438/Using_reinforcement_learning_to_build_a_better_model_of_dialogue_state">Using reinforcement learning to build a better model of dialogue state</a><div class="ds-related-work--metadata"><a class="js-wsj-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="50446196" href="https://independent.academia.edu/JoelTetreault">Joel Tetreault</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2006</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Using reinforcement learning to build a better model of dialogue state","attachmentId":46809839,"attachmentType":"pdf","work_url":"https://www.academia.edu/26513438/Using_reinforcement_learning_to_build_a_better_model_of_dialogue_state","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-wsj-grid-card-view-pdf" href="https://www.academia.edu/26513438/Using_reinforcement_learning_to_build_a_better_model_of_dialogue_state"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-wsj-grid-card" data-collection-position="7" data-entity-id="14976651" data-sort-order="default"><a class="ds-related-work--title js-wsj-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/14976651/Hybrid_Reinforcement_Supervised_Learning_of_Dialogue_Policies_from_Fixed_Data_Sets">Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets</a><div class="ds-related-work--metadata"><a class="js-wsj-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="33971246" href="https://independent.academia.edu/JamesHenderson28">James Henderson</a></div><p class="ds-related-work--metadata ds2-5-body-xs">Computational Linguistics, 2008</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Hybrid Reinforcement/Supervised Learning of Dialogue Policies from Fixed Data Sets","attachmentId":43681476,"attachmentType":"pdf","work_url":"https://www.academia.edu/14976651/Hybrid_Reinforcement_Supervised_Learning_of_Dialogue_Policies_from_Fixed_Data_Sets","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-wsj-grid-card-view-pdf" href="https://www.academia.edu/14976651/Hybrid_Reinforcement_Supervised_Learning_of_Dialogue_Policies_from_Fixed_Data_Sets"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-wsj-grid-card" data-collection-position="8" data-entity-id="79321656" data-sort-order="default"><a class="ds-related-work--title js-wsj-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/79321656/A_Survey_on_Reinforcement_Learning_for_Dialogue_Systems">A Survey on Reinforcement Learning for Dialogue Systems</a><div class="ds-related-work--metadata"><a class="js-wsj-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="191316872" href="https://independent.academia.edu/AmanSoni105">Aman Soni</a></div><p class="ds-related-work--metadata ds2-5-body-xs">viXra, 2019</p><p class="ds-related-work--abstract ds2-5-body-sm">Dialogue systems are computer systems which com- municate with humans using natural language. The goal is not just to imitate human communication but to learn from these interactions and improve the system’s behaviour over time. Therefore, different machine learning approaches can be implemented with Reinforcement Learning being one of the most promising techniques to generate a contextually and semantically appropriate response. This paper outlines the current state-of- the-art methods and algorithms for integration of Reinforcement Learning techniques into dialogue systems.</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"A Survey on Reinforcement Learning for Dialogue Systems","attachmentId":86075644,"attachmentType":"pdf","work_url":"https://www.academia.edu/79321656/A_Survey_on_Reinforcement_Learning_for_Dialogue_Systems","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-wsj-grid-card-view-pdf" href="https://www.academia.edu/79321656/A_Survey_on_Reinforcement_Learning_for_Dialogue_Systems"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-wsj-grid-card" data-collection-position="9" data-entity-id="76839136" data-sort-order="default"><a class="ds-related-work--title js-wsj-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/76839136/Reinforcement_Learning_With_Simulated_User_For_Automatic_Dialog_Strategy_Optimization">Reinforcement Learning With Simulated User For Automatic Dialog Strategy Optimization</a><div class="ds-related-work--metadata"><a class="js-wsj-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="129160128" href="https://ugh.academia.edu/MinhQuangNguyen">Minh Quang Nguyen</a></div><p class="ds-related-work--abstract ds2-5-body-sm">In this paper, we propose a solution to the problem of formulating strategies for a spoken dialog system. Our approach is based on reinforcement learning with the help of a simulated user in order to identify an optimal dialog strategy. Our method considers the Markov decision process to be a framework for representation of speech dialog in which the states represent history and discourse context, the actions are dialog acts and the transition strategies are decisions on actions to take between states. We present our reinforcement learning architecture with a novel objective function that is based on dialog quality rather than its duration.</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Reinforcement Learning With Simulated User For Automatic Dialog Strategy Optimization","attachmentId":84413805,"attachmentType":"pdf","work_url":"https://www.academia.edu/76839136/Reinforcement_Learning_With_Simulated_User_For_Automatic_Dialog_Strategy_Optimization","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-wsj-grid-card-view-pdf" href="https://www.academia.edu/76839136/Reinforcement_Learning_With_Simulated_User_For_Automatic_Dialog_Strategy_Optimization"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div></div></div><div class="ds-sticky-ctas--wrapper js-loswp-sticky-ctas hidden"><div class="ds-sticky-ctas--grid-container"><div class="ds-sticky-ctas--container"><button class="ds2-5-button js-swp-download-button" data-signup-modal="{"location":"continue-reading-button--sticky-ctas","attachmentId":76449794,"attachmentType":"pdf","workUrl":null}">See full PDF</button><button class="ds2-5-button ds2-5-button--secondary js-swp-download-button" data-signup-modal="{"location":"download-pdf-button--sticky-ctas","attachmentId":76449794,"attachmentType":"pdf","workUrl":null}"><span class="material-symbols-outlined" style="font-size: 20px" translate="no">download</span>Download PDF</button></div></div></div><div class="ds-below-fold--grid-container"><div class="ds-work--container js-loswp-embedded-document"><div class="attachment_preview" data-attachment="Attachment_76449794" style="display: none"><div class="js-scribd-document-container"><div class="scribd--document-loading js-scribd-document-loader" style="display: block;"><img alt="Loading..." src="//a.academia-assets.com/images/loaders/paper-load.gif" /><p>Loading Preview</p></div></div><div style="text-align: center;"><div class="scribd--no-preview-alert js-preview-unavailable"><p>Sorry, preview is currently unavailable. You can download the paper by clicking the button above.</p></div></div></div></div><div class="ds-sidebar--container js-work-sidebar"><div class="ds-related-content--container"><h2 class="ds-related-content--heading">Related papers</h2><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="0" data-entity-id="89333028" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/89333028/Temporal_supervised_learning_for_inferring_a_dialog_policy_from_example_conversations">Temporal supervised learning for inferring a dialog policy from example conversations</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="242415505" href="https://independent.academia.edu/HeHe279">He He</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2014 IEEE Spoken Language Technology Workshop (SLT), 2014</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Temporal supervised learning for inferring a dialog policy from example conversations","attachmentId":93151785,"attachmentType":"pdf","work_url":"https://www.academia.edu/89333028/Temporal_supervised_learning_for_inferring_a_dialog_policy_from_example_conversations","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/89333028/Temporal_supervised_learning_for_inferring_a_dialog_policy_from_example_conversations"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="1" data-entity-id="30524642" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/30524642/Optimizing_Dialogue_Management_with_Reinforcement_Learning_Experiments_with_the_NJFun_System">Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="34843701" href="https://independent.academia.edu/MarilynWalker5">Marilyn Walker</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2011</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System","attachmentId":50968471,"attachmentType":"pdf","work_url":"https://www.academia.edu/30524642/Optimizing_Dialogue_Management_with_Reinforcement_Learning_Experiments_with_the_NJFun_System","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/30524642/Optimizing_Dialogue_Management_with_Reinforcement_Learning_Experiments_with_the_NJFun_System"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="2" data-entity-id="118442233" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/118442233/Learning_Robust_Dialog_Policies_in_Noisy_Environments">Learning Robust Dialog Policies in Noisy Environments</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="304788585" href="https://independent.academia.edu/AGeramifard">Alborz Geramifard</a></div><p class="ds-related-work--metadata ds2-5-body-xs">arXiv (Cornell University), 2017</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Learning Robust Dialog Policies in Noisy Environments","attachmentId":114068292,"attachmentType":"pdf","work_url":"https://www.academia.edu/118442233/Learning_Robust_Dialog_Policies_in_Noisy_Environments","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/118442233/Learning_Robust_Dialog_Policies_in_Noisy_Environments"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="3" data-entity-id="14976622" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/14976622/Hybrid_reinforcement_supervised_learning_for_dialogue_policies_from_communicator_data">Hybrid reinforcement/supervised learning for dialogue policies from communicator data</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="33971246" href="https://independent.academia.edu/JamesHenderson28">James Henderson</a></div><p class="ds-related-work--metadata ds2-5-body-xs">IJCAI workshop on Knowledge and Reasoning in Practical Dialogue Systems, 2005</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Hybrid reinforcement/supervised learning for dialogue policies from communicator data","attachmentId":38494029,"attachmentType":"pdf","work_url":"https://www.academia.edu/14976622/Hybrid_reinforcement_supervised_learning_for_dialogue_policies_from_communicator_data","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/14976622/Hybrid_reinforcement_supervised_learning_for_dialogue_policies_from_communicator_data"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="4" data-entity-id="30524515" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/30524515/Automatic_optimization_of_dialogue_management">Automatic optimization of dialogue management</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="34843701" href="https://independent.academia.edu/MarilynWalker5">Marilyn Walker</a></div><p class="ds-related-work--metadata ds2-5-body-xs">Proceedings of the 18th conference on Computational linguistics -, 2000</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Automatic optimization of dialogue management","attachmentId":50968646,"attachmentType":"pdf","work_url":"https://www.academia.edu/30524515/Automatic_optimization_of_dialogue_management","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/30524515/Automatic_optimization_of_dialogue_management"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="5" data-entity-id="89599918" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/89599918/Learning_End_to_End_Goal_Oriented_Dialog_with_Maximal_User_Task_Success_and_Minimal_Human_Agent_Use">Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="8648970" href="https://independent.academia.edu/LPolymenakos">Lazaros Polymenakos</a></div><p class="ds-related-work--metadata ds2-5-body-xs">Transactions of the Association for Computational Linguistics, 2019</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use","attachmentId":93373617,"attachmentType":"pdf","work_url":"https://www.academia.edu/89599918/Learning_End_to_End_Goal_Oriented_Dialog_with_Maximal_User_Task_Success_and_Minimal_Human_Agent_Use","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/89599918/Learning_End_to_End_Goal_Oriented_Dialog_with_Maximal_User_Task_Success_and_Minimal_Human_Agent_Use"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="6" data-entity-id="82646762" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/82646762/Sample_Efficient_Deep_Reinforcement_Learning_for_Dialogue_Systems_With_Large_Action_Spaces">Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="12420031" href="https://independent.academia.edu/ThabetMohammad">Mohammad Thabet</a></div><p class="ds-related-work--metadata ds2-5-body-xs">IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2018</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Sample Efficient Deep Reinforcement Learning for Dialogue Systems With Large Action Spaces","attachmentId":88285257,"attachmentType":"pdf","work_url":"https://www.academia.edu/82646762/Sample_Efficient_Deep_Reinforcement_Learning_for_Dialogue_Systems_With_Large_Action_Spaces","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/82646762/Sample_Efficient_Deep_Reinforcement_Learning_for_Dialogue_Systems_With_Large_Action_Spaces"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="7" data-entity-id="85474617" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/85474617/Experience_Replay_based_Deep_Reinforcement_Learning_for_Dialogue_Management_Optimisation">Experience Replay-based Deep Reinforcement Learning for Dialogue Management Optimisation</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="208546" href="https://iiita.academia.edu/umashankertiwary">uma shanker tiwary</a></div><p class="ds-related-work--metadata ds2-5-body-xs">ACM Transactions on Asian and Low-Resource Language Information Processing</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Experience Replay-based Deep Reinforcement Learning for Dialogue Management Optimisation","attachmentId":90162149,"attachmentType":"pdf","work_url":"https://www.academia.edu/85474617/Experience_Replay_based_Deep_Reinforcement_Learning_for_Dialogue_Management_Optimisation","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/85474617/Experience_Replay_based_Deep_Reinforcement_Learning_for_Dialogue_Management_Optimisation"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="8" data-entity-id="118442226" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/118442226/Resource_Constrained_Dialog_Policy_Learning_Via_Differentiable_Inductive_Logic_Programming">Resource Constrained Dialog Policy Learning Via Differentiable Inductive Logic Programming</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="304788585" href="https://independent.academia.edu/AGeramifard">Alborz Geramifard</a></div><p class="ds-related-work--metadata ds2-5-body-xs">Proceedings of the 28th International Conference on Computational Linguistics, 2020</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Resource Constrained Dialog Policy Learning Via Differentiable Inductive Logic Programming","attachmentId":114068315,"attachmentType":"pdf","work_url":"https://www.academia.edu/118442226/Resource_Constrained_Dialog_Policy_Learning_Via_Differentiable_Inductive_Logic_Programming","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/118442226/Resource_Constrained_Dialog_Policy_Learning_Via_Differentiable_Inductive_Logic_Programming"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="9" data-entity-id="58188605" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/58188605/AgentGraph_Toward_Universal_Dialogue_Management_With_Structured_Deep_Reinforcement_Learning">AgentGraph: Toward Universal Dialogue Management With Structured Deep Reinforcement Learning</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="165636886" href="https://cornell.academia.edu/SishanLong">Sishan Long</a></div><p class="ds-related-work--metadata ds2-5-body-xs">IEEE/ACM Transactions on Audio, Speech, and Language Processing</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"AgentGraph: Toward Universal Dialogue Management With Structured Deep Reinforcement Learning","attachmentId":72720419,"attachmentType":"pdf","work_url":"https://www.academia.edu/58188605/AgentGraph_Toward_Universal_Dialogue_Management_With_Structured_Deep_Reinforcement_Learning","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/58188605/AgentGraph_Toward_Universal_Dialogue_Management_With_Structured_Deep_Reinforcement_Learning"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="10" data-entity-id="14976659" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/14976659/An_ISU_dialogue_system_exhibiting_reinforcement_learning_of_dialogue_policies">An ISU dialogue system exhibiting reinforcement learning of dialogue policies</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="33971246" href="https://independent.academia.edu/JamesHenderson28">James Henderson</a></div><p class="ds-related-work--metadata ds2-5-body-xs">Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations on - EACL '06, 2006</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"An ISU dialogue system exhibiting reinforcement learning of dialogue policies","attachmentId":43681663,"attachmentType":"pdf","work_url":"https://www.academia.edu/14976659/An_ISU_dialogue_system_exhibiting_reinforcement_learning_of_dialogue_policies","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/14976659/An_ISU_dialogue_system_exhibiting_reinforcement_learning_of_dialogue_policies"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="11" data-entity-id="88142876" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/88142876/A_dynamic_goal_adapted_task_oriented_dialogue_agent">A dynamic goal adapted task oriented dialogue agent</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="4646101" href="https://iimcal.academia.edu/ShubhashisSengupta">Shubhashis Sengupta</a></div><p class="ds-related-work--metadata ds2-5-body-xs">PLOS ONE, 2021</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"A dynamic goal adapted task oriented dialogue agent","attachmentId":92175969,"attachmentType":"pdf","work_url":"https://www.academia.edu/88142876/A_dynamic_goal_adapted_task_oriented_dialogue_agent","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/88142876/A_dynamic_goal_adapted_task_oriented_dialogue_agent"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="12" data-entity-id="100687500" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/100687500/Enhancing_Designer_Knowledge_to_Dialogue_Management_A_Comparison_between_Supervised_and_Reinforcement_Learning_Approaches">Enhancing Designer Knowledge to Dialogue Management: A Comparison between Supervised and Reinforcement Learning Approaches</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="42508" href="https://ufscar.academia.edu/ViniciusCarida">Vinicius Caridá</a></div><p class="ds-related-work--metadata ds2-5-body-xs">Anais do XIX Encontro Nacional de Inteligência Artificial e Computacional (ENIAC 2022)</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Enhancing Designer Knowledge to Dialogue Management: A Comparison between Supervised and Reinforcement Learning Approaches","attachmentId":101439561,"attachmentType":"pdf","work_url":"https://www.academia.edu/100687500/Enhancing_Designer_Knowledge_to_Dialogue_Management_A_Comparison_between_Supervised_and_Reinforcement_Learning_Approaches","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/100687500/Enhancing_Designer_Knowledge_to_Dialogue_Management_A_Comparison_between_Supervised_and_Reinforcement_Learning_Approaches"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="13" data-entity-id="91518217" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/91518217/SUMBT_LaRL_Effective_Multi_Domain_End_to_End_Neural_Task_Oriented_Dialog_System">SUMBT+LaRL: Effective Multi-Domain End-to-End Neural Task-Oriented Dialog System</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="72898483" href="https://independent.academia.edu/SeokhwanJo">Seokhwan Jo</a></div><p class="ds-related-work--metadata ds2-5-body-xs">IEEE Access, 2021</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"SUMBT+LaRL: Effective Multi-Domain End-to-End Neural Task-Oriented Dialog System","attachmentId":94783488,"attachmentType":"pdf","work_url":"https://www.academia.edu/91518217/SUMBT_LaRL_Effective_Multi_Domain_End_to_End_Neural_Task_Oriented_Dialog_System","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/91518217/SUMBT_LaRL_Effective_Multi_Domain_End_to_End_Neural_Task_Oriented_Dialog_System"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="14" data-entity-id="30524415" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/30524415/Reinforcement_learning_for_spoken_dialogue_systems">Reinforcement learning for spoken dialogue systems</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="34843701" href="https://independent.academia.edu/MarilynWalker5">Marilyn Walker</a></div><p class="ds-related-work--metadata ds2-5-body-xs">Proc. NIPS99</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Reinforcement learning for spoken dialogue systems","attachmentId":50968496,"attachmentType":"pdf","work_url":"https://www.academia.edu/30524415/Reinforcement_learning_for_spoken_dialogue_systems","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/30524415/Reinforcement_learning_for_spoken_dialogue_systems"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="15" data-entity-id="65059580" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/65059580/Dialog_Simulation_with_Realistic_Variations_for_Training_Goal_Oriented_Conversational_Systems">Dialog Simulation with Realistic Variations for Training Goal-Oriented Conversational Systems</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="54474309" href="https://independent.academia.edu/NehalBelgamwar">Nehal Belgamwar</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2020</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Dialog Simulation with Realistic Variations for Training Goal-Oriented Conversational Systems","attachmentId":76811224,"attachmentType":"pdf","work_url":"https://www.academia.edu/65059580/Dialog_Simulation_with_Realistic_Variations_for_Training_Goal_Oriented_Conversational_Systems","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/65059580/Dialog_Simulation_with_Realistic_Variations_for_Training_Goal_Oriented_Conversational_Systems"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="16" data-entity-id="79127805" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/79127805/Dialogue_Systems_Domain_Interaction_Using_Reinforcement_Learning">Dialogue Systems Domain Interaction Using Reinforcement Learning</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="175342787" href="https://independent.academia.edu/PauloAra%C3%BAjo74">Paulo Araújo</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2008</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Dialogue Systems Domain Interaction Using Reinforcement Learning","attachmentId":85953549,"attachmentType":"pdf","work_url":"https://www.academia.edu/79127805/Dialogue_Systems_Domain_Interaction_Using_Reinforcement_Learning","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/79127805/Dialogue_Systems_Domain_Interaction_Using_Reinforcement_Learning"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="17" data-entity-id="30536525" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/30536525/A_stochastic_model_of_human_machine_interaction_for_learning_dialog_strategies">A stochastic model of human-machine interaction for learning dialog strategies</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="58204826" href="https://independent.academia.edu/WielandEckert">Wieland Eckert</a></div><p class="ds-related-work--metadata ds2-5-body-xs">IEEE Transactions on Speech and Audio Processing, 2000</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"A stochastic model of human-machine interaction for learning dialog strategies","attachmentId":50980325,"attachmentType":"pdf","work_url":"https://www.academia.edu/30536525/A_stochastic_model_of_human_machine_interaction_for_learning_dialog_strategies","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/30536525/A_stochastic_model_of_human_machine_interaction_for_learning_dialog_strategies"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="18" data-entity-id="30524655" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/30524655/An_Application_of_Reinforcement_Learning_to_Dialogue_Strategy_Selection_in_a_Spoken_Dialogue_System">An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="34843701" href="https://independent.academia.edu/MarilynWalker5">Marilyn Walker</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2002</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"An Application of Reinforcement Learning to Dialogue Strategy Selection in a Spoken Dialogue System","attachmentId":50968645,"attachmentType":"pdf","work_url":"https://www.academia.edu/30524655/An_Application_of_Reinforcement_Learning_to_Dialogue_Strategy_Selection_in_a_Spoken_Dialogue_System","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/30524655/An_Application_of_Reinforcement_Learning_to_Dialogue_Strategy_Selection_in_a_Spoken_Dialogue_System"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="19" data-entity-id="76839146" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/76839146/A_Quality_Focused_Spoken_Dialog_System_With_Reinforcement_Learning_And_Simulated_User">A Quality-Focused Spoken Dialog System With Reinforcement Learning And Simulated User</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="129160128" href="https://ugh.academia.edu/MinhQuangNguyen">Minh Quang Nguyen</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2008</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"A Quality-Focused Spoken Dialog System With Reinforcement Learning And Simulated User","attachmentId":84401951,"attachmentType":"pdf","work_url":"https://www.academia.edu/76839146/A_Quality_Focused_Spoken_Dialog_System_With_Reinforcement_Learning_And_Simulated_User","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/76839146/A_Quality_Focused_Spoken_Dialog_System_With_Reinforcement_Learning_And_Simulated_User"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="20" data-entity-id="111812076" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/111812076/Scaling_up_deep_reinforcement_learning_for_multi_domain_dialogue_systems">Scaling up deep reinforcement learning for multi-domain dialogue systems</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="53453208" href="https://independent.academia.edu/JacobCarse">Jacob Carse</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2017</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Scaling up deep reinforcement learning for multi-domain dialogue systems","attachmentId":109236676,"attachmentType":"pdf","work_url":"https://www.academia.edu/111812076/Scaling_up_deep_reinforcement_learning_for_multi_domain_dialogue_systems","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/111812076/Scaling_up_deep_reinforcement_learning_for_multi_domain_dialogue_systems"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="21" data-entity-id="60598699" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/60598699/Learning_dialogue_policies_using_state_aggregation_in_reinforcement_learning">Learning dialogue policies using state aggregation in reinforcement learning</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="43344964" href="https://independent.academia.edu/MatthiasDenecke">Matthias Denecke</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2004</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Learning dialogue policies using state aggregation in reinforcement learning","attachmentId":73970768,"attachmentType":"pdf","work_url":"https://www.academia.edu/60598699/Learning_dialogue_policies_using_state_aggregation_in_reinforcement_learning","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/60598699/Learning_dialogue_policies_using_state_aggregation_in_reinforcement_learning"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="22" data-entity-id="91279132" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/91279132/MAS_Architectural_Model_for_Dialog_Systems_with_Advancing_Conversations">MAS Architectural Model for Dialog Systems with Advancing Conversations</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="237590476" href="https://independent.academia.edu/hokoyo">henry okoyo</a></div><p class="ds-related-work--metadata ds2-5-body-xs">International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 2018</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"MAS Architectural Model for Dialog Systems with Advancing Conversations","attachmentId":94610745,"attachmentType":"pdf","work_url":"https://www.academia.edu/91279132/MAS_Architectural_Model_for_Dialog_Systems_with_Advancing_Conversations","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/91279132/MAS_Architectural_Model_for_Dialog_Systems_with_Advancing_Conversations"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="23" data-entity-id="68351703" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/68351703/Learning_Optimal_Dialogue_Management_Rules_by_Using_Reinforcement_Learning_and_Inductive_Logic_Programming">Learning Optimal Dialogue Management Rules by Using Reinforcement Learning and Inductive Logic Programming</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="40691165" href="https://independent.academia.edu/RenaudLecoeuche">Renaud Lecoeuche</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2001</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Learning Optimal Dialogue Management Rules by Using Reinforcement Learning and Inductive Logic Programming","attachmentId":78856257,"attachmentType":"pdf","work_url":"https://www.academia.edu/68351703/Learning_Optimal_Dialogue_Management_Rules_by_Using_Reinforcement_Learning_and_Inductive_Logic_Programming","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/68351703/Learning_Optimal_Dialogue_Management_Rules_by_Using_Reinforcement_Learning_and_Inductive_Logic_Programming"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div><div class="ds-related-work--container js-related-work-sidebar-card" data-collection-position="24" data-entity-id="5571837" data-sort-order="default"><a class="ds-related-work--title js-related-work-grid-card-title ds2-5-body-md ds2-5-body-link" href="https://www.academia.edu/5571837/Learning_multi_goal_dialogue_strategies_using_reinforcement_learning_with_reduced_state_action_spaces">Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces</a><div class="ds-related-work--metadata"><a class="js-related-work-grid-card-author ds2-5-body-sm ds2-5-body-link" data-author-id="7929936" href="https://edinburgh.academia.edu/HiroshiShimodaira">Hiroshi Shimodaira</a></div><p class="ds-related-work--metadata ds2-5-body-xs">2006</p><div class="ds-related-work--ctas"><button class="ds2-5-text-link ds2-5-text-link--inline js-swp-download-button" data-signup-modal="{"location":"wsj-grid-card-download-pdf-modal","work_title":"Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces","attachmentId":32660008,"attachmentType":"pdf","work_url":"https://www.academia.edu/5571837/Learning_multi_goal_dialogue_strategies_using_reinforcement_learning_with_reduced_state_action_spaces","alternativeTracking":true}"><span class="material-symbols-outlined" style="font-size: 18px" translate="no">download</span><span class="ds2-5-text-link__content">Download free PDF</span></button><a class="ds2-5-text-link ds2-5-text-link--inline js-related-work-grid-card-view-pdf" href="https://www.academia.edu/5571837/Learning_multi_goal_dialogue_strategies_using_reinforcement_learning_with_reduced_state_action_spaces"><span class="ds2-5-text-link__content">View PDF</span><span class="material-symbols-outlined" style="font-size: 18px" translate="no">chevron_right</span></a></div></div></div><div class="ds-related-content--container"><h2 class="ds-related-content--heading">Related topics</h2><div class="ds-research-interests--pills-container"><a class="js-related-research-interest ds-research-interests--pill" data-entity-id="422" href="https://www.academia.edu/Documents/in/Computer_Science">Computer Science</a><a class="js-related-research-interest ds-research-interests--pill" data-entity-id="3193313" href="https://www.academia.edu/Documents/in/arXiv">arXiv</a></div></div></div></div></div><div class="footer--content"><ul class="footer--main-links hide-on-mobile"><li><a href="https://www.academia.edu/about">About</a></li><li><a href="https://www.academia.edu/press">Press</a></li><li><a rel="nofollow" href="https://medium.com/academia">Blog</a></li><li><a href="https://www.academia.edu/documents">Papers</a></li><li><a href="https://www.academia.edu/topics">Topics</a></li><li><a href="https://www.academia.edu/hiring"><svg style="width: 13px; height: 13px; position: relative; bottom: -1px;" aria-hidden="true" focusable="false" data-prefix="fas" data-icon="briefcase" class="svg-inline--fa fa-briefcase fa-w-16" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><path fill="currentColor" d="M320 336c0 8.84-7.16 16-16 16h-96c-8.84 0-16-7.16-16-16v-48H0v144c0 25.6 22.4 48 48 48h416c25.6 0 48-22.4 48-48V288H320v48zm144-208h-80V80c0-25.6-22.4-48-48-48H176c-25.6 0-48 22.4-48 48v48H48c-25.6 0-48 22.4-48 48v80h512v-80c0-25.6-22.4-48-48-48zm-144 0H192V96h128v32z"></path></svg> <strong>We're Hiring!</strong></a></li><li><a href="https://support.academia.edu/"><svg style="width: 12px; height: 12px; position: relative; bottom: -1px;" aria-hidden="true" focusable="false" data-prefix="fas" data-icon="question-circle" class="svg-inline--fa fa-question-circle fa-w-16" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 512 512"><path fill="currentColor" d="M504 256c0 136.997-111.043 248-248 248S8 392.997 8 256C8 119.083 119.043 8 256 8s248 111.083 248 248zM262.655 90c-54.497 0-89.255 22.957-116.549 63.758-3.536 5.286-2.353 12.415 2.715 16.258l34.699 26.31c5.205 3.947 12.621 3.008 16.665-2.122 17.864-22.658 30.113-35.797 57.303-35.797 20.429 0 45.698 13.148 45.698 32.958 0 14.976-12.363 22.667-32.534 33.976C247.128 238.528 216 254.941 216 296v4c0 6.627 5.373 12 12 12h56c6.627 0 12-5.373 12-12v-1.333c0-28.462 83.186-29.647 83.186-106.667 0-58.002-60.165-102-116.531-102zM256 338c-25.365 0-46 20.635-46 46 0 25.364 20.635 46 46 46s46-20.636 46-46c0-25.365-20.635-46-46-46z"></path></svg> <strong>Help Center</strong></a></li></ul><ul class="footer--research-interests"><li>Find new research papers in:</li><li><a href="https://www.academia.edu/Documents/in/Physics">Physics</a></li><li><a href="https://www.academia.edu/Documents/in/Chemistry">Chemistry</a></li><li><a href="https://www.academia.edu/Documents/in/Biology">Biology</a></li><li><a href="https://www.academia.edu/Documents/in/Health_Sciences">Health Sciences</a></li><li><a href="https://www.academia.edu/Documents/in/Ecology">Ecology</a></li><li><a href="https://www.academia.edu/Documents/in/Earth_Sciences">Earth Sciences</a></li><li><a href="https://www.academia.edu/Documents/in/Cognitive_Science">Cognitive Science</a></li><li><a href="https://www.academia.edu/Documents/in/Mathematics">Mathematics</a></li><li><a href="https://www.academia.edu/Documents/in/Computer_Science">Computer Science</a></li></ul><ul class="footer--legal-links hide-on-mobile"><li><a href="https://www.academia.edu/terms">Terms</a></li><li><a href="https://www.academia.edu/privacy">Privacy</a></li><li><a href="https://www.academia.edu/copyright">Copyright</a></li><li>Academia ©2024</li></ul></div> </body> </html>