CINXE.COM
{"title":"On Dialogue Systems Based on Deep Learning","authors":"Yifan Fan, Xudong Luo, Pingping Lin","volume":168,"journal":"International Journal of Computer and Information Engineering","pagesStart":525,"pagesEnd":534,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/10011653","abstract":"Nowadays, dialogue systems increasingly become the<br \/>\r\nway for humans to access many computer systems. So, humans<br \/>\r\ncan interact with computers in natural language. A dialogue<br \/>\r\nsystem consists of three parts: understanding what humans say in<br \/>\r\nnatural language, managing dialogue, and generating responses in<br \/>\r\nnatural language. In this paper, we survey deep learning based<br \/>\r\nmethods for dialogue management, response generation and dialogue<br \/>\r\nevaluation. Specifically, these methods are based on neural network,<br \/>\r\nlong short-term memory network, deep reinforcement learning,<br \/>\r\npre-training and generative adversarial network. We compare these<br \/>\r\nmethods and point out the further research directions.","references":"[1] A.F. Agarap. A neural network architecture combining gated recurrent\r\nunit (GRU) and support vector machine (SVM) for intrusion detection in\r\nnetwork traffic data. In Proceedings of the 10th International Conference\r\non Machine Learning and Computing, pages 26\u201330, 2018.\r\n[2] M.Z. Alom, T.M. Taha, C. Yakopcic, S. Westberg, P. Sidike, M.S.\r\nNasrin, M. Hasan, B.C. Van Essen, A.A.S. Awwal, and V.K. Asari.\r\nA state-of-the-art survey on deep learning theory and architectures.\r\nElectronics, 8(3):292, 2019.\r\n[3] K. Arulkumaran, M.P. Deisenroth, M. Brundage, and A.A. Bharath.\r\nDeep reinforcement learning: A brief survey. IEEE Signal Processing\r\nMagazine, 34(6):26\u201338, 2017.\r\n[4] K. Asadi and J.D. Williams. Sample-efficient deep reinforcement\r\nlearning for dialog control. arXiv preprint arXiv:1612.06000, 2016.\r\n[5] T. Baltru\u02c7saitis, C. Ahuja, and L. Morency. Multimodal machine learning:\r\nA survey and taxonomy. IEEE transactions on pattern analysis and\r\nmachine intelligence, 41(2):423\u2013443, 2018.\r\n[6] T. B Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal,\r\nA. Neelakantan, P. Shyam, G. Sastry, and A. Askell. Language models\r\nare few-shot learners. arXiv preprint arXiv:2005.14165, 2020. [7] E. Bruni and R. Fern\u00b4andez. Adversarial evaluation for open-domain\r\ndialogue generation. In Proceedings of the 18th Annual SIGdial Meeting\r\non Discourse and Dialogue, pages 284\u2013288, 2017.\r\n[8] P. Budzianowski, T. Wen, B. Tseng, I. Casanueva, S. Ultes, O. Ramadan,\r\nand M. Ga\u02c7si\u00b4c. Multiwoz - a large-scale multi-domain wizard-of-oz\r\ndataset for task-oriented dialogue modelling. In Proceedings of the 2018\r\nConference on Empirical Methods in Natural Language Processing,\r\npage 50165026, 2018.\r\n[9] H. Chen, X. Liu, D. Yin, and J. Tang. A survey on dialogue systems:\r\nRecent advances and new frontiers. ACM SIGKDD Explorations\r\nNewsletter, 19(2):25\u201335, 2017.\r\n[10] L. Chen, Z. Chen, B. Tan, S. Long, M. Ga\u02c7si\u00b4c, and K. Yu.\r\nAgentgraph: Toward universal dialogue management with structured\r\ndeep reinforcement learning. IEEE\/ACM Transactions on Audio, Speech,\r\nand Language Processing, 27(9):1378\u20131391, 2019.\r\n[11] H. Cuay\u00b4ahuitl, D. Lee, S. Ryu, Y. Cho, S. Choi, S. Indurthi, S. Yu,\r\nH. Choi, I. Hwang, and J. Kim. Ensemble-based deep reinforcement\r\nlearning for chatbots. Neurocomputing, 366:118\u2013130, 2019.\r\n[12] J. Deriu, A. Rodrigo, A. Otegi, G. Echegoyen, S. Rosset, E. Agirre,\r\nand M. Cieliebak. Survey on evaluation methods for dialogue systems.\r\nArtificial Intelligence Review, pages 1\u201356, 2020.\r\n[13] J. Devlin, M.W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training\r\nof deep bidirectional transformers for language understanding. In\r\nProceedings of the 2019 Conference of the North American Chapter\r\nof the Association for Computational Linguistics: Human Language\r\nTechnologies, (Volume 1: Long and Short Papers), page 41714186, 2019.\r\n[14] L. Dong, N. Yang, W. Wang, F. Wei, X. Liu, Y. Wang, J. Gao, M. Zhou,\r\nand H.-W Hon. Unified language model pre-training for natural language\r\nunderstanding and generation. In Proceedings of the 2019 Advances in\r\nNeural Information Processing Systems, pages 13063\u201313075, 2019.\r\n[15] O. Du\u02c7sek, J. Novikova, and V. Rieser. Evaluating the state-of-the-art\r\nof end-to-end natural language generation: The E2E NLG challenge.\r\nComputer Speech & Language, 59:123\u2013156, 2020.\r\n[16] P. Ehrenbrink, S. Osman, and S. M\u00a8oller. Google now is for the\r\nextraverted, cortana for the introverted: Investigating the influence of\r\npersonality on ipa preference. In Proceedings of the 29th Australian\r\nConference on Computer-Human Interaction, pages 257\u2013265, 2017.\r\n[17] M. Eric and C. D. Manning. Key-value retrieval networks for\r\ntask-oriented dialogue. In Proceedings of the 18th Annual SIGdial\r\nMeeting on Discourse and Dialogue, pages 37\u201349, 2017.\r\n[18] S. Feng, H. Chen, K Li, and D. Yin. Posterior-GAN: Towards\r\ninformative and coherent response generation with posterior generative\r\nadversarial network. In Proceedings of the 34th AAAI Conference on\r\nArtificial Intelligence, pages 7708\u20137715, 2020.\r\n[19] M. Ghazvininejad, C. Brockett, and M. Chang. A knowledge-grounded\r\nneural conversation model. In Proceedings of the 2018 National\r\nConference on Artificial Intelligence, pages 5110\u20135117, 2018.\r\n[20] T. Holstein, M. Wallmyr, J. Wietzke, and R. Land. Current Challenges\r\nin Compositing Heterogeneous User Interfaces for Automotive Purposes,\r\npages 531\u2013542. Computer Science, 2015.\r\n[21] V. Ilievski, C. Musat, A. Hossmann, and M. Baeriswyl. Goal-oriented\r\nchatbot dialog management bootstrapping with transfer learning. In\r\nProceedings of the 27th International Joint Conference on Artificial\r\nIntelligence Organization, pages 4115\u20134120, 2018.\r\n[22] A. Kannan and O. Vinyals. Adversarial evaluation of dialogue models.\r\narXiv preprint arXiv:1701.08198, 2017.\r\n[23] J. Kim, S. Oh, O.-W. Kwon, and H. Kim. Multi-turn chatbot based\r\non query-context attentions and dual wasserstein generative adversarial\r\nnetworks. Applied Sciences, 9(18):3908, 2019.\r\n[24] A. Kumar, P. Ku, A. Goyal, A. Metallinou, and D.H. Tur. Ma-dst:\r\nMulti-attention based scalable dialog state tracking. In Proceedings of\r\nthe 34th AAAI Conference on Artificial Intelligence, pages 8107\u20138114,\r\n2020.\r\n[25] H. Kumar, A. Agarwal, R. Dasgupta, and S. Joshi. Dialogue act sequence\r\nlabeling using hierarchical encoder with CRF. In Proceedings of the\r\n32nd AAAI Conference on Artificial Intelligence, pages 3440\u20133446,\r\n2018.\r\n[26] J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, and D. Jurafsky.\r\nDeep reinforcement learning for dialogue generation. In Proceedings\r\nof the 2016 Conference on Empirical Methods in Natural Language\r\nProcessing, pages 1192\u20131202, 2016.\r\n[27] J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, and D. Jurafsky. Adversarial\r\nlearning for neural dialogue generation. In Proceedings of the 22nd\r\nEmpirical Methods in Natural Language Processing, page 21572169,\r\n2017.\r\n[28] Y. Li, K. Qian, W.Y. Shi, and Z. Yu. End-to-end trainable\r\nnon-collaborative dialog system. In Proceedings of the 34th AAAI\r\nConference on Artificial Intelligence, pages 8293\u20138302, 2020.\r\n[29] Z.M. Li, J. Kiseleva, and M.D. Rijke. Dialogue generation: From\r\nimitation learning to inverse reinforcement learning. In Proceedings of\r\nthe 33rd AAAI Conference on Artificial Intelligence, pages 6722\u20136728,\r\n2019.\r\n[30] R. Lowe, M. Noseworthy, I.V. Serban, N. Angelard-Gontier, and\r\nJ. Pineau. Towards an automatic turing test: Learning to evaluate\r\ndialogue responses. In Proceedings of the 55th Annual Meeting of the\r\nAssociation for Computational Linguistics, pages 1116\u20131126, 2017.\r\n[31] R. Lowe, I.V. Serban, M. Noseworthy, L. Charlin, and J. Pineau. On\r\nthe evaluation of dialogue systems with next utterance classification. In\r\nProceedings of the 17th Annual Meeting of the Special Interest Group\r\non Discourse and Dialogue, pages 264\u2013269, 2016.\r\n[32] V.N. Lu, J. Wirtz, W. H. Kunz, S. Paluch, T. Gruber, A. Martins, and\r\nP. G. Patterson. Service robots, customers and service employees: What\r\ncan we learn from the academic literature and where are the gaps?\r\nJournal of Service Theory and Practice, 2020.\r\n[33] A. Madotto, C.S. Wu, and P. Fung. Mem2seq: Effectively incorporating\r\nknowledge bases into end-to-end task-oriented dialog systems. In\r\nProceedings of the 56th Annual Meeting of the Association for\r\nComputational Linguistics, pages 1468\u20131478, 2018.\r\n[34] N. Majumder, S.J. Poria, D. Hazarika, R. Mihalcea, A. Gelbukh, and\r\nE. Cambria. Dialoguernn: An attentive rnn for emotion detection in\r\nconversations. In Proceedings of the 33rd AAAI Conference on Artificial\r\nIntelligence, pages 6818\u20136824, 2019.\r\n[35] E. Merdivan, D. Singh, S. Hanke, and A. Holzinger. Dialogue systems\r\nfor intelligent human computer interactions. Electronic Notes in\r\nTheoretical Computer Science, 343:5771, 2019.\r\n[36] F. Mi, M. Huang, J. Zhang, and B. Faltings. Meta-learning for\r\nlow-resource natural language generation in task-oriented dialogue\r\nsystems. In Proceedings of the 28th International Joint Conference on\r\nArtificial Intelligence Organization, pages 3151\u20133157, 2019.\r\n[37] T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient estimation\r\nof word representations in vector space. In Proceedings of the\r\n1st International Conference on Learning Representations, pages\r\n5998\u20136008, 2017.\r\n[38] T. Mikolov, I. Sutskever, K. Chen, Greg S. C., and J. Dean. Distributed\r\nrepresentations of words and phrases and their compositionality. In\r\nProceedings of the 2013 Advances in neural information processing\r\nsystems, pages 3111\u20133119, 2013.\r\n[39] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou,\r\nD. Wierstra, and M. Riedmiller. Playing atari with deep reinforcement\r\nlearning. arXiv preprint arXiv:1312.5602, 2013.\r\n[40] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G.\r\nBellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski,\r\net al. Human-level control through deep reinforcement learning. Nature,\r\n518(7540):529\u2013533, 2015.\r\n[41] N. Mrk\u02c7si\u00b4c, D.O. S\u00b4eaghdha, B. Thomson, M. Ga\u02c7si\u00b4c, P.-H. Su,\r\nD. Vandyke, T.-H. Wen, and S. Young. Multi-domain dialog state\r\ntracking using recurrent neural networks. In Proceedings of the 53rd\r\nAnnual Meeting of the Association for Computational Linguistics and\r\nthe 7th International Joint Conference on Natural Language Processing,\r\nvolume 2, pages 794\u2013799, 2015.\r\n[42] N. Mrk\u02c7si\u00b4c, D.O. S\u00b4eaghdha, T.-H. Wen, B. Thomson, and S. Young.\r\nNeural belief tracker: Data-driven dialogue state tracking. In\r\nProceedings of the 55th Annual Meeting of the Association for\r\nComputational Linguistics, volume 1, pages 1777\u20131788, 2017.\r\n[43] A. Papangelis and Y. Stylianou. Single-model multi-domain dialogue\r\nmanagement with deep learning. In Advanced Social Interaction with\r\nAgents, pages 71\u201377. 2019.\r\n[44] M.-J. Peng, Y.W. Qin, C.X. Tang, and X.M. Deng. An e-commerce\r\ncustomer service robot based on intention recognition model. Journal\r\nof Electronic Commerce in Organizations, 14(1):34\u201344, 2016.\r\n[45] J. Pennington, R. Socher, and C.D. Manning. Glove: Global vectors for\r\nword representation. In Proceedings of the 2014 conference on empirical\r\nmethods in natural language processing (EMNLP), pages 1532\u20131543,\r\n2014.\r\n[46] L.-B Qin, X. Xu, W.-X Che, Y. Zhang, and T. Liu. Dynamic\r\nfusion network for multi-domain end-to-end task-oriented dialog. In\r\nProceedings of the 58th Annual Meeting of the Association for\r\nComputational Linguistics, page 63446354, 2020.\r\n[47] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena,\r\nY. Zhou, W. Li, and P.J. Liu. Exploring the limits of transfer learning\r\nwith a unified text-to-text transformer. Journal of Machine Learning\r\nResearch, 21(140):1\u201367, 2020. [48] I.V. Serban, R. Lowe, P. Henderson, L. Charlin, and J. Pineau. A\r\nsurvey of available corpora for building data-driven dialogue systems:\r\nThe journal version. Dialogue & Discourse, 9(1):1\u201349, 2018.\r\n[49] I.V. Serban, A. Sordoni, Y. Bengio, A. Courville, and J. Pineau.\r\nBuilding end-to-end dialogue systems using generative hierarchical\r\nneural network models. In Proceedings of the 30th AAAI Conference\r\non Artificial Intelligence, pages 3776\u20133783, 2016.\r\n[50] X.Y. Shen, H. Su, S.Z. Niu, and V. Demberg. Improving variational\r\nencoder-decoders in dialogue generation. In Proceedings of the\r\n32nd Association for the Advancement of Artificial Intelligence, pages\r\n5456\u20135462, 2018.\r\n[51] O. Sihombing, N. Zendrato, Y. Laia, M. Nababan, D. Sitanggang,\r\nW. Purba, D. Batubara, S. Aisyah, E. Indra, and S. Siregar. Smart\r\nhome design for electronic devices monitoring based wireless gateway\r\nnetwork using cisco packet tracer. Journal of Physics Conference Series,\r\n1007(1):12\u201321, 2018.\r\n[52] H.-Y Song, W.-N Zhang, and T. Liu. Open domain multi-round\r\ndialogue strategy learning based on dqn. Journal of Chinese Information\r\nProcessing, 32:99\u2013108, 2018.\r\n[53] H. Su, X.Y. Shen, P.W. Hu, W.J. Li, and Y. Chen. Dialogue generation\r\nwith gan. In Proceedings of the 32nd AAAI Conference on Artificial\r\nIntelligence, pages 8163\u20138163, 2018.\r\n[54] M.H. Su, C.H.Wu, K.Y. Huang, T.H. Yang, and T.C. Huang. Dialog state\r\ntracking for interview coaching using two-level LSTM. In Proceedings\r\nof the 10th International Symposium on Chinese Spoken Language\r\nProcessing, pages 1\u20135, 2016.\r\n[55] S. Subramanian, S.R. Mudumba, A. Sordoni, A. Trischler, A. C.\r\nCourville, and C. Pal. Towards text generation with adversarially learned\r\nneuraloutlines. In Proceedings of the 32nd Conference on Neural\r\nInformation Processing Systems, volume 31, pages 2\u20139, 2018.\r\n[56] X.W. Tong, Z.X. Fu, M.Y. Shang, D.Y. Zhao, and R. Yan. One ruler\r\nfor all languages: Multi-lingual dialogue evaluation with adversarial\r\nmulti-task learning. In Proceedings of the 27th International Joint\r\nConference on Artificial Intelligence Organization, pages 4432\u20134437,\r\n2018.\r\n[57] V.K. Tran and L.M. Nguyen. Natural language generation for\r\nspokendialogue system usingrnn encoder-decoder networks. In\r\nProceedings of the 21st Conference on Computational Natural Language\r\nLearning, pages 442\u2013451, 2017.\r\n[58] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkorei, L. Jones, A. Gomez,\r\nand L. Kaiser. Attention is all you need. In Proceedings of the 2017\r\nAdvances in Neural Information Processing Systems, pages 5998\u20136008,\r\n2017.\r\n[59] J. Wang, J.H. Liu, W. Bi, X.J. Liu, K.J. He, R.F. Xu, and M. Yang.\r\nImproving knowledge-aware dialogue generation via knowledge base\r\nquestion answering. In Proceedings of the 34th AAAI Conference on\r\nArtificial Intelligence, pages 1\u20138, 2020.\r\n[60] X.-G Wang, X.-Y Cheng, J. Zhou, and W. Xu. State tracking networks\r\nfor dialog state tracking. In Proceedings of the Workshops of the 32nd\r\nAAAI Conference on Artificial Intelligence, pages 746\u2013751, 2018.\r\n[61] T.-H. Wen and S. Young. Recurrent neural network language generation\r\nfor spoken dialogue systems. Computer Speech & Language, 63:101017,\r\n2020.\r\n[62] Y. Wu, Z. Li, W. Wu, and M. Zhou. Response selection with topic clues\r\nfor retrieval-based chatbots. Neurocomputing, 316:251\u2013261, 2018.\r\n[63] Y. Wu, W. Wu, C. Xing, M. Zhou, and Z. Li. Sequential matching\r\nnetwork: A new architecture for multi-turn response selection in\r\nretrieval-based chatbots. In Proceedings of the 55th Annual Meeting\r\nof the Association for Computational Linguistics, volume 1, pages\r\n496\u2013505, 2017.\r\n[64] Z. Wu, Z. Liu, J. Lin, Y. Lin, and S. Han. Lite transformer with\r\nlong-short range attention. In Proceedings of the 8th International\r\nConference on Learning Representations, pages 1\u201312, 2020.\r\n[65] R. Yan. chitty-chitty-chat bot: Deep learning for conversation AI. In\r\nProceedings of the 2018 International Joint Conference on Artificial\r\nIntelligence Organization, pages 5520\u20135526, 2018.\r\n[66] H.-T. Ye, K.-L. Lo, S.-Y. Su, and Y.-N. Chen. Knowledge-grounded\r\nresponse generation with deep attentional latent-variable model.\r\nComputer Speech & Language, page 101069, 2020.\r\n[67] H.N. Zhang, Y.Y. Lan, J.F. Guo, J. Xu, and X.Q. Cheng. Reinforcing\r\ncoherence for sequence to sequence model in dialogue generation. In\r\nProceedings of the 27th International Joint Conference on Artificial\r\nIntelligence Organization, pages 4567\u20134572, 2018.\r\n[68] W.-N. Zhang, Y.-Z. Zhang, and T. Liu. Survey of evaluation methods for\r\ndialogue systems. Science in China: Information Science, 47(8):953966,\r\n2017. (In chinese).\r\n[69] W.E. Zhang, Q.Z. Sheng, A. Alhazmi, and C. Li. Adversarial attacks on\r\ndeep-learning models in natural language processing: A survey. ACM\r\nTransactions on Intelligent Systems and Technology, 11(3):1\u201341, 2020.\r\n[70] T. Zhao, K. Lee, and M. Eskenazi. Unsupervised discrete sentence\r\nrepresentation learning for interpretable neural dialog generation. In\r\nProceedings of the 56th Annual Meeting of the Association for\r\nComputational Linguistics (Volume 1: Long Papers), page 10981107,\r\n2018.\r\n[71] T.-C. Zhao, K. Xie, and M. Eskenazi. Rethinking action spaces for\r\nreinforcement learning in end-to-end dialog agents with latent variable\r\nmodels. In Proceedings of the 2019 Conference of the North American\r\nChapter of the Association for Computational Linguistics: Human\r\nLanguage Technologies, page 12081218, 2019.\r\n[72] Y.-Q. Zhao and Y. Xiang. Dialog generation based on hierarchical\r\nencoding and deep reinforcement learning. Journal of Computer\r\nApplications, 37(10):2813\u20132818, 2017. (In chinese).\r\n[73] G.B. Zhou, Q. Luo, Y.J. Xiao, F. Lin, B. Chen, and Q. He. Elastic\r\nresponding machine for dialog generation with dynamically mechanism\r\nselecting. In Proceedings of the 32nd AAAI Conference on Artificial\r\nIntelligence, pages 5730\u20135737, 2018.\r\n[74] H. Zhou, M. Huang, T.-Y. Zhang, X.-Y. Zhu, and L. Bing. Emotional\r\nchatting machine: Emotional conversation generation with internal and\r\nexternal memory. In Proceedings of the 32nd AAAI Conference on\r\nArtificial Intelligence, pages 730\u2013739, 2018.\r\n[75] H. Zhou, M. Huang, and X. Zhu. Context-aware natural language\r\ngeneration for spoken dialogue systems. In Proceedings of the\r\n26th International Conference on Computational Linguistics, pages\r\n2032\u20132041, 2016.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 168, 2020"}