CINXE.COM

The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims

<!DOCTYPE html> <html lang="en"> <head> <meta content="text/html; charset=utf-8" http-equiv="content-type"/> <title>The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims</title> <!--Generated on Thu Nov 21 12:24:34 2024 by LaTeXML (version 0.8.8) http://dlmf.nist.gov/LaTeXML/.--> <meta content="width=device-width, initial-scale=1, shrink-to-fit=no" name="viewport"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv-fonts.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/latexml_styles.css" rel="stylesheet" type="text/css"/> <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.3.3/html2canvas.min.js"></script> <script src="/static/browse/0.3.4/js/addons_new.js"></script> <script src="/static/browse/0.3.4/js/feedbackOverlay.js"></script> <meta content="patent text abstract generation master-slave encoder." lang="en" name="keywords"/> <base href="/html/2411.14072v1/"/></head> <body> <nav class="ltx_page_navbar"> <nav class="ltx_TOC"> <ol class="ltx_toclist"> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S1" title="In The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">1 </span>Introduction</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S2" title="In The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2 </span>Related work</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S2.SS1" title="In 2 Related work ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2.1 </span>Patent Text Abstract Generation</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S2.SS2" title="In 2 Related work ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2.2 </span>Out-of-vocabulary Problem</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S2.SS3" title="In 2 Related work ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2.3 </span>Repeated Generation Issues</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S3" title="In The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3 </span>Methodology</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S3.SS1" title="In 3 Methodology ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.1 </span>Master Encoder</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S3.SS2" title="In 3 Methodology ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.2 </span>Slave Encoder</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S3.SS3" title="In 3 Methodology ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.3 </span>Decoder</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S3.SS4" title="In 3 Methodology ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.4 </span>Pointer Network</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S3.SS5" title="In 3 Methodology ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.5 </span>Enhanced Repetition Suppression Mechanism</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4" title="In The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4 </span>Experiments</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.SS1" title="In 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.1 </span>Data Source and Preprocessing</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.SS2" title="In 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.2 </span>Model parameter settings and Metrics</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.SS3" title="In 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.3 </span>Baselines</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.SS4" title="In 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.4 </span>Comparison</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.SS5" title="In 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.5 </span>Analysis of Differences between Using Specifications and Claims Text in Patents</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.SS6" title="In 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.6 </span>Sensitivity Analysis Under Different Decoding Lengths</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.SS7" title="In 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.7 </span>Sensitivity Analysis of Hidden Layers</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.SS8" title="In 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.8 </span>Ablation Study</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.SS9" title="In 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.9 </span>Enhanced Repetition Suppression Mechanism Results Analysis</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S5" title="In The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">5 </span>Conclusions</span></a></li> </ol></nav> </nav> <div class="ltx_page_main"> <div class="ltx_page_content"> <article class="ltx_document ltx_authors_1line"><span class="ltx_note ltx_role_institutetext" id="id1"><sup class="ltx_note_mark">1</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">1</sup><span class="ltx_note_type">institutetext: </span>School of Information Management, Nanjing University, China </span></span></span><span class="ltx_note ltx_role_institutetext" id="id2"><sup class="ltx_note_mark">2</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">2</sup><span class="ltx_note_type">institutetext: </span>Key Laboratory of Data Engineering and Knowledge Services in Jiangsu Provincial Universities, Nanjing University, China <br class="ltx_break"/><span class="ltx_note ltx_role_email" id="id2.1"><sup class="ltx_note_mark">2</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">2</sup><span class="ltx_note_type">email: </span>shuzhou@smail.nju.edu.cn; ywhaowang@nju.edu.cn</span></span></span> <br class="ltx_break"/></span></span></span><span class="ltx_note ltx_role_institutetext" id="id3"><sup class="ltx_note_mark">3</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">3</sup><span class="ltx_note_type">institutetext: </span>Baidu, Inc., Beijing, China <br class="ltx_break"/></span></span></span> <h1 class="ltx_title ltx_title_document">The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims</h1> <div class="ltx_authors"> <span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Shu Zhou </span><span class="ltx_author_notes">These authors contributed equally.1122</span></span> <span class="ltx_author_before">  </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Xin Wang </span><span class="ltx_author_notes">33</span></span> <span class="ltx_author_before">  </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Zhengda Zhou<sup class="ltx_sup" id="id3.2.id1"><span class="ltx_text ltx_font_italic" id="id3.2.id1.1">⋆</span></sup> </span><span class="ltx_author_notes">1122</span></span> <span class="ltx_author_before">  </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Haohan Yi<sup class="ltx_sup" id="id4.2.id1"><span class="ltx_text ltx_font_italic" id="id4.2.id1.1">⋆</span></sup> </span><span class="ltx_author_notes">1122</span></span> <span class="ltx_author_before">  </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Xuhui Zheng </span><span class="ltx_author_notes">1122</span></span> <span class="ltx_author_before">  </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Hao Wang </span><span class="ltx_author_notes">Corresponding author. This paper is supported by the National Natural Science Foundation of China under contract No. 72074108, Jiangsu Young Talents in Social Sciences, and Tang Schloar of Nanjing University.1122</span></span> </div> <div class="ltx_abstract"> <h6 class="ltx_title ltx_title_abstract">Abstract</h6> <p class="ltx_p" id="id5.id1">In order to solve the problem of insufficient generation quality caused by traditional patent text abstract generation models only originating from patent specifications, the problem of new terminology OOV caused by rapid patent updates, and the problem of information redundancy caused by insufficient consideration of the high professionalism, accuracy, and uniqueness of patent texts, we proposes a patent text abstract generation model (MSEA) based on a master-slave encoder architecture; Firstly, the MSEA model designs a master-slave encoder, which combines the instructions in the patent text with the claims as input, and fully explores the characteristics and details between the two through the master-slave encoder; Then, the model enhances the consideration of new technical terms in the input sequence based on the pointer network, and further enhances the correlation with the input text by re weighing the "remembered" and "for-gotten" parts of the input sequence from the encoder; Finally, an enhanced repetition suppression mechanism for patent text was introduced to ensure accurate and non redundant abstracts generated. On a publicly available patent text dataset, compared to the state-of-the-art model, Improved Multi-Head Attention Mechanism (IMHAM), the MSEA model achieves an improvement of 0.006, 0.005, and 0.005 in Rouge-1, Rouge-2, and Rouge-L scores, respectively. MSEA leverages the characteristics of patent texts to effectively enhance the quality of patent text generation, demonstrating its advancement and effectiveness in the experiments.</p> </div> <div class="ltx_keywords"> <h6 class="ltx_title ltx_title_keywords">Keywords: </h6>patent text abstract generation master-slave encoder. </div> <section class="ltx_section" id="S1"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">1 </span>Introduction</h2> <div class="ltx_para" id="S1.p1"> <p class="ltx_p" id="S1.p1.1">In recent years, as the number of patent applications in China has steadily increased, the quality of patent abstract generation has become increasingly scrutinized <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib138" title="">138</a>]</cite>. Unfortunately, many patent applications are rejected, one major reason being defects in the claims, specifications, and abstracts <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib139" title="">139</a>]</cite>. These defects can partly be attributed to three key issues in abstract generation: first, the neglect of the importance of the claims; second, the Out of Vocabulary (OOV) issue with new patent terminology; and third, the failure to properly handle information redundancy and maintain professional value <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib45" title="">45</a>]</cite>.</p> </div> <div class="ltx_para" id="S1.p2"> <p class="ltx_p" id="S1.p2.1">Recently, encoder-decoder models based on neural networks have been used in various sequence-to-sequence tasks <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib76" title="">76</a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib106" title="">106</a>]</cite>, such as machine translation, speech recognition, and text summary generation. Although these models have made significant progress in other sequence-to-sequence tasks like machine translation and speech recognition, they still face a range of challenges in patent summary generation <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib138" title="">138</a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib45" title="">45</a>]</cite>. We addresses the following three main issues:</p> </div> <div class="ltx_para" id="S1.p3"> <p class="ltx_p" id="S1.p3.1">Traditional patent text summary generation models <span class="ltx_text ltx_font_bold" id="S1.p3.1.1">(1) only source input text from patent specifications</span>, resulting in summaries that may lack the core points of the patent. This is because patent texts include both the claims and the specifications, which differ in content (the specifications typically includes the title, technical field, background technology, content of the invention, specifications of the drawings, and specific embodiments, while the claims include independent and dependent claims) and importance.</p> </div> <div class="ltx_para" id="S1.p4"> <p class="ltx_p" id="S1.p4.1">The rapid updating of patents and the fine classification of technology result in a large number of new technological terms, <span class="ltx_text ltx_font_bold" id="S1.p4.1.1">(2) making the OOV issue particularly prominent</span>. Because patents are updated quickly and classified in detail, new technological terms emerge easily, highlighting the OOV problem. However, traditional pointer networks often only consider the context vector, the state of the decoder, and the input to the decoder, and do not fully account for the importance of these new technological terms, leading to summaries that may neglect or mismanage these terms.</p> </div> <div class="ltx_para" id="S1.p5"> <p class="ltx_p" id="S1.p5.1">The high professionalism and uniqueness of patent texts <span class="ltx_text ltx_font_bold" id="S1.p5.1.1">(3) require summary generation models to be highly accurate and free of redundancy</span>. However, traditional coverage mechanisms have not fully addressed the challenge of repetitive generation, leading to summaries that may contain information redundancy and lose their professional value. Patent texts have a high degree of professionalism, precision, and uniqueness, where every specifications, term, and technical detail carries crucial information. Thus, compared to general text generation, repetitive generation in patent texts not only leads to information redundancy but also causes the generated text to lose its intended value and professionalism. Traditional coverage mechanisms have not fully adapted to and met the requirements for generating text content with such high specialization and uniqueness.</p> </div> <div class="ltx_para" id="S1.p6"> <p class="ltx_p" id="S1.p6.1">To address these issues, we proposes a new patent text summary generation model based on a master-slave encoder architecture (MSEA) on the existing sequence-to-sequence framework. MSEA considers the different importance of the specifications and claims in patent texts, dividing the encoder into two parts: a master encoder and a slave encoder, with a separate decoder. The slave encoder processes the input from the master encoder and other inputs separately, producing a new vector as an additional input to the decoder, which enables the decoder to obtain more semantic information. In the decoding phase, the paper conducts multi-step decoding operations and establishes a semantic feature vector for the text content at each step, allowing the decoder to continuously ’remember’ the content generated in previous time steps to avoid repetition, thereby enhancing the accuracy and professional value of the summary.</p> </div> <div class="ltx_para" id="S1.p7"> <p class="ltx_p" id="S1.p7.1">The main contributions of this paper are summarized as follows:</p> </div> <div class="ltx_para" id="S1.p8"> <p class="ltx_p" id="S1.p8.1">(1) The master-slave encoder architecture (MSEA) is specifically designed to handle the specifications and claims in patent texts, integrating these two parts as inputs to fully explore their characteristics and details. This approach significantly improves the core point coverage of patent summaries, ensuring that the generated summaries accurately reflect the innovative aspects and technical scope of the patents.</p> </div> <div class="ltx_para" id="S1.p9"> <p class="ltx_p" id="S1.p9.1">(2) The MSEA model, through an improved pointer network and reweighted input sequence, particularly enhances the recognition and handling of new technological terms. This not only addresses the OOV issue but also enhances the model’s relevance to the input text by ’remembering’ and ’forgetting’ different parts of the input sequence, improving the accuracy of term processing.</p> </div> <div class="ltx_para" id="S1.p10"> <p class="ltx_p" id="S1.p10.1">(3) The MSEA model introduces an enhanced suppression mechanism specifically tailored for the high professionalism and uniqueness of patent texts, ensuring that the generated summaries are both accurate and non-redundant. Through multi-step decoding and suppression of repetitive content, the model effectively prevents the generation of redundant information, ensuring the professional value and precision of the summaries.</p> </div> <div class="ltx_para" id="S1.p11"> <p class="ltx_p" id="S1.p11.1">(4) Experimental results show that the MSEA model surpasses advanced patent summary generation models (such as IMHAM) on common Rouge scoring metrics, demonstrating its advanced capabilities and effectiveness in the field of patent summary generation.</p> </div> </section> <section class="ltx_section" id="S2"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">2 </span>Related work</h2> <section class="ltx_subsection" id="S2.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">2.1 </span>Patent Text Abstract Generation</h3> <div class="ltx_para" id="S2.SS1.p1"> <p class="ltx_p" id="S2.SS1.p1.1">Text summarization, particularly in the context of patent texts, remains a vital area of research in natural language processing. Deep learning advancements have notably improved summary quality, yet accurately summarizing main points continues to be a challenge <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib76" title="">76</a>]</cite>. Summarization techniques include extractive summarization, which compiles key sentences from the text <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib79" title="">79</a>]</cite>, and abstractive summarization, which creates summaries that may diverge from the source text to include multiple topics and varied sentence structures <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib130" title="">130</a>]</cite>.</p> </div> <div class="ltx_para" id="S2.SS1.p2"> <p class="ltx_p" id="S2.SS1.p2.1">In patent summarization, existing models struggle with producing summaries of appropriate length and relevance <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib76" title="">76</a>]</cite>. Recent models like the Reinforcement Learning Chinese Patent Rewriting Abstract (RLCPRA) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib139" title="">139</a>]</cite>, which focuses on patent specifications using reinforcement learning, and the Strategy Transformer Network Language Towards Patent (STNLTP) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib138" title="">138</a>]</cite>, which employs an ensemble approach for the same, address specific challenges such as out-of-vocabulary issues and repetitiveness. The Improved Multi-Head Attention Mechanism (IMHAM) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib45" title="">45</a>]</cite> and Master-Slave Encoder Architecture (MSEA) models incorporate both specifications and claims of patents, aiming to enhance summary quality by acknowledging the unique structure of patent texts. MSEA differentiates itself by using a dual-decoder approach to capture more detailed aspects of patents compared to IMHAM’s focus on either claims or specifications.</p> </div> <div class="ltx_para" id="S2.SS1.p3"> <p class="ltx_p" id="S2.SS1.p3.1">Despite advancements, combining specifications and claims in patent summarization remains an emerging field, indicating that the area is still evolving and underscoring the need for continued research.</p> </div> </section> <section class="ltx_subsection" id="S2.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">2.2 </span>Out-of-vocabulary Problem</h3> <div class="ltx_para" id="S2.SS2.p1"> <p class="ltx_p" id="S2.SS2.p1.1">The Out-Of-Vocabulary (OOV) issue is a significant challenge in natural language processing, where words not in a model’s vocabulary lead to errors in output. Recent strategies include subword tokenization <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib52" title="">52</a>]</cite> and vocabulary construction with hierarchical supervision <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib29" title="">29</a>]</cite>, which allow models to generate words beyond their initial vocabulary, thus improving their handling of OOV words.</p> </div> <div class="ltx_para" id="S2.SS2.p2"> <p class="ltx_p" id="S2.SS2.p2.1">Additionally, incorporating external knowledge sources like word embeddings or dictionaries has shown to enhance text generation quality by providing extra context and meaning <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib65" title="">65</a>]</cite>. Moreover, the application of reinforcement learning has furthered model performance by enabling iterative learning from feedback, optimizing the handling of OOV words in generated text.</p> </div> </section> <section class="ltx_subsection" id="S2.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">2.3 </span>Repeated Generation Issues</h3> <div class="ltx_para" id="S2.SS3.p1"> <p class="ltx_p" id="S2.SS3.p1.1">A common issue with neural network-based encoder-decoder models is their tendency to produce repetitive and incoherent phrases in longer summaries. To avoid this, a coverage mechanism can be used to eliminate repetitions by focusing on the same parts during encoding <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib99" title="">99</a>]</cite>. Additionally, the decoded information from the decoder can also be used to prevent repetition.</p> </div> <div class="ltx_para" id="S2.SS3.p2"> <p class="ltx_p" id="S2.SS3.p2.1">The problem of repetition occurs more frequently in long sequence generation tasks. However, researchers have seldom focused on using large datasets to summarize longer texts. Nallapati et al. <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib79" title="">79</a>]</cite> proposed an encoder-decoder model with hierarchical attention based on Recurrent Neural Networks (RNNs) for abstractive summarization tasks. Later, another hierarchical RNN model was developed, achieving significantly better results in abstraction <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib78" title="">78</a>]</cite>.</p> </div> </section> </section> <section class="ltx_section" id="S3"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">3 </span>Methodology</h2> <div class="ltx_para" id="S3.p1"> <p class="ltx_p" id="S3.p1.7">In the task of generating patent text summaries, a patent typically contains information such as specification, claims, publication number, and title. This model specifically combines information from the specifications and claims. Specifically, it takes as input a sequence of source text from the patent’s specifications <math alttext="X=(x_{1},x_{2},\ldots,x_{j},\ldots x_{m})" class="ltx_Math" display="inline" id="S3.p1.1.m1.5"><semantics id="S3.p1.1.m1.5a"><mrow id="S3.p1.1.m1.5.5" xref="S3.p1.1.m1.5.5.cmml"><mi id="S3.p1.1.m1.5.5.6" xref="S3.p1.1.m1.5.5.6.cmml">X</mi><mo id="S3.p1.1.m1.5.5.5" xref="S3.p1.1.m1.5.5.5.cmml">=</mo><mrow id="S3.p1.1.m1.5.5.4.4" xref="S3.p1.1.m1.5.5.4.5.cmml"><mo id="S3.p1.1.m1.5.5.4.4.5" stretchy="false" xref="S3.p1.1.m1.5.5.4.5.cmml">(</mo><msub id="S3.p1.1.m1.2.2.1.1.1" xref="S3.p1.1.m1.2.2.1.1.1.cmml"><mi id="S3.p1.1.m1.2.2.1.1.1.2" xref="S3.p1.1.m1.2.2.1.1.1.2.cmml">x</mi><mn id="S3.p1.1.m1.2.2.1.1.1.3" xref="S3.p1.1.m1.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S3.p1.1.m1.5.5.4.4.6" xref="S3.p1.1.m1.5.5.4.5.cmml">,</mo><msub id="S3.p1.1.m1.3.3.2.2.2" xref="S3.p1.1.m1.3.3.2.2.2.cmml"><mi id="S3.p1.1.m1.3.3.2.2.2.2" xref="S3.p1.1.m1.3.3.2.2.2.2.cmml">x</mi><mn id="S3.p1.1.m1.3.3.2.2.2.3" xref="S3.p1.1.m1.3.3.2.2.2.3.cmml">2</mn></msub><mo id="S3.p1.1.m1.5.5.4.4.7" xref="S3.p1.1.m1.5.5.4.5.cmml">,</mo><mi id="S3.p1.1.m1.1.1" mathvariant="normal" xref="S3.p1.1.m1.1.1.cmml">…</mi><mo id="S3.p1.1.m1.5.5.4.4.8" xref="S3.p1.1.m1.5.5.4.5.cmml">,</mo><msub id="S3.p1.1.m1.4.4.3.3.3" xref="S3.p1.1.m1.4.4.3.3.3.cmml"><mi id="S3.p1.1.m1.4.4.3.3.3.2" xref="S3.p1.1.m1.4.4.3.3.3.2.cmml">x</mi><mi id="S3.p1.1.m1.4.4.3.3.3.3" xref="S3.p1.1.m1.4.4.3.3.3.3.cmml">j</mi></msub><mo id="S3.p1.1.m1.5.5.4.4.9" xref="S3.p1.1.m1.5.5.4.5.cmml">,</mo><mrow id="S3.p1.1.m1.5.5.4.4.4" xref="S3.p1.1.m1.5.5.4.4.4.cmml"><mi id="S3.p1.1.m1.5.5.4.4.4.2" mathvariant="normal" xref="S3.p1.1.m1.5.5.4.4.4.2.cmml">…</mi><mo id="S3.p1.1.m1.5.5.4.4.4.1" xref="S3.p1.1.m1.5.5.4.4.4.1.cmml">⁢</mo><msub id="S3.p1.1.m1.5.5.4.4.4.3" xref="S3.p1.1.m1.5.5.4.4.4.3.cmml"><mi id="S3.p1.1.m1.5.5.4.4.4.3.2" xref="S3.p1.1.m1.5.5.4.4.4.3.2.cmml">x</mi><mi id="S3.p1.1.m1.5.5.4.4.4.3.3" xref="S3.p1.1.m1.5.5.4.4.4.3.3.cmml">m</mi></msub></mrow><mo id="S3.p1.1.m1.5.5.4.4.10" stretchy="false" xref="S3.p1.1.m1.5.5.4.5.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.p1.1.m1.5b"><apply id="S3.p1.1.m1.5.5.cmml" xref="S3.p1.1.m1.5.5"><eq id="S3.p1.1.m1.5.5.5.cmml" xref="S3.p1.1.m1.5.5.5"></eq><ci id="S3.p1.1.m1.5.5.6.cmml" xref="S3.p1.1.m1.5.5.6">𝑋</ci><vector id="S3.p1.1.m1.5.5.4.5.cmml" xref="S3.p1.1.m1.5.5.4.4"><apply id="S3.p1.1.m1.2.2.1.1.1.cmml" xref="S3.p1.1.m1.2.2.1.1.1"><csymbol cd="ambiguous" id="S3.p1.1.m1.2.2.1.1.1.1.cmml" xref="S3.p1.1.m1.2.2.1.1.1">subscript</csymbol><ci id="S3.p1.1.m1.2.2.1.1.1.2.cmml" xref="S3.p1.1.m1.2.2.1.1.1.2">𝑥</ci><cn id="S3.p1.1.m1.2.2.1.1.1.3.cmml" type="integer" xref="S3.p1.1.m1.2.2.1.1.1.3">1</cn></apply><apply id="S3.p1.1.m1.3.3.2.2.2.cmml" xref="S3.p1.1.m1.3.3.2.2.2"><csymbol cd="ambiguous" id="S3.p1.1.m1.3.3.2.2.2.1.cmml" xref="S3.p1.1.m1.3.3.2.2.2">subscript</csymbol><ci id="S3.p1.1.m1.3.3.2.2.2.2.cmml" xref="S3.p1.1.m1.3.3.2.2.2.2">𝑥</ci><cn id="S3.p1.1.m1.3.3.2.2.2.3.cmml" type="integer" xref="S3.p1.1.m1.3.3.2.2.2.3">2</cn></apply><ci id="S3.p1.1.m1.1.1.cmml" xref="S3.p1.1.m1.1.1">…</ci><apply id="S3.p1.1.m1.4.4.3.3.3.cmml" xref="S3.p1.1.m1.4.4.3.3.3"><csymbol cd="ambiguous" id="S3.p1.1.m1.4.4.3.3.3.1.cmml" xref="S3.p1.1.m1.4.4.3.3.3">subscript</csymbol><ci id="S3.p1.1.m1.4.4.3.3.3.2.cmml" xref="S3.p1.1.m1.4.4.3.3.3.2">𝑥</ci><ci id="S3.p1.1.m1.4.4.3.3.3.3.cmml" xref="S3.p1.1.m1.4.4.3.3.3.3">𝑗</ci></apply><apply id="S3.p1.1.m1.5.5.4.4.4.cmml" xref="S3.p1.1.m1.5.5.4.4.4"><times id="S3.p1.1.m1.5.5.4.4.4.1.cmml" xref="S3.p1.1.m1.5.5.4.4.4.1"></times><ci id="S3.p1.1.m1.5.5.4.4.4.2.cmml" xref="S3.p1.1.m1.5.5.4.4.4.2">…</ci><apply id="S3.p1.1.m1.5.5.4.4.4.3.cmml" xref="S3.p1.1.m1.5.5.4.4.4.3"><csymbol cd="ambiguous" id="S3.p1.1.m1.5.5.4.4.4.3.1.cmml" xref="S3.p1.1.m1.5.5.4.4.4.3">subscript</csymbol><ci id="S3.p1.1.m1.5.5.4.4.4.3.2.cmml" xref="S3.p1.1.m1.5.5.4.4.4.3.2">𝑥</ci><ci id="S3.p1.1.m1.5.5.4.4.4.3.3.cmml" xref="S3.p1.1.m1.5.5.4.4.4.3.3">𝑚</ci></apply></apply></vector></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.p1.1.m1.5c">X=(x_{1},x_{2},\ldots,x_{j},\ldots x_{m})</annotation><annotation encoding="application/x-llamapun" id="S3.p1.1.m1.5d">italic_X = ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , … italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT )</annotation></semantics></math> and a sequence from the patent’s claims <math alttext="X^{\prime}=(x^{\prime}_{1},x^{\prime}_{2},\ldots,x^{\prime}_{j},\ldots x^{% \prime}_{m})" class="ltx_Math" display="inline" id="S3.p1.2.m2.5"><semantics id="S3.p1.2.m2.5a"><mrow id="S3.p1.2.m2.5.5" xref="S3.p1.2.m2.5.5.cmml"><msup id="S3.p1.2.m2.5.5.6" xref="S3.p1.2.m2.5.5.6.cmml"><mi id="S3.p1.2.m2.5.5.6.2" xref="S3.p1.2.m2.5.5.6.2.cmml">X</mi><mo id="S3.p1.2.m2.5.5.6.3" xref="S3.p1.2.m2.5.5.6.3.cmml">′</mo></msup><mo id="S3.p1.2.m2.5.5.5" xref="S3.p1.2.m2.5.5.5.cmml">=</mo><mrow id="S3.p1.2.m2.5.5.4.4" xref="S3.p1.2.m2.5.5.4.5.cmml"><mo id="S3.p1.2.m2.5.5.4.4.5" stretchy="false" xref="S3.p1.2.m2.5.5.4.5.cmml">(</mo><msubsup id="S3.p1.2.m2.2.2.1.1.1" xref="S3.p1.2.m2.2.2.1.1.1.cmml"><mi id="S3.p1.2.m2.2.2.1.1.1.2.2" xref="S3.p1.2.m2.2.2.1.1.1.2.2.cmml">x</mi><mn id="S3.p1.2.m2.2.2.1.1.1.3" xref="S3.p1.2.m2.2.2.1.1.1.3.cmml">1</mn><mo id="S3.p1.2.m2.2.2.1.1.1.2.3" xref="S3.p1.2.m2.2.2.1.1.1.2.3.cmml">′</mo></msubsup><mo id="S3.p1.2.m2.5.5.4.4.6" xref="S3.p1.2.m2.5.5.4.5.cmml">,</mo><msubsup id="S3.p1.2.m2.3.3.2.2.2" xref="S3.p1.2.m2.3.3.2.2.2.cmml"><mi id="S3.p1.2.m2.3.3.2.2.2.2.2" xref="S3.p1.2.m2.3.3.2.2.2.2.2.cmml">x</mi><mn id="S3.p1.2.m2.3.3.2.2.2.3" xref="S3.p1.2.m2.3.3.2.2.2.3.cmml">2</mn><mo id="S3.p1.2.m2.3.3.2.2.2.2.3" xref="S3.p1.2.m2.3.3.2.2.2.2.3.cmml">′</mo></msubsup><mo id="S3.p1.2.m2.5.5.4.4.7" xref="S3.p1.2.m2.5.5.4.5.cmml">,</mo><mi id="S3.p1.2.m2.1.1" mathvariant="normal" xref="S3.p1.2.m2.1.1.cmml">…</mi><mo id="S3.p1.2.m2.5.5.4.4.8" xref="S3.p1.2.m2.5.5.4.5.cmml">,</mo><msubsup id="S3.p1.2.m2.4.4.3.3.3" xref="S3.p1.2.m2.4.4.3.3.3.cmml"><mi id="S3.p1.2.m2.4.4.3.3.3.2.2" xref="S3.p1.2.m2.4.4.3.3.3.2.2.cmml">x</mi><mi id="S3.p1.2.m2.4.4.3.3.3.3" xref="S3.p1.2.m2.4.4.3.3.3.3.cmml">j</mi><mo id="S3.p1.2.m2.4.4.3.3.3.2.3" xref="S3.p1.2.m2.4.4.3.3.3.2.3.cmml">′</mo></msubsup><mo id="S3.p1.2.m2.5.5.4.4.9" xref="S3.p1.2.m2.5.5.4.5.cmml">,</mo><mrow id="S3.p1.2.m2.5.5.4.4.4" xref="S3.p1.2.m2.5.5.4.4.4.cmml"><mi id="S3.p1.2.m2.5.5.4.4.4.2" mathvariant="normal" xref="S3.p1.2.m2.5.5.4.4.4.2.cmml">…</mi><mo id="S3.p1.2.m2.5.5.4.4.4.1" xref="S3.p1.2.m2.5.5.4.4.4.1.cmml">⁢</mo><msubsup id="S3.p1.2.m2.5.5.4.4.4.3" xref="S3.p1.2.m2.5.5.4.4.4.3.cmml"><mi id="S3.p1.2.m2.5.5.4.4.4.3.2.2" xref="S3.p1.2.m2.5.5.4.4.4.3.2.2.cmml">x</mi><mi id="S3.p1.2.m2.5.5.4.4.4.3.3" xref="S3.p1.2.m2.5.5.4.4.4.3.3.cmml">m</mi><mo id="S3.p1.2.m2.5.5.4.4.4.3.2.3" xref="S3.p1.2.m2.5.5.4.4.4.3.2.3.cmml">′</mo></msubsup></mrow><mo id="S3.p1.2.m2.5.5.4.4.10" stretchy="false" xref="S3.p1.2.m2.5.5.4.5.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.p1.2.m2.5b"><apply id="S3.p1.2.m2.5.5.cmml" xref="S3.p1.2.m2.5.5"><eq id="S3.p1.2.m2.5.5.5.cmml" xref="S3.p1.2.m2.5.5.5"></eq><apply id="S3.p1.2.m2.5.5.6.cmml" xref="S3.p1.2.m2.5.5.6"><csymbol cd="ambiguous" id="S3.p1.2.m2.5.5.6.1.cmml" xref="S3.p1.2.m2.5.5.6">superscript</csymbol><ci id="S3.p1.2.m2.5.5.6.2.cmml" xref="S3.p1.2.m2.5.5.6.2">𝑋</ci><ci id="S3.p1.2.m2.5.5.6.3.cmml" xref="S3.p1.2.m2.5.5.6.3">′</ci></apply><vector id="S3.p1.2.m2.5.5.4.5.cmml" xref="S3.p1.2.m2.5.5.4.4"><apply id="S3.p1.2.m2.2.2.1.1.1.cmml" xref="S3.p1.2.m2.2.2.1.1.1"><csymbol cd="ambiguous" id="S3.p1.2.m2.2.2.1.1.1.1.cmml" xref="S3.p1.2.m2.2.2.1.1.1">subscript</csymbol><apply id="S3.p1.2.m2.2.2.1.1.1.2.cmml" xref="S3.p1.2.m2.2.2.1.1.1"><csymbol cd="ambiguous" id="S3.p1.2.m2.2.2.1.1.1.2.1.cmml" xref="S3.p1.2.m2.2.2.1.1.1">superscript</csymbol><ci id="S3.p1.2.m2.2.2.1.1.1.2.2.cmml" xref="S3.p1.2.m2.2.2.1.1.1.2.2">𝑥</ci><ci id="S3.p1.2.m2.2.2.1.1.1.2.3.cmml" xref="S3.p1.2.m2.2.2.1.1.1.2.3">′</ci></apply><cn id="S3.p1.2.m2.2.2.1.1.1.3.cmml" type="integer" xref="S3.p1.2.m2.2.2.1.1.1.3">1</cn></apply><apply id="S3.p1.2.m2.3.3.2.2.2.cmml" xref="S3.p1.2.m2.3.3.2.2.2"><csymbol cd="ambiguous" id="S3.p1.2.m2.3.3.2.2.2.1.cmml" xref="S3.p1.2.m2.3.3.2.2.2">subscript</csymbol><apply id="S3.p1.2.m2.3.3.2.2.2.2.cmml" xref="S3.p1.2.m2.3.3.2.2.2"><csymbol cd="ambiguous" id="S3.p1.2.m2.3.3.2.2.2.2.1.cmml" xref="S3.p1.2.m2.3.3.2.2.2">superscript</csymbol><ci id="S3.p1.2.m2.3.3.2.2.2.2.2.cmml" xref="S3.p1.2.m2.3.3.2.2.2.2.2">𝑥</ci><ci id="S3.p1.2.m2.3.3.2.2.2.2.3.cmml" xref="S3.p1.2.m2.3.3.2.2.2.2.3">′</ci></apply><cn id="S3.p1.2.m2.3.3.2.2.2.3.cmml" type="integer" xref="S3.p1.2.m2.3.3.2.2.2.3">2</cn></apply><ci id="S3.p1.2.m2.1.1.cmml" xref="S3.p1.2.m2.1.1">…</ci><apply id="S3.p1.2.m2.4.4.3.3.3.cmml" xref="S3.p1.2.m2.4.4.3.3.3"><csymbol cd="ambiguous" id="S3.p1.2.m2.4.4.3.3.3.1.cmml" xref="S3.p1.2.m2.4.4.3.3.3">subscript</csymbol><apply id="S3.p1.2.m2.4.4.3.3.3.2.cmml" xref="S3.p1.2.m2.4.4.3.3.3"><csymbol cd="ambiguous" id="S3.p1.2.m2.4.4.3.3.3.2.1.cmml" xref="S3.p1.2.m2.4.4.3.3.3">superscript</csymbol><ci id="S3.p1.2.m2.4.4.3.3.3.2.2.cmml" xref="S3.p1.2.m2.4.4.3.3.3.2.2">𝑥</ci><ci id="S3.p1.2.m2.4.4.3.3.3.2.3.cmml" xref="S3.p1.2.m2.4.4.3.3.3.2.3">′</ci></apply><ci id="S3.p1.2.m2.4.4.3.3.3.3.cmml" xref="S3.p1.2.m2.4.4.3.3.3.3">𝑗</ci></apply><apply id="S3.p1.2.m2.5.5.4.4.4.cmml" xref="S3.p1.2.m2.5.5.4.4.4"><times id="S3.p1.2.m2.5.5.4.4.4.1.cmml" xref="S3.p1.2.m2.5.5.4.4.4.1"></times><ci id="S3.p1.2.m2.5.5.4.4.4.2.cmml" xref="S3.p1.2.m2.5.5.4.4.4.2">…</ci><apply id="S3.p1.2.m2.5.5.4.4.4.3.cmml" xref="S3.p1.2.m2.5.5.4.4.4.3"><csymbol cd="ambiguous" id="S3.p1.2.m2.5.5.4.4.4.3.1.cmml" xref="S3.p1.2.m2.5.5.4.4.4.3">subscript</csymbol><apply id="S3.p1.2.m2.5.5.4.4.4.3.2.cmml" xref="S3.p1.2.m2.5.5.4.4.4.3"><csymbol cd="ambiguous" id="S3.p1.2.m2.5.5.4.4.4.3.2.1.cmml" xref="S3.p1.2.m2.5.5.4.4.4.3">superscript</csymbol><ci id="S3.p1.2.m2.5.5.4.4.4.3.2.2.cmml" xref="S3.p1.2.m2.5.5.4.4.4.3.2.2">𝑥</ci><ci id="S3.p1.2.m2.5.5.4.4.4.3.2.3.cmml" xref="S3.p1.2.m2.5.5.4.4.4.3.2.3">′</ci></apply><ci id="S3.p1.2.m2.5.5.4.4.4.3.3.cmml" xref="S3.p1.2.m2.5.5.4.4.4.3.3">𝑚</ci></apply></apply></vector></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.p1.2.m2.5c">X^{\prime}=(x^{\prime}_{1},x^{\prime}_{2},\ldots,x^{\prime}_{j},\ldots x^{% \prime}_{m})</annotation><annotation encoding="application/x-llamapun" id="S3.p1.2.m2.5d">italic_X start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , … italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT )</annotation></semantics></math>, where <math alttext="j" class="ltx_Math" display="inline" id="S3.p1.3.m3.1"><semantics id="S3.p1.3.m3.1a"><mi id="S3.p1.3.m3.1.1" xref="S3.p1.3.m3.1.1.cmml">j</mi><annotation-xml encoding="MathML-Content" id="S3.p1.3.m3.1b"><ci id="S3.p1.3.m3.1.1.cmml" xref="S3.p1.3.m3.1.1">𝑗</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.p1.3.m3.1c">j</annotation><annotation encoding="application/x-llamapun" id="S3.p1.3.m3.1d">italic_j</annotation></semantics></math> and <math alttext="m" class="ltx_Math" display="inline" id="S3.p1.4.m4.1"><semantics id="S3.p1.4.m4.1a"><mi id="S3.p1.4.m4.1.1" xref="S3.p1.4.m4.1.1.cmml">m</mi><annotation-xml encoding="MathML-Content" id="S3.p1.4.m4.1b"><ci id="S3.p1.4.m4.1.1.cmml" xref="S3.p1.4.m4.1.1">𝑚</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.p1.4.m4.1c">m</annotation><annotation encoding="application/x-llamapun" id="S3.p1.4.m4.1d">italic_m</annotation></semantics></math> represent the index and number of words in the source text, respectively. The output is a summary sequence of the patent text <math alttext="Y=(y_{1},y_{2},\ldots,y_{i},\ldots y_{n})" class="ltx_Math" display="inline" id="S3.p1.5.m5.5"><semantics id="S3.p1.5.m5.5a"><mrow id="S3.p1.5.m5.5.5" xref="S3.p1.5.m5.5.5.cmml"><mi id="S3.p1.5.m5.5.5.6" xref="S3.p1.5.m5.5.5.6.cmml">Y</mi><mo id="S3.p1.5.m5.5.5.5" xref="S3.p1.5.m5.5.5.5.cmml">=</mo><mrow id="S3.p1.5.m5.5.5.4.4" xref="S3.p1.5.m5.5.5.4.5.cmml"><mo id="S3.p1.5.m5.5.5.4.4.5" stretchy="false" xref="S3.p1.5.m5.5.5.4.5.cmml">(</mo><msub id="S3.p1.5.m5.2.2.1.1.1" xref="S3.p1.5.m5.2.2.1.1.1.cmml"><mi id="S3.p1.5.m5.2.2.1.1.1.2" xref="S3.p1.5.m5.2.2.1.1.1.2.cmml">y</mi><mn id="S3.p1.5.m5.2.2.1.1.1.3" xref="S3.p1.5.m5.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S3.p1.5.m5.5.5.4.4.6" xref="S3.p1.5.m5.5.5.4.5.cmml">,</mo><msub id="S3.p1.5.m5.3.3.2.2.2" xref="S3.p1.5.m5.3.3.2.2.2.cmml"><mi id="S3.p1.5.m5.3.3.2.2.2.2" xref="S3.p1.5.m5.3.3.2.2.2.2.cmml">y</mi><mn id="S3.p1.5.m5.3.3.2.2.2.3" xref="S3.p1.5.m5.3.3.2.2.2.3.cmml">2</mn></msub><mo id="S3.p1.5.m5.5.5.4.4.7" xref="S3.p1.5.m5.5.5.4.5.cmml">,</mo><mi id="S3.p1.5.m5.1.1" mathvariant="normal" xref="S3.p1.5.m5.1.1.cmml">…</mi><mo id="S3.p1.5.m5.5.5.4.4.8" xref="S3.p1.5.m5.5.5.4.5.cmml">,</mo><msub id="S3.p1.5.m5.4.4.3.3.3" xref="S3.p1.5.m5.4.4.3.3.3.cmml"><mi id="S3.p1.5.m5.4.4.3.3.3.2" xref="S3.p1.5.m5.4.4.3.3.3.2.cmml">y</mi><mi id="S3.p1.5.m5.4.4.3.3.3.3" xref="S3.p1.5.m5.4.4.3.3.3.3.cmml">i</mi></msub><mo id="S3.p1.5.m5.5.5.4.4.9" xref="S3.p1.5.m5.5.5.4.5.cmml">,</mo><mrow id="S3.p1.5.m5.5.5.4.4.4" xref="S3.p1.5.m5.5.5.4.4.4.cmml"><mi id="S3.p1.5.m5.5.5.4.4.4.2" mathvariant="normal" xref="S3.p1.5.m5.5.5.4.4.4.2.cmml">…</mi><mo id="S3.p1.5.m5.5.5.4.4.4.1" xref="S3.p1.5.m5.5.5.4.4.4.1.cmml">⁢</mo><msub id="S3.p1.5.m5.5.5.4.4.4.3" xref="S3.p1.5.m5.5.5.4.4.4.3.cmml"><mi id="S3.p1.5.m5.5.5.4.4.4.3.2" xref="S3.p1.5.m5.5.5.4.4.4.3.2.cmml">y</mi><mi id="S3.p1.5.m5.5.5.4.4.4.3.3" xref="S3.p1.5.m5.5.5.4.4.4.3.3.cmml">n</mi></msub></mrow><mo id="S3.p1.5.m5.5.5.4.4.10" stretchy="false" xref="S3.p1.5.m5.5.5.4.5.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.p1.5.m5.5b"><apply id="S3.p1.5.m5.5.5.cmml" xref="S3.p1.5.m5.5.5"><eq id="S3.p1.5.m5.5.5.5.cmml" xref="S3.p1.5.m5.5.5.5"></eq><ci id="S3.p1.5.m5.5.5.6.cmml" xref="S3.p1.5.m5.5.5.6">𝑌</ci><vector id="S3.p1.5.m5.5.5.4.5.cmml" xref="S3.p1.5.m5.5.5.4.4"><apply id="S3.p1.5.m5.2.2.1.1.1.cmml" xref="S3.p1.5.m5.2.2.1.1.1"><csymbol cd="ambiguous" id="S3.p1.5.m5.2.2.1.1.1.1.cmml" xref="S3.p1.5.m5.2.2.1.1.1">subscript</csymbol><ci id="S3.p1.5.m5.2.2.1.1.1.2.cmml" xref="S3.p1.5.m5.2.2.1.1.1.2">𝑦</ci><cn id="S3.p1.5.m5.2.2.1.1.1.3.cmml" type="integer" xref="S3.p1.5.m5.2.2.1.1.1.3">1</cn></apply><apply id="S3.p1.5.m5.3.3.2.2.2.cmml" xref="S3.p1.5.m5.3.3.2.2.2"><csymbol cd="ambiguous" id="S3.p1.5.m5.3.3.2.2.2.1.cmml" xref="S3.p1.5.m5.3.3.2.2.2">subscript</csymbol><ci id="S3.p1.5.m5.3.3.2.2.2.2.cmml" xref="S3.p1.5.m5.3.3.2.2.2.2">𝑦</ci><cn id="S3.p1.5.m5.3.3.2.2.2.3.cmml" type="integer" xref="S3.p1.5.m5.3.3.2.2.2.3">2</cn></apply><ci id="S3.p1.5.m5.1.1.cmml" xref="S3.p1.5.m5.1.1">…</ci><apply id="S3.p1.5.m5.4.4.3.3.3.cmml" xref="S3.p1.5.m5.4.4.3.3.3"><csymbol cd="ambiguous" id="S3.p1.5.m5.4.4.3.3.3.1.cmml" xref="S3.p1.5.m5.4.4.3.3.3">subscript</csymbol><ci id="S3.p1.5.m5.4.4.3.3.3.2.cmml" xref="S3.p1.5.m5.4.4.3.3.3.2">𝑦</ci><ci id="S3.p1.5.m5.4.4.3.3.3.3.cmml" xref="S3.p1.5.m5.4.4.3.3.3.3">𝑖</ci></apply><apply id="S3.p1.5.m5.5.5.4.4.4.cmml" xref="S3.p1.5.m5.5.5.4.4.4"><times id="S3.p1.5.m5.5.5.4.4.4.1.cmml" xref="S3.p1.5.m5.5.5.4.4.4.1"></times><ci id="S3.p1.5.m5.5.5.4.4.4.2.cmml" xref="S3.p1.5.m5.5.5.4.4.4.2">…</ci><apply id="S3.p1.5.m5.5.5.4.4.4.3.cmml" xref="S3.p1.5.m5.5.5.4.4.4.3"><csymbol cd="ambiguous" id="S3.p1.5.m5.5.5.4.4.4.3.1.cmml" xref="S3.p1.5.m5.5.5.4.4.4.3">subscript</csymbol><ci id="S3.p1.5.m5.5.5.4.4.4.3.2.cmml" xref="S3.p1.5.m5.5.5.4.4.4.3.2">𝑦</ci><ci id="S3.p1.5.m5.5.5.4.4.4.3.3.cmml" xref="S3.p1.5.m5.5.5.4.4.4.3.3">𝑛</ci></apply></apply></vector></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.p1.5.m5.5c">Y=(y_{1},y_{2},\ldots,y_{i},\ldots y_{n})</annotation><annotation encoding="application/x-llamapun" id="S3.p1.5.m5.5d">italic_Y = ( italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , … italic_y start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT )</annotation></semantics></math>, where <math alttext="i" class="ltx_Math" display="inline" id="S3.p1.6.m6.1"><semantics id="S3.p1.6.m6.1a"><mi id="S3.p1.6.m6.1.1" xref="S3.p1.6.m6.1.1.cmml">i</mi><annotation-xml encoding="MathML-Content" id="S3.p1.6.m6.1b"><ci id="S3.p1.6.m6.1.1.cmml" xref="S3.p1.6.m6.1.1">𝑖</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.p1.6.m6.1c">i</annotation><annotation encoding="application/x-llamapun" id="S3.p1.6.m6.1d">italic_i</annotation></semantics></math> and <math alttext="n" class="ltx_Math" display="inline" id="S3.p1.7.m7.1"><semantics id="S3.p1.7.m7.1a"><mi id="S3.p1.7.m7.1.1" xref="S3.p1.7.m7.1.1.cmml">n</mi><annotation-xml encoding="MathML-Content" id="S3.p1.7.m7.1b"><ci id="S3.p1.7.m7.1.1.cmml" xref="S3.p1.7.m7.1.1">𝑛</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.p1.7.m7.1c">n</annotation><annotation encoding="application/x-llamapun" id="S3.p1.7.m7.1d">italic_n</annotation></semantics></math> represent the index and number of words in the summary text, respectively.</p> </div> <figure class="ltx_figure" id="S3.F1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="353" id="S3.F1.g1" src="x1.png" width="830"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure">Figure 1: </span>The overall architecture of the model MSEA. MSEA has a master encoder, a slave encoder, and a decoder with an attention mechanism</figcaption> </figure> <div class="ltx_para" id="S3.p2"> <p class="ltx_p" id="S3.p2.1">This article will describe in detail the designed master-slave encoder model. The model structure is shown in Figure <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S3.F1" title="Figure 1 ‣ 3 Methodology ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">1</span></a>, consisting of a master encoder, a slave encoder, and a decoder with an attention mechanism:</p> </div> <div class="ltx_para" id="S3.p3"> <p class="ltx_p" id="S3.p3.1">At the master encoder end, the role of the master encoder is to calculate semantic vectors for each word in the input patent text’s specifications.</p> </div> <div class="ltx_para" id="S3.p4"> <p class="ltx_p" id="S3.p4.1">At the slave encoder end, the slave encoder first calculates the importance weights of each word in the specifications of the input patent text, adds the text of each word from the claims of the patent text, and then recalculates the corresponding semantic vectors.</p> </div> <div class="ltx_para" id="S3.p5"> <p class="ltx_p" id="S3.p5.1">At the decoder end, a decoder with an attention mechanism is designed to decode in stages, producing a partially fixed-length output sequence at each stage.</p> </div> <section class="ltx_subsection" id="S3.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.1 </span>Master Encoder</h3> <div class="ltx_para" id="S3.SS1.p1"> <p class="ltx_p" id="S3.SS1.p1.7">In this paper, GRU is used to adaptively capture dependencies between different time scales. As shown in Figure <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S3.F1" title="Figure 1 ‣ 3 Methodology ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">1</span></a>, the main encoder’s relationships can be described by the following equations:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E1"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\begin{cases}u_{t}=\sigma(W_{u}[x_{t},h_{t-1}])\\ r_{t}=\sigma(W_{r}[x_{t},h_{t-1}])\\ h_{t}^{\prime}=\tanh(W_{h}[x_{t},r_{t}\odot h_{t-1}])\\ h_{t}=(1-u_{t})\odot h_{t-1}+u_{t}\odot h_{t}^{\prime}\end{cases}" class="ltx_Math" display="block" id="S3.E1.m1.4"><semantics id="S3.E1.m1.4a"><mrow id="S3.E1.m1.4.4" xref="S3.E1.m1.4.5.1.cmml"><mo id="S3.E1.m1.4.4.5" xref="S3.E1.m1.4.5.1.1.cmml">{</mo><mtable columnspacing="5pt" displaystyle="true" id="S3.E1.m1.4.4.4" rowspacing="0pt" xref="S3.E1.m1.4.5.1.cmml"><mtr id="S3.E1.m1.4.4.4a" xref="S3.E1.m1.4.5.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S3.E1.m1.4.4.4b" xref="S3.E1.m1.4.5.1.cmml"><mrow id="S3.E1.m1.1.1.1.1.1.1" xref="S3.E1.m1.1.1.1.1.1.1.cmml"><msub id="S3.E1.m1.1.1.1.1.1.1.3" xref="S3.E1.m1.1.1.1.1.1.1.3.cmml"><mi id="S3.E1.m1.1.1.1.1.1.1.3.2" xref="S3.E1.m1.1.1.1.1.1.1.3.2.cmml">u</mi><mi id="S3.E1.m1.1.1.1.1.1.1.3.3" xref="S3.E1.m1.1.1.1.1.1.1.3.3.cmml">t</mi></msub><mo id="S3.E1.m1.1.1.1.1.1.1.2" xref="S3.E1.m1.1.1.1.1.1.1.2.cmml">=</mo><mrow id="S3.E1.m1.1.1.1.1.1.1.1" xref="S3.E1.m1.1.1.1.1.1.1.1.cmml"><mi id="S3.E1.m1.1.1.1.1.1.1.1.3" xref="S3.E1.m1.1.1.1.1.1.1.1.3.cmml">σ</mi><mo id="S3.E1.m1.1.1.1.1.1.1.1.2" xref="S3.E1.m1.1.1.1.1.1.1.1.2.cmml">⁢</mo><mrow id="S3.E1.m1.1.1.1.1.1.1.1.1.1" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.cmml"><mo id="S3.E1.m1.1.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.cmml">(</mo><mrow id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.cmml"><msub id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4.cmml"><mi id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4.2" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4.2.cmml">W</mi><mi id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4.3" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4.3.cmml">u</mi></msub><mo id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.3" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.3.cmml">⁢</mo><mrow id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.3.cmml"><mo id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.3" stretchy="false" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.3.cmml">[</mo><msub id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml">x</mi><mi id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml">t</mi></msub><mo id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.4" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.3.cmml">,</mo><msub id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml"><mi id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.2" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.2.cmml">h</mi><mrow id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.cmml"><mi id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.2" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.2.cmml">t</mi><mo id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.1" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.1.cmml">−</mo><mn id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.3" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.3.cmml">1</mn></mrow></msub><mo id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.5" stretchy="false" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.3.cmml">]</mo></mrow></mrow><mo id="S3.E1.m1.1.1.1.1.1.1.1.1.1.3" stretchy="false" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow></mtd><mtd id="S3.E1.m1.4.4.4c" xref="S3.E1.m1.4.5.1.1.cmml"></mtd></mtr><mtr id="S3.E1.m1.4.4.4d" xref="S3.E1.m1.4.5.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S3.E1.m1.4.4.4e" xref="S3.E1.m1.4.5.1.cmml"><mrow id="S3.E1.m1.2.2.2.2.1.1" xref="S3.E1.m1.2.2.2.2.1.1.cmml"><msub id="S3.E1.m1.2.2.2.2.1.1.3" xref="S3.E1.m1.2.2.2.2.1.1.3.cmml"><mi id="S3.E1.m1.2.2.2.2.1.1.3.2" xref="S3.E1.m1.2.2.2.2.1.1.3.2.cmml">r</mi><mi id="S3.E1.m1.2.2.2.2.1.1.3.3" xref="S3.E1.m1.2.2.2.2.1.1.3.3.cmml">t</mi></msub><mo id="S3.E1.m1.2.2.2.2.1.1.2" xref="S3.E1.m1.2.2.2.2.1.1.2.cmml">=</mo><mrow id="S3.E1.m1.2.2.2.2.1.1.1" xref="S3.E1.m1.2.2.2.2.1.1.1.cmml"><mi id="S3.E1.m1.2.2.2.2.1.1.1.3" xref="S3.E1.m1.2.2.2.2.1.1.1.3.cmml">σ</mi><mo id="S3.E1.m1.2.2.2.2.1.1.1.2" xref="S3.E1.m1.2.2.2.2.1.1.1.2.cmml">⁢</mo><mrow id="S3.E1.m1.2.2.2.2.1.1.1.1.1" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.cmml"><mo id="S3.E1.m1.2.2.2.2.1.1.1.1.1.2" stretchy="false" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.cmml">(</mo><mrow id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.cmml"><msub id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4.cmml"><mi id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4.2" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4.2.cmml">W</mi><mi id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4.3" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4.3.cmml">r</mi></msub><mo id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.3" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.3.cmml">⁢</mo><mrow id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.3.cmml"><mo id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.3" stretchy="false" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.3.cmml">[</mo><msub id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1.cmml"><mi id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1.2" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1.2.cmml">x</mi><mi id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1.3" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1.3.cmml">t</mi></msub><mo id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.4" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.3.cmml">,</mo><msub id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.cmml"><mi id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.2" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.2.cmml">h</mi><mrow id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.cmml"><mi id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.2" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.2.cmml">t</mi><mo id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.1" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.1.cmml">−</mo><mn id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.3" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.3.cmml">1</mn></mrow></msub><mo id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.5" stretchy="false" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.3.cmml">]</mo></mrow></mrow><mo id="S3.E1.m1.2.2.2.2.1.1.1.1.1.3" stretchy="false" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow></mtd><mtd id="S3.E1.m1.4.4.4f" xref="S3.E1.m1.4.5.1.1.cmml"></mtd></mtr><mtr id="S3.E1.m1.4.4.4g" xref="S3.E1.m1.4.5.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S3.E1.m1.4.4.4h" xref="S3.E1.m1.4.5.1.cmml"><mrow id="S3.E1.m1.3.3.3.3.1.1" xref="S3.E1.m1.3.3.3.3.1.1.cmml"><msubsup id="S3.E1.m1.3.3.3.3.1.1.4" xref="S3.E1.m1.3.3.3.3.1.1.4.cmml"><mi id="S3.E1.m1.3.3.3.3.1.1.4.2.2" xref="S3.E1.m1.3.3.3.3.1.1.4.2.2.cmml">h</mi><mi id="S3.E1.m1.3.3.3.3.1.1.4.2.3" xref="S3.E1.m1.3.3.3.3.1.1.4.2.3.cmml">t</mi><mo id="S3.E1.m1.3.3.3.3.1.1.4.3" xref="S3.E1.m1.3.3.3.3.1.1.4.3.cmml">′</mo></msubsup><mo id="S3.E1.m1.3.3.3.3.1.1.3" xref="S3.E1.m1.3.3.3.3.1.1.3.cmml">=</mo><mrow id="S3.E1.m1.3.3.3.3.1.1.2.1" xref="S3.E1.m1.3.3.3.3.1.1.2.2.cmml"><mi id="S3.E1.m1.3.3.3.3.1.1.1" xref="S3.E1.m1.3.3.3.3.1.1.1.cmml">tanh</mi><mo id="S3.E1.m1.3.3.3.3.1.1.2.1a" xref="S3.E1.m1.3.3.3.3.1.1.2.2.cmml">⁡</mo><mrow id="S3.E1.m1.3.3.3.3.1.1.2.1.1" xref="S3.E1.m1.3.3.3.3.1.1.2.2.cmml"><mo id="S3.E1.m1.3.3.3.3.1.1.2.1.1.2" stretchy="false" xref="S3.E1.m1.3.3.3.3.1.1.2.2.cmml">(</mo><mrow id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.cmml"><msub id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4.cmml"><mi id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4.2" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4.2.cmml">W</mi><mi id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4.3" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4.3.cmml">h</mi></msub><mo id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.3" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.3.cmml">⁢</mo><mrow id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.3.cmml"><mo id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.3" stretchy="false" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.3.cmml">[</mo><msub id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1.cmml"><mi id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1.2" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1.2.cmml">x</mi><mi id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1.3" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1.3.cmml">t</mi></msub><mo id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.4" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.3.cmml">,</mo><mrow id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.cmml"><msub id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2.cmml"><mi id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2.2" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2.2.cmml">r</mi><mi id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2.3" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2.3.cmml">t</mi></msub><mo id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.1" lspace="0.222em" rspace="0.222em" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.1.cmml">⊙</mo><msub id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.cmml"><mi id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.2" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.2.cmml">h</mi><mrow id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.cmml"><mi id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.2" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.2.cmml">t</mi><mo id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.1" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.1.cmml">−</mo><mn id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.3" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.3.cmml">1</mn></mrow></msub></mrow><mo id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.5" stretchy="false" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.3.cmml">]</mo></mrow></mrow><mo id="S3.E1.m1.3.3.3.3.1.1.2.1.1.3" stretchy="false" xref="S3.E1.m1.3.3.3.3.1.1.2.2.cmml">)</mo></mrow></mrow></mrow></mtd><mtd id="S3.E1.m1.4.4.4i" xref="S3.E1.m1.4.5.1.1.cmml"></mtd></mtr><mtr id="S3.E1.m1.4.4.4j" xref="S3.E1.m1.4.5.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S3.E1.m1.4.4.4k" xref="S3.E1.m1.4.5.1.cmml"><mrow id="S3.E1.m1.4.4.4.4.1.1" xref="S3.E1.m1.4.4.4.4.1.1.cmml"><msub id="S3.E1.m1.4.4.4.4.1.1.3" xref="S3.E1.m1.4.4.4.4.1.1.3.cmml"><mi id="S3.E1.m1.4.4.4.4.1.1.3.2" xref="S3.E1.m1.4.4.4.4.1.1.3.2.cmml">h</mi><mi id="S3.E1.m1.4.4.4.4.1.1.3.3" xref="S3.E1.m1.4.4.4.4.1.1.3.3.cmml">t</mi></msub><mo id="S3.E1.m1.4.4.4.4.1.1.2" xref="S3.E1.m1.4.4.4.4.1.1.2.cmml">=</mo><mrow id="S3.E1.m1.4.4.4.4.1.1.1" xref="S3.E1.m1.4.4.4.4.1.1.1.cmml"><mrow id="S3.E1.m1.4.4.4.4.1.1.1.1" xref="S3.E1.m1.4.4.4.4.1.1.1.1.cmml"><mrow id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.cmml"><mo id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.2" stretchy="false" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.cmml">(</mo><mrow id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.cmml"><mn id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.2" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.2.cmml">1</mn><mo id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.1" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.1.cmml">−</mo><msub id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3.cmml"><mi id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3.2" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3.2.cmml">u</mi><mi id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3.3" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3.3.cmml">t</mi></msub></mrow><mo id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.3" rspace="0.055em" stretchy="false" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.cmml">)</mo></mrow><mo id="S3.E1.m1.4.4.4.4.1.1.1.1.2" rspace="0.222em" xref="S3.E1.m1.4.4.4.4.1.1.1.1.2.cmml">⊙</mo><msub id="S3.E1.m1.4.4.4.4.1.1.1.1.3" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3.cmml"><mi id="S3.E1.m1.4.4.4.4.1.1.1.1.3.2" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3.2.cmml">h</mi><mrow id="S3.E1.m1.4.4.4.4.1.1.1.1.3.3" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.cmml"><mi id="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.2" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.2.cmml">t</mi><mo id="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.1" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.1.cmml">−</mo><mn id="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.3" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.3.cmml">1</mn></mrow></msub></mrow><mo id="S3.E1.m1.4.4.4.4.1.1.1.2" xref="S3.E1.m1.4.4.4.4.1.1.1.2.cmml">+</mo><mrow id="S3.E1.m1.4.4.4.4.1.1.1.3" xref="S3.E1.m1.4.4.4.4.1.1.1.3.cmml"><msub id="S3.E1.m1.4.4.4.4.1.1.1.3.2" xref="S3.E1.m1.4.4.4.4.1.1.1.3.2.cmml"><mi id="S3.E1.m1.4.4.4.4.1.1.1.3.2.2" xref="S3.E1.m1.4.4.4.4.1.1.1.3.2.2.cmml">u</mi><mi id="S3.E1.m1.4.4.4.4.1.1.1.3.2.3" xref="S3.E1.m1.4.4.4.4.1.1.1.3.2.3.cmml">t</mi></msub><mo id="S3.E1.m1.4.4.4.4.1.1.1.3.1" lspace="0.222em" rspace="0.222em" xref="S3.E1.m1.4.4.4.4.1.1.1.3.1.cmml">⊙</mo><msubsup id="S3.E1.m1.4.4.4.4.1.1.1.3.3" xref="S3.E1.m1.4.4.4.4.1.1.1.3.3.cmml"><mi id="S3.E1.m1.4.4.4.4.1.1.1.3.3.2.2" xref="S3.E1.m1.4.4.4.4.1.1.1.3.3.2.2.cmml">h</mi><mi id="S3.E1.m1.4.4.4.4.1.1.1.3.3.2.3" xref="S3.E1.m1.4.4.4.4.1.1.1.3.3.2.3.cmml">t</mi><mo id="S3.E1.m1.4.4.4.4.1.1.1.3.3.3" xref="S3.E1.m1.4.4.4.4.1.1.1.3.3.3.cmml">′</mo></msubsup></mrow></mrow></mrow></mtd><mtd id="S3.E1.m1.4.4.4l" xref="S3.E1.m1.4.5.1.1.cmml"></mtd></mtr></mtable></mrow><annotation-xml encoding="MathML-Content" id="S3.E1.m1.4b"><apply id="S3.E1.m1.4.5.1.cmml" xref="S3.E1.m1.4.4"><csymbol cd="latexml" id="S3.E1.m1.4.5.1.1.cmml" xref="S3.E1.m1.4.4.5">cases</csymbol><apply id="S3.E1.m1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1"><eq id="S3.E1.m1.1.1.1.1.1.1.2.cmml" xref="S3.E1.m1.1.1.1.1.1.1.2"></eq><apply id="S3.E1.m1.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.E1.m1.1.1.1.1.1.1.3.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S3.E1.m1.1.1.1.1.1.1.3.2.cmml" xref="S3.E1.m1.1.1.1.1.1.1.3.2">𝑢</ci><ci id="S3.E1.m1.1.1.1.1.1.1.3.3.cmml" xref="S3.E1.m1.1.1.1.1.1.1.3.3">𝑡</ci></apply><apply id="S3.E1.m1.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1"><times id="S3.E1.m1.1.1.1.1.1.1.1.2.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.2"></times><ci id="S3.E1.m1.1.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.3">𝜎</ci><apply id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1"><times id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.3"></times><apply id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4"><csymbol cd="ambiguous" id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4">subscript</csymbol><ci id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4.2.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4.2">𝑊</ci><ci id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4.3.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.4.3">𝑢</ci></apply><interval closure="closed" id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2"><apply id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1">subscript</csymbol><ci id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.2">𝑥</ci><ci id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.3">𝑡</ci></apply><apply id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2"><csymbol cd="ambiguous" id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2">subscript</csymbol><ci id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.2.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.2">ℎ</ci><apply id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3"><minus id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.1.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.1"></minus><ci id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.2.cmml" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.2">𝑡</ci><cn id="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.3.cmml" type="integer" xref="S3.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2.3.3">1</cn></apply></apply></interval></apply></apply></apply><ci id="S3.E1.m1.4.5.1.3a.cmml" xref="S3.E1.m1.4.4"><mtext class="ltx_mathvariant_italic" id="S3.E1.m1.4.5.1.3.cmml" xref="S3.E1.m1.4.4.5">otherwise</mtext></ci><apply id="S3.E1.m1.2.2.2.2.1.1.cmml" xref="S3.E1.m1.2.2.2.2.1.1"><eq id="S3.E1.m1.2.2.2.2.1.1.2.cmml" xref="S3.E1.m1.2.2.2.2.1.1.2"></eq><apply id="S3.E1.m1.2.2.2.2.1.1.3.cmml" xref="S3.E1.m1.2.2.2.2.1.1.3"><csymbol cd="ambiguous" id="S3.E1.m1.2.2.2.2.1.1.3.1.cmml" xref="S3.E1.m1.2.2.2.2.1.1.3">subscript</csymbol><ci id="S3.E1.m1.2.2.2.2.1.1.3.2.cmml" xref="S3.E1.m1.2.2.2.2.1.1.3.2">𝑟</ci><ci id="S3.E1.m1.2.2.2.2.1.1.3.3.cmml" xref="S3.E1.m1.2.2.2.2.1.1.3.3">𝑡</ci></apply><apply id="S3.E1.m1.2.2.2.2.1.1.1.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1"><times id="S3.E1.m1.2.2.2.2.1.1.1.2.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.2"></times><ci id="S3.E1.m1.2.2.2.2.1.1.1.3.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.3">𝜎</ci><apply id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1"><times id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.3"></times><apply id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4"><csymbol cd="ambiguous" id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4.1.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4">subscript</csymbol><ci id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4.2.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4.2">𝑊</ci><ci id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4.3.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.4.3">𝑟</ci></apply><interval closure="closed" id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.3.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2"><apply id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1">subscript</csymbol><ci id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1.2.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1.2">𝑥</ci><ci id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.1.1.1.3">𝑡</ci></apply><apply id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2"><csymbol cd="ambiguous" id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.1.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2">subscript</csymbol><ci id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.2.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.2">ℎ</ci><apply id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3"><minus id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.1.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.1"></minus><ci id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.2.cmml" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.2">𝑡</ci><cn id="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.3.cmml" type="integer" xref="S3.E1.m1.2.2.2.2.1.1.1.1.1.1.2.2.2.3.3">1</cn></apply></apply></interval></apply></apply></apply><ci id="S3.E1.m1.4.5.1.5a.cmml" xref="S3.E1.m1.4.4"><mtext class="ltx_mathvariant_italic" id="S3.E1.m1.4.5.1.5.cmml" xref="S3.E1.m1.4.4.5">otherwise</mtext></ci><apply id="S3.E1.m1.3.3.3.3.1.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1"><eq id="S3.E1.m1.3.3.3.3.1.1.3.cmml" xref="S3.E1.m1.3.3.3.3.1.1.3"></eq><apply id="S3.E1.m1.3.3.3.3.1.1.4.cmml" xref="S3.E1.m1.3.3.3.3.1.1.4"><csymbol cd="ambiguous" id="S3.E1.m1.3.3.3.3.1.1.4.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1.4">superscript</csymbol><apply id="S3.E1.m1.3.3.3.3.1.1.4.2.cmml" xref="S3.E1.m1.3.3.3.3.1.1.4"><csymbol cd="ambiguous" id="S3.E1.m1.3.3.3.3.1.1.4.2.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1.4">subscript</csymbol><ci id="S3.E1.m1.3.3.3.3.1.1.4.2.2.cmml" xref="S3.E1.m1.3.3.3.3.1.1.4.2.2">ℎ</ci><ci id="S3.E1.m1.3.3.3.3.1.1.4.2.3.cmml" xref="S3.E1.m1.3.3.3.3.1.1.4.2.3">𝑡</ci></apply><ci id="S3.E1.m1.3.3.3.3.1.1.4.3.cmml" xref="S3.E1.m1.3.3.3.3.1.1.4.3">′</ci></apply><apply id="S3.E1.m1.3.3.3.3.1.1.2.2.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1"><tanh id="S3.E1.m1.3.3.3.3.1.1.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1.1"></tanh><apply id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1"><times id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.3.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.3"></times><apply id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4"><csymbol cd="ambiguous" id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4">subscript</csymbol><ci id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4.2.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4.2">𝑊</ci><ci id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4.3.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.4.3">ℎ</ci></apply><interval closure="closed" id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.3.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2"><apply id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1">subscript</csymbol><ci id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1.2.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1.2">𝑥</ci><ci id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.1.1.1.3">𝑡</ci></apply><apply id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2"><csymbol cd="latexml" id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.1">direct-product</csymbol><apply id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2"><csymbol cd="ambiguous" id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2">subscript</csymbol><ci id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2.2.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2.2">𝑟</ci><ci id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2.3.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.2.3">𝑡</ci></apply><apply id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3"><csymbol cd="ambiguous" id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3">subscript</csymbol><ci id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.2.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.2">ℎ</ci><apply id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3"><minus id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.1.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.1"></minus><ci id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.2.cmml" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.2">𝑡</ci><cn id="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.3.cmml" type="integer" xref="S3.E1.m1.3.3.3.3.1.1.2.1.1.1.2.2.2.3.3.3">1</cn></apply></apply></apply></interval></apply></apply></apply><ci id="S3.E1.m1.4.5.1.7a.cmml" xref="S3.E1.m1.4.4"><mtext class="ltx_mathvariant_italic" id="S3.E1.m1.4.5.1.7.cmml" xref="S3.E1.m1.4.4.5">otherwise</mtext></ci><apply id="S3.E1.m1.4.4.4.4.1.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1"><eq id="S3.E1.m1.4.4.4.4.1.1.2.cmml" xref="S3.E1.m1.4.4.4.4.1.1.2"></eq><apply id="S3.E1.m1.4.4.4.4.1.1.3.cmml" xref="S3.E1.m1.4.4.4.4.1.1.3"><csymbol cd="ambiguous" id="S3.E1.m1.4.4.4.4.1.1.3.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.3">subscript</csymbol><ci id="S3.E1.m1.4.4.4.4.1.1.3.2.cmml" xref="S3.E1.m1.4.4.4.4.1.1.3.2">ℎ</ci><ci id="S3.E1.m1.4.4.4.4.1.1.3.3.cmml" xref="S3.E1.m1.4.4.4.4.1.1.3.3">𝑡</ci></apply><apply id="S3.E1.m1.4.4.4.4.1.1.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1"><plus id="S3.E1.m1.4.4.4.4.1.1.1.2.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.2"></plus><apply id="S3.E1.m1.4.4.4.4.1.1.1.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1"><csymbol cd="latexml" id="S3.E1.m1.4.4.4.4.1.1.1.1.2.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.2">direct-product</csymbol><apply id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1"><minus id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.1"></minus><cn id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.2.cmml" type="integer" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.2">1</cn><apply id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3.2.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3.2">𝑢</ci><ci id="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3.3.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.1.1.1.3.3">𝑡</ci></apply></apply><apply id="S3.E1.m1.4.4.4.4.1.1.1.1.3.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.E1.m1.4.4.4.4.1.1.1.1.3.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3">subscript</csymbol><ci id="S3.E1.m1.4.4.4.4.1.1.1.1.3.2.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3.2">ℎ</ci><apply id="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3.3"><minus id="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.1"></minus><ci id="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.2.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.2">𝑡</ci><cn id="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.3.cmml" type="integer" xref="S3.E1.m1.4.4.4.4.1.1.1.1.3.3.3">1</cn></apply></apply></apply><apply id="S3.E1.m1.4.4.4.4.1.1.1.3.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3"><csymbol cd="latexml" id="S3.E1.m1.4.4.4.4.1.1.1.3.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.1">direct-product</csymbol><apply id="S3.E1.m1.4.4.4.4.1.1.1.3.2.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.2"><csymbol cd="ambiguous" id="S3.E1.m1.4.4.4.4.1.1.1.3.2.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.2">subscript</csymbol><ci id="S3.E1.m1.4.4.4.4.1.1.1.3.2.2.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.2.2">𝑢</ci><ci id="S3.E1.m1.4.4.4.4.1.1.1.3.2.3.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.2.3">𝑡</ci></apply><apply id="S3.E1.m1.4.4.4.4.1.1.1.3.3.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.3"><csymbol cd="ambiguous" id="S3.E1.m1.4.4.4.4.1.1.1.3.3.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.3">superscript</csymbol><apply id="S3.E1.m1.4.4.4.4.1.1.1.3.3.2.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.3"><csymbol cd="ambiguous" id="S3.E1.m1.4.4.4.4.1.1.1.3.3.2.1.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.3">subscript</csymbol><ci id="S3.E1.m1.4.4.4.4.1.1.1.3.3.2.2.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.3.2.2">ℎ</ci><ci id="S3.E1.m1.4.4.4.4.1.1.1.3.3.2.3.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.3.2.3">𝑡</ci></apply><ci id="S3.E1.m1.4.4.4.4.1.1.1.3.3.3.cmml" xref="S3.E1.m1.4.4.4.4.1.1.1.3.3.3">′</ci></apply></apply></apply></apply><ci id="S3.E1.m1.4.5.1.9a.cmml" xref="S3.E1.m1.4.4"><mtext class="ltx_mathvariant_italic" id="S3.E1.m1.4.5.1.9.cmml" xref="S3.E1.m1.4.4.5">otherwise</mtext></ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E1.m1.4c">\begin{cases}u_{t}=\sigma(W_{u}[x_{t},h_{t-1}])\\ r_{t}=\sigma(W_{r}[x_{t},h_{t-1}])\\ h_{t}^{\prime}=\tanh(W_{h}[x_{t},r_{t}\odot h_{t-1}])\\ h_{t}=(1-u_{t})\odot h_{t-1}+u_{t}\odot h_{t}^{\prime}\end{cases}</annotation><annotation encoding="application/x-llamapun" id="S3.E1.m1.4d">{ start_ROW start_CELL italic_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_σ ( italic_W start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT [ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT ] ) end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_σ ( italic_W start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT [ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT ] ) end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = roman_tanh ( italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT [ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⊙ italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT ] ) end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = ( 1 - italic_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ⊙ italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT + italic_u start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⊙ italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(1)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS1.p1.6">where <math alttext="W_{u}" class="ltx_Math" display="inline" id="S3.SS1.p1.1.m1.1"><semantics id="S3.SS1.p1.1.m1.1a"><msub id="S3.SS1.p1.1.m1.1.1" xref="S3.SS1.p1.1.m1.1.1.cmml"><mi id="S3.SS1.p1.1.m1.1.1.2" xref="S3.SS1.p1.1.m1.1.1.2.cmml">W</mi><mi id="S3.SS1.p1.1.m1.1.1.3" xref="S3.SS1.p1.1.m1.1.1.3.cmml">u</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.1.m1.1b"><apply id="S3.SS1.p1.1.m1.1.1.cmml" xref="S3.SS1.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p1.1.m1.1.1.1.cmml" xref="S3.SS1.p1.1.m1.1.1">subscript</csymbol><ci id="S3.SS1.p1.1.m1.1.1.2.cmml" xref="S3.SS1.p1.1.m1.1.1.2">𝑊</ci><ci id="S3.SS1.p1.1.m1.1.1.3.cmml" xref="S3.SS1.p1.1.m1.1.1.3">𝑢</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.1.m1.1c">W_{u}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.1.m1.1d">italic_W start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT</annotation></semantics></math>, <math alttext="W_{r}" class="ltx_Math" display="inline" id="S3.SS1.p1.2.m2.1"><semantics id="S3.SS1.p1.2.m2.1a"><msub id="S3.SS1.p1.2.m2.1.1" xref="S3.SS1.p1.2.m2.1.1.cmml"><mi id="S3.SS1.p1.2.m2.1.1.2" xref="S3.SS1.p1.2.m2.1.1.2.cmml">W</mi><mi id="S3.SS1.p1.2.m2.1.1.3" xref="S3.SS1.p1.2.m2.1.1.3.cmml">r</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.2.m2.1b"><apply id="S3.SS1.p1.2.m2.1.1.cmml" xref="S3.SS1.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S3.SS1.p1.2.m2.1.1.1.cmml" xref="S3.SS1.p1.2.m2.1.1">subscript</csymbol><ci id="S3.SS1.p1.2.m2.1.1.2.cmml" xref="S3.SS1.p1.2.m2.1.1.2">𝑊</ci><ci id="S3.SS1.p1.2.m2.1.1.3.cmml" xref="S3.SS1.p1.2.m2.1.1.3">𝑟</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.2.m2.1c">W_{r}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.2.m2.1d">italic_W start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT</annotation></semantics></math>, and <math alttext="W_{h}" class="ltx_Math" display="inline" id="S3.SS1.p1.3.m3.1"><semantics id="S3.SS1.p1.3.m3.1a"><msub id="S3.SS1.p1.3.m3.1.1" xref="S3.SS1.p1.3.m3.1.1.cmml"><mi id="S3.SS1.p1.3.m3.1.1.2" xref="S3.SS1.p1.3.m3.1.1.2.cmml">W</mi><mi id="S3.SS1.p1.3.m3.1.1.3" xref="S3.SS1.p1.3.m3.1.1.3.cmml">h</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.3.m3.1b"><apply id="S3.SS1.p1.3.m3.1.1.cmml" xref="S3.SS1.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S3.SS1.p1.3.m3.1.1.1.cmml" xref="S3.SS1.p1.3.m3.1.1">subscript</csymbol><ci id="S3.SS1.p1.3.m3.1.1.2.cmml" xref="S3.SS1.p1.3.m3.1.1.2">𝑊</ci><ci id="S3.SS1.p1.3.m3.1.1.3.cmml" xref="S3.SS1.p1.3.m3.1.1.3">ℎ</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.3.m3.1c">W_{h}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.3.m3.1d">italic_W start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT</annotation></semantics></math> are parameter matrices, and <math alttext="x_{t}" class="ltx_Math" display="inline" id="S3.SS1.p1.4.m4.1"><semantics id="S3.SS1.p1.4.m4.1a"><msub id="S3.SS1.p1.4.m4.1.1" xref="S3.SS1.p1.4.m4.1.1.cmml"><mi id="S3.SS1.p1.4.m4.1.1.2" xref="S3.SS1.p1.4.m4.1.1.2.cmml">x</mi><mi id="S3.SS1.p1.4.m4.1.1.3" xref="S3.SS1.p1.4.m4.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.4.m4.1b"><apply id="S3.SS1.p1.4.m4.1.1.cmml" xref="S3.SS1.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S3.SS1.p1.4.m4.1.1.1.cmml" xref="S3.SS1.p1.4.m4.1.1">subscript</csymbol><ci id="S3.SS1.p1.4.m4.1.1.2.cmml" xref="S3.SS1.p1.4.m4.1.1.2">𝑥</ci><ci id="S3.SS1.p1.4.m4.1.1.3.cmml" xref="S3.SS1.p1.4.m4.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.4.m4.1c">x_{t}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.4.m4.1d">italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> and <math alttext="h_{t}" class="ltx_Math" display="inline" id="S3.SS1.p1.5.m5.1"><semantics id="S3.SS1.p1.5.m5.1a"><msub id="S3.SS1.p1.5.m5.1.1" xref="S3.SS1.p1.5.m5.1.1.cmml"><mi id="S3.SS1.p1.5.m5.1.1.2" xref="S3.SS1.p1.5.m5.1.1.2.cmml">h</mi><mi id="S3.SS1.p1.5.m5.1.1.3" xref="S3.SS1.p1.5.m5.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.5.m5.1b"><apply id="S3.SS1.p1.5.m5.1.1.cmml" xref="S3.SS1.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S3.SS1.p1.5.m5.1.1.1.cmml" xref="S3.SS1.p1.5.m5.1.1">subscript</csymbol><ci id="S3.SS1.p1.5.m5.1.1.2.cmml" xref="S3.SS1.p1.5.m5.1.1.2">ℎ</ci><ci id="S3.SS1.p1.5.m5.1.1.3.cmml" xref="S3.SS1.p1.5.m5.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.5.m5.1c">h_{t}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.5.m5.1d">italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> represent the input vector and hidden state vector at time step <math alttext="t" class="ltx_Math" display="inline" id="S3.SS1.p1.6.m6.1"><semantics id="S3.SS1.p1.6.m6.1a"><mi id="S3.SS1.p1.6.m6.1.1" xref="S3.SS1.p1.6.m6.1.1.cmml">t</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.6.m6.1b"><ci id="S3.SS1.p1.6.m6.1.1.cmml" xref="S3.SS1.p1.6.m6.1.1">𝑡</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.6.m6.1c">t</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.6.m6.1d">italic_t</annotation></semantics></math>, respectively.</p> </div> <div class="ltx_para" id="S3.SS1.p2"> <p class="ltx_p" id="S3.SS1.p2.3">The purpose of the main encoder is to construct feature representations of the input sentences from the patent specifications. In the recurrent unit part of the main encoder, Bi-GRU is used, which consists of a forward GRU and a backward GRU. Given an input sequence (such as <math alttext="x_{1},x_{2},\ldots,x_{m}" class="ltx_Math" display="inline" id="S3.SS1.p2.1.m1.4"><semantics id="S3.SS1.p2.1.m1.4a"><mrow id="S3.SS1.p2.1.m1.4.4.3" xref="S3.SS1.p2.1.m1.4.4.4.cmml"><msub id="S3.SS1.p2.1.m1.2.2.1.1" xref="S3.SS1.p2.1.m1.2.2.1.1.cmml"><mi id="S3.SS1.p2.1.m1.2.2.1.1.2" xref="S3.SS1.p2.1.m1.2.2.1.1.2.cmml">x</mi><mn id="S3.SS1.p2.1.m1.2.2.1.1.3" xref="S3.SS1.p2.1.m1.2.2.1.1.3.cmml">1</mn></msub><mo id="S3.SS1.p2.1.m1.4.4.3.4" xref="S3.SS1.p2.1.m1.4.4.4.cmml">,</mo><msub id="S3.SS1.p2.1.m1.3.3.2.2" xref="S3.SS1.p2.1.m1.3.3.2.2.cmml"><mi id="S3.SS1.p2.1.m1.3.3.2.2.2" xref="S3.SS1.p2.1.m1.3.3.2.2.2.cmml">x</mi><mn id="S3.SS1.p2.1.m1.3.3.2.2.3" xref="S3.SS1.p2.1.m1.3.3.2.2.3.cmml">2</mn></msub><mo id="S3.SS1.p2.1.m1.4.4.3.5" xref="S3.SS1.p2.1.m1.4.4.4.cmml">,</mo><mi id="S3.SS1.p2.1.m1.1.1" mathvariant="normal" xref="S3.SS1.p2.1.m1.1.1.cmml">…</mi><mo id="S3.SS1.p2.1.m1.4.4.3.6" xref="S3.SS1.p2.1.m1.4.4.4.cmml">,</mo><msub id="S3.SS1.p2.1.m1.4.4.3.3" xref="S3.SS1.p2.1.m1.4.4.3.3.cmml"><mi id="S3.SS1.p2.1.m1.4.4.3.3.2" xref="S3.SS1.p2.1.m1.4.4.3.3.2.cmml">x</mi><mi id="S3.SS1.p2.1.m1.4.4.3.3.3" xref="S3.SS1.p2.1.m1.4.4.3.3.3.cmml">m</mi></msub></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p2.1.m1.4b"><list id="S3.SS1.p2.1.m1.4.4.4.cmml" xref="S3.SS1.p2.1.m1.4.4.3"><apply id="S3.SS1.p2.1.m1.2.2.1.1.cmml" xref="S3.SS1.p2.1.m1.2.2.1.1"><csymbol cd="ambiguous" id="S3.SS1.p2.1.m1.2.2.1.1.1.cmml" xref="S3.SS1.p2.1.m1.2.2.1.1">subscript</csymbol><ci id="S3.SS1.p2.1.m1.2.2.1.1.2.cmml" xref="S3.SS1.p2.1.m1.2.2.1.1.2">𝑥</ci><cn id="S3.SS1.p2.1.m1.2.2.1.1.3.cmml" type="integer" xref="S3.SS1.p2.1.m1.2.2.1.1.3">1</cn></apply><apply id="S3.SS1.p2.1.m1.3.3.2.2.cmml" xref="S3.SS1.p2.1.m1.3.3.2.2"><csymbol cd="ambiguous" id="S3.SS1.p2.1.m1.3.3.2.2.1.cmml" xref="S3.SS1.p2.1.m1.3.3.2.2">subscript</csymbol><ci id="S3.SS1.p2.1.m1.3.3.2.2.2.cmml" xref="S3.SS1.p2.1.m1.3.3.2.2.2">𝑥</ci><cn id="S3.SS1.p2.1.m1.3.3.2.2.3.cmml" type="integer" xref="S3.SS1.p2.1.m1.3.3.2.2.3">2</cn></apply><ci id="S3.SS1.p2.1.m1.1.1.cmml" xref="S3.SS1.p2.1.m1.1.1">…</ci><apply id="S3.SS1.p2.1.m1.4.4.3.3.cmml" xref="S3.SS1.p2.1.m1.4.4.3.3"><csymbol cd="ambiguous" id="S3.SS1.p2.1.m1.4.4.3.3.1.cmml" xref="S3.SS1.p2.1.m1.4.4.3.3">subscript</csymbol><ci id="S3.SS1.p2.1.m1.4.4.3.3.2.cmml" xref="S3.SS1.p2.1.m1.4.4.3.3.2">𝑥</ci><ci id="S3.SS1.p2.1.m1.4.4.3.3.3.cmml" xref="S3.SS1.p2.1.m1.4.4.3.3.3">𝑚</ci></apply></list></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p2.1.m1.4c">x_{1},x_{2},\ldots,x_{m}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p2.1.m1.4d">italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT</annotation></semantics></math> in Figure 1), the forward GRU sequentially calculates the hidden state representations <math alttext="\overrightarrow{h_{1}^{p}},\overrightarrow{h_{2}^{p}},\ldots,\overrightarrow{h% _{m}^{p}}" class="ltx_Math" display="inline" id="S3.SS1.p2.2.m2.4"><semantics id="S3.SS1.p2.2.m2.4a"><mrow id="S3.SS1.p2.2.m2.4.5.2" xref="S3.SS1.p2.2.m2.4.5.1.cmml"><mover accent="true" id="S3.SS1.p2.2.m2.1.1" xref="S3.SS1.p2.2.m2.1.1.cmml"><msubsup id="S3.SS1.p2.2.m2.1.1.2" xref="S3.SS1.p2.2.m2.1.1.2.cmml"><mi id="S3.SS1.p2.2.m2.1.1.2.2.2" xref="S3.SS1.p2.2.m2.1.1.2.2.2.cmml">h</mi><mn id="S3.SS1.p2.2.m2.1.1.2.2.3" xref="S3.SS1.p2.2.m2.1.1.2.2.3.cmml">1</mn><mi id="S3.SS1.p2.2.m2.1.1.2.3" xref="S3.SS1.p2.2.m2.1.1.2.3.cmml">p</mi></msubsup><mo id="S3.SS1.p2.2.m2.1.1.1" stretchy="false" xref="S3.SS1.p2.2.m2.1.1.1.cmml">→</mo></mover><mo id="S3.SS1.p2.2.m2.4.5.2.1" xref="S3.SS1.p2.2.m2.4.5.1.cmml">,</mo><mover accent="true" id="S3.SS1.p2.2.m2.2.2" xref="S3.SS1.p2.2.m2.2.2.cmml"><msubsup id="S3.SS1.p2.2.m2.2.2.2" xref="S3.SS1.p2.2.m2.2.2.2.cmml"><mi id="S3.SS1.p2.2.m2.2.2.2.2.2" xref="S3.SS1.p2.2.m2.2.2.2.2.2.cmml">h</mi><mn id="S3.SS1.p2.2.m2.2.2.2.2.3" xref="S3.SS1.p2.2.m2.2.2.2.2.3.cmml">2</mn><mi id="S3.SS1.p2.2.m2.2.2.2.3" xref="S3.SS1.p2.2.m2.2.2.2.3.cmml">p</mi></msubsup><mo id="S3.SS1.p2.2.m2.2.2.1" stretchy="false" xref="S3.SS1.p2.2.m2.2.2.1.cmml">→</mo></mover><mo id="S3.SS1.p2.2.m2.4.5.2.2" xref="S3.SS1.p2.2.m2.4.5.1.cmml">,</mo><mi id="S3.SS1.p2.2.m2.3.3" mathvariant="normal" xref="S3.SS1.p2.2.m2.3.3.cmml">…</mi><mo id="S3.SS1.p2.2.m2.4.5.2.3" xref="S3.SS1.p2.2.m2.4.5.1.cmml">,</mo><mover accent="true" id="S3.SS1.p2.2.m2.4.4" xref="S3.SS1.p2.2.m2.4.4.cmml"><msubsup id="S3.SS1.p2.2.m2.4.4.2" xref="S3.SS1.p2.2.m2.4.4.2.cmml"><mi id="S3.SS1.p2.2.m2.4.4.2.2.2" xref="S3.SS1.p2.2.m2.4.4.2.2.2.cmml">h</mi><mi id="S3.SS1.p2.2.m2.4.4.2.2.3" xref="S3.SS1.p2.2.m2.4.4.2.2.3.cmml">m</mi><mi id="S3.SS1.p2.2.m2.4.4.2.3" xref="S3.SS1.p2.2.m2.4.4.2.3.cmml">p</mi></msubsup><mo id="S3.SS1.p2.2.m2.4.4.1" stretchy="false" xref="S3.SS1.p2.2.m2.4.4.1.cmml">→</mo></mover></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p2.2.m2.4b"><list id="S3.SS1.p2.2.m2.4.5.1.cmml" xref="S3.SS1.p2.2.m2.4.5.2"><apply id="S3.SS1.p2.2.m2.1.1.cmml" xref="S3.SS1.p2.2.m2.1.1"><ci id="S3.SS1.p2.2.m2.1.1.1.cmml" xref="S3.SS1.p2.2.m2.1.1.1">→</ci><apply id="S3.SS1.p2.2.m2.1.1.2.cmml" xref="S3.SS1.p2.2.m2.1.1.2"><csymbol cd="ambiguous" id="S3.SS1.p2.2.m2.1.1.2.1.cmml" xref="S3.SS1.p2.2.m2.1.1.2">superscript</csymbol><apply id="S3.SS1.p2.2.m2.1.1.2.2.cmml" xref="S3.SS1.p2.2.m2.1.1.2"><csymbol cd="ambiguous" id="S3.SS1.p2.2.m2.1.1.2.2.1.cmml" xref="S3.SS1.p2.2.m2.1.1.2">subscript</csymbol><ci id="S3.SS1.p2.2.m2.1.1.2.2.2.cmml" xref="S3.SS1.p2.2.m2.1.1.2.2.2">ℎ</ci><cn id="S3.SS1.p2.2.m2.1.1.2.2.3.cmml" type="integer" xref="S3.SS1.p2.2.m2.1.1.2.2.3">1</cn></apply><ci id="S3.SS1.p2.2.m2.1.1.2.3.cmml" xref="S3.SS1.p2.2.m2.1.1.2.3">𝑝</ci></apply></apply><apply id="S3.SS1.p2.2.m2.2.2.cmml" xref="S3.SS1.p2.2.m2.2.2"><ci id="S3.SS1.p2.2.m2.2.2.1.cmml" xref="S3.SS1.p2.2.m2.2.2.1">→</ci><apply id="S3.SS1.p2.2.m2.2.2.2.cmml" xref="S3.SS1.p2.2.m2.2.2.2"><csymbol cd="ambiguous" id="S3.SS1.p2.2.m2.2.2.2.1.cmml" xref="S3.SS1.p2.2.m2.2.2.2">superscript</csymbol><apply id="S3.SS1.p2.2.m2.2.2.2.2.cmml" xref="S3.SS1.p2.2.m2.2.2.2"><csymbol cd="ambiguous" id="S3.SS1.p2.2.m2.2.2.2.2.1.cmml" xref="S3.SS1.p2.2.m2.2.2.2">subscript</csymbol><ci id="S3.SS1.p2.2.m2.2.2.2.2.2.cmml" xref="S3.SS1.p2.2.m2.2.2.2.2.2">ℎ</ci><cn id="S3.SS1.p2.2.m2.2.2.2.2.3.cmml" type="integer" xref="S3.SS1.p2.2.m2.2.2.2.2.3">2</cn></apply><ci id="S3.SS1.p2.2.m2.2.2.2.3.cmml" xref="S3.SS1.p2.2.m2.2.2.2.3">𝑝</ci></apply></apply><ci id="S3.SS1.p2.2.m2.3.3.cmml" xref="S3.SS1.p2.2.m2.3.3">…</ci><apply id="S3.SS1.p2.2.m2.4.4.cmml" xref="S3.SS1.p2.2.m2.4.4"><ci id="S3.SS1.p2.2.m2.4.4.1.cmml" xref="S3.SS1.p2.2.m2.4.4.1">→</ci><apply id="S3.SS1.p2.2.m2.4.4.2.cmml" xref="S3.SS1.p2.2.m2.4.4.2"><csymbol cd="ambiguous" id="S3.SS1.p2.2.m2.4.4.2.1.cmml" xref="S3.SS1.p2.2.m2.4.4.2">superscript</csymbol><apply id="S3.SS1.p2.2.m2.4.4.2.2.cmml" xref="S3.SS1.p2.2.m2.4.4.2"><csymbol cd="ambiguous" id="S3.SS1.p2.2.m2.4.4.2.2.1.cmml" xref="S3.SS1.p2.2.m2.4.4.2">subscript</csymbol><ci id="S3.SS1.p2.2.m2.4.4.2.2.2.cmml" xref="S3.SS1.p2.2.m2.4.4.2.2.2">ℎ</ci><ci id="S3.SS1.p2.2.m2.4.4.2.2.3.cmml" xref="S3.SS1.p2.2.m2.4.4.2.2.3">𝑚</ci></apply><ci id="S3.SS1.p2.2.m2.4.4.2.3.cmml" xref="S3.SS1.p2.2.m2.4.4.2.3">𝑝</ci></apply></apply></list></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p2.2.m2.4c">\overrightarrow{h_{1}^{p}},\overrightarrow{h_{2}^{p}},\ldots,\overrightarrow{h% _{m}^{p}}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p2.2.m2.4d">over→ start_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG , over→ start_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG , … , over→ start_ARG italic_h start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG</annotation></semantics></math> based on the current word embeddings. The backward GRU generates the hidden state representations <math alttext="\overleftarrow{h_{1}^{p}},\overleftarrow{h_{2}^{p}},\ldots,\overleftarrow{h_{m% }^{p}}" class="ltx_Math" display="inline" id="S3.SS1.p2.3.m3.4"><semantics id="S3.SS1.p2.3.m3.4a"><mrow id="S3.SS1.p2.3.m3.4.5.2" xref="S3.SS1.p2.3.m3.4.5.1.cmml"><mover accent="true" id="S3.SS1.p2.3.m3.1.1" xref="S3.SS1.p2.3.m3.1.1.cmml"><msubsup id="S3.SS1.p2.3.m3.1.1.2" xref="S3.SS1.p2.3.m3.1.1.2.cmml"><mi id="S3.SS1.p2.3.m3.1.1.2.2.2" xref="S3.SS1.p2.3.m3.1.1.2.2.2.cmml">h</mi><mn id="S3.SS1.p2.3.m3.1.1.2.2.3" xref="S3.SS1.p2.3.m3.1.1.2.2.3.cmml">1</mn><mi id="S3.SS1.p2.3.m3.1.1.2.3" xref="S3.SS1.p2.3.m3.1.1.2.3.cmml">p</mi></msubsup><mo id="S3.SS1.p2.3.m3.1.1.1" stretchy="false" xref="S3.SS1.p2.3.m3.1.1.1.cmml">←</mo></mover><mo id="S3.SS1.p2.3.m3.4.5.2.1" xref="S3.SS1.p2.3.m3.4.5.1.cmml">,</mo><mover accent="true" id="S3.SS1.p2.3.m3.2.2" xref="S3.SS1.p2.3.m3.2.2.cmml"><msubsup id="S3.SS1.p2.3.m3.2.2.2" xref="S3.SS1.p2.3.m3.2.2.2.cmml"><mi id="S3.SS1.p2.3.m3.2.2.2.2.2" xref="S3.SS1.p2.3.m3.2.2.2.2.2.cmml">h</mi><mn id="S3.SS1.p2.3.m3.2.2.2.2.3" xref="S3.SS1.p2.3.m3.2.2.2.2.3.cmml">2</mn><mi id="S3.SS1.p2.3.m3.2.2.2.3" xref="S3.SS1.p2.3.m3.2.2.2.3.cmml">p</mi></msubsup><mo id="S3.SS1.p2.3.m3.2.2.1" stretchy="false" xref="S3.SS1.p2.3.m3.2.2.1.cmml">←</mo></mover><mo id="S3.SS1.p2.3.m3.4.5.2.2" xref="S3.SS1.p2.3.m3.4.5.1.cmml">,</mo><mi id="S3.SS1.p2.3.m3.3.3" mathvariant="normal" xref="S3.SS1.p2.3.m3.3.3.cmml">…</mi><mo id="S3.SS1.p2.3.m3.4.5.2.3" xref="S3.SS1.p2.3.m3.4.5.1.cmml">,</mo><mover accent="true" id="S3.SS1.p2.3.m3.4.4" xref="S3.SS1.p2.3.m3.4.4.cmml"><msubsup id="S3.SS1.p2.3.m3.4.4.2" xref="S3.SS1.p2.3.m3.4.4.2.cmml"><mi id="S3.SS1.p2.3.m3.4.4.2.2.2" xref="S3.SS1.p2.3.m3.4.4.2.2.2.cmml">h</mi><mi id="S3.SS1.p2.3.m3.4.4.2.2.3" xref="S3.SS1.p2.3.m3.4.4.2.2.3.cmml">m</mi><mi id="S3.SS1.p2.3.m3.4.4.2.3" xref="S3.SS1.p2.3.m3.4.4.2.3.cmml">p</mi></msubsup><mo id="S3.SS1.p2.3.m3.4.4.1" stretchy="false" xref="S3.SS1.p2.3.m3.4.4.1.cmml">←</mo></mover></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p2.3.m3.4b"><list id="S3.SS1.p2.3.m3.4.5.1.cmml" xref="S3.SS1.p2.3.m3.4.5.2"><apply id="S3.SS1.p2.3.m3.1.1.cmml" xref="S3.SS1.p2.3.m3.1.1"><ci id="S3.SS1.p2.3.m3.1.1.1.cmml" xref="S3.SS1.p2.3.m3.1.1.1">←</ci><apply id="S3.SS1.p2.3.m3.1.1.2.cmml" xref="S3.SS1.p2.3.m3.1.1.2"><csymbol cd="ambiguous" id="S3.SS1.p2.3.m3.1.1.2.1.cmml" xref="S3.SS1.p2.3.m3.1.1.2">superscript</csymbol><apply id="S3.SS1.p2.3.m3.1.1.2.2.cmml" xref="S3.SS1.p2.3.m3.1.1.2"><csymbol cd="ambiguous" id="S3.SS1.p2.3.m3.1.1.2.2.1.cmml" xref="S3.SS1.p2.3.m3.1.1.2">subscript</csymbol><ci id="S3.SS1.p2.3.m3.1.1.2.2.2.cmml" xref="S3.SS1.p2.3.m3.1.1.2.2.2">ℎ</ci><cn id="S3.SS1.p2.3.m3.1.1.2.2.3.cmml" type="integer" xref="S3.SS1.p2.3.m3.1.1.2.2.3">1</cn></apply><ci id="S3.SS1.p2.3.m3.1.1.2.3.cmml" xref="S3.SS1.p2.3.m3.1.1.2.3">𝑝</ci></apply></apply><apply id="S3.SS1.p2.3.m3.2.2.cmml" xref="S3.SS1.p2.3.m3.2.2"><ci id="S3.SS1.p2.3.m3.2.2.1.cmml" xref="S3.SS1.p2.3.m3.2.2.1">←</ci><apply id="S3.SS1.p2.3.m3.2.2.2.cmml" xref="S3.SS1.p2.3.m3.2.2.2"><csymbol cd="ambiguous" id="S3.SS1.p2.3.m3.2.2.2.1.cmml" xref="S3.SS1.p2.3.m3.2.2.2">superscript</csymbol><apply id="S3.SS1.p2.3.m3.2.2.2.2.cmml" xref="S3.SS1.p2.3.m3.2.2.2"><csymbol cd="ambiguous" id="S3.SS1.p2.3.m3.2.2.2.2.1.cmml" xref="S3.SS1.p2.3.m3.2.2.2">subscript</csymbol><ci id="S3.SS1.p2.3.m3.2.2.2.2.2.cmml" xref="S3.SS1.p2.3.m3.2.2.2.2.2">ℎ</ci><cn id="S3.SS1.p2.3.m3.2.2.2.2.3.cmml" type="integer" xref="S3.SS1.p2.3.m3.2.2.2.2.3">2</cn></apply><ci id="S3.SS1.p2.3.m3.2.2.2.3.cmml" xref="S3.SS1.p2.3.m3.2.2.2.3">𝑝</ci></apply></apply><ci id="S3.SS1.p2.3.m3.3.3.cmml" xref="S3.SS1.p2.3.m3.3.3">…</ci><apply id="S3.SS1.p2.3.m3.4.4.cmml" xref="S3.SS1.p2.3.m3.4.4"><ci id="S3.SS1.p2.3.m3.4.4.1.cmml" xref="S3.SS1.p2.3.m3.4.4.1">←</ci><apply id="S3.SS1.p2.3.m3.4.4.2.cmml" xref="S3.SS1.p2.3.m3.4.4.2"><csymbol cd="ambiguous" id="S3.SS1.p2.3.m3.4.4.2.1.cmml" xref="S3.SS1.p2.3.m3.4.4.2">superscript</csymbol><apply id="S3.SS1.p2.3.m3.4.4.2.2.cmml" xref="S3.SS1.p2.3.m3.4.4.2"><csymbol cd="ambiguous" id="S3.SS1.p2.3.m3.4.4.2.2.1.cmml" xref="S3.SS1.p2.3.m3.4.4.2">subscript</csymbol><ci id="S3.SS1.p2.3.m3.4.4.2.2.2.cmml" xref="S3.SS1.p2.3.m3.4.4.2.2.2">ℎ</ci><ci id="S3.SS1.p2.3.m3.4.4.2.2.3.cmml" xref="S3.SS1.p2.3.m3.4.4.2.2.3">𝑚</ci></apply><ci id="S3.SS1.p2.3.m3.4.4.2.3.cmml" xref="S3.SS1.p2.3.m3.4.4.2.3">𝑝</ci></apply></apply></list></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p2.3.m3.4c">\overleftarrow{h_{1}^{p}},\overleftarrow{h_{2}^{p}},\ldots,\overleftarrow{h_{m% }^{p}}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p2.3.m3.4d">over← start_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG , over← start_ARG italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG , … , over← start_ARG italic_h start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG</annotation></semantics></math> for each word in the reverse sequence. These two hidden states are defined as:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E2"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\begin{cases}\overrightarrow{h_{t}^{p}}=\text{GRU}^{p}(x_{t},\overrightarrow{h% _{t-1}^{p}})\\ \overleftarrow{h_{t}^{p}}=\text{GRU}^{p}(x_{t},\overleftarrow{h_{t-1}^{p}})% \end{cases}" class="ltx_Math" display="block" id="S3.E2.m1.2"><semantics id="S3.E2.m1.2a"><mrow id="S3.E2.m1.2.2" xref="S3.E2.m1.2.3.1.cmml"><mo id="S3.E2.m1.2.2.3" xref="S3.E2.m1.2.3.1.1.cmml">{</mo><mtable columnspacing="5pt" displaystyle="true" id="S3.E2.m1.2.2.2" rowspacing="0pt" xref="S3.E2.m1.2.3.1.cmml"><mtr id="S3.E2.m1.2.2.2a" xref="S3.E2.m1.2.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S3.E2.m1.2.2.2b" xref="S3.E2.m1.2.3.1.cmml"><mrow id="S3.E2.m1.1.1.1.1.1.1" xref="S3.E2.m1.1.1.1.1.1.1.cmml"><mover accent="true" id="S3.E2.m1.1.1.1.1.1.1.4" xref="S3.E2.m1.1.1.1.1.1.1.4.cmml"><msubsup id="S3.E2.m1.1.1.1.1.1.1.4.2" xref="S3.E2.m1.1.1.1.1.1.1.4.2.cmml"><mi id="S3.E2.m1.1.1.1.1.1.1.4.2.2.2" xref="S3.E2.m1.1.1.1.1.1.1.4.2.2.2.cmml">h</mi><mi id="S3.E2.m1.1.1.1.1.1.1.4.2.2.3" xref="S3.E2.m1.1.1.1.1.1.1.4.2.2.3.cmml">t</mi><mi id="S3.E2.m1.1.1.1.1.1.1.4.2.3" xref="S3.E2.m1.1.1.1.1.1.1.4.2.3.cmml">p</mi></msubsup><mo id="S3.E2.m1.1.1.1.1.1.1.4.1" stretchy="false" xref="S3.E2.m1.1.1.1.1.1.1.4.1.cmml">→</mo></mover><mo id="S3.E2.m1.1.1.1.1.1.1.3" xref="S3.E2.m1.1.1.1.1.1.1.3.cmml">=</mo><mrow id="S3.E2.m1.1.1.1.1.1.1.2" xref="S3.E2.m1.1.1.1.1.1.1.2.cmml"><msup id="S3.E2.m1.1.1.1.1.1.1.2.3" xref="S3.E2.m1.1.1.1.1.1.1.2.3.cmml"><mtext id="S3.E2.m1.1.1.1.1.1.1.2.3.2" xref="S3.E2.m1.1.1.1.1.1.1.2.3.2a.cmml">GRU</mtext><mi id="S3.E2.m1.1.1.1.1.1.1.2.3.3" xref="S3.E2.m1.1.1.1.1.1.1.2.3.3.cmml">p</mi></msup><mo id="S3.E2.m1.1.1.1.1.1.1.2.2" xref="S3.E2.m1.1.1.1.1.1.1.2.2.cmml">⁢</mo><mrow id="S3.E2.m1.1.1.1.1.1.1.2.1.1" xref="S3.E2.m1.1.1.1.1.1.1.2.1.2.cmml"><mo id="S3.E2.m1.1.1.1.1.1.1.2.1.1.2" stretchy="false" xref="S3.E2.m1.1.1.1.1.1.1.2.1.2.cmml">(</mo><msub id="S3.E2.m1.1.1.1.1.1.1.2.1.1.1" xref="S3.E2.m1.1.1.1.1.1.1.2.1.1.1.cmml"><mi id="S3.E2.m1.1.1.1.1.1.1.2.1.1.1.2" xref="S3.E2.m1.1.1.1.1.1.1.2.1.1.1.2.cmml">x</mi><mi id="S3.E2.m1.1.1.1.1.1.1.2.1.1.1.3" xref="S3.E2.m1.1.1.1.1.1.1.2.1.1.1.3.cmml">t</mi></msub><mo id="S3.E2.m1.1.1.1.1.1.1.2.1.1.3" xref="S3.E2.m1.1.1.1.1.1.1.2.1.2.cmml">,</mo><mover accent="true" id="S3.E2.m1.1.1.1.1.1.1.1" xref="S3.E2.m1.1.1.1.1.1.1.1.cmml"><msubsup id="S3.E2.m1.1.1.1.1.1.1.1.2" xref="S3.E2.m1.1.1.1.1.1.1.1.2.cmml"><mi id="S3.E2.m1.1.1.1.1.1.1.1.2.2.2" xref="S3.E2.m1.1.1.1.1.1.1.1.2.2.2.cmml">h</mi><mrow id="S3.E2.m1.1.1.1.1.1.1.1.2.2.3" xref="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.cmml"><mi id="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.2" xref="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.2.cmml">t</mi><mo id="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.1" xref="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.1.cmml">−</mo><mn id="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.3" xref="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.3.cmml">1</mn></mrow><mi id="S3.E2.m1.1.1.1.1.1.1.1.2.3" xref="S3.E2.m1.1.1.1.1.1.1.1.2.3.cmml">p</mi></msubsup><mo id="S3.E2.m1.1.1.1.1.1.1.1.1" stretchy="false" xref="S3.E2.m1.1.1.1.1.1.1.1.1.cmml">→</mo></mover><mo id="S3.E2.m1.1.1.1.1.1.1.2.1.1.4" stretchy="false" xref="S3.E2.m1.1.1.1.1.1.1.2.1.2.cmml">)</mo></mrow></mrow></mrow></mtd><mtd id="S3.E2.m1.2.2.2c" xref="S3.E2.m1.2.3.1.1.cmml"></mtd></mtr><mtr id="S3.E2.m1.2.2.2d" xref="S3.E2.m1.2.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S3.E2.m1.2.2.2e" xref="S3.E2.m1.2.3.1.cmml"><mrow id="S3.E2.m1.2.2.2.2.1.1" xref="S3.E2.m1.2.2.2.2.1.1.cmml"><mover accent="true" id="S3.E2.m1.2.2.2.2.1.1.4" xref="S3.E2.m1.2.2.2.2.1.1.4.cmml"><msubsup id="S3.E2.m1.2.2.2.2.1.1.4.2" xref="S3.E2.m1.2.2.2.2.1.1.4.2.cmml"><mi id="S3.E2.m1.2.2.2.2.1.1.4.2.2.2" xref="S3.E2.m1.2.2.2.2.1.1.4.2.2.2.cmml">h</mi><mi id="S3.E2.m1.2.2.2.2.1.1.4.2.2.3" xref="S3.E2.m1.2.2.2.2.1.1.4.2.2.3.cmml">t</mi><mi id="S3.E2.m1.2.2.2.2.1.1.4.2.3" xref="S3.E2.m1.2.2.2.2.1.1.4.2.3.cmml">p</mi></msubsup><mo id="S3.E2.m1.2.2.2.2.1.1.4.1" stretchy="false" xref="S3.E2.m1.2.2.2.2.1.1.4.1.cmml">←</mo></mover><mo id="S3.E2.m1.2.2.2.2.1.1.3" xref="S3.E2.m1.2.2.2.2.1.1.3.cmml">=</mo><mrow id="S3.E2.m1.2.2.2.2.1.1.2" xref="S3.E2.m1.2.2.2.2.1.1.2.cmml"><msup id="S3.E2.m1.2.2.2.2.1.1.2.3" xref="S3.E2.m1.2.2.2.2.1.1.2.3.cmml"><mtext id="S3.E2.m1.2.2.2.2.1.1.2.3.2" xref="S3.E2.m1.2.2.2.2.1.1.2.3.2a.cmml">GRU</mtext><mi id="S3.E2.m1.2.2.2.2.1.1.2.3.3" xref="S3.E2.m1.2.2.2.2.1.1.2.3.3.cmml">p</mi></msup><mo id="S3.E2.m1.2.2.2.2.1.1.2.2" xref="S3.E2.m1.2.2.2.2.1.1.2.2.cmml">⁢</mo><mrow id="S3.E2.m1.2.2.2.2.1.1.2.1.1" xref="S3.E2.m1.2.2.2.2.1.1.2.1.2.cmml"><mo id="S3.E2.m1.2.2.2.2.1.1.2.1.1.2" stretchy="false" xref="S3.E2.m1.2.2.2.2.1.1.2.1.2.cmml">(</mo><msub id="S3.E2.m1.2.2.2.2.1.1.2.1.1.1" xref="S3.E2.m1.2.2.2.2.1.1.2.1.1.1.cmml"><mi id="S3.E2.m1.2.2.2.2.1.1.2.1.1.1.2" xref="S3.E2.m1.2.2.2.2.1.1.2.1.1.1.2.cmml">x</mi><mi id="S3.E2.m1.2.2.2.2.1.1.2.1.1.1.3" xref="S3.E2.m1.2.2.2.2.1.1.2.1.1.1.3.cmml">t</mi></msub><mo id="S3.E2.m1.2.2.2.2.1.1.2.1.1.3" xref="S3.E2.m1.2.2.2.2.1.1.2.1.2.cmml">,</mo><mover accent="true" id="S3.E2.m1.2.2.2.2.1.1.1" xref="S3.E2.m1.2.2.2.2.1.1.1.cmml"><msubsup id="S3.E2.m1.2.2.2.2.1.1.1.2" xref="S3.E2.m1.2.2.2.2.1.1.1.2.cmml"><mi id="S3.E2.m1.2.2.2.2.1.1.1.2.2.2" xref="S3.E2.m1.2.2.2.2.1.1.1.2.2.2.cmml">h</mi><mrow id="S3.E2.m1.2.2.2.2.1.1.1.2.2.3" xref="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.cmml"><mi id="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.2" xref="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.2.cmml">t</mi><mo id="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.1" xref="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.1.cmml">−</mo><mn id="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.3" xref="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.3.cmml">1</mn></mrow><mi id="S3.E2.m1.2.2.2.2.1.1.1.2.3" xref="S3.E2.m1.2.2.2.2.1.1.1.2.3.cmml">p</mi></msubsup><mo id="S3.E2.m1.2.2.2.2.1.1.1.1" stretchy="false" xref="S3.E2.m1.2.2.2.2.1.1.1.1.cmml">←</mo></mover><mo id="S3.E2.m1.2.2.2.2.1.1.2.1.1.4" stretchy="false" xref="S3.E2.m1.2.2.2.2.1.1.2.1.2.cmml">)</mo></mrow></mrow></mrow></mtd><mtd id="S3.E2.m1.2.2.2f" xref="S3.E2.m1.2.3.1.1.cmml"></mtd></mtr></mtable></mrow><annotation-xml encoding="MathML-Content" id="S3.E2.m1.2b"><apply id="S3.E2.m1.2.3.1.cmml" xref="S3.E2.m1.2.2"><csymbol cd="latexml" id="S3.E2.m1.2.3.1.1.cmml" xref="S3.E2.m1.2.2.3">cases</csymbol><apply id="S3.E2.m1.1.1.1.1.1.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1"><eq id="S3.E2.m1.1.1.1.1.1.1.3.cmml" xref="S3.E2.m1.1.1.1.1.1.1.3"></eq><apply id="S3.E2.m1.1.1.1.1.1.1.4.cmml" xref="S3.E2.m1.1.1.1.1.1.1.4"><ci id="S3.E2.m1.1.1.1.1.1.1.4.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.4.1">→</ci><apply id="S3.E2.m1.1.1.1.1.1.1.4.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.4.2"><csymbol cd="ambiguous" id="S3.E2.m1.1.1.1.1.1.1.4.2.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.4.2">superscript</csymbol><apply id="S3.E2.m1.1.1.1.1.1.1.4.2.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.4.2"><csymbol cd="ambiguous" id="S3.E2.m1.1.1.1.1.1.1.4.2.2.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.4.2">subscript</csymbol><ci id="S3.E2.m1.1.1.1.1.1.1.4.2.2.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.4.2.2.2">ℎ</ci><ci id="S3.E2.m1.1.1.1.1.1.1.4.2.2.3.cmml" xref="S3.E2.m1.1.1.1.1.1.1.4.2.2.3">𝑡</ci></apply><ci id="S3.E2.m1.1.1.1.1.1.1.4.2.3.cmml" xref="S3.E2.m1.1.1.1.1.1.1.4.2.3">𝑝</ci></apply></apply><apply id="S3.E2.m1.1.1.1.1.1.1.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2"><times id="S3.E2.m1.1.1.1.1.1.1.2.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2.2"></times><apply id="S3.E2.m1.1.1.1.1.1.1.2.3.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2.3"><csymbol cd="ambiguous" id="S3.E2.m1.1.1.1.1.1.1.2.3.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2.3">superscript</csymbol><ci id="S3.E2.m1.1.1.1.1.1.1.2.3.2a.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2.3.2"><mtext id="S3.E2.m1.1.1.1.1.1.1.2.3.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2.3.2">GRU</mtext></ci><ci id="S3.E2.m1.1.1.1.1.1.1.2.3.3.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2.3.3">𝑝</ci></apply><interval closure="open" id="S3.E2.m1.1.1.1.1.1.1.2.1.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2.1.1"><apply id="S3.E2.m1.1.1.1.1.1.1.2.1.1.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2.1.1.1"><csymbol cd="ambiguous" id="S3.E2.m1.1.1.1.1.1.1.2.1.1.1.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2.1.1.1">subscript</csymbol><ci id="S3.E2.m1.1.1.1.1.1.1.2.1.1.1.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2.1.1.1.2">𝑥</ci><ci id="S3.E2.m1.1.1.1.1.1.1.2.1.1.1.3.cmml" xref="S3.E2.m1.1.1.1.1.1.1.2.1.1.1.3">𝑡</ci></apply><apply id="S3.E2.m1.1.1.1.1.1.1.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1"><ci id="S3.E2.m1.1.1.1.1.1.1.1.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1.1">→</ci><apply id="S3.E2.m1.1.1.1.1.1.1.1.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S3.E2.m1.1.1.1.1.1.1.1.2.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1.2">superscript</csymbol><apply id="S3.E2.m1.1.1.1.1.1.1.1.2.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S3.E2.m1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S3.E2.m1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1.2.2.2">ℎ</ci><apply id="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1.2.2.3"><minus id="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.1.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.1"></minus><ci id="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.2.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.2">𝑡</ci><cn id="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.3.cmml" type="integer" xref="S3.E2.m1.1.1.1.1.1.1.1.2.2.3.3">1</cn></apply></apply><ci id="S3.E2.m1.1.1.1.1.1.1.1.2.3.cmml" xref="S3.E2.m1.1.1.1.1.1.1.1.2.3">𝑝</ci></apply></apply></interval></apply></apply><ci id="S3.E2.m1.2.3.1.3a.cmml" xref="S3.E2.m1.2.2"><mtext class="ltx_mathvariant_italic" id="S3.E2.m1.2.3.1.3.cmml" xref="S3.E2.m1.2.2.3">otherwise</mtext></ci><apply id="S3.E2.m1.2.2.2.2.1.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1"><eq id="S3.E2.m1.2.2.2.2.1.1.3.cmml" xref="S3.E2.m1.2.2.2.2.1.1.3"></eq><apply id="S3.E2.m1.2.2.2.2.1.1.4.cmml" xref="S3.E2.m1.2.2.2.2.1.1.4"><ci id="S3.E2.m1.2.2.2.2.1.1.4.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1.4.1">←</ci><apply id="S3.E2.m1.2.2.2.2.1.1.4.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.4.2"><csymbol cd="ambiguous" id="S3.E2.m1.2.2.2.2.1.1.4.2.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1.4.2">superscript</csymbol><apply id="S3.E2.m1.2.2.2.2.1.1.4.2.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.4.2"><csymbol cd="ambiguous" id="S3.E2.m1.2.2.2.2.1.1.4.2.2.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1.4.2">subscript</csymbol><ci id="S3.E2.m1.2.2.2.2.1.1.4.2.2.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.4.2.2.2">ℎ</ci><ci id="S3.E2.m1.2.2.2.2.1.1.4.2.2.3.cmml" xref="S3.E2.m1.2.2.2.2.1.1.4.2.2.3">𝑡</ci></apply><ci id="S3.E2.m1.2.2.2.2.1.1.4.2.3.cmml" xref="S3.E2.m1.2.2.2.2.1.1.4.2.3">𝑝</ci></apply></apply><apply id="S3.E2.m1.2.2.2.2.1.1.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2"><times id="S3.E2.m1.2.2.2.2.1.1.2.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2.2"></times><apply id="S3.E2.m1.2.2.2.2.1.1.2.3.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2.3"><csymbol cd="ambiguous" id="S3.E2.m1.2.2.2.2.1.1.2.3.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2.3">superscript</csymbol><ci id="S3.E2.m1.2.2.2.2.1.1.2.3.2a.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2.3.2"><mtext id="S3.E2.m1.2.2.2.2.1.1.2.3.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2.3.2">GRU</mtext></ci><ci id="S3.E2.m1.2.2.2.2.1.1.2.3.3.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2.3.3">𝑝</ci></apply><interval closure="open" id="S3.E2.m1.2.2.2.2.1.1.2.1.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2.1.1"><apply id="S3.E2.m1.2.2.2.2.1.1.2.1.1.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2.1.1.1"><csymbol cd="ambiguous" id="S3.E2.m1.2.2.2.2.1.1.2.1.1.1.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2.1.1.1">subscript</csymbol><ci id="S3.E2.m1.2.2.2.2.1.1.2.1.1.1.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2.1.1.1.2">𝑥</ci><ci id="S3.E2.m1.2.2.2.2.1.1.2.1.1.1.3.cmml" xref="S3.E2.m1.2.2.2.2.1.1.2.1.1.1.3">𝑡</ci></apply><apply id="S3.E2.m1.2.2.2.2.1.1.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1.1"><ci id="S3.E2.m1.2.2.2.2.1.1.1.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1.1.1">←</ci><apply id="S3.E2.m1.2.2.2.2.1.1.1.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.1.2"><csymbol cd="ambiguous" id="S3.E2.m1.2.2.2.2.1.1.1.2.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1.1.2">superscript</csymbol><apply id="S3.E2.m1.2.2.2.2.1.1.1.2.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.1.2"><csymbol cd="ambiguous" id="S3.E2.m1.2.2.2.2.1.1.1.2.2.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1.1.2">subscript</csymbol><ci id="S3.E2.m1.2.2.2.2.1.1.1.2.2.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.1.2.2.2">ℎ</ci><apply id="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.cmml" xref="S3.E2.m1.2.2.2.2.1.1.1.2.2.3"><minus id="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.1.cmml" xref="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.1"></minus><ci id="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.2.cmml" xref="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.2">𝑡</ci><cn id="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.3.cmml" type="integer" xref="S3.E2.m1.2.2.2.2.1.1.1.2.2.3.3">1</cn></apply></apply><ci id="S3.E2.m1.2.2.2.2.1.1.1.2.3.cmml" xref="S3.E2.m1.2.2.2.2.1.1.1.2.3">𝑝</ci></apply></apply></interval></apply></apply><ci id="S3.E2.m1.2.3.1.5a.cmml" xref="S3.E2.m1.2.2"><mtext class="ltx_mathvariant_italic" id="S3.E2.m1.2.3.1.5.cmml" xref="S3.E2.m1.2.2.3">otherwise</mtext></ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E2.m1.2c">\begin{cases}\overrightarrow{h_{t}^{p}}=\text{GRU}^{p}(x_{t},\overrightarrow{h% _{t-1}^{p}})\\ \overleftarrow{h_{t}^{p}}=\text{GRU}^{p}(x_{t},\overleftarrow{h_{t-1}^{p}})% \end{cases}</annotation><annotation encoding="application/x-llamapun" id="S3.E2.m1.2d">{ start_ROW start_CELL over→ start_ARG italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG = GRU start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , over→ start_ARG italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG ) end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL over← start_ARG italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG = GRU start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , over← start_ARG italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG ) end_CELL start_CELL end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(2)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S3.SS1.p3"> <p class="ltx_p" id="S3.SS1.p3.3">The initial states of the Bi-GRU are set to zero vectors, i.e., <math alttext="\overrightarrow{h_{1}^{p}}=0" class="ltx_Math" display="inline" id="S3.SS1.p3.1.m1.1"><semantics id="S3.SS1.p3.1.m1.1a"><mrow id="S3.SS1.p3.1.m1.1.1" xref="S3.SS1.p3.1.m1.1.1.cmml"><mover accent="true" id="S3.SS1.p3.1.m1.1.1.2" xref="S3.SS1.p3.1.m1.1.1.2.cmml"><msubsup id="S3.SS1.p3.1.m1.1.1.2.2" xref="S3.SS1.p3.1.m1.1.1.2.2.cmml"><mi id="S3.SS1.p3.1.m1.1.1.2.2.2.2" xref="S3.SS1.p3.1.m1.1.1.2.2.2.2.cmml">h</mi><mn id="S3.SS1.p3.1.m1.1.1.2.2.2.3" xref="S3.SS1.p3.1.m1.1.1.2.2.2.3.cmml">1</mn><mi id="S3.SS1.p3.1.m1.1.1.2.2.3" xref="S3.SS1.p3.1.m1.1.1.2.2.3.cmml">p</mi></msubsup><mo id="S3.SS1.p3.1.m1.1.1.2.1" stretchy="false" xref="S3.SS1.p3.1.m1.1.1.2.1.cmml">→</mo></mover><mo id="S3.SS1.p3.1.m1.1.1.1" xref="S3.SS1.p3.1.m1.1.1.1.cmml">=</mo><mn id="S3.SS1.p3.1.m1.1.1.3" xref="S3.SS1.p3.1.m1.1.1.3.cmml">0</mn></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.1.m1.1b"><apply id="S3.SS1.p3.1.m1.1.1.cmml" xref="S3.SS1.p3.1.m1.1.1"><eq id="S3.SS1.p3.1.m1.1.1.1.cmml" xref="S3.SS1.p3.1.m1.1.1.1"></eq><apply id="S3.SS1.p3.1.m1.1.1.2.cmml" xref="S3.SS1.p3.1.m1.1.1.2"><ci id="S3.SS1.p3.1.m1.1.1.2.1.cmml" xref="S3.SS1.p3.1.m1.1.1.2.1">→</ci><apply id="S3.SS1.p3.1.m1.1.1.2.2.cmml" xref="S3.SS1.p3.1.m1.1.1.2.2"><csymbol cd="ambiguous" id="S3.SS1.p3.1.m1.1.1.2.2.1.cmml" xref="S3.SS1.p3.1.m1.1.1.2.2">superscript</csymbol><apply id="S3.SS1.p3.1.m1.1.1.2.2.2.cmml" xref="S3.SS1.p3.1.m1.1.1.2.2"><csymbol cd="ambiguous" id="S3.SS1.p3.1.m1.1.1.2.2.2.1.cmml" xref="S3.SS1.p3.1.m1.1.1.2.2">subscript</csymbol><ci id="S3.SS1.p3.1.m1.1.1.2.2.2.2.cmml" xref="S3.SS1.p3.1.m1.1.1.2.2.2.2">ℎ</ci><cn id="S3.SS1.p3.1.m1.1.1.2.2.2.3.cmml" type="integer" xref="S3.SS1.p3.1.m1.1.1.2.2.2.3">1</cn></apply><ci id="S3.SS1.p3.1.m1.1.1.2.2.3.cmml" xref="S3.SS1.p3.1.m1.1.1.2.2.3">𝑝</ci></apply></apply><cn id="S3.SS1.p3.1.m1.1.1.3.cmml" type="integer" xref="S3.SS1.p3.1.m1.1.1.3">0</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.1.m1.1c">\overrightarrow{h_{1}^{p}}=0</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.1.m1.1d">over→ start_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG = 0</annotation></semantics></math> and <math alttext="\overleftarrow{h_{1}^{p}}=0" class="ltx_Math" display="inline" id="S3.SS1.p3.2.m2.1"><semantics id="S3.SS1.p3.2.m2.1a"><mrow id="S3.SS1.p3.2.m2.1.1" xref="S3.SS1.p3.2.m2.1.1.cmml"><mover accent="true" id="S3.SS1.p3.2.m2.1.1.2" xref="S3.SS1.p3.2.m2.1.1.2.cmml"><msubsup id="S3.SS1.p3.2.m2.1.1.2.2" xref="S3.SS1.p3.2.m2.1.1.2.2.cmml"><mi id="S3.SS1.p3.2.m2.1.1.2.2.2.2" xref="S3.SS1.p3.2.m2.1.1.2.2.2.2.cmml">h</mi><mn id="S3.SS1.p3.2.m2.1.1.2.2.2.3" xref="S3.SS1.p3.2.m2.1.1.2.2.2.3.cmml">1</mn><mi id="S3.SS1.p3.2.m2.1.1.2.2.3" xref="S3.SS1.p3.2.m2.1.1.2.2.3.cmml">p</mi></msubsup><mo id="S3.SS1.p3.2.m2.1.1.2.1" stretchy="false" xref="S3.SS1.p3.2.m2.1.1.2.1.cmml">←</mo></mover><mo id="S3.SS1.p3.2.m2.1.1.1" xref="S3.SS1.p3.2.m2.1.1.1.cmml">=</mo><mn id="S3.SS1.p3.2.m2.1.1.3" xref="S3.SS1.p3.2.m2.1.1.3.cmml">0</mn></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.2.m2.1b"><apply id="S3.SS1.p3.2.m2.1.1.cmml" xref="S3.SS1.p3.2.m2.1.1"><eq id="S3.SS1.p3.2.m2.1.1.1.cmml" xref="S3.SS1.p3.2.m2.1.1.1"></eq><apply id="S3.SS1.p3.2.m2.1.1.2.cmml" xref="S3.SS1.p3.2.m2.1.1.2"><ci id="S3.SS1.p3.2.m2.1.1.2.1.cmml" xref="S3.SS1.p3.2.m2.1.1.2.1">←</ci><apply id="S3.SS1.p3.2.m2.1.1.2.2.cmml" xref="S3.SS1.p3.2.m2.1.1.2.2"><csymbol cd="ambiguous" id="S3.SS1.p3.2.m2.1.1.2.2.1.cmml" xref="S3.SS1.p3.2.m2.1.1.2.2">superscript</csymbol><apply id="S3.SS1.p3.2.m2.1.1.2.2.2.cmml" xref="S3.SS1.p3.2.m2.1.1.2.2"><csymbol cd="ambiguous" id="S3.SS1.p3.2.m2.1.1.2.2.2.1.cmml" xref="S3.SS1.p3.2.m2.1.1.2.2">subscript</csymbol><ci id="S3.SS1.p3.2.m2.1.1.2.2.2.2.cmml" xref="S3.SS1.p3.2.m2.1.1.2.2.2.2">ℎ</ci><cn id="S3.SS1.p3.2.m2.1.1.2.2.2.3.cmml" type="integer" xref="S3.SS1.p3.2.m2.1.1.2.2.2.3">1</cn></apply><ci id="S3.SS1.p3.2.m2.1.1.2.2.3.cmml" xref="S3.SS1.p3.2.m2.1.1.2.2.3">𝑝</ci></apply></apply><cn id="S3.SS1.p3.2.m2.1.1.3.cmml" type="integer" xref="S3.SS1.p3.2.m2.1.1.3">0</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.2.m2.1c">\overleftarrow{h_{1}^{p}}=0</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.2.m2.1d">over← start_ARG italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG = 0</annotation></semantics></math>. After the main encoder reads the input sentence, each word in the sentence can be represented by concatenating the forward and backward GRU states as <math alttext="h_{t}^{p}=[\overrightarrow{h_{t}^{p}},\overleftarrow{h_{t}^{p}}]" class="ltx_Math" display="inline" id="S3.SS1.p3.3.m3.2"><semantics id="S3.SS1.p3.3.m3.2a"><mrow id="S3.SS1.p3.3.m3.2.3" xref="S3.SS1.p3.3.m3.2.3.cmml"><msubsup id="S3.SS1.p3.3.m3.2.3.2" xref="S3.SS1.p3.3.m3.2.3.2.cmml"><mi id="S3.SS1.p3.3.m3.2.3.2.2.2" xref="S3.SS1.p3.3.m3.2.3.2.2.2.cmml">h</mi><mi id="S3.SS1.p3.3.m3.2.3.2.2.3" xref="S3.SS1.p3.3.m3.2.3.2.2.3.cmml">t</mi><mi id="S3.SS1.p3.3.m3.2.3.2.3" xref="S3.SS1.p3.3.m3.2.3.2.3.cmml">p</mi></msubsup><mo id="S3.SS1.p3.3.m3.2.3.1" xref="S3.SS1.p3.3.m3.2.3.1.cmml">=</mo><mrow id="S3.SS1.p3.3.m3.2.3.3.2" xref="S3.SS1.p3.3.m3.2.3.3.1.cmml"><mo id="S3.SS1.p3.3.m3.2.3.3.2.1" stretchy="false" xref="S3.SS1.p3.3.m3.2.3.3.1.cmml">[</mo><mover accent="true" id="S3.SS1.p3.3.m3.1.1" xref="S3.SS1.p3.3.m3.1.1.cmml"><msubsup id="S3.SS1.p3.3.m3.1.1.2" xref="S3.SS1.p3.3.m3.1.1.2.cmml"><mi id="S3.SS1.p3.3.m3.1.1.2.2.2" xref="S3.SS1.p3.3.m3.1.1.2.2.2.cmml">h</mi><mi id="S3.SS1.p3.3.m3.1.1.2.2.3" xref="S3.SS1.p3.3.m3.1.1.2.2.3.cmml">t</mi><mi id="S3.SS1.p3.3.m3.1.1.2.3" xref="S3.SS1.p3.3.m3.1.1.2.3.cmml">p</mi></msubsup><mo id="S3.SS1.p3.3.m3.1.1.1" stretchy="false" xref="S3.SS1.p3.3.m3.1.1.1.cmml">→</mo></mover><mo id="S3.SS1.p3.3.m3.2.3.3.2.2" xref="S3.SS1.p3.3.m3.2.3.3.1.cmml">,</mo><mover accent="true" id="S3.SS1.p3.3.m3.2.2" xref="S3.SS1.p3.3.m3.2.2.cmml"><msubsup id="S3.SS1.p3.3.m3.2.2.2" xref="S3.SS1.p3.3.m3.2.2.2.cmml"><mi id="S3.SS1.p3.3.m3.2.2.2.2.2" xref="S3.SS1.p3.3.m3.2.2.2.2.2.cmml">h</mi><mi id="S3.SS1.p3.3.m3.2.2.2.2.3" xref="S3.SS1.p3.3.m3.2.2.2.2.3.cmml">t</mi><mi id="S3.SS1.p3.3.m3.2.2.2.3" xref="S3.SS1.p3.3.m3.2.2.2.3.cmml">p</mi></msubsup><mo id="S3.SS1.p3.3.m3.2.2.1" stretchy="false" xref="S3.SS1.p3.3.m3.2.2.1.cmml">←</mo></mover><mo id="S3.SS1.p3.3.m3.2.3.3.2.3" stretchy="false" xref="S3.SS1.p3.3.m3.2.3.3.1.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.3.m3.2b"><apply id="S3.SS1.p3.3.m3.2.3.cmml" xref="S3.SS1.p3.3.m3.2.3"><eq id="S3.SS1.p3.3.m3.2.3.1.cmml" xref="S3.SS1.p3.3.m3.2.3.1"></eq><apply id="S3.SS1.p3.3.m3.2.3.2.cmml" xref="S3.SS1.p3.3.m3.2.3.2"><csymbol cd="ambiguous" id="S3.SS1.p3.3.m3.2.3.2.1.cmml" xref="S3.SS1.p3.3.m3.2.3.2">superscript</csymbol><apply id="S3.SS1.p3.3.m3.2.3.2.2.cmml" xref="S3.SS1.p3.3.m3.2.3.2"><csymbol cd="ambiguous" id="S3.SS1.p3.3.m3.2.3.2.2.1.cmml" xref="S3.SS1.p3.3.m3.2.3.2">subscript</csymbol><ci id="S3.SS1.p3.3.m3.2.3.2.2.2.cmml" xref="S3.SS1.p3.3.m3.2.3.2.2.2">ℎ</ci><ci id="S3.SS1.p3.3.m3.2.3.2.2.3.cmml" xref="S3.SS1.p3.3.m3.2.3.2.2.3">𝑡</ci></apply><ci id="S3.SS1.p3.3.m3.2.3.2.3.cmml" xref="S3.SS1.p3.3.m3.2.3.2.3">𝑝</ci></apply><interval closure="closed" id="S3.SS1.p3.3.m3.2.3.3.1.cmml" xref="S3.SS1.p3.3.m3.2.3.3.2"><apply id="S3.SS1.p3.3.m3.1.1.cmml" xref="S3.SS1.p3.3.m3.1.1"><ci id="S3.SS1.p3.3.m3.1.1.1.cmml" xref="S3.SS1.p3.3.m3.1.1.1">→</ci><apply id="S3.SS1.p3.3.m3.1.1.2.cmml" xref="S3.SS1.p3.3.m3.1.1.2"><csymbol cd="ambiguous" id="S3.SS1.p3.3.m3.1.1.2.1.cmml" xref="S3.SS1.p3.3.m3.1.1.2">superscript</csymbol><apply id="S3.SS1.p3.3.m3.1.1.2.2.cmml" xref="S3.SS1.p3.3.m3.1.1.2"><csymbol cd="ambiguous" id="S3.SS1.p3.3.m3.1.1.2.2.1.cmml" xref="S3.SS1.p3.3.m3.1.1.2">subscript</csymbol><ci id="S3.SS1.p3.3.m3.1.1.2.2.2.cmml" xref="S3.SS1.p3.3.m3.1.1.2.2.2">ℎ</ci><ci id="S3.SS1.p3.3.m3.1.1.2.2.3.cmml" xref="S3.SS1.p3.3.m3.1.1.2.2.3">𝑡</ci></apply><ci id="S3.SS1.p3.3.m3.1.1.2.3.cmml" xref="S3.SS1.p3.3.m3.1.1.2.3">𝑝</ci></apply></apply><apply id="S3.SS1.p3.3.m3.2.2.cmml" xref="S3.SS1.p3.3.m3.2.2"><ci id="S3.SS1.p3.3.m3.2.2.1.cmml" xref="S3.SS1.p3.3.m3.2.2.1">←</ci><apply id="S3.SS1.p3.3.m3.2.2.2.cmml" xref="S3.SS1.p3.3.m3.2.2.2"><csymbol cd="ambiguous" id="S3.SS1.p3.3.m3.2.2.2.1.cmml" xref="S3.SS1.p3.3.m3.2.2.2">superscript</csymbol><apply id="S3.SS1.p3.3.m3.2.2.2.2.cmml" xref="S3.SS1.p3.3.m3.2.2.2"><csymbol cd="ambiguous" id="S3.SS1.p3.3.m3.2.2.2.2.1.cmml" xref="S3.SS1.p3.3.m3.2.2.2">subscript</csymbol><ci id="S3.SS1.p3.3.m3.2.2.2.2.2.cmml" xref="S3.SS1.p3.3.m3.2.2.2.2.2">ℎ</ci><ci id="S3.SS1.p3.3.m3.2.2.2.2.3.cmml" xref="S3.SS1.p3.3.m3.2.2.2.2.3">𝑡</ci></apply><ci id="S3.SS1.p3.3.m3.2.2.2.3.cmml" xref="S3.SS1.p3.3.m3.2.2.2.3">𝑝</ci></apply></apply></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.3.m3.2c">h_{t}^{p}=[\overrightarrow{h_{t}^{p}},\overleftarrow{h_{t}^{p}}]</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.3.m3.2d">italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT = [ over→ start_ARG italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG , over← start_ARG italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG ]</annotation></semantics></math>. Subsequently, the representation of the text sequence in the patent specification is modeled as a nonlinear transformation of the average of the Bi-GRU hidden states. This is expressed as:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E3"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="C^{p}=\tanh(W_{p}\cdot\frac{1}{N}\sum_{t=1}^{N}h_{t}^{p}+b_{p})" class="ltx_Math" display="block" id="S3.E3.m1.2"><semantics id="S3.E3.m1.2a"><mrow id="S3.E3.m1.2.2" xref="S3.E3.m1.2.2.cmml"><msup id="S3.E3.m1.2.2.3" xref="S3.E3.m1.2.2.3.cmml"><mi id="S3.E3.m1.2.2.3.2" xref="S3.E3.m1.2.2.3.2.cmml">C</mi><mi id="S3.E3.m1.2.2.3.3" xref="S3.E3.m1.2.2.3.3.cmml">p</mi></msup><mo id="S3.E3.m1.2.2.2" xref="S3.E3.m1.2.2.2.cmml">=</mo><mrow id="S3.E3.m1.2.2.1.1" xref="S3.E3.m1.2.2.1.2.cmml"><mi id="S3.E3.m1.1.1" xref="S3.E3.m1.1.1.cmml">tanh</mi><mo id="S3.E3.m1.2.2.1.1a" xref="S3.E3.m1.2.2.1.2.cmml">⁡</mo><mrow id="S3.E3.m1.2.2.1.1.1" xref="S3.E3.m1.2.2.1.2.cmml"><mo id="S3.E3.m1.2.2.1.1.1.2" stretchy="false" xref="S3.E3.m1.2.2.1.2.cmml">(</mo><mrow id="S3.E3.m1.2.2.1.1.1.1" xref="S3.E3.m1.2.2.1.1.1.1.cmml"><mrow id="S3.E3.m1.2.2.1.1.1.1.2" xref="S3.E3.m1.2.2.1.1.1.1.2.cmml"><mrow id="S3.E3.m1.2.2.1.1.1.1.2.2" xref="S3.E3.m1.2.2.1.1.1.1.2.2.cmml"><msub id="S3.E3.m1.2.2.1.1.1.1.2.2.2" xref="S3.E3.m1.2.2.1.1.1.1.2.2.2.cmml"><mi id="S3.E3.m1.2.2.1.1.1.1.2.2.2.2" xref="S3.E3.m1.2.2.1.1.1.1.2.2.2.2.cmml">W</mi><mi id="S3.E3.m1.2.2.1.1.1.1.2.2.2.3" xref="S3.E3.m1.2.2.1.1.1.1.2.2.2.3.cmml">p</mi></msub><mo id="S3.E3.m1.2.2.1.1.1.1.2.2.1" lspace="0.222em" rspace="0.222em" xref="S3.E3.m1.2.2.1.1.1.1.2.2.1.cmml">⋅</mo><mfrac id="S3.E3.m1.2.2.1.1.1.1.2.2.3" xref="S3.E3.m1.2.2.1.1.1.1.2.2.3.cmml"><mn id="S3.E3.m1.2.2.1.1.1.1.2.2.3.2" xref="S3.E3.m1.2.2.1.1.1.1.2.2.3.2.cmml">1</mn><mi id="S3.E3.m1.2.2.1.1.1.1.2.2.3.3" xref="S3.E3.m1.2.2.1.1.1.1.2.2.3.3.cmml">N</mi></mfrac></mrow><mo id="S3.E3.m1.2.2.1.1.1.1.2.1" xref="S3.E3.m1.2.2.1.1.1.1.2.1.cmml">⁢</mo><mrow id="S3.E3.m1.2.2.1.1.1.1.2.3" xref="S3.E3.m1.2.2.1.1.1.1.2.3.cmml"><munderover id="S3.E3.m1.2.2.1.1.1.1.2.3.1" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.cmml"><mo id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.2" movablelimits="false" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.2.cmml">∑</mo><mrow id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.cmml"><mi id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.2" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.2.cmml">t</mi><mo id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.1" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.1.cmml">=</mo><mn id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.3" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.3.cmml">1</mn></mrow><mi id="S3.E3.m1.2.2.1.1.1.1.2.3.1.3" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.3.cmml">N</mi></munderover><msubsup id="S3.E3.m1.2.2.1.1.1.1.2.3.2" xref="S3.E3.m1.2.2.1.1.1.1.2.3.2.cmml"><mi id="S3.E3.m1.2.2.1.1.1.1.2.3.2.2.2" xref="S3.E3.m1.2.2.1.1.1.1.2.3.2.2.2.cmml">h</mi><mi id="S3.E3.m1.2.2.1.1.1.1.2.3.2.2.3" xref="S3.E3.m1.2.2.1.1.1.1.2.3.2.2.3.cmml">t</mi><mi id="S3.E3.m1.2.2.1.1.1.1.2.3.2.3" xref="S3.E3.m1.2.2.1.1.1.1.2.3.2.3.cmml">p</mi></msubsup></mrow></mrow><mo id="S3.E3.m1.2.2.1.1.1.1.1" xref="S3.E3.m1.2.2.1.1.1.1.1.cmml">+</mo><msub id="S3.E3.m1.2.2.1.1.1.1.3" xref="S3.E3.m1.2.2.1.1.1.1.3.cmml"><mi id="S3.E3.m1.2.2.1.1.1.1.3.2" xref="S3.E3.m1.2.2.1.1.1.1.3.2.cmml">b</mi><mi id="S3.E3.m1.2.2.1.1.1.1.3.3" xref="S3.E3.m1.2.2.1.1.1.1.3.3.cmml">p</mi></msub></mrow><mo id="S3.E3.m1.2.2.1.1.1.3" stretchy="false" xref="S3.E3.m1.2.2.1.2.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E3.m1.2b"><apply id="S3.E3.m1.2.2.cmml" xref="S3.E3.m1.2.2"><eq id="S3.E3.m1.2.2.2.cmml" xref="S3.E3.m1.2.2.2"></eq><apply id="S3.E3.m1.2.2.3.cmml" xref="S3.E3.m1.2.2.3"><csymbol cd="ambiguous" id="S3.E3.m1.2.2.3.1.cmml" xref="S3.E3.m1.2.2.3">superscript</csymbol><ci id="S3.E3.m1.2.2.3.2.cmml" xref="S3.E3.m1.2.2.3.2">𝐶</ci><ci id="S3.E3.m1.2.2.3.3.cmml" xref="S3.E3.m1.2.2.3.3">𝑝</ci></apply><apply id="S3.E3.m1.2.2.1.2.cmml" xref="S3.E3.m1.2.2.1.1"><tanh id="S3.E3.m1.1.1.cmml" xref="S3.E3.m1.1.1"></tanh><apply id="S3.E3.m1.2.2.1.1.1.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1"><plus id="S3.E3.m1.2.2.1.1.1.1.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.1"></plus><apply id="S3.E3.m1.2.2.1.1.1.1.2.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2"><times id="S3.E3.m1.2.2.1.1.1.1.2.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.1"></times><apply id="S3.E3.m1.2.2.1.1.1.1.2.2.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.2"><ci id="S3.E3.m1.2.2.1.1.1.1.2.2.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.2.1">⋅</ci><apply id="S3.E3.m1.2.2.1.1.1.1.2.2.2.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.2.2"><csymbol cd="ambiguous" id="S3.E3.m1.2.2.1.1.1.1.2.2.2.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.2.2">subscript</csymbol><ci id="S3.E3.m1.2.2.1.1.1.1.2.2.2.2.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.2.2.2">𝑊</ci><ci id="S3.E3.m1.2.2.1.1.1.1.2.2.2.3.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.2.2.3">𝑝</ci></apply><apply id="S3.E3.m1.2.2.1.1.1.1.2.2.3.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.2.3"><divide id="S3.E3.m1.2.2.1.1.1.1.2.2.3.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.2.3"></divide><cn id="S3.E3.m1.2.2.1.1.1.1.2.2.3.2.cmml" type="integer" xref="S3.E3.m1.2.2.1.1.1.1.2.2.3.2">1</cn><ci id="S3.E3.m1.2.2.1.1.1.1.2.2.3.3.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.2.3.3">𝑁</ci></apply></apply><apply id="S3.E3.m1.2.2.1.1.1.1.2.3.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3"><apply id="S3.E3.m1.2.2.1.1.1.1.2.3.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1"><csymbol cd="ambiguous" id="S3.E3.m1.2.2.1.1.1.1.2.3.1.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1">superscript</csymbol><apply id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1"><csymbol cd="ambiguous" id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1">subscript</csymbol><sum id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.2.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.2"></sum><apply id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3"><eq id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.1"></eq><ci id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.2.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.2">𝑡</ci><cn id="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.3.cmml" type="integer" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.2.3.3">1</cn></apply></apply><ci id="S3.E3.m1.2.2.1.1.1.1.2.3.1.3.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.1.3">𝑁</ci></apply><apply id="S3.E3.m1.2.2.1.1.1.1.2.3.2.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.2"><csymbol cd="ambiguous" id="S3.E3.m1.2.2.1.1.1.1.2.3.2.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.2">superscript</csymbol><apply id="S3.E3.m1.2.2.1.1.1.1.2.3.2.2.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.2"><csymbol cd="ambiguous" id="S3.E3.m1.2.2.1.1.1.1.2.3.2.2.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.2">subscript</csymbol><ci id="S3.E3.m1.2.2.1.1.1.1.2.3.2.2.2.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.2.2.2">ℎ</ci><ci id="S3.E3.m1.2.2.1.1.1.1.2.3.2.2.3.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.2.2.3">𝑡</ci></apply><ci id="S3.E3.m1.2.2.1.1.1.1.2.3.2.3.cmml" xref="S3.E3.m1.2.2.1.1.1.1.2.3.2.3">𝑝</ci></apply></apply></apply><apply id="S3.E3.m1.2.2.1.1.1.1.3.cmml" xref="S3.E3.m1.2.2.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.E3.m1.2.2.1.1.1.1.3.1.cmml" xref="S3.E3.m1.2.2.1.1.1.1.3">subscript</csymbol><ci id="S3.E3.m1.2.2.1.1.1.1.3.2.cmml" xref="S3.E3.m1.2.2.1.1.1.1.3.2">𝑏</ci><ci id="S3.E3.m1.2.2.1.1.1.1.3.3.cmml" xref="S3.E3.m1.2.2.1.1.1.1.3.3">𝑝</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E3.m1.2c">C^{p}=\tanh(W_{p}\cdot\frac{1}{N}\sum_{t=1}^{N}h_{t}^{p}+b_{p})</annotation><annotation encoding="application/x-llamapun" id="S3.E3.m1.2d">italic_C start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT = roman_tanh ( italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ⋅ divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT + italic_b start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(3)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS1.p3.6">where <math alttext="W_{p}" class="ltx_Math" display="inline" id="S3.SS1.p3.4.m1.1"><semantics id="S3.SS1.p3.4.m1.1a"><msub id="S3.SS1.p3.4.m1.1.1" xref="S3.SS1.p3.4.m1.1.1.cmml"><mi id="S3.SS1.p3.4.m1.1.1.2" xref="S3.SS1.p3.4.m1.1.1.2.cmml">W</mi><mi id="S3.SS1.p3.4.m1.1.1.3" xref="S3.SS1.p3.4.m1.1.1.3.cmml">p</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.4.m1.1b"><apply id="S3.SS1.p3.4.m1.1.1.cmml" xref="S3.SS1.p3.4.m1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p3.4.m1.1.1.1.cmml" xref="S3.SS1.p3.4.m1.1.1">subscript</csymbol><ci id="S3.SS1.p3.4.m1.1.1.2.cmml" xref="S3.SS1.p3.4.m1.1.1.2">𝑊</ci><ci id="S3.SS1.p3.4.m1.1.1.3.cmml" xref="S3.SS1.p3.4.m1.1.1.3">𝑝</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.4.m1.1c">W_{p}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.4.m1.1d">italic_W start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT</annotation></semantics></math> and <math alttext="b_{p}" class="ltx_Math" display="inline" id="S3.SS1.p3.5.m2.1"><semantics id="S3.SS1.p3.5.m2.1a"><msub id="S3.SS1.p3.5.m2.1.1" xref="S3.SS1.p3.5.m2.1.1.cmml"><mi id="S3.SS1.p3.5.m2.1.1.2" xref="S3.SS1.p3.5.m2.1.1.2.cmml">b</mi><mi id="S3.SS1.p3.5.m2.1.1.3" xref="S3.SS1.p3.5.m2.1.1.3.cmml">p</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.5.m2.1b"><apply id="S3.SS1.p3.5.m2.1.1.cmml" xref="S3.SS1.p3.5.m2.1.1"><csymbol cd="ambiguous" id="S3.SS1.p3.5.m2.1.1.1.cmml" xref="S3.SS1.p3.5.m2.1.1">subscript</csymbol><ci id="S3.SS1.p3.5.m2.1.1.2.cmml" xref="S3.SS1.p3.5.m2.1.1.2">𝑏</ci><ci id="S3.SS1.p3.5.m2.1.1.3.cmml" xref="S3.SS1.p3.5.m2.1.1.3">𝑝</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.5.m2.1c">b_{p}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.5.m2.1d">italic_b start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT</annotation></semantics></math> are hyperparameters, and <math alttext="N" class="ltx_Math" display="inline" id="S3.SS1.p3.6.m3.1"><semantics id="S3.SS1.p3.6.m3.1a"><mi id="S3.SS1.p3.6.m3.1.1" xref="S3.SS1.p3.6.m3.1.1.cmml">N</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p3.6.m3.1b"><ci id="S3.SS1.p3.6.m3.1.1.cmml" xref="S3.SS1.p3.6.m3.1.1">𝑁</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p3.6.m3.1c">N</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p3.6.m3.1d">italic_N</annotation></semantics></math> represents the length of the input sentence.</p> </div> </section> <section class="ltx_subsection" id="S3.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.2 </span>Slave Encoder</h3> <div class="ltx_para" id="S3.SS2.p1"> <p class="ltx_p" id="S3.SS2.p1.6">The slave encoder, depicted in Figure 1, utilizes a unidirectional GRU to process the input sequence from patent specifications every <math alttext="K" class="ltx_Math" display="inline" id="S3.SS2.p1.1.m1.1"><semantics id="S3.SS2.p1.1.m1.1a"><mi id="S3.SS2.p1.1.m1.1.1" xref="S3.SS2.p1.1.m1.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.1.m1.1b"><ci id="S3.SS2.p1.1.m1.1.1.cmml" xref="S3.SS2.p1.1.m1.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.1.m1.1c">K</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.1.m1.1d">italic_K</annotation></semantics></math> decoding steps, creating hidden states and computing context via an attention mechanism. It calculates the importance weight <math alttext="\alpha_{t}" class="ltx_Math" display="inline" id="S3.SS2.p1.2.m2.1"><semantics id="S3.SS2.p1.2.m2.1a"><msub id="S3.SS2.p1.2.m2.1.1" xref="S3.SS2.p1.2.m2.1.1.cmml"><mi id="S3.SS2.p1.2.m2.1.1.2" xref="S3.SS2.p1.2.m2.1.1.2.cmml">α</mi><mi id="S3.SS2.p1.2.m2.1.1.3" xref="S3.SS2.p1.2.m2.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.2.m2.1b"><apply id="S3.SS2.p1.2.m2.1.1.cmml" xref="S3.SS2.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.2.m2.1.1.1.cmml" xref="S3.SS2.p1.2.m2.1.1">subscript</csymbol><ci id="S3.SS2.p1.2.m2.1.1.2.cmml" xref="S3.SS2.p1.2.m2.1.1.2">𝛼</ci><ci id="S3.SS2.p1.2.m2.1.1.3.cmml" xref="S3.SS2.p1.2.m2.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.2.m2.1c">\alpha_{t}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.2.m2.1d">italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> based on the features of each word <math alttext="h_{t}^{p}" class="ltx_Math" display="inline" id="S3.SS2.p1.3.m3.1"><semantics id="S3.SS2.p1.3.m3.1a"><msubsup id="S3.SS2.p1.3.m3.1.1" xref="S3.SS2.p1.3.m3.1.1.cmml"><mi id="S3.SS2.p1.3.m3.1.1.2.2" xref="S3.SS2.p1.3.m3.1.1.2.2.cmml">h</mi><mi id="S3.SS2.p1.3.m3.1.1.2.3" xref="S3.SS2.p1.3.m3.1.1.2.3.cmml">t</mi><mi id="S3.SS2.p1.3.m3.1.1.3" xref="S3.SS2.p1.3.m3.1.1.3.cmml">p</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.3.m3.1b"><apply id="S3.SS2.p1.3.m3.1.1.cmml" xref="S3.SS2.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.3.m3.1.1.1.cmml" xref="S3.SS2.p1.3.m3.1.1">superscript</csymbol><apply id="S3.SS2.p1.3.m3.1.1.2.cmml" xref="S3.SS2.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.3.m3.1.1.2.1.cmml" xref="S3.SS2.p1.3.m3.1.1">subscript</csymbol><ci id="S3.SS2.p1.3.m3.1.1.2.2.cmml" xref="S3.SS2.p1.3.m3.1.1.2.2">ℎ</ci><ci id="S3.SS2.p1.3.m3.1.1.2.3.cmml" xref="S3.SS2.p1.3.m3.1.1.2.3">𝑡</ci></apply><ci id="S3.SS2.p1.3.m3.1.1.3.cmml" xref="S3.SS2.p1.3.m3.1.1.3">𝑝</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.3.m3.1c">h_{t}^{p}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.3.m3.1d">italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT</annotation></semantics></math>, the contents of the specifications <math alttext="C^{p}" class="ltx_Math" display="inline" id="S3.SS2.p1.4.m4.1"><semantics id="S3.SS2.p1.4.m4.1a"><msup id="S3.SS2.p1.4.m4.1.1" xref="S3.SS2.p1.4.m4.1.1.cmml"><mi id="S3.SS2.p1.4.m4.1.1.2" xref="S3.SS2.p1.4.m4.1.1.2.cmml">C</mi><mi id="S3.SS2.p1.4.m4.1.1.3" xref="S3.SS2.p1.4.m4.1.1.3.cmml">p</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.4.m4.1b"><apply id="S3.SS2.p1.4.m4.1.1.cmml" xref="S3.SS2.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.4.m4.1.1.1.cmml" xref="S3.SS2.p1.4.m4.1.1">superscript</csymbol><ci id="S3.SS2.p1.4.m4.1.1.2.cmml" xref="S3.SS2.p1.4.m4.1.1.2">𝐶</ci><ci id="S3.SS2.p1.4.m4.1.1.3.cmml" xref="S3.SS2.p1.4.m4.1.1.3">𝑝</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.4.m4.1c">C^{p}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.4.m4.1d">italic_C start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT</annotation></semantics></math>, claims <math alttext="C^{q}" class="ltx_Math" display="inline" id="S3.SS2.p1.5.m5.1"><semantics id="S3.SS2.p1.5.m5.1a"><msup id="S3.SS2.p1.5.m5.1.1" xref="S3.SS2.p1.5.m5.1.1.cmml"><mi id="S3.SS2.p1.5.m5.1.1.2" xref="S3.SS2.p1.5.m5.1.1.2.cmml">C</mi><mi id="S3.SS2.p1.5.m5.1.1.3" xref="S3.SS2.p1.5.m5.1.1.3.cmml">q</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.5.m5.1b"><apply id="S3.SS2.p1.5.m5.1.1.cmml" xref="S3.SS2.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.5.m5.1.1.1.cmml" xref="S3.SS2.p1.5.m5.1.1">superscript</csymbol><ci id="S3.SS2.p1.5.m5.1.1.2.cmml" xref="S3.SS2.p1.5.m5.1.1.2">𝐶</ci><ci id="S3.SS2.p1.5.m5.1.1.3.cmml" xref="S3.SS2.p1.5.m5.1.1.3">𝑞</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.5.m5.1c">C^{q}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.5.m5.1d">italic_C start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT</annotation></semantics></math>, and the decoder’s output <math alttext="C^{d}" class="ltx_Math" display="inline" id="S3.SS2.p1.6.m6.1"><semantics id="S3.SS2.p1.6.m6.1a"><msup id="S3.SS2.p1.6.m6.1.1" xref="S3.SS2.p1.6.m6.1.1.cmml"><mi id="S3.SS2.p1.6.m6.1.1.2" xref="S3.SS2.p1.6.m6.1.1.2.cmml">C</mi><mi id="S3.SS2.p1.6.m6.1.1.3" xref="S3.SS2.p1.6.m6.1.1.3.cmml">d</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.6.m6.1b"><apply id="S3.SS2.p1.6.m6.1.1.cmml" xref="S3.SS2.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.6.m6.1.1.1.cmml" xref="S3.SS2.p1.6.m6.1.1">superscript</csymbol><ci id="S3.SS2.p1.6.m6.1.1.2.cmml" xref="S3.SS2.p1.6.m6.1.1.2">𝐶</ci><ci id="S3.SS2.p1.6.m6.1.1.3.cmml" xref="S3.SS2.p1.6.m6.1.1.3">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.6.m6.1c">C^{d}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.6.m6.1d">italic_C start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT</annotation></semantics></math> as follows:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E4"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\begin{split}\alpha_{t}=\sigma(W_{2}(\tanh(W_{1}[h_{t}^{p},C^{p},C^{q},C^{d}]+% b_{1}))+\\ h_{t}^{p}{}^{T}W_{s}C^{p}+h_{t}^{p}{}^{T}W_{s}C^{d}-C^{p}{}^{T}W_{r}C^{d}+C^{q% }{}^{T}W_{k}+b_{2})\\ \end{split}" class="ltx_Math" display="block" id="S3.E4.m1.68"><semantics id="S3.E4.m1.68a"><mtable displaystyle="true" id="S3.E4.m1.67.67" rowspacing="0pt" xref="S3.E4.m1.68.68.1.cmml"><mtr id="S3.E4.m1.67.67a" xref="S3.E4.m1.68.68.1.cmml"><mtd class="ltx_align_right" columnalign="right" id="S3.E4.m1.67.67b" xref="S3.E4.m1.68.68.1.cmml"><mrow id="S3.E4.m1.32.32.32.32.32" xref="S3.E4.m1.68.68.1.cmml"><msub id="S3.E4.m1.32.32.32.32.32.33" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.1.1.1.1.1.1" xref="S3.E4.m1.1.1.1.1.1.1.cmml">α</mi><mi id="S3.E4.m1.2.2.2.2.2.2.1" xref="S3.E4.m1.2.2.2.2.2.2.1.cmml">t</mi></msub><mo id="S3.E4.m1.3.3.3.3.3.3" xref="S3.E4.m1.3.3.3.3.3.3.cmml">=</mo><mi id="S3.E4.m1.4.4.4.4.4.4" xref="S3.E4.m1.4.4.4.4.4.4.cmml">σ</mi><mrow id="S3.E4.m1.32.32.32.32.32.34" xref="S3.E4.m1.68.68.1.cmml"><mo id="S3.E4.m1.5.5.5.5.5.5" stretchy="false" xref="S3.E4.m1.68.68.1.cmml">(</mo><msub id="S3.E4.m1.32.32.32.32.32.34.1" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.6.6.6.6.6.6" xref="S3.E4.m1.6.6.6.6.6.6.cmml">W</mi><mn id="S3.E4.m1.7.7.7.7.7.7.1" xref="S3.E4.m1.7.7.7.7.7.7.1.cmml">2</mn></msub><mrow id="S3.E4.m1.32.32.32.32.32.34.2" xref="S3.E4.m1.68.68.1.cmml"><mo id="S3.E4.m1.8.8.8.8.8.8" stretchy="false" xref="S3.E4.m1.68.68.1.cmml">(</mo><mi id="S3.E4.m1.9.9.9.9.9.9" xref="S3.E4.m1.9.9.9.9.9.9.cmml">tanh</mi><mrow id="S3.E4.m1.32.32.32.32.32.34.2.1" xref="S3.E4.m1.68.68.1.cmml"><mo id="S3.E4.m1.10.10.10.10.10.10" stretchy="false" xref="S3.E4.m1.68.68.1.cmml">(</mo><msub id="S3.E4.m1.32.32.32.32.32.34.2.1.1" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.11.11.11.11.11.11" xref="S3.E4.m1.11.11.11.11.11.11.cmml">W</mi><mn id="S3.E4.m1.12.12.12.12.12.12.1" xref="S3.E4.m1.12.12.12.12.12.12.1.cmml">1</mn></msub><mrow id="S3.E4.m1.32.32.32.32.32.34.2.1.2" xref="S3.E4.m1.68.68.1.cmml"><mo id="S3.E4.m1.13.13.13.13.13.13" stretchy="false" xref="S3.E4.m1.68.68.1.cmml">[</mo><msubsup id="S3.E4.m1.32.32.32.32.32.34.2.1.2.1" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.14.14.14.14.14.14" xref="S3.E4.m1.14.14.14.14.14.14.cmml">h</mi><mi id="S3.E4.m1.15.15.15.15.15.15.1" xref="S3.E4.m1.15.15.15.15.15.15.1.cmml">t</mi><mi id="S3.E4.m1.16.16.16.16.16.16.1" xref="S3.E4.m1.16.16.16.16.16.16.1.cmml">p</mi></msubsup><mo id="S3.E4.m1.17.17.17.17.17.17" xref="S3.E4.m1.68.68.1.cmml">,</mo><msup id="S3.E4.m1.32.32.32.32.32.34.2.1.2.2" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.18.18.18.18.18.18" xref="S3.E4.m1.18.18.18.18.18.18.cmml">C</mi><mi id="S3.E4.m1.19.19.19.19.19.19.1" xref="S3.E4.m1.19.19.19.19.19.19.1.cmml">p</mi></msup><mo id="S3.E4.m1.20.20.20.20.20.20" xref="S3.E4.m1.68.68.1.cmml">,</mo><msup id="S3.E4.m1.32.32.32.32.32.34.2.1.2.3" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.21.21.21.21.21.21" xref="S3.E4.m1.21.21.21.21.21.21.cmml">C</mi><mi id="S3.E4.m1.22.22.22.22.22.22.1" xref="S3.E4.m1.22.22.22.22.22.22.1.cmml">q</mi></msup><mo id="S3.E4.m1.23.23.23.23.23.23" xref="S3.E4.m1.68.68.1.cmml">,</mo><msup id="S3.E4.m1.32.32.32.32.32.34.2.1.2.4" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.24.24.24.24.24.24" xref="S3.E4.m1.24.24.24.24.24.24.cmml">C</mi><mi id="S3.E4.m1.25.25.25.25.25.25.1" xref="S3.E4.m1.25.25.25.25.25.25.1.cmml">d</mi></msup><mo id="S3.E4.m1.26.26.26.26.26.26" stretchy="false" xref="S3.E4.m1.68.68.1.cmml">]</mo></mrow><mo id="S3.E4.m1.27.27.27.27.27.27" xref="S3.E4.m1.27.27.27.27.27.27.cmml">+</mo><msub id="S3.E4.m1.32.32.32.32.32.34.2.1.3" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.28.28.28.28.28.28" xref="S3.E4.m1.28.28.28.28.28.28.cmml">b</mi><mn id="S3.E4.m1.29.29.29.29.29.29.1" xref="S3.E4.m1.29.29.29.29.29.29.1.cmml">1</mn></msub><mo id="S3.E4.m1.30.30.30.30.30.30" stretchy="false" xref="S3.E4.m1.68.68.1.cmml">)</mo></mrow><mo id="S3.E4.m1.31.31.31.31.31.31" stretchy="false" xref="S3.E4.m1.68.68.1.cmml">)</mo></mrow><mo id="S3.E4.m1.32.32.32.32.32.32" xref="S3.E4.m1.32.32.32.32.32.32.cmml">+</mo></mrow></mrow></mtd></mtr><mtr id="S3.E4.m1.67.67c" xref="S3.E4.m1.68.68.1.cmml"><mtd class="ltx_align_right" columnalign="right" id="S3.E4.m1.67.67d" xref="S3.E4.m1.68.68.1.cmml"><mrow id="S3.E4.m1.67.67.67.35.35" xref="S3.E4.m1.68.68.1.cmml"><msubsup id="S3.E4.m1.67.67.67.35.35.36" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.33.33.33.1.1.1" xref="S3.E4.m1.33.33.33.1.1.1.cmml">h</mi><mi id="S3.E4.m1.34.34.34.2.2.2.1" xref="S3.E4.m1.34.34.34.2.2.2.1.cmml">t</mi><mi id="S3.E4.m1.35.35.35.3.3.3.1" xref="S3.E4.m1.35.35.35.3.3.3.1.cmml">p</mi></msubsup><mmultiscripts id="S3.E4.m1.67.67.67.35.35.37" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.37.37.37.5.5.5" xref="S3.E4.m1.37.37.37.5.5.5.cmml">W</mi><mi id="S3.E4.m1.38.38.38.6.6.6.1" xref="S3.E4.m1.38.38.38.6.6.6.1.cmml">s</mi><mrow id="S3.E4.m1.67.67.67.35.35.37a" xref="S3.E4.m1.68.68.1.cmml"></mrow><mprescripts id="S3.E4.m1.67.67.67.35.35.37b" xref="S3.E4.m1.68.68.1.cmml"></mprescripts><mrow id="S3.E4.m1.67.67.67.35.35.37c" xref="S3.E4.m1.68.68.1.cmml"></mrow><mi id="S3.E4.m1.36.36.36.4.4.4.1" xref="S3.E4.m1.36.36.36.4.4.4.1.cmml">T</mi></mmultiscripts><msup id="S3.E4.m1.67.67.67.35.35.38" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.39.39.39.7.7.7" xref="S3.E4.m1.39.39.39.7.7.7.cmml">C</mi><mi id="S3.E4.m1.40.40.40.8.8.8.1" xref="S3.E4.m1.40.40.40.8.8.8.1.cmml">p</mi></msup><mo id="S3.E4.m1.41.41.41.9.9.9" xref="S3.E4.m1.68.68.1.cmml">+</mo><msubsup id="S3.E4.m1.67.67.67.35.35.39" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.42.42.42.10.10.10" xref="S3.E4.m1.42.42.42.10.10.10.cmml">h</mi><mi id="S3.E4.m1.43.43.43.11.11.11.1" xref="S3.E4.m1.43.43.43.11.11.11.1.cmml">t</mi><mi id="S3.E4.m1.44.44.44.12.12.12.1" xref="S3.E4.m1.44.44.44.12.12.12.1.cmml">p</mi></msubsup><mmultiscripts id="S3.E4.m1.67.67.67.35.35.40" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.46.46.46.14.14.14" xref="S3.E4.m1.46.46.46.14.14.14.cmml">W</mi><mi id="S3.E4.m1.47.47.47.15.15.15.1" xref="S3.E4.m1.47.47.47.15.15.15.1.cmml">s</mi><mrow id="S3.E4.m1.67.67.67.35.35.40a" xref="S3.E4.m1.68.68.1.cmml"></mrow><mprescripts id="S3.E4.m1.67.67.67.35.35.40b" xref="S3.E4.m1.68.68.1.cmml"></mprescripts><mrow id="S3.E4.m1.67.67.67.35.35.40c" xref="S3.E4.m1.68.68.1.cmml"></mrow><mi id="S3.E4.m1.45.45.45.13.13.13.1" xref="S3.E4.m1.45.45.45.13.13.13.1.cmml">T</mi></mmultiscripts><msup id="S3.E4.m1.67.67.67.35.35.41" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.48.48.48.16.16.16" xref="S3.E4.m1.48.48.48.16.16.16.cmml">C</mi><mi id="S3.E4.m1.49.49.49.17.17.17.1" xref="S3.E4.m1.49.49.49.17.17.17.1.cmml">d</mi></msup><mo id="S3.E4.m1.50.50.50.18.18.18" xref="S3.E4.m1.50.50.50.18.18.18.cmml">−</mo><msup id="S3.E4.m1.67.67.67.35.35.42" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.51.51.51.19.19.19" xref="S3.E4.m1.51.51.51.19.19.19.cmml">C</mi><mi id="S3.E4.m1.52.52.52.20.20.20.1" xref="S3.E4.m1.52.52.52.20.20.20.1.cmml">p</mi></msup><mmultiscripts id="S3.E4.m1.67.67.67.35.35.43" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.54.54.54.22.22.22" xref="S3.E4.m1.54.54.54.22.22.22.cmml">W</mi><mi id="S3.E4.m1.55.55.55.23.23.23.1" xref="S3.E4.m1.55.55.55.23.23.23.1.cmml">r</mi><mrow id="S3.E4.m1.67.67.67.35.35.43a" xref="S3.E4.m1.68.68.1.cmml"></mrow><mprescripts id="S3.E4.m1.67.67.67.35.35.43b" xref="S3.E4.m1.68.68.1.cmml"></mprescripts><mrow id="S3.E4.m1.67.67.67.35.35.43c" xref="S3.E4.m1.68.68.1.cmml"></mrow><mi id="S3.E4.m1.53.53.53.21.21.21.1" xref="S3.E4.m1.53.53.53.21.21.21.1.cmml">T</mi></mmultiscripts><msup id="S3.E4.m1.67.67.67.35.35.44" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.56.56.56.24.24.24" xref="S3.E4.m1.56.56.56.24.24.24.cmml">C</mi><mi id="S3.E4.m1.57.57.57.25.25.25.1" xref="S3.E4.m1.57.57.57.25.25.25.1.cmml">d</mi></msup><mo id="S3.E4.m1.58.58.58.26.26.26" xref="S3.E4.m1.58.58.58.26.26.26.cmml">+</mo><msup id="S3.E4.m1.67.67.67.35.35.45" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.59.59.59.27.27.27" xref="S3.E4.m1.59.59.59.27.27.27.cmml">C</mi><mi id="S3.E4.m1.60.60.60.28.28.28.1" xref="S3.E4.m1.60.60.60.28.28.28.1.cmml">q</mi></msup><mmultiscripts id="S3.E4.m1.67.67.67.35.35.46" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.62.62.62.30.30.30" xref="S3.E4.m1.62.62.62.30.30.30.cmml">W</mi><mi id="S3.E4.m1.63.63.63.31.31.31.1" xref="S3.E4.m1.63.63.63.31.31.31.1.cmml">k</mi><mrow id="S3.E4.m1.67.67.67.35.35.46a" xref="S3.E4.m1.68.68.1.cmml"></mrow><mprescripts id="S3.E4.m1.67.67.67.35.35.46b" xref="S3.E4.m1.68.68.1.cmml"></mprescripts><mrow id="S3.E4.m1.67.67.67.35.35.46c" xref="S3.E4.m1.68.68.1.cmml"></mrow><mi id="S3.E4.m1.61.61.61.29.29.29.1" xref="S3.E4.m1.61.61.61.29.29.29.1.cmml">T</mi></mmultiscripts><mo id="S3.E4.m1.64.64.64.32.32.32" xref="S3.E4.m1.68.68.1.cmml">+</mo><msub id="S3.E4.m1.67.67.67.35.35.47" xref="S3.E4.m1.68.68.1.cmml"><mi id="S3.E4.m1.65.65.65.33.33.33" xref="S3.E4.m1.65.65.65.33.33.33.cmml">b</mi><mn id="S3.E4.m1.66.66.66.34.34.34.1" xref="S3.E4.m1.66.66.66.34.34.34.1.cmml">2</mn></msub><mo id="S3.E4.m1.67.67.67.35.35.35" stretchy="false" xref="S3.E4.m1.68.68.1.cmml">)</mo></mrow></mtd></mtr></mtable><annotation-xml encoding="MathML-Content" id="S3.E4.m1.68b"><apply id="S3.E4.m1.68.68.1.cmml" xref="S3.E4.m1.67.67"><eq id="S3.E4.m1.3.3.3.3.3.3.cmml" xref="S3.E4.m1.3.3.3.3.3.3"></eq><apply id="S3.E4.m1.68.68.1.3.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.3.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.1.1.1.1.1.1.cmml" xref="S3.E4.m1.1.1.1.1.1.1">𝛼</ci><ci id="S3.E4.m1.2.2.2.2.2.2.1.cmml" xref="S3.E4.m1.2.2.2.2.2.2.1">𝑡</ci></apply><apply id="S3.E4.m1.68.68.1.1.cmml" xref="S3.E4.m1.67.67"><times id="S3.E4.m1.68.68.1.1.2.cmml" xref="S3.E4.m1.67.67"></times><ci id="S3.E4.m1.4.4.4.4.4.4.cmml" xref="S3.E4.m1.4.4.4.4.4.4">𝜎</ci><apply id="S3.E4.m1.68.68.1.1.1.1.1.cmml" xref="S3.E4.m1.67.67"><plus id="S3.E4.m1.58.58.58.26.26.26.cmml" xref="S3.E4.m1.58.58.58.26.26.26"></plus><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.cmml" xref="S3.E4.m1.67.67"><minus id="S3.E4.m1.50.50.50.18.18.18.cmml" xref="S3.E4.m1.50.50.50.18.18.18"></minus><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.cmml" xref="S3.E4.m1.67.67"><plus id="S3.E4.m1.32.32.32.32.32.32.cmml" xref="S3.E4.m1.32.32.32.32.32.32"></plus><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.cmml" xref="S3.E4.m1.67.67"><times id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.2.cmml" xref="S3.E4.m1.67.67"></times><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.3.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.3.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.6.6.6.6.6.6.cmml" xref="S3.E4.m1.6.6.6.6.6.6">𝑊</ci><cn id="S3.E4.m1.7.7.7.7.7.7.1.cmml" type="integer" xref="S3.E4.m1.7.7.7.7.7.7.1">2</cn></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S3.E4.m1.67.67"><tanh id="S3.E4.m1.9.9.9.9.9.9.cmml" xref="S3.E4.m1.9.9.9.9.9.9"></tanh><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E4.m1.67.67"><plus id="S3.E4.m1.27.27.27.27.27.27.cmml" xref="S3.E4.m1.27.27.27.27.27.27"></plus><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.4.cmml" xref="S3.E4.m1.67.67"><times id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.4.5.cmml" xref="S3.E4.m1.67.67"></times><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.4.6.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.4.6.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.11.11.11.11.11.11.cmml" xref="S3.E4.m1.11.11.11.11.11.11">𝑊</ci><cn id="S3.E4.m1.12.12.12.12.12.12.1.cmml" type="integer" xref="S3.E4.m1.12.12.12.12.12.12.1">1</cn></apply><list id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.4.4.5.cmml" xref="S3.E4.m1.67.67"><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.14.14.14.14.14.14.cmml" xref="S3.E4.m1.14.14.14.14.14.14">ℎ</ci><ci id="S3.E4.m1.15.15.15.15.15.15.1.cmml" xref="S3.E4.m1.15.15.15.15.15.15.1">𝑡</ci></apply><ci id="S3.E4.m1.16.16.16.16.16.16.1.cmml" xref="S3.E4.m1.16.16.16.16.16.16.1">𝑝</ci></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.2.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><ci id="S3.E4.m1.18.18.18.18.18.18.cmml" xref="S3.E4.m1.18.18.18.18.18.18">𝐶</ci><ci id="S3.E4.m1.19.19.19.19.19.19.1.cmml" xref="S3.E4.m1.19.19.19.19.19.19.1">𝑝</ci></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.3.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.3.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><ci id="S3.E4.m1.21.21.21.21.21.21.cmml" xref="S3.E4.m1.21.21.21.21.21.21">𝐶</ci><ci id="S3.E4.m1.22.22.22.22.22.22.1.cmml" xref="S3.E4.m1.22.22.22.22.22.22.1">𝑞</ci></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.4.4.4.4.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.4.4.4.4.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><ci id="S3.E4.m1.24.24.24.24.24.24.cmml" xref="S3.E4.m1.24.24.24.24.24.24">𝐶</ci><ci id="S3.E4.m1.25.25.25.25.25.25.1.cmml" xref="S3.E4.m1.25.25.25.25.25.25.1">𝑑</ci></apply></list></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.6.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.1.1.1.1.1.1.1.6.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.28.28.28.28.28.28.cmml" xref="S3.E4.m1.28.28.28.28.28.28">𝑏</ci><cn id="S3.E4.m1.29.29.29.29.29.29.1.cmml" type="integer" xref="S3.E4.m1.29.29.29.29.29.29.1">1</cn></apply></apply></apply></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.cmml" xref="S3.E4.m1.67.67"><times id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.1.cmml" xref="S3.E4.m1.67.67"></times><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.2.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.2.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.2.2.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.33.33.33.1.1.1.cmml" xref="S3.E4.m1.33.33.33.1.1.1">ℎ</ci><ci id="S3.E4.m1.34.34.34.2.2.2.1.cmml" xref="S3.E4.m1.34.34.34.2.2.2.1">𝑡</ci></apply><ci id="S3.E4.m1.35.35.35.3.3.3.1.cmml" xref="S3.E4.m1.35.35.35.3.3.3.1">𝑝</ci></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.3.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.3.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.3.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.3.2.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.37.37.37.5.5.5.cmml" xref="S3.E4.m1.37.37.37.5.5.5">𝑊</ci><ci id="S3.E4.m1.38.38.38.6.6.6.1.cmml" xref="S3.E4.m1.38.38.38.6.6.6.1">𝑠</ci></apply><ci id="S3.E4.m1.36.36.36.4.4.4.1.cmml" xref="S3.E4.m1.36.36.36.4.4.4.1">𝑇</ci></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.4.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.3.4.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><ci id="S3.E4.m1.39.39.39.7.7.7.cmml" xref="S3.E4.m1.39.39.39.7.7.7">𝐶</ci><ci id="S3.E4.m1.40.40.40.8.8.8.1.cmml" xref="S3.E4.m1.40.40.40.8.8.8.1">𝑝</ci></apply></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.cmml" xref="S3.E4.m1.67.67"><times id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.1.cmml" xref="S3.E4.m1.67.67"></times><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.2.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.2.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.2.2.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.42.42.42.10.10.10.cmml" xref="S3.E4.m1.42.42.42.10.10.10">ℎ</ci><ci id="S3.E4.m1.43.43.43.11.11.11.1.cmml" xref="S3.E4.m1.43.43.43.11.11.11.1">𝑡</ci></apply><ci id="S3.E4.m1.44.44.44.12.12.12.1.cmml" xref="S3.E4.m1.44.44.44.12.12.12.1">𝑝</ci></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.3.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.3.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.3.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.3.2.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.46.46.46.14.14.14.cmml" xref="S3.E4.m1.46.46.46.14.14.14">𝑊</ci><ci id="S3.E4.m1.47.47.47.15.15.15.1.cmml" xref="S3.E4.m1.47.47.47.15.15.15.1">𝑠</ci></apply><ci id="S3.E4.m1.45.45.45.13.13.13.1.cmml" xref="S3.E4.m1.45.45.45.13.13.13.1">𝑇</ci></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.4.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.1.4.4.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><ci id="S3.E4.m1.48.48.48.16.16.16.cmml" xref="S3.E4.m1.48.48.48.16.16.16">𝐶</ci><ci id="S3.E4.m1.49.49.49.17.17.17.1.cmml" xref="S3.E4.m1.49.49.49.17.17.17.1">𝑑</ci></apply></apply></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.3.cmml" xref="S3.E4.m1.67.67"><times id="S3.E4.m1.68.68.1.1.1.1.1.1.3.1.cmml" xref="S3.E4.m1.67.67"></times><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.3.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.3.2.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><ci id="S3.E4.m1.51.51.51.19.19.19.cmml" xref="S3.E4.m1.51.51.51.19.19.19">𝐶</ci><ci id="S3.E4.m1.52.52.52.20.20.20.1.cmml" xref="S3.E4.m1.52.52.52.20.20.20.1">𝑝</ci></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.3.3.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.3.3.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.3.3.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.3.3.2.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.54.54.54.22.22.22.cmml" xref="S3.E4.m1.54.54.54.22.22.22">𝑊</ci><ci id="S3.E4.m1.55.55.55.23.23.23.1.cmml" xref="S3.E4.m1.55.55.55.23.23.23.1">𝑟</ci></apply><ci id="S3.E4.m1.53.53.53.21.21.21.1.cmml" xref="S3.E4.m1.53.53.53.21.21.21.1">𝑇</ci></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.1.3.4.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.1.3.4.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><ci id="S3.E4.m1.56.56.56.24.24.24.cmml" xref="S3.E4.m1.56.56.56.24.24.24">𝐶</ci><ci id="S3.E4.m1.57.57.57.25.25.25.1.cmml" xref="S3.E4.m1.57.57.57.25.25.25.1">𝑑</ci></apply></apply></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.3.cmml" xref="S3.E4.m1.67.67"><times id="S3.E4.m1.68.68.1.1.1.1.1.3.1.cmml" xref="S3.E4.m1.67.67"></times><apply id="S3.E4.m1.68.68.1.1.1.1.1.3.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.3.2.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><ci id="S3.E4.m1.59.59.59.27.27.27.cmml" xref="S3.E4.m1.59.59.59.27.27.27">𝐶</ci><ci id="S3.E4.m1.60.60.60.28.28.28.1.cmml" xref="S3.E4.m1.60.60.60.28.28.28.1">𝑞</ci></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.3.3.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.3.3.1.cmml" xref="S3.E4.m1.67.67">superscript</csymbol><apply id="S3.E4.m1.68.68.1.1.1.1.1.3.3.2.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.3.3.2.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.62.62.62.30.30.30.cmml" xref="S3.E4.m1.62.62.62.30.30.30">𝑊</ci><ci id="S3.E4.m1.63.63.63.31.31.31.1.cmml" xref="S3.E4.m1.63.63.63.31.31.31.1">𝑘</ci></apply><ci id="S3.E4.m1.61.61.61.29.29.29.1.cmml" xref="S3.E4.m1.61.61.61.29.29.29.1">𝑇</ci></apply></apply><apply id="S3.E4.m1.68.68.1.1.1.1.1.4.cmml" xref="S3.E4.m1.67.67"><csymbol cd="ambiguous" id="S3.E4.m1.68.68.1.1.1.1.1.4.1.cmml" xref="S3.E4.m1.67.67">subscript</csymbol><ci id="S3.E4.m1.65.65.65.33.33.33.cmml" xref="S3.E4.m1.65.65.65.33.33.33">𝑏</ci><cn id="S3.E4.m1.66.66.66.34.34.34.1.cmml" type="integer" xref="S3.E4.m1.66.66.66.34.34.34.1">2</cn></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E4.m1.68c">\begin{split}\alpha_{t}=\sigma(W_{2}(\tanh(W_{1}[h_{t}^{p},C^{p},C^{q},C^{d}]+% b_{1}))+\\ h_{t}^{p}{}^{T}W_{s}C^{p}+h_{t}^{p}{}^{T}W_{s}C^{d}-C^{p}{}^{T}W_{r}C^{d}+C^{q% }{}^{T}W_{k}+b_{2})\\ \end{split}</annotation><annotation encoding="application/x-llamapun" id="S3.E4.m1.68d">start_ROW start_CELL italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_σ ( italic_W start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( roman_tanh ( italic_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [ italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT , italic_C start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT , italic_C start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT , italic_C start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ] + italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) + end_CELL end_ROW start_ROW start_CELL italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT italic_C start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT + italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT italic_C start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT - italic_C start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT italic_C start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + italic_C start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT start_FLOATSUPERSCRIPT italic_T end_FLOATSUPERSCRIPT italic_W start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(4)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS2.p1.9">where parameters <math alttext="W_{1},W_{2},W_{s},W_{r},W_{k},b_{1},b_{2}" class="ltx_Math" display="inline" id="S3.SS2.p1.7.m1.7"><semantics id="S3.SS2.p1.7.m1.7a"><mrow id="S3.SS2.p1.7.m1.7.7.7" xref="S3.SS2.p1.7.m1.7.7.8.cmml"><msub id="S3.SS2.p1.7.m1.1.1.1.1" xref="S3.SS2.p1.7.m1.1.1.1.1.cmml"><mi id="S3.SS2.p1.7.m1.1.1.1.1.2" xref="S3.SS2.p1.7.m1.1.1.1.1.2.cmml">W</mi><mn id="S3.SS2.p1.7.m1.1.1.1.1.3" xref="S3.SS2.p1.7.m1.1.1.1.1.3.cmml">1</mn></msub><mo id="S3.SS2.p1.7.m1.7.7.7.8" xref="S3.SS2.p1.7.m1.7.7.8.cmml">,</mo><msub id="S3.SS2.p1.7.m1.2.2.2.2" xref="S3.SS2.p1.7.m1.2.2.2.2.cmml"><mi id="S3.SS2.p1.7.m1.2.2.2.2.2" xref="S3.SS2.p1.7.m1.2.2.2.2.2.cmml">W</mi><mn id="S3.SS2.p1.7.m1.2.2.2.2.3" xref="S3.SS2.p1.7.m1.2.2.2.2.3.cmml">2</mn></msub><mo id="S3.SS2.p1.7.m1.7.7.7.9" xref="S3.SS2.p1.7.m1.7.7.8.cmml">,</mo><msub id="S3.SS2.p1.7.m1.3.3.3.3" xref="S3.SS2.p1.7.m1.3.3.3.3.cmml"><mi id="S3.SS2.p1.7.m1.3.3.3.3.2" xref="S3.SS2.p1.7.m1.3.3.3.3.2.cmml">W</mi><mi id="S3.SS2.p1.7.m1.3.3.3.3.3" xref="S3.SS2.p1.7.m1.3.3.3.3.3.cmml">s</mi></msub><mo id="S3.SS2.p1.7.m1.7.7.7.10" xref="S3.SS2.p1.7.m1.7.7.8.cmml">,</mo><msub id="S3.SS2.p1.7.m1.4.4.4.4" xref="S3.SS2.p1.7.m1.4.4.4.4.cmml"><mi id="S3.SS2.p1.7.m1.4.4.4.4.2" xref="S3.SS2.p1.7.m1.4.4.4.4.2.cmml">W</mi><mi id="S3.SS2.p1.7.m1.4.4.4.4.3" xref="S3.SS2.p1.7.m1.4.4.4.4.3.cmml">r</mi></msub><mo id="S3.SS2.p1.7.m1.7.7.7.11" xref="S3.SS2.p1.7.m1.7.7.8.cmml">,</mo><msub id="S3.SS2.p1.7.m1.5.5.5.5" xref="S3.SS2.p1.7.m1.5.5.5.5.cmml"><mi id="S3.SS2.p1.7.m1.5.5.5.5.2" xref="S3.SS2.p1.7.m1.5.5.5.5.2.cmml">W</mi><mi id="S3.SS2.p1.7.m1.5.5.5.5.3" xref="S3.SS2.p1.7.m1.5.5.5.5.3.cmml">k</mi></msub><mo id="S3.SS2.p1.7.m1.7.7.7.12" xref="S3.SS2.p1.7.m1.7.7.8.cmml">,</mo><msub id="S3.SS2.p1.7.m1.6.6.6.6" xref="S3.SS2.p1.7.m1.6.6.6.6.cmml"><mi id="S3.SS2.p1.7.m1.6.6.6.6.2" xref="S3.SS2.p1.7.m1.6.6.6.6.2.cmml">b</mi><mn id="S3.SS2.p1.7.m1.6.6.6.6.3" xref="S3.SS2.p1.7.m1.6.6.6.6.3.cmml">1</mn></msub><mo id="S3.SS2.p1.7.m1.7.7.7.13" xref="S3.SS2.p1.7.m1.7.7.8.cmml">,</mo><msub id="S3.SS2.p1.7.m1.7.7.7.7" xref="S3.SS2.p1.7.m1.7.7.7.7.cmml"><mi id="S3.SS2.p1.7.m1.7.7.7.7.2" xref="S3.SS2.p1.7.m1.7.7.7.7.2.cmml">b</mi><mn id="S3.SS2.p1.7.m1.7.7.7.7.3" xref="S3.SS2.p1.7.m1.7.7.7.7.3.cmml">2</mn></msub></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.7.m1.7b"><list id="S3.SS2.p1.7.m1.7.7.8.cmml" xref="S3.SS2.p1.7.m1.7.7.7"><apply id="S3.SS2.p1.7.m1.1.1.1.1.cmml" xref="S3.SS2.p1.7.m1.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.7.m1.1.1.1.1.1.cmml" xref="S3.SS2.p1.7.m1.1.1.1.1">subscript</csymbol><ci id="S3.SS2.p1.7.m1.1.1.1.1.2.cmml" xref="S3.SS2.p1.7.m1.1.1.1.1.2">𝑊</ci><cn id="S3.SS2.p1.7.m1.1.1.1.1.3.cmml" type="integer" xref="S3.SS2.p1.7.m1.1.1.1.1.3">1</cn></apply><apply id="S3.SS2.p1.7.m1.2.2.2.2.cmml" xref="S3.SS2.p1.7.m1.2.2.2.2"><csymbol cd="ambiguous" id="S3.SS2.p1.7.m1.2.2.2.2.1.cmml" xref="S3.SS2.p1.7.m1.2.2.2.2">subscript</csymbol><ci id="S3.SS2.p1.7.m1.2.2.2.2.2.cmml" xref="S3.SS2.p1.7.m1.2.2.2.2.2">𝑊</ci><cn id="S3.SS2.p1.7.m1.2.2.2.2.3.cmml" type="integer" xref="S3.SS2.p1.7.m1.2.2.2.2.3">2</cn></apply><apply id="S3.SS2.p1.7.m1.3.3.3.3.cmml" xref="S3.SS2.p1.7.m1.3.3.3.3"><csymbol cd="ambiguous" id="S3.SS2.p1.7.m1.3.3.3.3.1.cmml" xref="S3.SS2.p1.7.m1.3.3.3.3">subscript</csymbol><ci id="S3.SS2.p1.7.m1.3.3.3.3.2.cmml" xref="S3.SS2.p1.7.m1.3.3.3.3.2">𝑊</ci><ci id="S3.SS2.p1.7.m1.3.3.3.3.3.cmml" xref="S3.SS2.p1.7.m1.3.3.3.3.3">𝑠</ci></apply><apply id="S3.SS2.p1.7.m1.4.4.4.4.cmml" xref="S3.SS2.p1.7.m1.4.4.4.4"><csymbol cd="ambiguous" id="S3.SS2.p1.7.m1.4.4.4.4.1.cmml" xref="S3.SS2.p1.7.m1.4.4.4.4">subscript</csymbol><ci id="S3.SS2.p1.7.m1.4.4.4.4.2.cmml" xref="S3.SS2.p1.7.m1.4.4.4.4.2">𝑊</ci><ci id="S3.SS2.p1.7.m1.4.4.4.4.3.cmml" xref="S3.SS2.p1.7.m1.4.4.4.4.3">𝑟</ci></apply><apply id="S3.SS2.p1.7.m1.5.5.5.5.cmml" xref="S3.SS2.p1.7.m1.5.5.5.5"><csymbol cd="ambiguous" id="S3.SS2.p1.7.m1.5.5.5.5.1.cmml" xref="S3.SS2.p1.7.m1.5.5.5.5">subscript</csymbol><ci id="S3.SS2.p1.7.m1.5.5.5.5.2.cmml" xref="S3.SS2.p1.7.m1.5.5.5.5.2">𝑊</ci><ci id="S3.SS2.p1.7.m1.5.5.5.5.3.cmml" xref="S3.SS2.p1.7.m1.5.5.5.5.3">𝑘</ci></apply><apply id="S3.SS2.p1.7.m1.6.6.6.6.cmml" xref="S3.SS2.p1.7.m1.6.6.6.6"><csymbol cd="ambiguous" id="S3.SS2.p1.7.m1.6.6.6.6.1.cmml" xref="S3.SS2.p1.7.m1.6.6.6.6">subscript</csymbol><ci id="S3.SS2.p1.7.m1.6.6.6.6.2.cmml" xref="S3.SS2.p1.7.m1.6.6.6.6.2">𝑏</ci><cn id="S3.SS2.p1.7.m1.6.6.6.6.3.cmml" type="integer" xref="S3.SS2.p1.7.m1.6.6.6.6.3">1</cn></apply><apply id="S3.SS2.p1.7.m1.7.7.7.7.cmml" xref="S3.SS2.p1.7.m1.7.7.7.7"><csymbol cd="ambiguous" id="S3.SS2.p1.7.m1.7.7.7.7.1.cmml" xref="S3.SS2.p1.7.m1.7.7.7.7">subscript</csymbol><ci id="S3.SS2.p1.7.m1.7.7.7.7.2.cmml" xref="S3.SS2.p1.7.m1.7.7.7.7.2">𝑏</ci><cn id="S3.SS2.p1.7.m1.7.7.7.7.3.cmml" type="integer" xref="S3.SS2.p1.7.m1.7.7.7.7.3">2</cn></apply></list></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.7.m1.7c">W_{1},W_{2},W_{s},W_{r},W_{k},b_{1},b_{2}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.7.m1.7d">italic_W start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_W start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_W start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT , italic_W start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT , italic_W start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math> adjust how the model attends to and processes the input words <math alttext="x_{t}" class="ltx_Math" display="inline" id="S3.SS2.p1.8.m2.1"><semantics id="S3.SS2.p1.8.m2.1a"><msub id="S3.SS2.p1.8.m2.1.1" xref="S3.SS2.p1.8.m2.1.1.cmml"><mi id="S3.SS2.p1.8.m2.1.1.2" xref="S3.SS2.p1.8.m2.1.1.2.cmml">x</mi><mi id="S3.SS2.p1.8.m2.1.1.3" xref="S3.SS2.p1.8.m2.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.8.m2.1b"><apply id="S3.SS2.p1.8.m2.1.1.cmml" xref="S3.SS2.p1.8.m2.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.8.m2.1.1.1.cmml" xref="S3.SS2.p1.8.m2.1.1">subscript</csymbol><ci id="S3.SS2.p1.8.m2.1.1.2.cmml" xref="S3.SS2.p1.8.m2.1.1.2">𝑥</ci><ci id="S3.SS2.p1.8.m2.1.1.3.cmml" xref="S3.SS2.p1.8.m2.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.8.m2.1c">x_{t}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.8.m2.1d">italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> and <math alttext="x_{t}^{\prime}" class="ltx_Math" display="inline" id="S3.SS2.p1.9.m3.1"><semantics id="S3.SS2.p1.9.m3.1a"><msubsup id="S3.SS2.p1.9.m3.1.1" xref="S3.SS2.p1.9.m3.1.1.cmml"><mi id="S3.SS2.p1.9.m3.1.1.2.2" xref="S3.SS2.p1.9.m3.1.1.2.2.cmml">x</mi><mi id="S3.SS2.p1.9.m3.1.1.2.3" xref="S3.SS2.p1.9.m3.1.1.2.3.cmml">t</mi><mo id="S3.SS2.p1.9.m3.1.1.3" xref="S3.SS2.p1.9.m3.1.1.3.cmml">′</mo></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.9.m3.1b"><apply id="S3.SS2.p1.9.m3.1.1.cmml" xref="S3.SS2.p1.9.m3.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.9.m3.1.1.1.cmml" xref="S3.SS2.p1.9.m3.1.1">superscript</csymbol><apply id="S3.SS2.p1.9.m3.1.1.2.cmml" xref="S3.SS2.p1.9.m3.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.9.m3.1.1.2.1.cmml" xref="S3.SS2.p1.9.m3.1.1">subscript</csymbol><ci id="S3.SS2.p1.9.m3.1.1.2.2.cmml" xref="S3.SS2.p1.9.m3.1.1.2.2">𝑥</ci><ci id="S3.SS2.p1.9.m3.1.1.2.3.cmml" xref="S3.SS2.p1.9.m3.1.1.2.3">𝑡</ci></apply><ci id="S3.SS2.p1.9.m3.1.1.3.cmml" xref="S3.SS2.p1.9.m3.1.1.3">′</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.9.m3.1c">x_{t}^{\prime}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.9.m3.1d">italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT</annotation></semantics></math>.</p> </div> <div class="ltx_para" id="S3.SS2.p2"> <p class="ltx_p" id="S3.SS2.p2.2">The updating rule for the slave encoder’s hidden state <math alttext="h_{t}^{s}" class="ltx_Math" display="inline" id="S3.SS2.p2.1.m1.1"><semantics id="S3.SS2.p2.1.m1.1a"><msubsup id="S3.SS2.p2.1.m1.1.1" xref="S3.SS2.p2.1.m1.1.1.cmml"><mi id="S3.SS2.p2.1.m1.1.1.2.2" xref="S3.SS2.p2.1.m1.1.1.2.2.cmml">h</mi><mi id="S3.SS2.p2.1.m1.1.1.2.3" xref="S3.SS2.p2.1.m1.1.1.2.3.cmml">t</mi><mi id="S3.SS2.p2.1.m1.1.1.3" xref="S3.SS2.p2.1.m1.1.1.3.cmml">s</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS2.p2.1.m1.1b"><apply id="S3.SS2.p2.1.m1.1.1.cmml" xref="S3.SS2.p2.1.m1.1.1"><csymbol cd="ambiguous" id="S3.SS2.p2.1.m1.1.1.1.cmml" xref="S3.SS2.p2.1.m1.1.1">superscript</csymbol><apply id="S3.SS2.p2.1.m1.1.1.2.cmml" xref="S3.SS2.p2.1.m1.1.1"><csymbol cd="ambiguous" id="S3.SS2.p2.1.m1.1.1.2.1.cmml" xref="S3.SS2.p2.1.m1.1.1">subscript</csymbol><ci id="S3.SS2.p2.1.m1.1.1.2.2.cmml" xref="S3.SS2.p2.1.m1.1.1.2.2">ℎ</ci><ci id="S3.SS2.p2.1.m1.1.1.2.3.cmml" xref="S3.SS2.p2.1.m1.1.1.2.3">𝑡</ci></apply><ci id="S3.SS2.p2.1.m1.1.1.3.cmml" xref="S3.SS2.p2.1.m1.1.1.3">𝑠</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p2.1.m1.1c">h_{t}^{s}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.1.m1.1d">italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT</annotation></semantics></math> based on <math alttext="\alpha_{t}" class="ltx_Math" display="inline" id="S3.SS2.p2.2.m2.1"><semantics id="S3.SS2.p2.2.m2.1a"><msub id="S3.SS2.p2.2.m2.1.1" xref="S3.SS2.p2.2.m2.1.1.cmml"><mi id="S3.SS2.p2.2.m2.1.1.2" xref="S3.SS2.p2.2.m2.1.1.2.cmml">α</mi><mi id="S3.SS2.p2.2.m2.1.1.3" xref="S3.SS2.p2.2.m2.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS2.p2.2.m2.1b"><apply id="S3.SS2.p2.2.m2.1.1.cmml" xref="S3.SS2.p2.2.m2.1.1"><csymbol cd="ambiguous" id="S3.SS2.p2.2.m2.1.1.1.cmml" xref="S3.SS2.p2.2.m2.1.1">subscript</csymbol><ci id="S3.SS2.p2.2.m2.1.1.2.cmml" xref="S3.SS2.p2.2.m2.1.1.2">𝛼</ci><ci id="S3.SS2.p2.2.m2.1.1.3.cmml" xref="S3.SS2.p2.2.m2.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p2.2.m2.1c">\alpha_{t}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.2.m2.1d">italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> is:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E5"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="h_{t}^{s}=(1-\alpha_{t})\odot h_{t-1}^{s}+\alpha_{t}\odot\text{GRU}^{s}(x_{t},% h_{t-1}^{s})" class="ltx_Math" display="block" id="S3.E5.m1.3"><semantics id="S3.E5.m1.3a"><mrow id="S3.E5.m1.3.3" xref="S3.E5.m1.3.3.cmml"><msubsup id="S3.E5.m1.3.3.5" xref="S3.E5.m1.3.3.5.cmml"><mi id="S3.E5.m1.3.3.5.2.2" xref="S3.E5.m1.3.3.5.2.2.cmml">h</mi><mi id="S3.E5.m1.3.3.5.2.3" xref="S3.E5.m1.3.3.5.2.3.cmml">t</mi><mi id="S3.E5.m1.3.3.5.3" xref="S3.E5.m1.3.3.5.3.cmml">s</mi></msubsup><mo id="S3.E5.m1.3.3.4" xref="S3.E5.m1.3.3.4.cmml">=</mo><mrow id="S3.E5.m1.3.3.3" xref="S3.E5.m1.3.3.3.cmml"><mrow id="S3.E5.m1.1.1.1.1" xref="S3.E5.m1.1.1.1.1.cmml"><mrow id="S3.E5.m1.1.1.1.1.1.1" xref="S3.E5.m1.1.1.1.1.1.1.1.cmml"><mo id="S3.E5.m1.1.1.1.1.1.1.2" stretchy="false" xref="S3.E5.m1.1.1.1.1.1.1.1.cmml">(</mo><mrow id="S3.E5.m1.1.1.1.1.1.1.1" xref="S3.E5.m1.1.1.1.1.1.1.1.cmml"><mn id="S3.E5.m1.1.1.1.1.1.1.1.2" xref="S3.E5.m1.1.1.1.1.1.1.1.2.cmml">1</mn><mo id="S3.E5.m1.1.1.1.1.1.1.1.1" xref="S3.E5.m1.1.1.1.1.1.1.1.1.cmml">−</mo><msub id="S3.E5.m1.1.1.1.1.1.1.1.3" xref="S3.E5.m1.1.1.1.1.1.1.1.3.cmml"><mi id="S3.E5.m1.1.1.1.1.1.1.1.3.2" xref="S3.E5.m1.1.1.1.1.1.1.1.3.2.cmml">α</mi><mi id="S3.E5.m1.1.1.1.1.1.1.1.3.3" xref="S3.E5.m1.1.1.1.1.1.1.1.3.3.cmml">t</mi></msub></mrow><mo id="S3.E5.m1.1.1.1.1.1.1.3" rspace="0.055em" stretchy="false" xref="S3.E5.m1.1.1.1.1.1.1.1.cmml">)</mo></mrow><mo id="S3.E5.m1.1.1.1.1.2" rspace="0.222em" xref="S3.E5.m1.1.1.1.1.2.cmml">⊙</mo><msubsup id="S3.E5.m1.1.1.1.1.3" xref="S3.E5.m1.1.1.1.1.3.cmml"><mi id="S3.E5.m1.1.1.1.1.3.2.2" xref="S3.E5.m1.1.1.1.1.3.2.2.cmml">h</mi><mrow id="S3.E5.m1.1.1.1.1.3.2.3" xref="S3.E5.m1.1.1.1.1.3.2.3.cmml"><mi id="S3.E5.m1.1.1.1.1.3.2.3.2" xref="S3.E5.m1.1.1.1.1.3.2.3.2.cmml">t</mi><mo id="S3.E5.m1.1.1.1.1.3.2.3.1" xref="S3.E5.m1.1.1.1.1.3.2.3.1.cmml">−</mo><mn id="S3.E5.m1.1.1.1.1.3.2.3.3" xref="S3.E5.m1.1.1.1.1.3.2.3.3.cmml">1</mn></mrow><mi id="S3.E5.m1.1.1.1.1.3.3" xref="S3.E5.m1.1.1.1.1.3.3.cmml">s</mi></msubsup></mrow><mo id="S3.E5.m1.3.3.3.4" xref="S3.E5.m1.3.3.3.4.cmml">+</mo><mrow id="S3.E5.m1.3.3.3.3" xref="S3.E5.m1.3.3.3.3.cmml"><mrow id="S3.E5.m1.3.3.3.3.4" xref="S3.E5.m1.3.3.3.3.4.cmml"><msub id="S3.E5.m1.3.3.3.3.4.2" xref="S3.E5.m1.3.3.3.3.4.2.cmml"><mi id="S3.E5.m1.3.3.3.3.4.2.2" xref="S3.E5.m1.3.3.3.3.4.2.2.cmml">α</mi><mi id="S3.E5.m1.3.3.3.3.4.2.3" xref="S3.E5.m1.3.3.3.3.4.2.3.cmml">t</mi></msub><mo id="S3.E5.m1.3.3.3.3.4.1" lspace="0.222em" rspace="0.222em" xref="S3.E5.m1.3.3.3.3.4.1.cmml">⊙</mo><msup id="S3.E5.m1.3.3.3.3.4.3" xref="S3.E5.m1.3.3.3.3.4.3.cmml"><mtext id="S3.E5.m1.3.3.3.3.4.3.2" xref="S3.E5.m1.3.3.3.3.4.3.2a.cmml">GRU</mtext><mi id="S3.E5.m1.3.3.3.3.4.3.3" xref="S3.E5.m1.3.3.3.3.4.3.3.cmml">s</mi></msup></mrow><mo id="S3.E5.m1.3.3.3.3.3" xref="S3.E5.m1.3.3.3.3.3.cmml">⁢</mo><mrow id="S3.E5.m1.3.3.3.3.2.2" xref="S3.E5.m1.3.3.3.3.2.3.cmml"><mo id="S3.E5.m1.3.3.3.3.2.2.3" stretchy="false" xref="S3.E5.m1.3.3.3.3.2.3.cmml">(</mo><msub id="S3.E5.m1.2.2.2.2.1.1.1" xref="S3.E5.m1.2.2.2.2.1.1.1.cmml"><mi id="S3.E5.m1.2.2.2.2.1.1.1.2" xref="S3.E5.m1.2.2.2.2.1.1.1.2.cmml">x</mi><mi id="S3.E5.m1.2.2.2.2.1.1.1.3" xref="S3.E5.m1.2.2.2.2.1.1.1.3.cmml">t</mi></msub><mo id="S3.E5.m1.3.3.3.3.2.2.4" xref="S3.E5.m1.3.3.3.3.2.3.cmml">,</mo><msubsup id="S3.E5.m1.3.3.3.3.2.2.2" xref="S3.E5.m1.3.3.3.3.2.2.2.cmml"><mi id="S3.E5.m1.3.3.3.3.2.2.2.2.2" xref="S3.E5.m1.3.3.3.3.2.2.2.2.2.cmml">h</mi><mrow id="S3.E5.m1.3.3.3.3.2.2.2.2.3" xref="S3.E5.m1.3.3.3.3.2.2.2.2.3.cmml"><mi id="S3.E5.m1.3.3.3.3.2.2.2.2.3.2" xref="S3.E5.m1.3.3.3.3.2.2.2.2.3.2.cmml">t</mi><mo id="S3.E5.m1.3.3.3.3.2.2.2.2.3.1" xref="S3.E5.m1.3.3.3.3.2.2.2.2.3.1.cmml">−</mo><mn id="S3.E5.m1.3.3.3.3.2.2.2.2.3.3" xref="S3.E5.m1.3.3.3.3.2.2.2.2.3.3.cmml">1</mn></mrow><mi id="S3.E5.m1.3.3.3.3.2.2.2.3" xref="S3.E5.m1.3.3.3.3.2.2.2.3.cmml">s</mi></msubsup><mo id="S3.E5.m1.3.3.3.3.2.2.5" stretchy="false" xref="S3.E5.m1.3.3.3.3.2.3.cmml">)</mo></mrow></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E5.m1.3b"><apply id="S3.E5.m1.3.3.cmml" xref="S3.E5.m1.3.3"><eq id="S3.E5.m1.3.3.4.cmml" xref="S3.E5.m1.3.3.4"></eq><apply id="S3.E5.m1.3.3.5.cmml" xref="S3.E5.m1.3.3.5"><csymbol cd="ambiguous" id="S3.E5.m1.3.3.5.1.cmml" xref="S3.E5.m1.3.3.5">superscript</csymbol><apply id="S3.E5.m1.3.3.5.2.cmml" xref="S3.E5.m1.3.3.5"><csymbol cd="ambiguous" id="S3.E5.m1.3.3.5.2.1.cmml" xref="S3.E5.m1.3.3.5">subscript</csymbol><ci id="S3.E5.m1.3.3.5.2.2.cmml" xref="S3.E5.m1.3.3.5.2.2">ℎ</ci><ci id="S3.E5.m1.3.3.5.2.3.cmml" xref="S3.E5.m1.3.3.5.2.3">𝑡</ci></apply><ci id="S3.E5.m1.3.3.5.3.cmml" xref="S3.E5.m1.3.3.5.3">𝑠</ci></apply><apply id="S3.E5.m1.3.3.3.cmml" xref="S3.E5.m1.3.3.3"><plus id="S3.E5.m1.3.3.3.4.cmml" xref="S3.E5.m1.3.3.3.4"></plus><apply id="S3.E5.m1.1.1.1.1.cmml" xref="S3.E5.m1.1.1.1.1"><csymbol cd="latexml" id="S3.E5.m1.1.1.1.1.2.cmml" xref="S3.E5.m1.1.1.1.1.2">direct-product</csymbol><apply id="S3.E5.m1.1.1.1.1.1.1.1.cmml" xref="S3.E5.m1.1.1.1.1.1.1"><minus id="S3.E5.m1.1.1.1.1.1.1.1.1.cmml" xref="S3.E5.m1.1.1.1.1.1.1.1.1"></minus><cn id="S3.E5.m1.1.1.1.1.1.1.1.2.cmml" type="integer" xref="S3.E5.m1.1.1.1.1.1.1.1.2">1</cn><apply id="S3.E5.m1.1.1.1.1.1.1.1.3.cmml" xref="S3.E5.m1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.E5.m1.1.1.1.1.1.1.1.3.1.cmml" xref="S3.E5.m1.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S3.E5.m1.1.1.1.1.1.1.1.3.2.cmml" xref="S3.E5.m1.1.1.1.1.1.1.1.3.2">𝛼</ci><ci id="S3.E5.m1.1.1.1.1.1.1.1.3.3.cmml" xref="S3.E5.m1.1.1.1.1.1.1.1.3.3">𝑡</ci></apply></apply><apply id="S3.E5.m1.1.1.1.1.3.cmml" xref="S3.E5.m1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.E5.m1.1.1.1.1.3.1.cmml" xref="S3.E5.m1.1.1.1.1.3">superscript</csymbol><apply id="S3.E5.m1.1.1.1.1.3.2.cmml" xref="S3.E5.m1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.E5.m1.1.1.1.1.3.2.1.cmml" xref="S3.E5.m1.1.1.1.1.3">subscript</csymbol><ci id="S3.E5.m1.1.1.1.1.3.2.2.cmml" xref="S3.E5.m1.1.1.1.1.3.2.2">ℎ</ci><apply id="S3.E5.m1.1.1.1.1.3.2.3.cmml" xref="S3.E5.m1.1.1.1.1.3.2.3"><minus id="S3.E5.m1.1.1.1.1.3.2.3.1.cmml" xref="S3.E5.m1.1.1.1.1.3.2.3.1"></minus><ci id="S3.E5.m1.1.1.1.1.3.2.3.2.cmml" xref="S3.E5.m1.1.1.1.1.3.2.3.2">𝑡</ci><cn id="S3.E5.m1.1.1.1.1.3.2.3.3.cmml" type="integer" xref="S3.E5.m1.1.1.1.1.3.2.3.3">1</cn></apply></apply><ci id="S3.E5.m1.1.1.1.1.3.3.cmml" xref="S3.E5.m1.1.1.1.1.3.3">𝑠</ci></apply></apply><apply id="S3.E5.m1.3.3.3.3.cmml" xref="S3.E5.m1.3.3.3.3"><times id="S3.E5.m1.3.3.3.3.3.cmml" xref="S3.E5.m1.3.3.3.3.3"></times><apply id="S3.E5.m1.3.3.3.3.4.cmml" xref="S3.E5.m1.3.3.3.3.4"><csymbol cd="latexml" id="S3.E5.m1.3.3.3.3.4.1.cmml" xref="S3.E5.m1.3.3.3.3.4.1">direct-product</csymbol><apply id="S3.E5.m1.3.3.3.3.4.2.cmml" xref="S3.E5.m1.3.3.3.3.4.2"><csymbol cd="ambiguous" id="S3.E5.m1.3.3.3.3.4.2.1.cmml" xref="S3.E5.m1.3.3.3.3.4.2">subscript</csymbol><ci id="S3.E5.m1.3.3.3.3.4.2.2.cmml" xref="S3.E5.m1.3.3.3.3.4.2.2">𝛼</ci><ci id="S3.E5.m1.3.3.3.3.4.2.3.cmml" xref="S3.E5.m1.3.3.3.3.4.2.3">𝑡</ci></apply><apply id="S3.E5.m1.3.3.3.3.4.3.cmml" xref="S3.E5.m1.3.3.3.3.4.3"><csymbol cd="ambiguous" id="S3.E5.m1.3.3.3.3.4.3.1.cmml" xref="S3.E5.m1.3.3.3.3.4.3">superscript</csymbol><ci id="S3.E5.m1.3.3.3.3.4.3.2a.cmml" xref="S3.E5.m1.3.3.3.3.4.3.2"><mtext id="S3.E5.m1.3.3.3.3.4.3.2.cmml" xref="S3.E5.m1.3.3.3.3.4.3.2">GRU</mtext></ci><ci id="S3.E5.m1.3.3.3.3.4.3.3.cmml" xref="S3.E5.m1.3.3.3.3.4.3.3">𝑠</ci></apply></apply><interval closure="open" id="S3.E5.m1.3.3.3.3.2.3.cmml" xref="S3.E5.m1.3.3.3.3.2.2"><apply id="S3.E5.m1.2.2.2.2.1.1.1.cmml" xref="S3.E5.m1.2.2.2.2.1.1.1"><csymbol cd="ambiguous" id="S3.E5.m1.2.2.2.2.1.1.1.1.cmml" xref="S3.E5.m1.2.2.2.2.1.1.1">subscript</csymbol><ci id="S3.E5.m1.2.2.2.2.1.1.1.2.cmml" xref="S3.E5.m1.2.2.2.2.1.1.1.2">𝑥</ci><ci id="S3.E5.m1.2.2.2.2.1.1.1.3.cmml" xref="S3.E5.m1.2.2.2.2.1.1.1.3">𝑡</ci></apply><apply id="S3.E5.m1.3.3.3.3.2.2.2.cmml" xref="S3.E5.m1.3.3.3.3.2.2.2"><csymbol cd="ambiguous" id="S3.E5.m1.3.3.3.3.2.2.2.1.cmml" xref="S3.E5.m1.3.3.3.3.2.2.2">superscript</csymbol><apply id="S3.E5.m1.3.3.3.3.2.2.2.2.cmml" xref="S3.E5.m1.3.3.3.3.2.2.2"><csymbol cd="ambiguous" id="S3.E5.m1.3.3.3.3.2.2.2.2.1.cmml" xref="S3.E5.m1.3.3.3.3.2.2.2">subscript</csymbol><ci id="S3.E5.m1.3.3.3.3.2.2.2.2.2.cmml" xref="S3.E5.m1.3.3.3.3.2.2.2.2.2">ℎ</ci><apply id="S3.E5.m1.3.3.3.3.2.2.2.2.3.cmml" xref="S3.E5.m1.3.3.3.3.2.2.2.2.3"><minus id="S3.E5.m1.3.3.3.3.2.2.2.2.3.1.cmml" xref="S3.E5.m1.3.3.3.3.2.2.2.2.3.1"></minus><ci id="S3.E5.m1.3.3.3.3.2.2.2.2.3.2.cmml" xref="S3.E5.m1.3.3.3.3.2.2.2.2.3.2">𝑡</ci><cn id="S3.E5.m1.3.3.3.3.2.2.2.2.3.3.cmml" type="integer" xref="S3.E5.m1.3.3.3.3.2.2.2.2.3.3">1</cn></apply></apply><ci id="S3.E5.m1.3.3.3.3.2.2.2.3.cmml" xref="S3.E5.m1.3.3.3.3.2.2.2.3">𝑠</ci></apply></interval></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E5.m1.3c">h_{t}^{s}=(1-\alpha_{t})\odot h_{t-1}^{s}+\alpha_{t}\odot\text{GRU}^{s}(x_{t},% h_{t-1}^{s})</annotation><annotation encoding="application/x-llamapun" id="S3.E5.m1.3d">italic_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT = ( 1 - italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ⊙ italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT + italic_α start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⊙ GRU start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_h start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(5)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS2.p2.3">this configuration allows the slave encoder to selectively integrate new input with prior information, aiding the decoder in generating accurate summaries by providing supplementary information through the final hidden state <math alttext="h_{m}^{s}" class="ltx_Math" display="inline" id="S3.SS2.p2.3.m1.1"><semantics id="S3.SS2.p2.3.m1.1a"><msubsup id="S3.SS2.p2.3.m1.1.1" xref="S3.SS2.p2.3.m1.1.1.cmml"><mi id="S3.SS2.p2.3.m1.1.1.2.2" xref="S3.SS2.p2.3.m1.1.1.2.2.cmml">h</mi><mi id="S3.SS2.p2.3.m1.1.1.2.3" xref="S3.SS2.p2.3.m1.1.1.2.3.cmml">m</mi><mi id="S3.SS2.p2.3.m1.1.1.3" xref="S3.SS2.p2.3.m1.1.1.3.cmml">s</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS2.p2.3.m1.1b"><apply id="S3.SS2.p2.3.m1.1.1.cmml" xref="S3.SS2.p2.3.m1.1.1"><csymbol cd="ambiguous" id="S3.SS2.p2.3.m1.1.1.1.cmml" xref="S3.SS2.p2.3.m1.1.1">superscript</csymbol><apply id="S3.SS2.p2.3.m1.1.1.2.cmml" xref="S3.SS2.p2.3.m1.1.1"><csymbol cd="ambiguous" id="S3.SS2.p2.3.m1.1.1.2.1.cmml" xref="S3.SS2.p2.3.m1.1.1">subscript</csymbol><ci id="S3.SS2.p2.3.m1.1.1.2.2.cmml" xref="S3.SS2.p2.3.m1.1.1.2.2">ℎ</ci><ci id="S3.SS2.p2.3.m1.1.1.2.3.cmml" xref="S3.SS2.p2.3.m1.1.1.2.3">𝑚</ci></apply><ci id="S3.SS2.p2.3.m1.1.1.3.cmml" xref="S3.SS2.p2.3.m1.1.1.3">𝑠</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p2.3.m1.1c">h_{m}^{s}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p2.3.m1.1d">italic_h start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT</annotation></semantics></math>.</p> </div> </section> <section class="ltx_subsection" id="S3.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.3 </span>Decoder</h3> <div class="ltx_para" id="S3.SS3.p1"> <p class="ltx_p" id="S3.SS3.p1.2">In the task of generating patent text summaries, the slave encoder is used as a supplementary and dependency encoder to enhance the performance of the basic model of this paper. In the decoder part, a decoder with an attention mechanism is used, which calculates the context vectors based on the hidden states <math alttext="h_{1}^{p},h_{2}^{p},\ldots,h_{m}^{p}" class="ltx_Math" display="inline" id="S3.SS3.p1.1.m1.4"><semantics id="S3.SS3.p1.1.m1.4a"><mrow id="S3.SS3.p1.1.m1.4.4.3" xref="S3.SS3.p1.1.m1.4.4.4.cmml"><msubsup id="S3.SS3.p1.1.m1.2.2.1.1" xref="S3.SS3.p1.1.m1.2.2.1.1.cmml"><mi id="S3.SS3.p1.1.m1.2.2.1.1.2.2" xref="S3.SS3.p1.1.m1.2.2.1.1.2.2.cmml">h</mi><mn id="S3.SS3.p1.1.m1.2.2.1.1.2.3" xref="S3.SS3.p1.1.m1.2.2.1.1.2.3.cmml">1</mn><mi id="S3.SS3.p1.1.m1.2.2.1.1.3" xref="S3.SS3.p1.1.m1.2.2.1.1.3.cmml">p</mi></msubsup><mo id="S3.SS3.p1.1.m1.4.4.3.4" xref="S3.SS3.p1.1.m1.4.4.4.cmml">,</mo><msubsup id="S3.SS3.p1.1.m1.3.3.2.2" xref="S3.SS3.p1.1.m1.3.3.2.2.cmml"><mi id="S3.SS3.p1.1.m1.3.3.2.2.2.2" xref="S3.SS3.p1.1.m1.3.3.2.2.2.2.cmml">h</mi><mn id="S3.SS3.p1.1.m1.3.3.2.2.2.3" xref="S3.SS3.p1.1.m1.3.3.2.2.2.3.cmml">2</mn><mi id="S3.SS3.p1.1.m1.3.3.2.2.3" xref="S3.SS3.p1.1.m1.3.3.2.2.3.cmml">p</mi></msubsup><mo id="S3.SS3.p1.1.m1.4.4.3.5" xref="S3.SS3.p1.1.m1.4.4.4.cmml">,</mo><mi id="S3.SS3.p1.1.m1.1.1" mathvariant="normal" xref="S3.SS3.p1.1.m1.1.1.cmml">…</mi><mo id="S3.SS3.p1.1.m1.4.4.3.6" xref="S3.SS3.p1.1.m1.4.4.4.cmml">,</mo><msubsup id="S3.SS3.p1.1.m1.4.4.3.3" xref="S3.SS3.p1.1.m1.4.4.3.3.cmml"><mi id="S3.SS3.p1.1.m1.4.4.3.3.2.2" xref="S3.SS3.p1.1.m1.4.4.3.3.2.2.cmml">h</mi><mi id="S3.SS3.p1.1.m1.4.4.3.3.2.3" xref="S3.SS3.p1.1.m1.4.4.3.3.2.3.cmml">m</mi><mi id="S3.SS3.p1.1.m1.4.4.3.3.3" xref="S3.SS3.p1.1.m1.4.4.3.3.3.cmml">p</mi></msubsup></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.1.m1.4b"><list id="S3.SS3.p1.1.m1.4.4.4.cmml" xref="S3.SS3.p1.1.m1.4.4.3"><apply id="S3.SS3.p1.1.m1.2.2.1.1.cmml" xref="S3.SS3.p1.1.m1.2.2.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.1.m1.2.2.1.1.1.cmml" xref="S3.SS3.p1.1.m1.2.2.1.1">superscript</csymbol><apply id="S3.SS3.p1.1.m1.2.2.1.1.2.cmml" xref="S3.SS3.p1.1.m1.2.2.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.1.m1.2.2.1.1.2.1.cmml" xref="S3.SS3.p1.1.m1.2.2.1.1">subscript</csymbol><ci id="S3.SS3.p1.1.m1.2.2.1.1.2.2.cmml" xref="S3.SS3.p1.1.m1.2.2.1.1.2.2">ℎ</ci><cn id="S3.SS3.p1.1.m1.2.2.1.1.2.3.cmml" type="integer" xref="S3.SS3.p1.1.m1.2.2.1.1.2.3">1</cn></apply><ci id="S3.SS3.p1.1.m1.2.2.1.1.3.cmml" xref="S3.SS3.p1.1.m1.2.2.1.1.3">𝑝</ci></apply><apply id="S3.SS3.p1.1.m1.3.3.2.2.cmml" xref="S3.SS3.p1.1.m1.3.3.2.2"><csymbol cd="ambiguous" id="S3.SS3.p1.1.m1.3.3.2.2.1.cmml" xref="S3.SS3.p1.1.m1.3.3.2.2">superscript</csymbol><apply id="S3.SS3.p1.1.m1.3.3.2.2.2.cmml" xref="S3.SS3.p1.1.m1.3.3.2.2"><csymbol cd="ambiguous" id="S3.SS3.p1.1.m1.3.3.2.2.2.1.cmml" xref="S3.SS3.p1.1.m1.3.3.2.2">subscript</csymbol><ci id="S3.SS3.p1.1.m1.3.3.2.2.2.2.cmml" xref="S3.SS3.p1.1.m1.3.3.2.2.2.2">ℎ</ci><cn id="S3.SS3.p1.1.m1.3.3.2.2.2.3.cmml" type="integer" xref="S3.SS3.p1.1.m1.3.3.2.2.2.3">2</cn></apply><ci id="S3.SS3.p1.1.m1.3.3.2.2.3.cmml" xref="S3.SS3.p1.1.m1.3.3.2.2.3">𝑝</ci></apply><ci id="S3.SS3.p1.1.m1.1.1.cmml" xref="S3.SS3.p1.1.m1.1.1">…</ci><apply id="S3.SS3.p1.1.m1.4.4.3.3.cmml" xref="S3.SS3.p1.1.m1.4.4.3.3"><csymbol cd="ambiguous" id="S3.SS3.p1.1.m1.4.4.3.3.1.cmml" xref="S3.SS3.p1.1.m1.4.4.3.3">superscript</csymbol><apply id="S3.SS3.p1.1.m1.4.4.3.3.2.cmml" xref="S3.SS3.p1.1.m1.4.4.3.3"><csymbol cd="ambiguous" id="S3.SS3.p1.1.m1.4.4.3.3.2.1.cmml" xref="S3.SS3.p1.1.m1.4.4.3.3">subscript</csymbol><ci id="S3.SS3.p1.1.m1.4.4.3.3.2.2.cmml" xref="S3.SS3.p1.1.m1.4.4.3.3.2.2">ℎ</ci><ci id="S3.SS3.p1.1.m1.4.4.3.3.2.3.cmml" xref="S3.SS3.p1.1.m1.4.4.3.3.2.3">𝑚</ci></apply><ci id="S3.SS3.p1.1.m1.4.4.3.3.3.cmml" xref="S3.SS3.p1.1.m1.4.4.3.3.3">𝑝</ci></apply></list></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.1.m1.4c">h_{1}^{p},h_{2}^{p},\ldots,h_{m}^{p}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.1.m1.4d">italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT , italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT , … , italic_h start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT</annotation></semantics></math> from the master encoder. The context vector <math alttext="c_{i}" class="ltx_Math" display="inline" id="S3.SS3.p1.2.m2.1"><semantics id="S3.SS3.p1.2.m2.1a"><msub id="S3.SS3.p1.2.m2.1.1" xref="S3.SS3.p1.2.m2.1.1.cmml"><mi id="S3.SS3.p1.2.m2.1.1.2" xref="S3.SS3.p1.2.m2.1.1.2.cmml">c</mi><mi id="S3.SS3.p1.2.m2.1.1.3" xref="S3.SS3.p1.2.m2.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.2.m2.1b"><apply id="S3.SS3.p1.2.m2.1.1.cmml" xref="S3.SS3.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.2.m2.1.1.1.cmml" xref="S3.SS3.p1.2.m2.1.1">subscript</csymbol><ci id="S3.SS3.p1.2.m2.1.1.2.cmml" xref="S3.SS3.p1.2.m2.1.1.2">𝑐</ci><ci id="S3.SS3.p1.2.m2.1.1.3.cmml" xref="S3.SS3.p1.2.m2.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.2.m2.1c">c_{i}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.2.m2.1d">italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math> is calculated as the weighted sum of these hidden states, inspired by the Transformer, as shown in the following formula:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E6"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="c_{i}=\sum_{j=1}^{n}a_{ij}h_{j}^{p}" class="ltx_Math" display="block" id="S3.E6.m1.1"><semantics id="S3.E6.m1.1a"><mrow id="S3.E6.m1.1.1" xref="S3.E6.m1.1.1.cmml"><msub id="S3.E6.m1.1.1.2" xref="S3.E6.m1.1.1.2.cmml"><mi id="S3.E6.m1.1.1.2.2" xref="S3.E6.m1.1.1.2.2.cmml">c</mi><mi id="S3.E6.m1.1.1.2.3" xref="S3.E6.m1.1.1.2.3.cmml">i</mi></msub><mo id="S3.E6.m1.1.1.1" rspace="0.111em" xref="S3.E6.m1.1.1.1.cmml">=</mo><mrow id="S3.E6.m1.1.1.3" xref="S3.E6.m1.1.1.3.cmml"><munderover id="S3.E6.m1.1.1.3.1" xref="S3.E6.m1.1.1.3.1.cmml"><mo id="S3.E6.m1.1.1.3.1.2.2" movablelimits="false" xref="S3.E6.m1.1.1.3.1.2.2.cmml">∑</mo><mrow id="S3.E6.m1.1.1.3.1.2.3" xref="S3.E6.m1.1.1.3.1.2.3.cmml"><mi id="S3.E6.m1.1.1.3.1.2.3.2" xref="S3.E6.m1.1.1.3.1.2.3.2.cmml">j</mi><mo id="S3.E6.m1.1.1.3.1.2.3.1" xref="S3.E6.m1.1.1.3.1.2.3.1.cmml">=</mo><mn id="S3.E6.m1.1.1.3.1.2.3.3" xref="S3.E6.m1.1.1.3.1.2.3.3.cmml">1</mn></mrow><mi id="S3.E6.m1.1.1.3.1.3" xref="S3.E6.m1.1.1.3.1.3.cmml">n</mi></munderover><mrow id="S3.E6.m1.1.1.3.2" xref="S3.E6.m1.1.1.3.2.cmml"><msub id="S3.E6.m1.1.1.3.2.2" xref="S3.E6.m1.1.1.3.2.2.cmml"><mi id="S3.E6.m1.1.1.3.2.2.2" xref="S3.E6.m1.1.1.3.2.2.2.cmml">a</mi><mrow id="S3.E6.m1.1.1.3.2.2.3" xref="S3.E6.m1.1.1.3.2.2.3.cmml"><mi id="S3.E6.m1.1.1.3.2.2.3.2" xref="S3.E6.m1.1.1.3.2.2.3.2.cmml">i</mi><mo id="S3.E6.m1.1.1.3.2.2.3.1" xref="S3.E6.m1.1.1.3.2.2.3.1.cmml">⁢</mo><mi id="S3.E6.m1.1.1.3.2.2.3.3" xref="S3.E6.m1.1.1.3.2.2.3.3.cmml">j</mi></mrow></msub><mo id="S3.E6.m1.1.1.3.2.1" xref="S3.E6.m1.1.1.3.2.1.cmml">⁢</mo><msubsup id="S3.E6.m1.1.1.3.2.3" xref="S3.E6.m1.1.1.3.2.3.cmml"><mi id="S3.E6.m1.1.1.3.2.3.2.2" xref="S3.E6.m1.1.1.3.2.3.2.2.cmml">h</mi><mi id="S3.E6.m1.1.1.3.2.3.2.3" xref="S3.E6.m1.1.1.3.2.3.2.3.cmml">j</mi><mi id="S3.E6.m1.1.1.3.2.3.3" xref="S3.E6.m1.1.1.3.2.3.3.cmml">p</mi></msubsup></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E6.m1.1b"><apply id="S3.E6.m1.1.1.cmml" xref="S3.E6.m1.1.1"><eq id="S3.E6.m1.1.1.1.cmml" xref="S3.E6.m1.1.1.1"></eq><apply id="S3.E6.m1.1.1.2.cmml" xref="S3.E6.m1.1.1.2"><csymbol cd="ambiguous" id="S3.E6.m1.1.1.2.1.cmml" xref="S3.E6.m1.1.1.2">subscript</csymbol><ci id="S3.E6.m1.1.1.2.2.cmml" xref="S3.E6.m1.1.1.2.2">𝑐</ci><ci id="S3.E6.m1.1.1.2.3.cmml" xref="S3.E6.m1.1.1.2.3">𝑖</ci></apply><apply id="S3.E6.m1.1.1.3.cmml" xref="S3.E6.m1.1.1.3"><apply id="S3.E6.m1.1.1.3.1.cmml" xref="S3.E6.m1.1.1.3.1"><csymbol cd="ambiguous" id="S3.E6.m1.1.1.3.1.1.cmml" xref="S3.E6.m1.1.1.3.1">superscript</csymbol><apply id="S3.E6.m1.1.1.3.1.2.cmml" xref="S3.E6.m1.1.1.3.1"><csymbol cd="ambiguous" id="S3.E6.m1.1.1.3.1.2.1.cmml" xref="S3.E6.m1.1.1.3.1">subscript</csymbol><sum id="S3.E6.m1.1.1.3.1.2.2.cmml" xref="S3.E6.m1.1.1.3.1.2.2"></sum><apply id="S3.E6.m1.1.1.3.1.2.3.cmml" xref="S3.E6.m1.1.1.3.1.2.3"><eq id="S3.E6.m1.1.1.3.1.2.3.1.cmml" xref="S3.E6.m1.1.1.3.1.2.3.1"></eq><ci id="S3.E6.m1.1.1.3.1.2.3.2.cmml" xref="S3.E6.m1.1.1.3.1.2.3.2">𝑗</ci><cn id="S3.E6.m1.1.1.3.1.2.3.3.cmml" type="integer" xref="S3.E6.m1.1.1.3.1.2.3.3">1</cn></apply></apply><ci id="S3.E6.m1.1.1.3.1.3.cmml" xref="S3.E6.m1.1.1.3.1.3">𝑛</ci></apply><apply id="S3.E6.m1.1.1.3.2.cmml" xref="S3.E6.m1.1.1.3.2"><times id="S3.E6.m1.1.1.3.2.1.cmml" xref="S3.E6.m1.1.1.3.2.1"></times><apply id="S3.E6.m1.1.1.3.2.2.cmml" xref="S3.E6.m1.1.1.3.2.2"><csymbol cd="ambiguous" id="S3.E6.m1.1.1.3.2.2.1.cmml" xref="S3.E6.m1.1.1.3.2.2">subscript</csymbol><ci id="S3.E6.m1.1.1.3.2.2.2.cmml" xref="S3.E6.m1.1.1.3.2.2.2">𝑎</ci><apply id="S3.E6.m1.1.1.3.2.2.3.cmml" xref="S3.E6.m1.1.1.3.2.2.3"><times id="S3.E6.m1.1.1.3.2.2.3.1.cmml" xref="S3.E6.m1.1.1.3.2.2.3.1"></times><ci id="S3.E6.m1.1.1.3.2.2.3.2.cmml" xref="S3.E6.m1.1.1.3.2.2.3.2">𝑖</ci><ci id="S3.E6.m1.1.1.3.2.2.3.3.cmml" xref="S3.E6.m1.1.1.3.2.2.3.3">𝑗</ci></apply></apply><apply id="S3.E6.m1.1.1.3.2.3.cmml" xref="S3.E6.m1.1.1.3.2.3"><csymbol cd="ambiguous" id="S3.E6.m1.1.1.3.2.3.1.cmml" xref="S3.E6.m1.1.1.3.2.3">superscript</csymbol><apply id="S3.E6.m1.1.1.3.2.3.2.cmml" xref="S3.E6.m1.1.1.3.2.3"><csymbol cd="ambiguous" id="S3.E6.m1.1.1.3.2.3.2.1.cmml" xref="S3.E6.m1.1.1.3.2.3">subscript</csymbol><ci id="S3.E6.m1.1.1.3.2.3.2.2.cmml" xref="S3.E6.m1.1.1.3.2.3.2.2">ℎ</ci><ci id="S3.E6.m1.1.1.3.2.3.2.3.cmml" xref="S3.E6.m1.1.1.3.2.3.2.3">𝑗</ci></apply><ci id="S3.E6.m1.1.1.3.2.3.3.cmml" xref="S3.E6.m1.1.1.3.2.3.3">𝑝</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E6.m1.1c">c_{i}=\sum_{j=1}^{n}a_{ij}h_{j}^{p}</annotation><annotation encoding="application/x-llamapun" id="S3.E6.m1.1d">italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(6)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS3.p1.4">where each hidden state <math alttext="h_{j}^{p}" class="ltx_Math" display="inline" id="S3.SS3.p1.3.m1.1"><semantics id="S3.SS3.p1.3.m1.1a"><msubsup id="S3.SS3.p1.3.m1.1.1" xref="S3.SS3.p1.3.m1.1.1.cmml"><mi id="S3.SS3.p1.3.m1.1.1.2.2" xref="S3.SS3.p1.3.m1.1.1.2.2.cmml">h</mi><mi id="S3.SS3.p1.3.m1.1.1.2.3" xref="S3.SS3.p1.3.m1.1.1.2.3.cmml">j</mi><mi id="S3.SS3.p1.3.m1.1.1.3" xref="S3.SS3.p1.3.m1.1.1.3.cmml">p</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.3.m1.1b"><apply id="S3.SS3.p1.3.m1.1.1.cmml" xref="S3.SS3.p1.3.m1.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.3.m1.1.1.1.cmml" xref="S3.SS3.p1.3.m1.1.1">superscript</csymbol><apply id="S3.SS3.p1.3.m1.1.1.2.cmml" xref="S3.SS3.p1.3.m1.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.3.m1.1.1.2.1.cmml" xref="S3.SS3.p1.3.m1.1.1">subscript</csymbol><ci id="S3.SS3.p1.3.m1.1.1.2.2.cmml" xref="S3.SS3.p1.3.m1.1.1.2.2">ℎ</ci><ci id="S3.SS3.p1.3.m1.1.1.2.3.cmml" xref="S3.SS3.p1.3.m1.1.1.2.3">𝑗</ci></apply><ci id="S3.SS3.p1.3.m1.1.1.3.cmml" xref="S3.SS3.p1.3.m1.1.1.3">𝑝</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.3.m1.1c">h_{j}^{p}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.3.m1.1d">italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT</annotation></semantics></math> weight <math alttext="a_{ij}" class="ltx_Math" display="inline" id="S3.SS3.p1.4.m2.1"><semantics id="S3.SS3.p1.4.m2.1a"><msub id="S3.SS3.p1.4.m2.1.1" xref="S3.SS3.p1.4.m2.1.1.cmml"><mi id="S3.SS3.p1.4.m2.1.1.2" xref="S3.SS3.p1.4.m2.1.1.2.cmml">a</mi><mrow id="S3.SS3.p1.4.m2.1.1.3" xref="S3.SS3.p1.4.m2.1.1.3.cmml"><mi id="S3.SS3.p1.4.m2.1.1.3.2" xref="S3.SS3.p1.4.m2.1.1.3.2.cmml">i</mi><mo id="S3.SS3.p1.4.m2.1.1.3.1" xref="S3.SS3.p1.4.m2.1.1.3.1.cmml">⁢</mo><mi id="S3.SS3.p1.4.m2.1.1.3.3" xref="S3.SS3.p1.4.m2.1.1.3.3.cmml">j</mi></mrow></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.4.m2.1b"><apply id="S3.SS3.p1.4.m2.1.1.cmml" xref="S3.SS3.p1.4.m2.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.4.m2.1.1.1.cmml" xref="S3.SS3.p1.4.m2.1.1">subscript</csymbol><ci id="S3.SS3.p1.4.m2.1.1.2.cmml" xref="S3.SS3.p1.4.m2.1.1.2">𝑎</ci><apply id="S3.SS3.p1.4.m2.1.1.3.cmml" xref="S3.SS3.p1.4.m2.1.1.3"><times id="S3.SS3.p1.4.m2.1.1.3.1.cmml" xref="S3.SS3.p1.4.m2.1.1.3.1"></times><ci id="S3.SS3.p1.4.m2.1.1.3.2.cmml" xref="S3.SS3.p1.4.m2.1.1.3.2">𝑖</ci><ci id="S3.SS3.p1.4.m2.1.1.3.3.cmml" xref="S3.SS3.p1.4.m2.1.1.3.3">𝑗</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.4.m2.1c">a_{ij}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.4.m2.1d">italic_a start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT</annotation></semantics></math> can be computed as:</p> <table class="ltx_equationgroup ltx_eqn_align ltx_eqn_table" id="S5.EGx1"> <tbody id="S3.E7"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle a_{ij}" class="ltx_Math" display="inline" id="S3.E7.m1.1"><semantics id="S3.E7.m1.1a"><msub id="S3.E7.m1.1.1" xref="S3.E7.m1.1.1.cmml"><mi id="S3.E7.m1.1.1.2" xref="S3.E7.m1.1.1.2.cmml">a</mi><mrow id="S3.E7.m1.1.1.3" xref="S3.E7.m1.1.1.3.cmml"><mi id="S3.E7.m1.1.1.3.2" xref="S3.E7.m1.1.1.3.2.cmml">i</mi><mo id="S3.E7.m1.1.1.3.1" xref="S3.E7.m1.1.1.3.1.cmml">⁢</mo><mi id="S3.E7.m1.1.1.3.3" xref="S3.E7.m1.1.1.3.3.cmml">j</mi></mrow></msub><annotation-xml encoding="MathML-Content" id="S3.E7.m1.1b"><apply id="S3.E7.m1.1.1.cmml" xref="S3.E7.m1.1.1"><csymbol cd="ambiguous" id="S3.E7.m1.1.1.1.cmml" xref="S3.E7.m1.1.1">subscript</csymbol><ci id="S3.E7.m1.1.1.2.cmml" xref="S3.E7.m1.1.1.2">𝑎</ci><apply id="S3.E7.m1.1.1.3.cmml" xref="S3.E7.m1.1.1.3"><times id="S3.E7.m1.1.1.3.1.cmml" xref="S3.E7.m1.1.1.3.1"></times><ci id="S3.E7.m1.1.1.3.2.cmml" xref="S3.E7.m1.1.1.3.2">𝑖</ci><ci id="S3.E7.m1.1.1.3.3.cmml" xref="S3.E7.m1.1.1.3.3">𝑗</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E7.m1.1c">\displaystyle a_{ij}</annotation><annotation encoding="application/x-llamapun" id="S3.E7.m1.1d">italic_a start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=\frac{\exp(e_{ij})}{\sum_{k=1}^{n}\exp(e_{ik})}" class="ltx_Math" display="inline" id="S3.E7.m2.4"><semantics id="S3.E7.m2.4a"><mrow id="S3.E7.m2.4.5" xref="S3.E7.m2.4.5.cmml"><mi id="S3.E7.m2.4.5.2" xref="S3.E7.m2.4.5.2.cmml"></mi><mo id="S3.E7.m2.4.5.1" xref="S3.E7.m2.4.5.1.cmml">=</mo><mstyle displaystyle="true" id="S3.E7.m2.4.4" xref="S3.E7.m2.4.4.cmml"><mfrac id="S3.E7.m2.4.4a" xref="S3.E7.m2.4.4.cmml"><mrow id="S3.E7.m2.2.2.2.2" xref="S3.E7.m2.2.2.2.3.cmml"><mi id="S3.E7.m2.1.1.1.1" xref="S3.E7.m2.1.1.1.1.cmml">exp</mi><mo id="S3.E7.m2.2.2.2.2a" xref="S3.E7.m2.2.2.2.3.cmml">⁡</mo><mrow id="S3.E7.m2.2.2.2.2.1" xref="S3.E7.m2.2.2.2.3.cmml"><mo id="S3.E7.m2.2.2.2.2.1.2" stretchy="false" xref="S3.E7.m2.2.2.2.3.cmml">(</mo><msub id="S3.E7.m2.2.2.2.2.1.1" xref="S3.E7.m2.2.2.2.2.1.1.cmml"><mi id="S3.E7.m2.2.2.2.2.1.1.2" xref="S3.E7.m2.2.2.2.2.1.1.2.cmml">e</mi><mrow id="S3.E7.m2.2.2.2.2.1.1.3" xref="S3.E7.m2.2.2.2.2.1.1.3.cmml"><mi id="S3.E7.m2.2.2.2.2.1.1.3.2" xref="S3.E7.m2.2.2.2.2.1.1.3.2.cmml">i</mi><mo id="S3.E7.m2.2.2.2.2.1.1.3.1" xref="S3.E7.m2.2.2.2.2.1.1.3.1.cmml">⁢</mo><mi id="S3.E7.m2.2.2.2.2.1.1.3.3" xref="S3.E7.m2.2.2.2.2.1.1.3.3.cmml">j</mi></mrow></msub><mo id="S3.E7.m2.2.2.2.2.1.3" stretchy="false" xref="S3.E7.m2.2.2.2.3.cmml">)</mo></mrow></mrow><mrow id="S3.E7.m2.4.4.4" xref="S3.E7.m2.4.4.4.cmml"><msubsup id="S3.E7.m2.4.4.4.3" xref="S3.E7.m2.4.4.4.3.cmml"><mo id="S3.E7.m2.4.4.4.3.2.2" xref="S3.E7.m2.4.4.4.3.2.2.cmml">∑</mo><mrow id="S3.E7.m2.4.4.4.3.2.3" xref="S3.E7.m2.4.4.4.3.2.3.cmml"><mi id="S3.E7.m2.4.4.4.3.2.3.2" xref="S3.E7.m2.4.4.4.3.2.3.2.cmml">k</mi><mo id="S3.E7.m2.4.4.4.3.2.3.1" xref="S3.E7.m2.4.4.4.3.2.3.1.cmml">=</mo><mn id="S3.E7.m2.4.4.4.3.2.3.3" xref="S3.E7.m2.4.4.4.3.2.3.3.cmml">1</mn></mrow><mi id="S3.E7.m2.4.4.4.3.3" xref="S3.E7.m2.4.4.4.3.3.cmml">n</mi></msubsup><mrow id="S3.E7.m2.4.4.4.2.1" xref="S3.E7.m2.4.4.4.2.2.cmml"><mi id="S3.E7.m2.3.3.3.1" xref="S3.E7.m2.3.3.3.1.cmml">exp</mi><mo id="S3.E7.m2.4.4.4.2.1a" xref="S3.E7.m2.4.4.4.2.2.cmml">⁡</mo><mrow id="S3.E7.m2.4.4.4.2.1.1" xref="S3.E7.m2.4.4.4.2.2.cmml"><mo id="S3.E7.m2.4.4.4.2.1.1.2" stretchy="false" xref="S3.E7.m2.4.4.4.2.2.cmml">(</mo><msub id="S3.E7.m2.4.4.4.2.1.1.1" xref="S3.E7.m2.4.4.4.2.1.1.1.cmml"><mi id="S3.E7.m2.4.4.4.2.1.1.1.2" xref="S3.E7.m2.4.4.4.2.1.1.1.2.cmml">e</mi><mrow id="S3.E7.m2.4.4.4.2.1.1.1.3" xref="S3.E7.m2.4.4.4.2.1.1.1.3.cmml"><mi id="S3.E7.m2.4.4.4.2.1.1.1.3.2" xref="S3.E7.m2.4.4.4.2.1.1.1.3.2.cmml">i</mi><mo id="S3.E7.m2.4.4.4.2.1.1.1.3.1" xref="S3.E7.m2.4.4.4.2.1.1.1.3.1.cmml">⁢</mo><mi id="S3.E7.m2.4.4.4.2.1.1.1.3.3" xref="S3.E7.m2.4.4.4.2.1.1.1.3.3.cmml">k</mi></mrow></msub><mo id="S3.E7.m2.4.4.4.2.1.1.3" stretchy="false" xref="S3.E7.m2.4.4.4.2.2.cmml">)</mo></mrow></mrow></mrow></mfrac></mstyle></mrow><annotation-xml encoding="MathML-Content" id="S3.E7.m2.4b"><apply id="S3.E7.m2.4.5.cmml" xref="S3.E7.m2.4.5"><eq id="S3.E7.m2.4.5.1.cmml" xref="S3.E7.m2.4.5.1"></eq><csymbol cd="latexml" id="S3.E7.m2.4.5.2.cmml" xref="S3.E7.m2.4.5.2">absent</csymbol><apply id="S3.E7.m2.4.4.cmml" xref="S3.E7.m2.4.4"><divide id="S3.E7.m2.4.4.5.cmml" xref="S3.E7.m2.4.4"></divide><apply id="S3.E7.m2.2.2.2.3.cmml" xref="S3.E7.m2.2.2.2.2"><exp id="S3.E7.m2.1.1.1.1.cmml" xref="S3.E7.m2.1.1.1.1"></exp><apply id="S3.E7.m2.2.2.2.2.1.1.cmml" xref="S3.E7.m2.2.2.2.2.1.1"><csymbol cd="ambiguous" id="S3.E7.m2.2.2.2.2.1.1.1.cmml" xref="S3.E7.m2.2.2.2.2.1.1">subscript</csymbol><ci id="S3.E7.m2.2.2.2.2.1.1.2.cmml" xref="S3.E7.m2.2.2.2.2.1.1.2">𝑒</ci><apply id="S3.E7.m2.2.2.2.2.1.1.3.cmml" xref="S3.E7.m2.2.2.2.2.1.1.3"><times id="S3.E7.m2.2.2.2.2.1.1.3.1.cmml" xref="S3.E7.m2.2.2.2.2.1.1.3.1"></times><ci id="S3.E7.m2.2.2.2.2.1.1.3.2.cmml" xref="S3.E7.m2.2.2.2.2.1.1.3.2">𝑖</ci><ci id="S3.E7.m2.2.2.2.2.1.1.3.3.cmml" xref="S3.E7.m2.2.2.2.2.1.1.3.3">𝑗</ci></apply></apply></apply><apply id="S3.E7.m2.4.4.4.cmml" xref="S3.E7.m2.4.4.4"><apply id="S3.E7.m2.4.4.4.3.cmml" xref="S3.E7.m2.4.4.4.3"><csymbol cd="ambiguous" id="S3.E7.m2.4.4.4.3.1.cmml" xref="S3.E7.m2.4.4.4.3">superscript</csymbol><apply id="S3.E7.m2.4.4.4.3.2.cmml" xref="S3.E7.m2.4.4.4.3"><csymbol cd="ambiguous" id="S3.E7.m2.4.4.4.3.2.1.cmml" xref="S3.E7.m2.4.4.4.3">subscript</csymbol><sum id="S3.E7.m2.4.4.4.3.2.2.cmml" xref="S3.E7.m2.4.4.4.3.2.2"></sum><apply id="S3.E7.m2.4.4.4.3.2.3.cmml" xref="S3.E7.m2.4.4.4.3.2.3"><eq id="S3.E7.m2.4.4.4.3.2.3.1.cmml" xref="S3.E7.m2.4.4.4.3.2.3.1"></eq><ci id="S3.E7.m2.4.4.4.3.2.3.2.cmml" xref="S3.E7.m2.4.4.4.3.2.3.2">𝑘</ci><cn id="S3.E7.m2.4.4.4.3.2.3.3.cmml" type="integer" xref="S3.E7.m2.4.4.4.3.2.3.3">1</cn></apply></apply><ci id="S3.E7.m2.4.4.4.3.3.cmml" xref="S3.E7.m2.4.4.4.3.3">𝑛</ci></apply><apply id="S3.E7.m2.4.4.4.2.2.cmml" xref="S3.E7.m2.4.4.4.2.1"><exp id="S3.E7.m2.3.3.3.1.cmml" xref="S3.E7.m2.3.3.3.1"></exp><apply id="S3.E7.m2.4.4.4.2.1.1.1.cmml" xref="S3.E7.m2.4.4.4.2.1.1.1"><csymbol cd="ambiguous" id="S3.E7.m2.4.4.4.2.1.1.1.1.cmml" xref="S3.E7.m2.4.4.4.2.1.1.1">subscript</csymbol><ci id="S3.E7.m2.4.4.4.2.1.1.1.2.cmml" xref="S3.E7.m2.4.4.4.2.1.1.1.2">𝑒</ci><apply id="S3.E7.m2.4.4.4.2.1.1.1.3.cmml" xref="S3.E7.m2.4.4.4.2.1.1.1.3"><times id="S3.E7.m2.4.4.4.2.1.1.1.3.1.cmml" xref="S3.E7.m2.4.4.4.2.1.1.1.3.1"></times><ci id="S3.E7.m2.4.4.4.2.1.1.1.3.2.cmml" xref="S3.E7.m2.4.4.4.2.1.1.1.3.2">𝑖</ci><ci id="S3.E7.m2.4.4.4.2.1.1.1.3.3.cmml" xref="S3.E7.m2.4.4.4.2.1.1.1.3.3">𝑘</ci></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E7.m2.4c">\displaystyle=\frac{\exp(e_{ij})}{\sum_{k=1}^{n}\exp(e_{ik})}</annotation><annotation encoding="application/x-llamapun" id="S3.E7.m2.4d">= divide start_ARG roman_exp ( italic_e start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT roman_exp ( italic_e start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT ) end_ARG</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(7)</span></td> </tr></tbody> <tbody id="S3.E8"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle e_{ij}" class="ltx_Math" display="inline" id="S3.E8.m1.1"><semantics id="S3.E8.m1.1a"><msub id="S3.E8.m1.1.1" xref="S3.E8.m1.1.1.cmml"><mi id="S3.E8.m1.1.1.2" xref="S3.E8.m1.1.1.2.cmml">e</mi><mrow id="S3.E8.m1.1.1.3" xref="S3.E8.m1.1.1.3.cmml"><mi id="S3.E8.m1.1.1.3.2" xref="S3.E8.m1.1.1.3.2.cmml">i</mi><mo id="S3.E8.m1.1.1.3.1" xref="S3.E8.m1.1.1.3.1.cmml">⁢</mo><mi id="S3.E8.m1.1.1.3.3" xref="S3.E8.m1.1.1.3.3.cmml">j</mi></mrow></msub><annotation-xml encoding="MathML-Content" id="S3.E8.m1.1b"><apply id="S3.E8.m1.1.1.cmml" xref="S3.E8.m1.1.1"><csymbol cd="ambiguous" id="S3.E8.m1.1.1.1.cmml" xref="S3.E8.m1.1.1">subscript</csymbol><ci id="S3.E8.m1.1.1.2.cmml" xref="S3.E8.m1.1.1.2">𝑒</ci><apply id="S3.E8.m1.1.1.3.cmml" xref="S3.E8.m1.1.1.3"><times id="S3.E8.m1.1.1.3.1.cmml" xref="S3.E8.m1.1.1.3.1"></times><ci id="S3.E8.m1.1.1.3.2.cmml" xref="S3.E8.m1.1.1.3.2">𝑖</ci><ci id="S3.E8.m1.1.1.3.3.cmml" xref="S3.E8.m1.1.1.3.3">𝑗</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E8.m1.1c">\displaystyle e_{ij}</annotation><annotation encoding="application/x-llamapun" id="S3.E8.m1.1d">italic_e start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=v_{a}^{T}\tanh(W_{a}h_{i-1}^{d}+U_{a}h_{j}^{p})" class="ltx_Math" display="inline" id="S3.E8.m2.2"><semantics id="S3.E8.m2.2a"><mrow id="S3.E8.m2.2.2" xref="S3.E8.m2.2.2.cmml"><mi id="S3.E8.m2.2.2.3" xref="S3.E8.m2.2.2.3.cmml"></mi><mo id="S3.E8.m2.2.2.2" xref="S3.E8.m2.2.2.2.cmml">=</mo><mrow id="S3.E8.m2.2.2.1" xref="S3.E8.m2.2.2.1.cmml"><msubsup id="S3.E8.m2.2.2.1.3" xref="S3.E8.m2.2.2.1.3.cmml"><mi id="S3.E8.m2.2.2.1.3.2.2" xref="S3.E8.m2.2.2.1.3.2.2.cmml">v</mi><mi id="S3.E8.m2.2.2.1.3.2.3" xref="S3.E8.m2.2.2.1.3.2.3.cmml">a</mi><mi id="S3.E8.m2.2.2.1.3.3" xref="S3.E8.m2.2.2.1.3.3.cmml">T</mi></msubsup><mo id="S3.E8.m2.2.2.1.2" lspace="0.167em" xref="S3.E8.m2.2.2.1.2.cmml">⁢</mo><mrow id="S3.E8.m2.2.2.1.1.1" xref="S3.E8.m2.2.2.1.1.2.cmml"><mi id="S3.E8.m2.1.1" xref="S3.E8.m2.1.1.cmml">tanh</mi><mo id="S3.E8.m2.2.2.1.1.1a" xref="S3.E8.m2.2.2.1.1.2.cmml">⁡</mo><mrow id="S3.E8.m2.2.2.1.1.1.1" xref="S3.E8.m2.2.2.1.1.2.cmml"><mo id="S3.E8.m2.2.2.1.1.1.1.2" stretchy="false" xref="S3.E8.m2.2.2.1.1.2.cmml">(</mo><mrow id="S3.E8.m2.2.2.1.1.1.1.1" xref="S3.E8.m2.2.2.1.1.1.1.1.cmml"><mrow id="S3.E8.m2.2.2.1.1.1.1.1.2" xref="S3.E8.m2.2.2.1.1.1.1.1.2.cmml"><msub id="S3.E8.m2.2.2.1.1.1.1.1.2.2" xref="S3.E8.m2.2.2.1.1.1.1.1.2.2.cmml"><mi id="S3.E8.m2.2.2.1.1.1.1.1.2.2.2" xref="S3.E8.m2.2.2.1.1.1.1.1.2.2.2.cmml">W</mi><mi id="S3.E8.m2.2.2.1.1.1.1.1.2.2.3" xref="S3.E8.m2.2.2.1.1.1.1.1.2.2.3.cmml">a</mi></msub><mo id="S3.E8.m2.2.2.1.1.1.1.1.2.1" xref="S3.E8.m2.2.2.1.1.1.1.1.2.1.cmml">⁢</mo><msubsup id="S3.E8.m2.2.2.1.1.1.1.1.2.3" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.cmml"><mi id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.2" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.2.cmml">h</mi><mrow id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.cmml"><mi id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.2" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.2.cmml">i</mi><mo id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.1" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.1.cmml">−</mo><mn id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.3" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.3.cmml">1</mn></mrow><mi id="S3.E8.m2.2.2.1.1.1.1.1.2.3.3" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.3.cmml">d</mi></msubsup></mrow><mo id="S3.E8.m2.2.2.1.1.1.1.1.1" xref="S3.E8.m2.2.2.1.1.1.1.1.1.cmml">+</mo><mrow id="S3.E8.m2.2.2.1.1.1.1.1.3" xref="S3.E8.m2.2.2.1.1.1.1.1.3.cmml"><msub id="S3.E8.m2.2.2.1.1.1.1.1.3.2" xref="S3.E8.m2.2.2.1.1.1.1.1.3.2.cmml"><mi id="S3.E8.m2.2.2.1.1.1.1.1.3.2.2" xref="S3.E8.m2.2.2.1.1.1.1.1.3.2.2.cmml">U</mi><mi id="S3.E8.m2.2.2.1.1.1.1.1.3.2.3" xref="S3.E8.m2.2.2.1.1.1.1.1.3.2.3.cmml">a</mi></msub><mo id="S3.E8.m2.2.2.1.1.1.1.1.3.1" xref="S3.E8.m2.2.2.1.1.1.1.1.3.1.cmml">⁢</mo><msubsup id="S3.E8.m2.2.2.1.1.1.1.1.3.3" xref="S3.E8.m2.2.2.1.1.1.1.1.3.3.cmml"><mi id="S3.E8.m2.2.2.1.1.1.1.1.3.3.2.2" xref="S3.E8.m2.2.2.1.1.1.1.1.3.3.2.2.cmml">h</mi><mi id="S3.E8.m2.2.2.1.1.1.1.1.3.3.2.3" xref="S3.E8.m2.2.2.1.1.1.1.1.3.3.2.3.cmml">j</mi><mi id="S3.E8.m2.2.2.1.1.1.1.1.3.3.3" xref="S3.E8.m2.2.2.1.1.1.1.1.3.3.3.cmml">p</mi></msubsup></mrow></mrow><mo id="S3.E8.m2.2.2.1.1.1.1.3" stretchy="false" xref="S3.E8.m2.2.2.1.1.2.cmml">)</mo></mrow></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E8.m2.2b"><apply id="S3.E8.m2.2.2.cmml" xref="S3.E8.m2.2.2"><eq id="S3.E8.m2.2.2.2.cmml" xref="S3.E8.m2.2.2.2"></eq><csymbol cd="latexml" id="S3.E8.m2.2.2.3.cmml" xref="S3.E8.m2.2.2.3">absent</csymbol><apply id="S3.E8.m2.2.2.1.cmml" xref="S3.E8.m2.2.2.1"><times id="S3.E8.m2.2.2.1.2.cmml" xref="S3.E8.m2.2.2.1.2"></times><apply id="S3.E8.m2.2.2.1.3.cmml" xref="S3.E8.m2.2.2.1.3"><csymbol cd="ambiguous" id="S3.E8.m2.2.2.1.3.1.cmml" xref="S3.E8.m2.2.2.1.3">superscript</csymbol><apply id="S3.E8.m2.2.2.1.3.2.cmml" xref="S3.E8.m2.2.2.1.3"><csymbol cd="ambiguous" id="S3.E8.m2.2.2.1.3.2.1.cmml" xref="S3.E8.m2.2.2.1.3">subscript</csymbol><ci id="S3.E8.m2.2.2.1.3.2.2.cmml" xref="S3.E8.m2.2.2.1.3.2.2">𝑣</ci><ci id="S3.E8.m2.2.2.1.3.2.3.cmml" xref="S3.E8.m2.2.2.1.3.2.3">𝑎</ci></apply><ci id="S3.E8.m2.2.2.1.3.3.cmml" xref="S3.E8.m2.2.2.1.3.3">𝑇</ci></apply><apply id="S3.E8.m2.2.2.1.1.2.cmml" xref="S3.E8.m2.2.2.1.1.1"><tanh id="S3.E8.m2.1.1.cmml" xref="S3.E8.m2.1.1"></tanh><apply id="S3.E8.m2.2.2.1.1.1.1.1.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1"><plus id="S3.E8.m2.2.2.1.1.1.1.1.1.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.1"></plus><apply id="S3.E8.m2.2.2.1.1.1.1.1.2.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2"><times id="S3.E8.m2.2.2.1.1.1.1.1.2.1.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.1"></times><apply id="S3.E8.m2.2.2.1.1.1.1.1.2.2.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.2"><csymbol cd="ambiguous" id="S3.E8.m2.2.2.1.1.1.1.1.2.2.1.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.2">subscript</csymbol><ci id="S3.E8.m2.2.2.1.1.1.1.1.2.2.2.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.2.2">𝑊</ci><ci id="S3.E8.m2.2.2.1.1.1.1.1.2.2.3.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.2.3">𝑎</ci></apply><apply id="S3.E8.m2.2.2.1.1.1.1.1.2.3.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3"><csymbol cd="ambiguous" id="S3.E8.m2.2.2.1.1.1.1.1.2.3.1.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3">superscript</csymbol><apply id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3"><csymbol cd="ambiguous" id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.1.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3">subscript</csymbol><ci id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.2.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.2">ℎ</ci><apply id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3"><minus id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.1.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.1"></minus><ci id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.2.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.2">𝑖</ci><cn id="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.3.cmml" type="integer" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.2.3.3">1</cn></apply></apply><ci id="S3.E8.m2.2.2.1.1.1.1.1.2.3.3.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.2.3.3">𝑑</ci></apply></apply><apply id="S3.E8.m2.2.2.1.1.1.1.1.3.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3"><times id="S3.E8.m2.2.2.1.1.1.1.1.3.1.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.1"></times><apply id="S3.E8.m2.2.2.1.1.1.1.1.3.2.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.2"><csymbol cd="ambiguous" id="S3.E8.m2.2.2.1.1.1.1.1.3.2.1.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.2">subscript</csymbol><ci id="S3.E8.m2.2.2.1.1.1.1.1.3.2.2.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.2.2">𝑈</ci><ci id="S3.E8.m2.2.2.1.1.1.1.1.3.2.3.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.2.3">𝑎</ci></apply><apply id="S3.E8.m2.2.2.1.1.1.1.1.3.3.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.3"><csymbol cd="ambiguous" id="S3.E8.m2.2.2.1.1.1.1.1.3.3.1.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.3">superscript</csymbol><apply id="S3.E8.m2.2.2.1.1.1.1.1.3.3.2.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.3"><csymbol cd="ambiguous" id="S3.E8.m2.2.2.1.1.1.1.1.3.3.2.1.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.3">subscript</csymbol><ci id="S3.E8.m2.2.2.1.1.1.1.1.3.3.2.2.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.3.2.2">ℎ</ci><ci id="S3.E8.m2.2.2.1.1.1.1.1.3.3.2.3.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.3.2.3">𝑗</ci></apply><ci id="S3.E8.m2.2.2.1.1.1.1.1.3.3.3.cmml" xref="S3.E8.m2.2.2.1.1.1.1.1.3.3.3">𝑝</ci></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E8.m2.2c">\displaystyle=v_{a}^{T}\tanh(W_{a}h_{i-1}^{d}+U_{a}h_{j}^{p})</annotation><annotation encoding="application/x-llamapun" id="S3.E8.m2.2d">= italic_v start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_tanh ( italic_W start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + italic_U start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(8)</span></td> </tr></tbody> <tbody id="S3.E9"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle h_{i}^{d}" class="ltx_Math" display="inline" id="S3.E9.m1.1"><semantics id="S3.E9.m1.1a"><msubsup id="S3.E9.m1.1.1" xref="S3.E9.m1.1.1.cmml"><mi id="S3.E9.m1.1.1.2.2" xref="S3.E9.m1.1.1.2.2.cmml">h</mi><mi id="S3.E9.m1.1.1.2.3" xref="S3.E9.m1.1.1.2.3.cmml">i</mi><mi id="S3.E9.m1.1.1.3" xref="S3.E9.m1.1.1.3.cmml">d</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.E9.m1.1b"><apply id="S3.E9.m1.1.1.cmml" xref="S3.E9.m1.1.1"><csymbol cd="ambiguous" id="S3.E9.m1.1.1.1.cmml" xref="S3.E9.m1.1.1">superscript</csymbol><apply id="S3.E9.m1.1.1.2.cmml" xref="S3.E9.m1.1.1"><csymbol cd="ambiguous" id="S3.E9.m1.1.1.2.1.cmml" xref="S3.E9.m1.1.1">subscript</csymbol><ci id="S3.E9.m1.1.1.2.2.cmml" xref="S3.E9.m1.1.1.2.2">ℎ</ci><ci id="S3.E9.m1.1.1.2.3.cmml" xref="S3.E9.m1.1.1.2.3">𝑖</ci></apply><ci id="S3.E9.m1.1.1.3.cmml" xref="S3.E9.m1.1.1.3">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E9.m1.1c">\displaystyle h_{i}^{d}</annotation><annotation encoding="application/x-llamapun" id="S3.E9.m1.1d">italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=\text{GRU}^{d}(y_{i},h_{i-1}^{d})" class="ltx_Math" display="inline" id="S3.E9.m2.2"><semantics id="S3.E9.m2.2a"><mrow id="S3.E9.m2.2.2" xref="S3.E9.m2.2.2.cmml"><mi id="S3.E9.m2.2.2.4" xref="S3.E9.m2.2.2.4.cmml"></mi><mo id="S3.E9.m2.2.2.3" xref="S3.E9.m2.2.2.3.cmml">=</mo><mrow id="S3.E9.m2.2.2.2" xref="S3.E9.m2.2.2.2.cmml"><msup id="S3.E9.m2.2.2.2.4" xref="S3.E9.m2.2.2.2.4.cmml"><mtext id="S3.E9.m2.2.2.2.4.2" xref="S3.E9.m2.2.2.2.4.2a.cmml">GRU</mtext><mi id="S3.E9.m2.2.2.2.4.3" xref="S3.E9.m2.2.2.2.4.3.cmml">d</mi></msup><mo id="S3.E9.m2.2.2.2.3" xref="S3.E9.m2.2.2.2.3.cmml">⁢</mo><mrow id="S3.E9.m2.2.2.2.2.2" xref="S3.E9.m2.2.2.2.2.3.cmml"><mo id="S3.E9.m2.2.2.2.2.2.3" stretchy="false" xref="S3.E9.m2.2.2.2.2.3.cmml">(</mo><msub id="S3.E9.m2.1.1.1.1.1.1" xref="S3.E9.m2.1.1.1.1.1.1.cmml"><mi id="S3.E9.m2.1.1.1.1.1.1.2" xref="S3.E9.m2.1.1.1.1.1.1.2.cmml">y</mi><mi id="S3.E9.m2.1.1.1.1.1.1.3" xref="S3.E9.m2.1.1.1.1.1.1.3.cmml">i</mi></msub><mo id="S3.E9.m2.2.2.2.2.2.4" xref="S3.E9.m2.2.2.2.2.3.cmml">,</mo><msubsup id="S3.E9.m2.2.2.2.2.2.2" xref="S3.E9.m2.2.2.2.2.2.2.cmml"><mi id="S3.E9.m2.2.2.2.2.2.2.2.2" xref="S3.E9.m2.2.2.2.2.2.2.2.2.cmml">h</mi><mrow id="S3.E9.m2.2.2.2.2.2.2.2.3" xref="S3.E9.m2.2.2.2.2.2.2.2.3.cmml"><mi id="S3.E9.m2.2.2.2.2.2.2.2.3.2" xref="S3.E9.m2.2.2.2.2.2.2.2.3.2.cmml">i</mi><mo id="S3.E9.m2.2.2.2.2.2.2.2.3.1" xref="S3.E9.m2.2.2.2.2.2.2.2.3.1.cmml">−</mo><mn id="S3.E9.m2.2.2.2.2.2.2.2.3.3" xref="S3.E9.m2.2.2.2.2.2.2.2.3.3.cmml">1</mn></mrow><mi id="S3.E9.m2.2.2.2.2.2.2.3" xref="S3.E9.m2.2.2.2.2.2.2.3.cmml">d</mi></msubsup><mo id="S3.E9.m2.2.2.2.2.2.5" stretchy="false" xref="S3.E9.m2.2.2.2.2.3.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E9.m2.2b"><apply id="S3.E9.m2.2.2.cmml" xref="S3.E9.m2.2.2"><eq id="S3.E9.m2.2.2.3.cmml" xref="S3.E9.m2.2.2.3"></eq><csymbol cd="latexml" id="S3.E9.m2.2.2.4.cmml" xref="S3.E9.m2.2.2.4">absent</csymbol><apply id="S3.E9.m2.2.2.2.cmml" xref="S3.E9.m2.2.2.2"><times id="S3.E9.m2.2.2.2.3.cmml" xref="S3.E9.m2.2.2.2.3"></times><apply id="S3.E9.m2.2.2.2.4.cmml" xref="S3.E9.m2.2.2.2.4"><csymbol cd="ambiguous" id="S3.E9.m2.2.2.2.4.1.cmml" xref="S3.E9.m2.2.2.2.4">superscript</csymbol><ci id="S3.E9.m2.2.2.2.4.2a.cmml" xref="S3.E9.m2.2.2.2.4.2"><mtext id="S3.E9.m2.2.2.2.4.2.cmml" xref="S3.E9.m2.2.2.2.4.2">GRU</mtext></ci><ci id="S3.E9.m2.2.2.2.4.3.cmml" xref="S3.E9.m2.2.2.2.4.3">𝑑</ci></apply><interval closure="open" id="S3.E9.m2.2.2.2.2.3.cmml" xref="S3.E9.m2.2.2.2.2.2"><apply id="S3.E9.m2.1.1.1.1.1.1.cmml" xref="S3.E9.m2.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.E9.m2.1.1.1.1.1.1.1.cmml" xref="S3.E9.m2.1.1.1.1.1.1">subscript</csymbol><ci id="S3.E9.m2.1.1.1.1.1.1.2.cmml" xref="S3.E9.m2.1.1.1.1.1.1.2">𝑦</ci><ci id="S3.E9.m2.1.1.1.1.1.1.3.cmml" xref="S3.E9.m2.1.1.1.1.1.1.3">𝑖</ci></apply><apply id="S3.E9.m2.2.2.2.2.2.2.cmml" xref="S3.E9.m2.2.2.2.2.2.2"><csymbol cd="ambiguous" id="S3.E9.m2.2.2.2.2.2.2.1.cmml" xref="S3.E9.m2.2.2.2.2.2.2">superscript</csymbol><apply id="S3.E9.m2.2.2.2.2.2.2.2.cmml" xref="S3.E9.m2.2.2.2.2.2.2"><csymbol cd="ambiguous" id="S3.E9.m2.2.2.2.2.2.2.2.1.cmml" xref="S3.E9.m2.2.2.2.2.2.2">subscript</csymbol><ci id="S3.E9.m2.2.2.2.2.2.2.2.2.cmml" xref="S3.E9.m2.2.2.2.2.2.2.2.2">ℎ</ci><apply id="S3.E9.m2.2.2.2.2.2.2.2.3.cmml" xref="S3.E9.m2.2.2.2.2.2.2.2.3"><minus id="S3.E9.m2.2.2.2.2.2.2.2.3.1.cmml" xref="S3.E9.m2.2.2.2.2.2.2.2.3.1"></minus><ci id="S3.E9.m2.2.2.2.2.2.2.2.3.2.cmml" xref="S3.E9.m2.2.2.2.2.2.2.2.3.2">𝑖</ci><cn id="S3.E9.m2.2.2.2.2.2.2.2.3.3.cmml" type="integer" xref="S3.E9.m2.2.2.2.2.2.2.2.3.3">1</cn></apply></apply><ci id="S3.E9.m2.2.2.2.2.2.2.3.cmml" xref="S3.E9.m2.2.2.2.2.2.2.3">𝑑</ci></apply></interval></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E9.m2.2c">\displaystyle=\text{GRU}^{d}(y_{i},h_{i-1}^{d})</annotation><annotation encoding="application/x-llamapun" id="S3.E9.m2.2d">= GRU start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_h start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(9)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS3.p1.11">Here, <math alttext="e_{ij}" class="ltx_Math" display="inline" id="S3.SS3.p1.5.m1.1"><semantics id="S3.SS3.p1.5.m1.1a"><msub id="S3.SS3.p1.5.m1.1.1" xref="S3.SS3.p1.5.m1.1.1.cmml"><mi id="S3.SS3.p1.5.m1.1.1.2" xref="S3.SS3.p1.5.m1.1.1.2.cmml">e</mi><mrow id="S3.SS3.p1.5.m1.1.1.3" xref="S3.SS3.p1.5.m1.1.1.3.cmml"><mi id="S3.SS3.p1.5.m1.1.1.3.2" xref="S3.SS3.p1.5.m1.1.1.3.2.cmml">i</mi><mo id="S3.SS3.p1.5.m1.1.1.3.1" xref="S3.SS3.p1.5.m1.1.1.3.1.cmml">⁢</mo><mi id="S3.SS3.p1.5.m1.1.1.3.3" xref="S3.SS3.p1.5.m1.1.1.3.3.cmml">j</mi></mrow></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.5.m1.1b"><apply id="S3.SS3.p1.5.m1.1.1.cmml" xref="S3.SS3.p1.5.m1.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.5.m1.1.1.1.cmml" xref="S3.SS3.p1.5.m1.1.1">subscript</csymbol><ci id="S3.SS3.p1.5.m1.1.1.2.cmml" xref="S3.SS3.p1.5.m1.1.1.2">𝑒</ci><apply id="S3.SS3.p1.5.m1.1.1.3.cmml" xref="S3.SS3.p1.5.m1.1.1.3"><times id="S3.SS3.p1.5.m1.1.1.3.1.cmml" xref="S3.SS3.p1.5.m1.1.1.3.1"></times><ci id="S3.SS3.p1.5.m1.1.1.3.2.cmml" xref="S3.SS3.p1.5.m1.1.1.3.2">𝑖</ci><ci id="S3.SS3.p1.5.m1.1.1.3.3.cmml" xref="S3.SS3.p1.5.m1.1.1.3.3">𝑗</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.5.m1.1c">e_{ij}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.5.m1.1d">italic_e start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT</annotation></semantics></math> represents the matching degree of the input near position <math alttext="j" class="ltx_Math" display="inline" id="S3.SS3.p1.6.m2.1"><semantics id="S3.SS3.p1.6.m2.1a"><mi id="S3.SS3.p1.6.m2.1.1" xref="S3.SS3.p1.6.m2.1.1.cmml">j</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.6.m2.1b"><ci id="S3.SS3.p1.6.m2.1.1.cmml" xref="S3.SS3.p1.6.m2.1.1">𝑗</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.6.m2.1c">j</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.6.m2.1d">italic_j</annotation></semantics></math> with the output at position <math alttext="i" class="ltx_Math" display="inline" id="S3.SS3.p1.7.m3.1"><semantics id="S3.SS3.p1.7.m3.1a"><mi id="S3.SS3.p1.7.m3.1.1" xref="S3.SS3.p1.7.m3.1.1.cmml">i</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.7.m3.1b"><ci id="S3.SS3.p1.7.m3.1.1.cmml" xref="S3.SS3.p1.7.m3.1.1">𝑖</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.7.m3.1c">i</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.7.m3.1d">italic_i</annotation></semantics></math>, and <math alttext="h_{i}^{d}" class="ltx_Math" display="inline" id="S3.SS3.p1.8.m4.1"><semantics id="S3.SS3.p1.8.m4.1a"><msubsup id="S3.SS3.p1.8.m4.1.1" xref="S3.SS3.p1.8.m4.1.1.cmml"><mi id="S3.SS3.p1.8.m4.1.1.2.2" xref="S3.SS3.p1.8.m4.1.1.2.2.cmml">h</mi><mi id="S3.SS3.p1.8.m4.1.1.2.3" xref="S3.SS3.p1.8.m4.1.1.2.3.cmml">i</mi><mi id="S3.SS3.p1.8.m4.1.1.3" xref="S3.SS3.p1.8.m4.1.1.3.cmml">d</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.8.m4.1b"><apply id="S3.SS3.p1.8.m4.1.1.cmml" xref="S3.SS3.p1.8.m4.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.8.m4.1.1.1.cmml" xref="S3.SS3.p1.8.m4.1.1">superscript</csymbol><apply id="S3.SS3.p1.8.m4.1.1.2.cmml" xref="S3.SS3.p1.8.m4.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.8.m4.1.1.2.1.cmml" xref="S3.SS3.p1.8.m4.1.1">subscript</csymbol><ci id="S3.SS3.p1.8.m4.1.1.2.2.cmml" xref="S3.SS3.p1.8.m4.1.1.2.2">ℎ</ci><ci id="S3.SS3.p1.8.m4.1.1.2.3.cmml" xref="S3.SS3.p1.8.m4.1.1.2.3">𝑖</ci></apply><ci id="S3.SS3.p1.8.m4.1.1.3.cmml" xref="S3.SS3.p1.8.m4.1.1.3">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.8.m4.1c">h_{i}^{d}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.8.m4.1d">italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT</annotation></semantics></math> is the hidden state generated by the decoder, which is based on its last hidden state <math alttext="h_{i-1}^{d}" class="ltx_Math" display="inline" id="S3.SS3.p1.9.m5.1"><semantics id="S3.SS3.p1.9.m5.1a"><msubsup id="S3.SS3.p1.9.m5.1.1" xref="S3.SS3.p1.9.m5.1.1.cmml"><mi id="S3.SS3.p1.9.m5.1.1.2.2" xref="S3.SS3.p1.9.m5.1.1.2.2.cmml">h</mi><mrow id="S3.SS3.p1.9.m5.1.1.2.3" xref="S3.SS3.p1.9.m5.1.1.2.3.cmml"><mi id="S3.SS3.p1.9.m5.1.1.2.3.2" xref="S3.SS3.p1.9.m5.1.1.2.3.2.cmml">i</mi><mo id="S3.SS3.p1.9.m5.1.1.2.3.1" xref="S3.SS3.p1.9.m5.1.1.2.3.1.cmml">−</mo><mn id="S3.SS3.p1.9.m5.1.1.2.3.3" xref="S3.SS3.p1.9.m5.1.1.2.3.3.cmml">1</mn></mrow><mi id="S3.SS3.p1.9.m5.1.1.3" xref="S3.SS3.p1.9.m5.1.1.3.cmml">d</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.9.m5.1b"><apply id="S3.SS3.p1.9.m5.1.1.cmml" xref="S3.SS3.p1.9.m5.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.9.m5.1.1.1.cmml" xref="S3.SS3.p1.9.m5.1.1">superscript</csymbol><apply id="S3.SS3.p1.9.m5.1.1.2.cmml" xref="S3.SS3.p1.9.m5.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.9.m5.1.1.2.1.cmml" xref="S3.SS3.p1.9.m5.1.1">subscript</csymbol><ci id="S3.SS3.p1.9.m5.1.1.2.2.cmml" xref="S3.SS3.p1.9.m5.1.1.2.2">ℎ</ci><apply id="S3.SS3.p1.9.m5.1.1.2.3.cmml" xref="S3.SS3.p1.9.m5.1.1.2.3"><minus id="S3.SS3.p1.9.m5.1.1.2.3.1.cmml" xref="S3.SS3.p1.9.m5.1.1.2.3.1"></minus><ci id="S3.SS3.p1.9.m5.1.1.2.3.2.cmml" xref="S3.SS3.p1.9.m5.1.1.2.3.2">𝑖</ci><cn id="S3.SS3.p1.9.m5.1.1.2.3.3.cmml" type="integer" xref="S3.SS3.p1.9.m5.1.1.2.3.3">1</cn></apply></apply><ci id="S3.SS3.p1.9.m5.1.1.3.cmml" xref="S3.SS3.p1.9.m5.1.1.3">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.9.m5.1c">h_{i-1}^{d}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.9.m5.1d">italic_h start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT</annotation></semantics></math> and the <math alttext="i" class="ltx_Math" display="inline" id="S3.SS3.p1.10.m6.1"><semantics id="S3.SS3.p1.10.m6.1a"><mi id="S3.SS3.p1.10.m6.1.1" xref="S3.SS3.p1.10.m6.1.1.cmml">i</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.10.m6.1b"><ci id="S3.SS3.p1.10.m6.1.1.cmml" xref="S3.SS3.p1.10.m6.1.1">𝑖</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.10.m6.1c">i</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.10.m6.1d">italic_i</annotation></semantics></math>th target in the output sequence <math alttext="y_{i}" class="ltx_Math" display="inline" id="S3.SS3.p1.11.m7.1"><semantics id="S3.SS3.p1.11.m7.1a"><msub id="S3.SS3.p1.11.m7.1.1" xref="S3.SS3.p1.11.m7.1.1.cmml"><mi id="S3.SS3.p1.11.m7.1.1.2" xref="S3.SS3.p1.11.m7.1.1.2.cmml">y</mi><mi id="S3.SS3.p1.11.m7.1.1.3" xref="S3.SS3.p1.11.m7.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.11.m7.1b"><apply id="S3.SS3.p1.11.m7.1.1.cmml" xref="S3.SS3.p1.11.m7.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.11.m7.1.1.1.cmml" xref="S3.SS3.p1.11.m7.1.1">subscript</csymbol><ci id="S3.SS3.p1.11.m7.1.1.2.cmml" xref="S3.SS3.p1.11.m7.1.1.2">𝑦</ci><ci id="S3.SS3.p1.11.m7.1.1.3.cmml" xref="S3.SS3.p1.11.m7.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.11.m7.1c">y_{i}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.11.m7.1d">italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math>.</p> </div> <div class="ltx_para" id="S3.SS3.p2"> <p class="ltx_p" id="S3.SS3.p2.1">The master-slave encoding model in this paper does not decode the entire output sequence at once, but decodes parts of a fixed length sequence in stages. Each stage decodes a part of the sequence of fixed length <math alttext="K" class="ltx_Math" display="inline" id="S3.SS3.p2.1.m1.1"><semantics id="S3.SS3.p2.1.m1.1a"><mi id="S3.SS3.p2.1.m1.1.1" xref="S3.SS3.p2.1.m1.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.p2.1.m1.1b"><ci id="S3.SS3.p2.1.m1.1.1.cmml" xref="S3.SS3.p2.1.m1.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p2.1.m1.1c">K</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p2.1.m1.1d">italic_K</annotation></semantics></math>, and the entire decoded sequence is innovatively represented as:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E10"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="C^{d}=\tanh(W_{d}\frac{1}{L}\sum_{i=1}^{L}h_{i}^{p}+b_{d})" class="ltx_Math" display="block" id="S3.E10.m1.2"><semantics id="S3.E10.m1.2a"><mrow id="S3.E10.m1.2.2" xref="S3.E10.m1.2.2.cmml"><msup id="S3.E10.m1.2.2.3" xref="S3.E10.m1.2.2.3.cmml"><mi id="S3.E10.m1.2.2.3.2" xref="S3.E10.m1.2.2.3.2.cmml">C</mi><mi id="S3.E10.m1.2.2.3.3" xref="S3.E10.m1.2.2.3.3.cmml">d</mi></msup><mo id="S3.E10.m1.2.2.2" xref="S3.E10.m1.2.2.2.cmml">=</mo><mrow id="S3.E10.m1.2.2.1.1" xref="S3.E10.m1.2.2.1.2.cmml"><mi id="S3.E10.m1.1.1" xref="S3.E10.m1.1.1.cmml">tanh</mi><mo id="S3.E10.m1.2.2.1.1a" xref="S3.E10.m1.2.2.1.2.cmml">⁡</mo><mrow id="S3.E10.m1.2.2.1.1.1" xref="S3.E10.m1.2.2.1.2.cmml"><mo id="S3.E10.m1.2.2.1.1.1.2" stretchy="false" xref="S3.E10.m1.2.2.1.2.cmml">(</mo><mrow id="S3.E10.m1.2.2.1.1.1.1" xref="S3.E10.m1.2.2.1.1.1.1.cmml"><mrow id="S3.E10.m1.2.2.1.1.1.1.2" xref="S3.E10.m1.2.2.1.1.1.1.2.cmml"><msub id="S3.E10.m1.2.2.1.1.1.1.2.2" xref="S3.E10.m1.2.2.1.1.1.1.2.2.cmml"><mi id="S3.E10.m1.2.2.1.1.1.1.2.2.2" xref="S3.E10.m1.2.2.1.1.1.1.2.2.2.cmml">W</mi><mi id="S3.E10.m1.2.2.1.1.1.1.2.2.3" xref="S3.E10.m1.2.2.1.1.1.1.2.2.3.cmml">d</mi></msub><mo id="S3.E10.m1.2.2.1.1.1.1.2.1" xref="S3.E10.m1.2.2.1.1.1.1.2.1.cmml">⁢</mo><mfrac id="S3.E10.m1.2.2.1.1.1.1.2.3" xref="S3.E10.m1.2.2.1.1.1.1.2.3.cmml"><mn id="S3.E10.m1.2.2.1.1.1.1.2.3.2" xref="S3.E10.m1.2.2.1.1.1.1.2.3.2.cmml">1</mn><mi id="S3.E10.m1.2.2.1.1.1.1.2.3.3" xref="S3.E10.m1.2.2.1.1.1.1.2.3.3.cmml">L</mi></mfrac><mo id="S3.E10.m1.2.2.1.1.1.1.2.1a" xref="S3.E10.m1.2.2.1.1.1.1.2.1.cmml">⁢</mo><mrow id="S3.E10.m1.2.2.1.1.1.1.2.4" xref="S3.E10.m1.2.2.1.1.1.1.2.4.cmml"><munderover id="S3.E10.m1.2.2.1.1.1.1.2.4.1" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.cmml"><mo id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.2" movablelimits="false" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.2.cmml">∑</mo><mrow id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.cmml"><mi id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.2" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.2.cmml">i</mi><mo id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.1" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.1.cmml">=</mo><mn id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.3" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.3.cmml">1</mn></mrow><mi id="S3.E10.m1.2.2.1.1.1.1.2.4.1.3" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.3.cmml">L</mi></munderover><msubsup id="S3.E10.m1.2.2.1.1.1.1.2.4.2" xref="S3.E10.m1.2.2.1.1.1.1.2.4.2.cmml"><mi id="S3.E10.m1.2.2.1.1.1.1.2.4.2.2.2" xref="S3.E10.m1.2.2.1.1.1.1.2.4.2.2.2.cmml">h</mi><mi id="S3.E10.m1.2.2.1.1.1.1.2.4.2.2.3" xref="S3.E10.m1.2.2.1.1.1.1.2.4.2.2.3.cmml">i</mi><mi id="S3.E10.m1.2.2.1.1.1.1.2.4.2.3" xref="S3.E10.m1.2.2.1.1.1.1.2.4.2.3.cmml">p</mi></msubsup></mrow></mrow><mo id="S3.E10.m1.2.2.1.1.1.1.1" xref="S3.E10.m1.2.2.1.1.1.1.1.cmml">+</mo><msub id="S3.E10.m1.2.2.1.1.1.1.3" xref="S3.E10.m1.2.2.1.1.1.1.3.cmml"><mi id="S3.E10.m1.2.2.1.1.1.1.3.2" xref="S3.E10.m1.2.2.1.1.1.1.3.2.cmml">b</mi><mi id="S3.E10.m1.2.2.1.1.1.1.3.3" xref="S3.E10.m1.2.2.1.1.1.1.3.3.cmml">d</mi></msub></mrow><mo id="S3.E10.m1.2.2.1.1.1.3" stretchy="false" xref="S3.E10.m1.2.2.1.2.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E10.m1.2b"><apply id="S3.E10.m1.2.2.cmml" xref="S3.E10.m1.2.2"><eq id="S3.E10.m1.2.2.2.cmml" xref="S3.E10.m1.2.2.2"></eq><apply id="S3.E10.m1.2.2.3.cmml" xref="S3.E10.m1.2.2.3"><csymbol cd="ambiguous" id="S3.E10.m1.2.2.3.1.cmml" xref="S3.E10.m1.2.2.3">superscript</csymbol><ci id="S3.E10.m1.2.2.3.2.cmml" xref="S3.E10.m1.2.2.3.2">𝐶</ci><ci id="S3.E10.m1.2.2.3.3.cmml" xref="S3.E10.m1.2.2.3.3">𝑑</ci></apply><apply id="S3.E10.m1.2.2.1.2.cmml" xref="S3.E10.m1.2.2.1.1"><tanh id="S3.E10.m1.1.1.cmml" xref="S3.E10.m1.1.1"></tanh><apply id="S3.E10.m1.2.2.1.1.1.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1"><plus id="S3.E10.m1.2.2.1.1.1.1.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1.1"></plus><apply id="S3.E10.m1.2.2.1.1.1.1.2.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2"><times id="S3.E10.m1.2.2.1.1.1.1.2.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.1"></times><apply id="S3.E10.m1.2.2.1.1.1.1.2.2.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.2"><csymbol cd="ambiguous" id="S3.E10.m1.2.2.1.1.1.1.2.2.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.2">subscript</csymbol><ci id="S3.E10.m1.2.2.1.1.1.1.2.2.2.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.2.2">𝑊</ci><ci id="S3.E10.m1.2.2.1.1.1.1.2.2.3.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.2.3">𝑑</ci></apply><apply id="S3.E10.m1.2.2.1.1.1.1.2.3.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.3"><divide id="S3.E10.m1.2.2.1.1.1.1.2.3.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.3"></divide><cn id="S3.E10.m1.2.2.1.1.1.1.2.3.2.cmml" type="integer" xref="S3.E10.m1.2.2.1.1.1.1.2.3.2">1</cn><ci id="S3.E10.m1.2.2.1.1.1.1.2.3.3.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.3.3">𝐿</ci></apply><apply id="S3.E10.m1.2.2.1.1.1.1.2.4.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4"><apply id="S3.E10.m1.2.2.1.1.1.1.2.4.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1"><csymbol cd="ambiguous" id="S3.E10.m1.2.2.1.1.1.1.2.4.1.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1">superscript</csymbol><apply id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1"><csymbol cd="ambiguous" id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1">subscript</csymbol><sum id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.2.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.2"></sum><apply id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3"><eq id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.1"></eq><ci id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.2.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.2">𝑖</ci><cn id="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.3.cmml" type="integer" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.2.3.3">1</cn></apply></apply><ci id="S3.E10.m1.2.2.1.1.1.1.2.4.1.3.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.1.3">𝐿</ci></apply><apply id="S3.E10.m1.2.2.1.1.1.1.2.4.2.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.2"><csymbol cd="ambiguous" id="S3.E10.m1.2.2.1.1.1.1.2.4.2.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.2">superscript</csymbol><apply id="S3.E10.m1.2.2.1.1.1.1.2.4.2.2.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.2"><csymbol cd="ambiguous" id="S3.E10.m1.2.2.1.1.1.1.2.4.2.2.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.2">subscript</csymbol><ci id="S3.E10.m1.2.2.1.1.1.1.2.4.2.2.2.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.2.2.2">ℎ</ci><ci id="S3.E10.m1.2.2.1.1.1.1.2.4.2.2.3.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.2.2.3">𝑖</ci></apply><ci id="S3.E10.m1.2.2.1.1.1.1.2.4.2.3.cmml" xref="S3.E10.m1.2.2.1.1.1.1.2.4.2.3">𝑝</ci></apply></apply></apply><apply id="S3.E10.m1.2.2.1.1.1.1.3.cmml" xref="S3.E10.m1.2.2.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.E10.m1.2.2.1.1.1.1.3.1.cmml" xref="S3.E10.m1.2.2.1.1.1.1.3">subscript</csymbol><ci id="S3.E10.m1.2.2.1.1.1.1.3.2.cmml" xref="S3.E10.m1.2.2.1.1.1.1.3.2">𝑏</ci><ci id="S3.E10.m1.2.2.1.1.1.1.3.3.cmml" xref="S3.E10.m1.2.2.1.1.1.1.3.3">𝑑</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E10.m1.2c">C^{d}=\tanh(W_{d}\frac{1}{L}\sum_{i=1}^{L}h_{i}^{p}+b_{d})</annotation><annotation encoding="application/x-llamapun" id="S3.E10.m1.2d">italic_C start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT = roman_tanh ( italic_W start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT divide start_ARG 1 end_ARG start_ARG italic_L end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT + italic_b start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(10)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS3.p2.5">where <math alttext="W_{d}" class="ltx_Math" display="inline" id="S3.SS3.p2.2.m1.1"><semantics id="S3.SS3.p2.2.m1.1a"><msub id="S3.SS3.p2.2.m1.1.1" xref="S3.SS3.p2.2.m1.1.1.cmml"><mi id="S3.SS3.p2.2.m1.1.1.2" xref="S3.SS3.p2.2.m1.1.1.2.cmml">W</mi><mi id="S3.SS3.p2.2.m1.1.1.3" xref="S3.SS3.p2.2.m1.1.1.3.cmml">d</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.p2.2.m1.1b"><apply id="S3.SS3.p2.2.m1.1.1.cmml" xref="S3.SS3.p2.2.m1.1.1"><csymbol cd="ambiguous" id="S3.SS3.p2.2.m1.1.1.1.cmml" xref="S3.SS3.p2.2.m1.1.1">subscript</csymbol><ci id="S3.SS3.p2.2.m1.1.1.2.cmml" xref="S3.SS3.p2.2.m1.1.1.2">𝑊</ci><ci id="S3.SS3.p2.2.m1.1.1.3.cmml" xref="S3.SS3.p2.2.m1.1.1.3">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p2.2.m1.1c">W_{d}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p2.2.m1.1d">italic_W start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT</annotation></semantics></math> and <math alttext="b_{d}" class="ltx_Math" display="inline" id="S3.SS3.p2.3.m2.1"><semantics id="S3.SS3.p2.3.m2.1a"><msub id="S3.SS3.p2.3.m2.1.1" xref="S3.SS3.p2.3.m2.1.1.cmml"><mi id="S3.SS3.p2.3.m2.1.1.2" xref="S3.SS3.p2.3.m2.1.1.2.cmml">b</mi><mi id="S3.SS3.p2.3.m2.1.1.3" xref="S3.SS3.p2.3.m2.1.1.3.cmml">d</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.p2.3.m2.1b"><apply id="S3.SS3.p2.3.m2.1.1.cmml" xref="S3.SS3.p2.3.m2.1.1"><csymbol cd="ambiguous" id="S3.SS3.p2.3.m2.1.1.1.cmml" xref="S3.SS3.p2.3.m2.1.1">subscript</csymbol><ci id="S3.SS3.p2.3.m2.1.1.2.cmml" xref="S3.SS3.p2.3.m2.1.1.2">𝑏</ci><ci id="S3.SS3.p2.3.m2.1.1.3.cmml" xref="S3.SS3.p2.3.m2.1.1.3">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p2.3.m2.1c">b_{d}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p2.3.m2.1d">italic_b start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT</annotation></semantics></math> are learning parameters, and <math alttext="L" class="ltx_Math" display="inline" id="S3.SS3.p2.4.m3.1"><semantics id="S3.SS3.p2.4.m3.1a"><mi id="S3.SS3.p2.4.m3.1.1" xref="S3.SS3.p2.4.m3.1.1.cmml">L</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.p2.4.m3.1b"><ci id="S3.SS3.p2.4.m3.1.1.cmml" xref="S3.SS3.p2.4.m3.1.1">𝐿</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p2.4.m3.1c">L</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p2.4.m3.1d">italic_L</annotation></semantics></math> represents the length of the current coded sentence. <math alttext="C^{d}" class="ltx_Math" display="inline" id="S3.SS3.p2.5.m4.1"><semantics id="S3.SS3.p2.5.m4.1a"><msup id="S3.SS3.p2.5.m4.1.1" xref="S3.SS3.p2.5.m4.1.1.cmml"><mi id="S3.SS3.p2.5.m4.1.1.2" xref="S3.SS3.p2.5.m4.1.1.2.cmml">C</mi><mi id="S3.SS3.p2.5.m4.1.1.3" xref="S3.SS3.p2.5.m4.1.1.3.cmml">d</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS3.p2.5.m4.1b"><apply id="S3.SS3.p2.5.m4.1.1.cmml" xref="S3.SS3.p2.5.m4.1.1"><csymbol cd="ambiguous" id="S3.SS3.p2.5.m4.1.1.1.cmml" xref="S3.SS3.p2.5.m4.1.1">superscript</csymbol><ci id="S3.SS3.p2.5.m4.1.1.2.cmml" xref="S3.SS3.p2.5.m4.1.1.2">𝐶</ci><ci id="S3.SS3.p2.5.m4.1.1.3.cmml" xref="S3.SS3.p2.5.m4.1.1.3">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p2.5.m4.1c">C^{d}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p2.5.m4.1d">italic_C start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT</annotation></semantics></math> is the content currently produced by the decoder, which is used to adjust the attention weights of the slave encoder for each word in the input sequence.</p> </div> <div class="ltx_para" id="S3.SS3.p3"> <p class="ltx_p" id="S3.SS3.p3.1">After each fixed-length decoding, the slave encoder generates a new final state <math alttext="h_{m}^{s}" class="ltx_Math" display="inline" id="S3.SS3.p3.1.m1.1"><semantics id="S3.SS3.p3.1.m1.1a"><msubsup id="S3.SS3.p3.1.m1.1.1" xref="S3.SS3.p3.1.m1.1.1.cmml"><mi id="S3.SS3.p3.1.m1.1.1.2.2" xref="S3.SS3.p3.1.m1.1.1.2.2.cmml">h</mi><mi id="S3.SS3.p3.1.m1.1.1.2.3" xref="S3.SS3.p3.1.m1.1.1.2.3.cmml">m</mi><mi id="S3.SS3.p3.1.m1.1.1.3" xref="S3.SS3.p3.1.m1.1.1.3.cmml">s</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS3.p3.1.m1.1b"><apply id="S3.SS3.p3.1.m1.1.1.cmml" xref="S3.SS3.p3.1.m1.1.1"><csymbol cd="ambiguous" id="S3.SS3.p3.1.m1.1.1.1.cmml" xref="S3.SS3.p3.1.m1.1.1">superscript</csymbol><apply id="S3.SS3.p3.1.m1.1.1.2.cmml" xref="S3.SS3.p3.1.m1.1.1"><csymbol cd="ambiguous" id="S3.SS3.p3.1.m1.1.1.2.1.cmml" xref="S3.SS3.p3.1.m1.1.1">subscript</csymbol><ci id="S3.SS3.p3.1.m1.1.1.2.2.cmml" xref="S3.SS3.p3.1.m1.1.1.2.2">ℎ</ci><ci id="S3.SS3.p3.1.m1.1.1.2.3.cmml" xref="S3.SS3.p3.1.m1.1.1.2.3">𝑚</ci></apply><ci id="S3.SS3.p3.1.m1.1.1.3.cmml" xref="S3.SS3.p3.1.m1.1.1.3">𝑠</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p3.1.m1.1c">h_{m}^{s}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p3.1.m1.1d">italic_h start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT</annotation></semantics></math>, and the decoder of this paper is innovatively rewritten as follows:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E11"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="h_{i}^{d}=\begin{cases}\text{GRU}^{d}(y_{i},[h_{i-1}^{d},h_{m}^{s}]),&amp;\text{if% }L\%k==0\\ \text{GRU}^{d}(y_{i},h_{i-1}^{d}),&amp;\text{if }L\%k\neq 0\end{cases}" class="ltx_math_unparsed" display="block" id="S3.E11.m1.4"><semantics id="S3.E11.m1.4a"><mrow id="S3.E11.m1.4.5"><msubsup id="S3.E11.m1.4.5.2"><mi id="S3.E11.m1.4.5.2.2.2">h</mi><mi id="S3.E11.m1.4.5.2.2.3">i</mi><mi id="S3.E11.m1.4.5.2.3">d</mi></msubsup><mo id="S3.E11.m1.4.5.1">=</mo><mrow id="S3.E11.m1.4.4"><mo id="S3.E11.m1.4.4.5">{</mo><mtable columnspacing="5pt" displaystyle="true" id="S3.E11.m1.4.4.4" rowspacing="0pt"><mtr id="S3.E11.m1.4.4.4a"><mtd class="ltx_align_left" columnalign="left" id="S3.E11.m1.4.4.4b"><mrow id="S3.E11.m1.1.1.1.1.1.1.1"><mrow id="S3.E11.m1.1.1.1.1.1.1.1.1"><msup id="S3.E11.m1.1.1.1.1.1.1.1.1.4"><mtext id="S3.E11.m1.1.1.1.1.1.1.1.1.4.2">GRU</mtext><mi id="S3.E11.m1.1.1.1.1.1.1.1.1.4.3">d</mi></msup><mo id="S3.E11.m1.1.1.1.1.1.1.1.1.3">⁢</mo><mrow id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2"><mo id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.3" stretchy="false">(</mo><msub id="S3.E11.m1.1.1.1.1.1.1.1.1.1.1.1"><mi id="S3.E11.m1.1.1.1.1.1.1.1.1.1.1.1.2">y</mi><mi id="S3.E11.m1.1.1.1.1.1.1.1.1.1.1.1.3">i</mi></msub><mo id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.4">,</mo><mrow id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.2"><mo id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.2.3" stretchy="false">[</mo><msubsup id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.1.1"><mi id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.1.1.2.2">h</mi><mrow id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.1.1.2.3"><mi id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.1.1.2.3.2">i</mi><mo id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.1.1.2.3.1">−</mo><mn id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.1.1.2.3.3">1</mn></mrow><mi id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.1.1.3">d</mi></msubsup><mo id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.2.4">,</mo><msubsup id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.2.2"><mi id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.2.2.2.2">h</mi><mi id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.2.2.2.3">m</mi><mi id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.2.2.3">s</mi></msubsup><mo id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.2.2.5" stretchy="false">]</mo></mrow><mo id="S3.E11.m1.1.1.1.1.1.1.1.1.2.2.5" stretchy="false">)</mo></mrow></mrow><mo id="S3.E11.m1.1.1.1.1.1.1.1.2">,</mo></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S3.E11.m1.4.4.4c"><mrow id="S3.E11.m1.2.2.2.2.2.1"><mtext id="S3.E11.m1.2.2.2.2.2.1.1">if </mtext><mi id="S3.E11.m1.2.2.2.2.2.1.2">L</mi><mo id="S3.E11.m1.2.2.2.2.2.1.3">%</mo><mi id="S3.E11.m1.2.2.2.2.2.1.4">k</mi><mo id="S3.E11.m1.2.2.2.2.2.1.5" rspace="0em">=</mo><mo id="S3.E11.m1.2.2.2.2.2.1.6" lspace="0em">=</mo><mn id="S3.E11.m1.2.2.2.2.2.1.7">0</mn></mrow></mtd></mtr><mtr id="S3.E11.m1.4.4.4d"><mtd class="ltx_align_left" columnalign="left" id="S3.E11.m1.4.4.4e"><mrow id="S3.E11.m1.3.3.3.3.1.1.1"><mrow id="S3.E11.m1.3.3.3.3.1.1.1.1"><msup id="S3.E11.m1.3.3.3.3.1.1.1.1.4"><mtext id="S3.E11.m1.3.3.3.3.1.1.1.1.4.2">GRU</mtext><mi id="S3.E11.m1.3.3.3.3.1.1.1.1.4.3">d</mi></msup><mo id="S3.E11.m1.3.3.3.3.1.1.1.1.3">⁢</mo><mrow id="S3.E11.m1.3.3.3.3.1.1.1.1.2.2"><mo id="S3.E11.m1.3.3.3.3.1.1.1.1.2.2.3" stretchy="false">(</mo><msub id="S3.E11.m1.3.3.3.3.1.1.1.1.1.1.1"><mi id="S3.E11.m1.3.3.3.3.1.1.1.1.1.1.1.2">y</mi><mi id="S3.E11.m1.3.3.3.3.1.1.1.1.1.1.1.3">i</mi></msub><mo id="S3.E11.m1.3.3.3.3.1.1.1.1.2.2.4">,</mo><msubsup id="S3.E11.m1.3.3.3.3.1.1.1.1.2.2.2"><mi id="S3.E11.m1.3.3.3.3.1.1.1.1.2.2.2.2.2">h</mi><mrow id="S3.E11.m1.3.3.3.3.1.1.1.1.2.2.2.2.3"><mi id="S3.E11.m1.3.3.3.3.1.1.1.1.2.2.2.2.3.2">i</mi><mo id="S3.E11.m1.3.3.3.3.1.1.1.1.2.2.2.2.3.1">−</mo><mn id="S3.E11.m1.3.3.3.3.1.1.1.1.2.2.2.2.3.3">1</mn></mrow><mi id="S3.E11.m1.3.3.3.3.1.1.1.1.2.2.2.3">d</mi></msubsup><mo id="S3.E11.m1.3.3.3.3.1.1.1.1.2.2.5" stretchy="false">)</mo></mrow></mrow><mo id="S3.E11.m1.3.3.3.3.1.1.1.2">,</mo></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S3.E11.m1.4.4.4f"><mrow id="S3.E11.m1.4.4.4.4.2.1"><mrow id="S3.E11.m1.4.4.4.4.2.1.2"><mtext id="S3.E11.m1.4.4.4.4.2.1.2.2">if </mtext><mo id="S3.E11.m1.4.4.4.4.2.1.2.1">⁢</mo><mrow id="S3.E11.m1.4.4.4.4.2.1.2.3"><mi id="S3.E11.m1.4.4.4.4.2.1.2.3.2">L</mi><mo id="S3.E11.m1.4.4.4.4.2.1.2.3.1">%</mo></mrow><mo id="S3.E11.m1.4.4.4.4.2.1.2.1a">⁢</mo><mi id="S3.E11.m1.4.4.4.4.2.1.2.4">k</mi></mrow><mo id="S3.E11.m1.4.4.4.4.2.1.1">≠</mo><mn id="S3.E11.m1.4.4.4.4.2.1.3">0</mn></mrow></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex" id="S3.E11.m1.4b">h_{i}^{d}=\begin{cases}\text{GRU}^{d}(y_{i},[h_{i-1}^{d},h_{m}^{s}]),&amp;\text{if% }L\%k==0\\ \text{GRU}^{d}(y_{i},h_{i-1}^{d}),&amp;\text{if }L\%k\neq 0\end{cases}</annotation><annotation encoding="application/x-llamapun" id="S3.E11.m1.4c">italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT = { start_ROW start_CELL GRU start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , [ italic_h start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , italic_h start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT ] ) , end_CELL start_CELL if italic_L % italic_k = = 0 end_CELL end_ROW start_ROW start_CELL GRU start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ( italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_h start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT ) , end_CELL start_CELL if italic_L % italic_k ≠ 0 end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(11)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS3.p3.5">The initial state of the decoder is set to the final state of the master encoder’s final state, i.e., <math alttext="h_{0}^{d}=h_{m}^{p}" class="ltx_Math" display="inline" id="S3.SS3.p3.2.m1.1"><semantics id="S3.SS3.p3.2.m1.1a"><mrow id="S3.SS3.p3.2.m1.1.1" xref="S3.SS3.p3.2.m1.1.1.cmml"><msubsup id="S3.SS3.p3.2.m1.1.1.2" xref="S3.SS3.p3.2.m1.1.1.2.cmml"><mi id="S3.SS3.p3.2.m1.1.1.2.2.2" xref="S3.SS3.p3.2.m1.1.1.2.2.2.cmml">h</mi><mn id="S3.SS3.p3.2.m1.1.1.2.2.3" xref="S3.SS3.p3.2.m1.1.1.2.2.3.cmml">0</mn><mi id="S3.SS3.p3.2.m1.1.1.2.3" xref="S3.SS3.p3.2.m1.1.1.2.3.cmml">d</mi></msubsup><mo id="S3.SS3.p3.2.m1.1.1.1" xref="S3.SS3.p3.2.m1.1.1.1.cmml">=</mo><msubsup id="S3.SS3.p3.2.m1.1.1.3" xref="S3.SS3.p3.2.m1.1.1.3.cmml"><mi id="S3.SS3.p3.2.m1.1.1.3.2.2" xref="S3.SS3.p3.2.m1.1.1.3.2.2.cmml">h</mi><mi id="S3.SS3.p3.2.m1.1.1.3.2.3" xref="S3.SS3.p3.2.m1.1.1.3.2.3.cmml">m</mi><mi id="S3.SS3.p3.2.m1.1.1.3.3" xref="S3.SS3.p3.2.m1.1.1.3.3.cmml">p</mi></msubsup></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.p3.2.m1.1b"><apply id="S3.SS3.p3.2.m1.1.1.cmml" xref="S3.SS3.p3.2.m1.1.1"><eq id="S3.SS3.p3.2.m1.1.1.1.cmml" xref="S3.SS3.p3.2.m1.1.1.1"></eq><apply id="S3.SS3.p3.2.m1.1.1.2.cmml" xref="S3.SS3.p3.2.m1.1.1.2"><csymbol cd="ambiguous" id="S3.SS3.p3.2.m1.1.1.2.1.cmml" xref="S3.SS3.p3.2.m1.1.1.2">superscript</csymbol><apply id="S3.SS3.p3.2.m1.1.1.2.2.cmml" xref="S3.SS3.p3.2.m1.1.1.2"><csymbol cd="ambiguous" id="S3.SS3.p3.2.m1.1.1.2.2.1.cmml" xref="S3.SS3.p3.2.m1.1.1.2">subscript</csymbol><ci id="S3.SS3.p3.2.m1.1.1.2.2.2.cmml" xref="S3.SS3.p3.2.m1.1.1.2.2.2">ℎ</ci><cn id="S3.SS3.p3.2.m1.1.1.2.2.3.cmml" type="integer" xref="S3.SS3.p3.2.m1.1.1.2.2.3">0</cn></apply><ci id="S3.SS3.p3.2.m1.1.1.2.3.cmml" xref="S3.SS3.p3.2.m1.1.1.2.3">𝑑</ci></apply><apply id="S3.SS3.p3.2.m1.1.1.3.cmml" xref="S3.SS3.p3.2.m1.1.1.3"><csymbol cd="ambiguous" id="S3.SS3.p3.2.m1.1.1.3.1.cmml" xref="S3.SS3.p3.2.m1.1.1.3">superscript</csymbol><apply id="S3.SS3.p3.2.m1.1.1.3.2.cmml" xref="S3.SS3.p3.2.m1.1.1.3"><csymbol cd="ambiguous" id="S3.SS3.p3.2.m1.1.1.3.2.1.cmml" xref="S3.SS3.p3.2.m1.1.1.3">subscript</csymbol><ci id="S3.SS3.p3.2.m1.1.1.3.2.2.cmml" xref="S3.SS3.p3.2.m1.1.1.3.2.2">ℎ</ci><ci id="S3.SS3.p3.2.m1.1.1.3.2.3.cmml" xref="S3.SS3.p3.2.m1.1.1.3.2.3">𝑚</ci></apply><ci id="S3.SS3.p3.2.m1.1.1.3.3.cmml" xref="S3.SS3.p3.2.m1.1.1.3.3">𝑝</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p3.2.m1.1c">h_{0}^{d}=h_{m}^{p}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p3.2.m1.1d">italic_h start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT = italic_h start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT</annotation></semantics></math>. Every <math alttext="K" class="ltx_Math" display="inline" id="S3.SS3.p3.3.m2.1"><semantics id="S3.SS3.p3.3.m2.1a"><mi id="S3.SS3.p3.3.m2.1.1" xref="S3.SS3.p3.3.m2.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.p3.3.m2.1b"><ci id="S3.SS3.p3.3.m2.1.1.cmml" xref="S3.SS3.p3.3.m2.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p3.3.m2.1c">K</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p3.3.m2.1d">italic_K</annotation></semantics></math> decoding steps, the content after decoding and re-encoding is calculated. Then, the current context vector <math alttext="c_{i}" class="ltx_Math" display="inline" id="S3.SS3.p3.4.m3.1"><semantics id="S3.SS3.p3.4.m3.1a"><msub id="S3.SS3.p3.4.m3.1.1" xref="S3.SS3.p3.4.m3.1.1.cmml"><mi id="S3.SS3.p3.4.m3.1.1.2" xref="S3.SS3.p3.4.m3.1.1.2.cmml">c</mi><mi id="S3.SS3.p3.4.m3.1.1.3" xref="S3.SS3.p3.4.m3.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.p3.4.m3.1b"><apply id="S3.SS3.p3.4.m3.1.1.cmml" xref="S3.SS3.p3.4.m3.1.1"><csymbol cd="ambiguous" id="S3.SS3.p3.4.m3.1.1.1.cmml" xref="S3.SS3.p3.4.m3.1.1">subscript</csymbol><ci id="S3.SS3.p3.4.m3.1.1.2.cmml" xref="S3.SS3.p3.4.m3.1.1.2">𝑐</ci><ci id="S3.SS3.p3.4.m3.1.1.3.cmml" xref="S3.SS3.p3.4.m3.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p3.4.m3.1c">c_{i}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p3.4.m3.1d">italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math> obtained from the master encoder and the hidden state <math alttext="h_{i}^{d}" class="ltx_Math" display="inline" id="S3.SS3.p3.5.m4.1"><semantics id="S3.SS3.p3.5.m4.1a"><msubsup id="S3.SS3.p3.5.m4.1.1" xref="S3.SS3.p3.5.m4.1.1.cmml"><mi id="S3.SS3.p3.5.m4.1.1.2.2" xref="S3.SS3.p3.5.m4.1.1.2.2.cmml">h</mi><mi id="S3.SS3.p3.5.m4.1.1.2.3" xref="S3.SS3.p3.5.m4.1.1.2.3.cmml">i</mi><mi id="S3.SS3.p3.5.m4.1.1.3" xref="S3.SS3.p3.5.m4.1.1.3.cmml">d</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS3.p3.5.m4.1b"><apply id="S3.SS3.p3.5.m4.1.1.cmml" xref="S3.SS3.p3.5.m4.1.1"><csymbol cd="ambiguous" id="S3.SS3.p3.5.m4.1.1.1.cmml" xref="S3.SS3.p3.5.m4.1.1">superscript</csymbol><apply id="S3.SS3.p3.5.m4.1.1.2.cmml" xref="S3.SS3.p3.5.m4.1.1"><csymbol cd="ambiguous" id="S3.SS3.p3.5.m4.1.1.2.1.cmml" xref="S3.SS3.p3.5.m4.1.1">subscript</csymbol><ci id="S3.SS3.p3.5.m4.1.1.2.2.cmml" xref="S3.SS3.p3.5.m4.1.1.2.2">ℎ</ci><ci id="S3.SS3.p3.5.m4.1.1.2.3.cmml" xref="S3.SS3.p3.5.m4.1.1.2.3">𝑖</ci></apply><ci id="S3.SS3.p3.5.m4.1.1.3.cmml" xref="S3.SS3.p3.5.m4.1.1.3">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p3.5.m4.1c">h_{i}^{d}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p3.5.m4.1d">italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT</annotation></semantics></math> of the decoder are concatenated and passed through a linear layer to produce a vocabulary distribution, as follows:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E12"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="P_{v}=P(y_{i}\mid y_{1},\ldots,y_{i-1};x)=\text{softmax}(W_{v}[h_{i}^{d},c_{i}% ]+b_{v})" class="ltx_Math" display="block" id="S3.E12.m1.4"><semantics id="S3.E12.m1.4a"><mrow id="S3.E12.m1.4.4" xref="S3.E12.m1.4.4.cmml"><msub id="S3.E12.m1.4.4.4" xref="S3.E12.m1.4.4.4.cmml"><mi id="S3.E12.m1.4.4.4.2" xref="S3.E12.m1.4.4.4.2.cmml">P</mi><mi id="S3.E12.m1.4.4.4.3" xref="S3.E12.m1.4.4.4.3.cmml">v</mi></msub><mo id="S3.E12.m1.4.4.5" xref="S3.E12.m1.4.4.5.cmml">=</mo><mrow id="S3.E12.m1.3.3.1" xref="S3.E12.m1.3.3.1.cmml"><mi id="S3.E12.m1.3.3.1.3" xref="S3.E12.m1.3.3.1.3.cmml">P</mi><mo id="S3.E12.m1.3.3.1.2" xref="S3.E12.m1.3.3.1.2.cmml">⁢</mo><mrow id="S3.E12.m1.3.3.1.1.1" xref="S3.E12.m1.3.3.1.1.1.1.cmml"><mo id="S3.E12.m1.3.3.1.1.1.2" stretchy="false" xref="S3.E12.m1.3.3.1.1.1.1.cmml">(</mo><mrow id="S3.E12.m1.3.3.1.1.1.1" xref="S3.E12.m1.3.3.1.1.1.1.cmml"><msub id="S3.E12.m1.3.3.1.1.1.1.4" xref="S3.E12.m1.3.3.1.1.1.1.4.cmml"><mi id="S3.E12.m1.3.3.1.1.1.1.4.2" xref="S3.E12.m1.3.3.1.1.1.1.4.2.cmml">y</mi><mi id="S3.E12.m1.3.3.1.1.1.1.4.3" xref="S3.E12.m1.3.3.1.1.1.1.4.3.cmml">i</mi></msub><mo id="S3.E12.m1.3.3.1.1.1.1.3" xref="S3.E12.m1.3.3.1.1.1.1.3.cmml">∣</mo><mrow id="S3.E12.m1.3.3.1.1.1.1.2.2" xref="S3.E12.m1.3.3.1.1.1.1.2.3.cmml"><msub id="S3.E12.m1.3.3.1.1.1.1.1.1.1" xref="S3.E12.m1.3.3.1.1.1.1.1.1.1.cmml"><mi id="S3.E12.m1.3.3.1.1.1.1.1.1.1.2" xref="S3.E12.m1.3.3.1.1.1.1.1.1.1.2.cmml">y</mi><mn id="S3.E12.m1.3.3.1.1.1.1.1.1.1.3" xref="S3.E12.m1.3.3.1.1.1.1.1.1.1.3.cmml">1</mn></msub><mo id="S3.E12.m1.3.3.1.1.1.1.2.2.3" xref="S3.E12.m1.3.3.1.1.1.1.2.3.cmml">,</mo><mi id="S3.E12.m1.1.1" mathvariant="normal" xref="S3.E12.m1.1.1.cmml">…</mi><mo id="S3.E12.m1.3.3.1.1.1.1.2.2.4" xref="S3.E12.m1.3.3.1.1.1.1.2.3.cmml">,</mo><msub id="S3.E12.m1.3.3.1.1.1.1.2.2.2" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2.cmml"><mi id="S3.E12.m1.3.3.1.1.1.1.2.2.2.2" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2.2.cmml">y</mi><mrow id="S3.E12.m1.3.3.1.1.1.1.2.2.2.3" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.cmml"><mi id="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.2" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.2.cmml">i</mi><mo id="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.1" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.1.cmml">−</mo><mn id="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.3" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.3.cmml">1</mn></mrow></msub><mo id="S3.E12.m1.3.3.1.1.1.1.2.2.5" xref="S3.E12.m1.3.3.1.1.1.1.2.3.cmml">;</mo><mi id="S3.E12.m1.2.2" xref="S3.E12.m1.2.2.cmml">x</mi></mrow></mrow><mo id="S3.E12.m1.3.3.1.1.1.3" stretchy="false" xref="S3.E12.m1.3.3.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S3.E12.m1.4.4.6" xref="S3.E12.m1.4.4.6.cmml">=</mo><mrow id="S3.E12.m1.4.4.2" xref="S3.E12.m1.4.4.2.cmml"><mtext id="S3.E12.m1.4.4.2.3" xref="S3.E12.m1.4.4.2.3a.cmml">softmax</mtext><mo id="S3.E12.m1.4.4.2.2" xref="S3.E12.m1.4.4.2.2.cmml">⁢</mo><mrow id="S3.E12.m1.4.4.2.1.1" xref="S3.E12.m1.4.4.2.1.1.1.cmml"><mo id="S3.E12.m1.4.4.2.1.1.2" stretchy="false" xref="S3.E12.m1.4.4.2.1.1.1.cmml">(</mo><mrow id="S3.E12.m1.4.4.2.1.1.1" xref="S3.E12.m1.4.4.2.1.1.1.cmml"><mrow id="S3.E12.m1.4.4.2.1.1.1.2" xref="S3.E12.m1.4.4.2.1.1.1.2.cmml"><msub id="S3.E12.m1.4.4.2.1.1.1.2.4" xref="S3.E12.m1.4.4.2.1.1.1.2.4.cmml"><mi id="S3.E12.m1.4.4.2.1.1.1.2.4.2" xref="S3.E12.m1.4.4.2.1.1.1.2.4.2.cmml">W</mi><mi id="S3.E12.m1.4.4.2.1.1.1.2.4.3" xref="S3.E12.m1.4.4.2.1.1.1.2.4.3.cmml">v</mi></msub><mo id="S3.E12.m1.4.4.2.1.1.1.2.3" xref="S3.E12.m1.4.4.2.1.1.1.2.3.cmml">⁢</mo><mrow id="S3.E12.m1.4.4.2.1.1.1.2.2.2" xref="S3.E12.m1.4.4.2.1.1.1.2.2.3.cmml"><mo id="S3.E12.m1.4.4.2.1.1.1.2.2.2.3" stretchy="false" xref="S3.E12.m1.4.4.2.1.1.1.2.2.3.cmml">[</mo><msubsup id="S3.E12.m1.4.4.2.1.1.1.1.1.1.1" xref="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.cmml"><mi id="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.2.2" xref="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.2.2.cmml">h</mi><mi id="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.2.3" xref="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.2.3.cmml">i</mi><mi id="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.3" xref="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.3.cmml">d</mi></msubsup><mo id="S3.E12.m1.4.4.2.1.1.1.2.2.2.4" xref="S3.E12.m1.4.4.2.1.1.1.2.2.3.cmml">,</mo><msub id="S3.E12.m1.4.4.2.1.1.1.2.2.2.2" xref="S3.E12.m1.4.4.2.1.1.1.2.2.2.2.cmml"><mi id="S3.E12.m1.4.4.2.1.1.1.2.2.2.2.2" xref="S3.E12.m1.4.4.2.1.1.1.2.2.2.2.2.cmml">c</mi><mi id="S3.E12.m1.4.4.2.1.1.1.2.2.2.2.3" xref="S3.E12.m1.4.4.2.1.1.1.2.2.2.2.3.cmml">i</mi></msub><mo id="S3.E12.m1.4.4.2.1.1.1.2.2.2.5" stretchy="false" xref="S3.E12.m1.4.4.2.1.1.1.2.2.3.cmml">]</mo></mrow></mrow><mo id="S3.E12.m1.4.4.2.1.1.1.3" xref="S3.E12.m1.4.4.2.1.1.1.3.cmml">+</mo><msub id="S3.E12.m1.4.4.2.1.1.1.4" xref="S3.E12.m1.4.4.2.1.1.1.4.cmml"><mi id="S3.E12.m1.4.4.2.1.1.1.4.2" xref="S3.E12.m1.4.4.2.1.1.1.4.2.cmml">b</mi><mi id="S3.E12.m1.4.4.2.1.1.1.4.3" xref="S3.E12.m1.4.4.2.1.1.1.4.3.cmml">v</mi></msub></mrow><mo id="S3.E12.m1.4.4.2.1.1.3" stretchy="false" xref="S3.E12.m1.4.4.2.1.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E12.m1.4b"><apply id="S3.E12.m1.4.4.cmml" xref="S3.E12.m1.4.4"><and id="S3.E12.m1.4.4a.cmml" xref="S3.E12.m1.4.4"></and><apply id="S3.E12.m1.4.4b.cmml" xref="S3.E12.m1.4.4"><eq id="S3.E12.m1.4.4.5.cmml" xref="S3.E12.m1.4.4.5"></eq><apply id="S3.E12.m1.4.4.4.cmml" xref="S3.E12.m1.4.4.4"><csymbol cd="ambiguous" id="S3.E12.m1.4.4.4.1.cmml" xref="S3.E12.m1.4.4.4">subscript</csymbol><ci id="S3.E12.m1.4.4.4.2.cmml" xref="S3.E12.m1.4.4.4.2">𝑃</ci><ci id="S3.E12.m1.4.4.4.3.cmml" xref="S3.E12.m1.4.4.4.3">𝑣</ci></apply><apply id="S3.E12.m1.3.3.1.cmml" xref="S3.E12.m1.3.3.1"><times id="S3.E12.m1.3.3.1.2.cmml" xref="S3.E12.m1.3.3.1.2"></times><ci id="S3.E12.m1.3.3.1.3.cmml" xref="S3.E12.m1.3.3.1.3">𝑃</ci><apply id="S3.E12.m1.3.3.1.1.1.1.cmml" xref="S3.E12.m1.3.3.1.1.1"><csymbol cd="latexml" id="S3.E12.m1.3.3.1.1.1.1.3.cmml" xref="S3.E12.m1.3.3.1.1.1.1.3">conditional</csymbol><apply id="S3.E12.m1.3.3.1.1.1.1.4.cmml" xref="S3.E12.m1.3.3.1.1.1.1.4"><csymbol cd="ambiguous" id="S3.E12.m1.3.3.1.1.1.1.4.1.cmml" xref="S3.E12.m1.3.3.1.1.1.1.4">subscript</csymbol><ci id="S3.E12.m1.3.3.1.1.1.1.4.2.cmml" xref="S3.E12.m1.3.3.1.1.1.1.4.2">𝑦</ci><ci id="S3.E12.m1.3.3.1.1.1.1.4.3.cmml" xref="S3.E12.m1.3.3.1.1.1.1.4.3">𝑖</ci></apply><list id="S3.E12.m1.3.3.1.1.1.1.2.3.cmml" xref="S3.E12.m1.3.3.1.1.1.1.2.2"><apply id="S3.E12.m1.3.3.1.1.1.1.1.1.1.cmml" xref="S3.E12.m1.3.3.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.E12.m1.3.3.1.1.1.1.1.1.1.1.cmml" xref="S3.E12.m1.3.3.1.1.1.1.1.1.1">subscript</csymbol><ci id="S3.E12.m1.3.3.1.1.1.1.1.1.1.2.cmml" xref="S3.E12.m1.3.3.1.1.1.1.1.1.1.2">𝑦</ci><cn id="S3.E12.m1.3.3.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S3.E12.m1.3.3.1.1.1.1.1.1.1.3">1</cn></apply><ci id="S3.E12.m1.1.1.cmml" xref="S3.E12.m1.1.1">…</ci><apply id="S3.E12.m1.3.3.1.1.1.1.2.2.2.cmml" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2"><csymbol cd="ambiguous" id="S3.E12.m1.3.3.1.1.1.1.2.2.2.1.cmml" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2">subscript</csymbol><ci id="S3.E12.m1.3.3.1.1.1.1.2.2.2.2.cmml" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2.2">𝑦</ci><apply id="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.cmml" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2.3"><minus id="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.1.cmml" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.1"></minus><ci id="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.2.cmml" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.2">𝑖</ci><cn id="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.3.cmml" type="integer" xref="S3.E12.m1.3.3.1.1.1.1.2.2.2.3.3">1</cn></apply></apply><ci id="S3.E12.m1.2.2.cmml" xref="S3.E12.m1.2.2">𝑥</ci></list></apply></apply></apply><apply id="S3.E12.m1.4.4c.cmml" xref="S3.E12.m1.4.4"><eq id="S3.E12.m1.4.4.6.cmml" xref="S3.E12.m1.4.4.6"></eq><share href="https://arxiv.org/html/2411.14072v1#S3.E12.m1.3.3.1.cmml" id="S3.E12.m1.4.4d.cmml" xref="S3.E12.m1.4.4"></share><apply id="S3.E12.m1.4.4.2.cmml" xref="S3.E12.m1.4.4.2"><times id="S3.E12.m1.4.4.2.2.cmml" xref="S3.E12.m1.4.4.2.2"></times><ci id="S3.E12.m1.4.4.2.3a.cmml" xref="S3.E12.m1.4.4.2.3"><mtext id="S3.E12.m1.4.4.2.3.cmml" xref="S3.E12.m1.4.4.2.3">softmax</mtext></ci><apply id="S3.E12.m1.4.4.2.1.1.1.cmml" xref="S3.E12.m1.4.4.2.1.1"><plus id="S3.E12.m1.4.4.2.1.1.1.3.cmml" xref="S3.E12.m1.4.4.2.1.1.1.3"></plus><apply id="S3.E12.m1.4.4.2.1.1.1.2.cmml" xref="S3.E12.m1.4.4.2.1.1.1.2"><times id="S3.E12.m1.4.4.2.1.1.1.2.3.cmml" xref="S3.E12.m1.4.4.2.1.1.1.2.3"></times><apply id="S3.E12.m1.4.4.2.1.1.1.2.4.cmml" xref="S3.E12.m1.4.4.2.1.1.1.2.4"><csymbol cd="ambiguous" id="S3.E12.m1.4.4.2.1.1.1.2.4.1.cmml" xref="S3.E12.m1.4.4.2.1.1.1.2.4">subscript</csymbol><ci id="S3.E12.m1.4.4.2.1.1.1.2.4.2.cmml" xref="S3.E12.m1.4.4.2.1.1.1.2.4.2">𝑊</ci><ci id="S3.E12.m1.4.4.2.1.1.1.2.4.3.cmml" xref="S3.E12.m1.4.4.2.1.1.1.2.4.3">𝑣</ci></apply><interval closure="closed" id="S3.E12.m1.4.4.2.1.1.1.2.2.3.cmml" xref="S3.E12.m1.4.4.2.1.1.1.2.2.2"><apply id="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.cmml" xref="S3.E12.m1.4.4.2.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.1.cmml" xref="S3.E12.m1.4.4.2.1.1.1.1.1.1.1">superscript</csymbol><apply id="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.2.cmml" xref="S3.E12.m1.4.4.2.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.2.1.cmml" xref="S3.E12.m1.4.4.2.1.1.1.1.1.1.1">subscript</csymbol><ci id="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.2.2.cmml" xref="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.2.2">ℎ</ci><ci id="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.2.3.cmml" xref="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.2.3">𝑖</ci></apply><ci id="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.3.cmml" xref="S3.E12.m1.4.4.2.1.1.1.1.1.1.1.3">𝑑</ci></apply><apply id="S3.E12.m1.4.4.2.1.1.1.2.2.2.2.cmml" xref="S3.E12.m1.4.4.2.1.1.1.2.2.2.2"><csymbol cd="ambiguous" id="S3.E12.m1.4.4.2.1.1.1.2.2.2.2.1.cmml" xref="S3.E12.m1.4.4.2.1.1.1.2.2.2.2">subscript</csymbol><ci id="S3.E12.m1.4.4.2.1.1.1.2.2.2.2.2.cmml" xref="S3.E12.m1.4.4.2.1.1.1.2.2.2.2.2">𝑐</ci><ci id="S3.E12.m1.4.4.2.1.1.1.2.2.2.2.3.cmml" xref="S3.E12.m1.4.4.2.1.1.1.2.2.2.2.3">𝑖</ci></apply></interval></apply><apply id="S3.E12.m1.4.4.2.1.1.1.4.cmml" xref="S3.E12.m1.4.4.2.1.1.1.4"><csymbol cd="ambiguous" id="S3.E12.m1.4.4.2.1.1.1.4.1.cmml" xref="S3.E12.m1.4.4.2.1.1.1.4">subscript</csymbol><ci id="S3.E12.m1.4.4.2.1.1.1.4.2.cmml" xref="S3.E12.m1.4.4.2.1.1.1.4.2">𝑏</ci><ci id="S3.E12.m1.4.4.2.1.1.1.4.3.cmml" xref="S3.E12.m1.4.4.2.1.1.1.4.3">𝑣</ci></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E12.m1.4c">P_{v}=P(y_{i}\mid y_{1},\ldots,y_{i-1};x)=\text{softmax}(W_{v}[h_{i}^{d},c_{i}% ]+b_{v})</annotation><annotation encoding="application/x-llamapun" id="S3.E12.m1.4d">italic_P start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT = italic_P ( italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∣ italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT ; italic_x ) = softmax ( italic_W start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT [ italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT , italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] + italic_b start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(12)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS3.p3.10">where <math alttext="P(y_{i}\mid y_{1},\ldots,y_{i-1};x)" class="ltx_Math" display="inline" id="S3.SS3.p3.6.m1.3"><semantics id="S3.SS3.p3.6.m1.3a"><mrow id="S3.SS3.p3.6.m1.3.3" xref="S3.SS3.p3.6.m1.3.3.cmml"><mi id="S3.SS3.p3.6.m1.3.3.3" xref="S3.SS3.p3.6.m1.3.3.3.cmml">P</mi><mo id="S3.SS3.p3.6.m1.3.3.2" xref="S3.SS3.p3.6.m1.3.3.2.cmml">⁢</mo><mrow id="S3.SS3.p3.6.m1.3.3.1.1" xref="S3.SS3.p3.6.m1.3.3.1.1.1.cmml"><mo id="S3.SS3.p3.6.m1.3.3.1.1.2" stretchy="false" xref="S3.SS3.p3.6.m1.3.3.1.1.1.cmml">(</mo><mrow id="S3.SS3.p3.6.m1.3.3.1.1.1" xref="S3.SS3.p3.6.m1.3.3.1.1.1.cmml"><msub id="S3.SS3.p3.6.m1.3.3.1.1.1.4" xref="S3.SS3.p3.6.m1.3.3.1.1.1.4.cmml"><mi id="S3.SS3.p3.6.m1.3.3.1.1.1.4.2" xref="S3.SS3.p3.6.m1.3.3.1.1.1.4.2.cmml">y</mi><mi id="S3.SS3.p3.6.m1.3.3.1.1.1.4.3" xref="S3.SS3.p3.6.m1.3.3.1.1.1.4.3.cmml">i</mi></msub><mo id="S3.SS3.p3.6.m1.3.3.1.1.1.3" xref="S3.SS3.p3.6.m1.3.3.1.1.1.3.cmml">∣</mo><mrow id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.3.cmml"><msub id="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1" xref="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1.cmml"><mi id="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1.2" xref="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1.2.cmml">y</mi><mn id="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1.3" xref="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1.3.cmml">1</mn></msub><mo id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.3" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.3.cmml">,</mo><mi id="S3.SS3.p3.6.m1.1.1" mathvariant="normal" xref="S3.SS3.p3.6.m1.1.1.cmml">…</mi><mo id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.4" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.3.cmml">,</mo><msub id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.cmml"><mi id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.2" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.2.cmml">y</mi><mrow id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.cmml"><mi id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.2" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.2.cmml">i</mi><mo id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.1" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.1.cmml">−</mo><mn id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.3" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.3.cmml">1</mn></mrow></msub><mo id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.5" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.3.cmml">;</mo><mi id="S3.SS3.p3.6.m1.2.2" xref="S3.SS3.p3.6.m1.2.2.cmml">x</mi></mrow></mrow><mo id="S3.SS3.p3.6.m1.3.3.1.1.3" stretchy="false" xref="S3.SS3.p3.6.m1.3.3.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.p3.6.m1.3b"><apply id="S3.SS3.p3.6.m1.3.3.cmml" xref="S3.SS3.p3.6.m1.3.3"><times id="S3.SS3.p3.6.m1.3.3.2.cmml" xref="S3.SS3.p3.6.m1.3.3.2"></times><ci id="S3.SS3.p3.6.m1.3.3.3.cmml" xref="S3.SS3.p3.6.m1.3.3.3">𝑃</ci><apply id="S3.SS3.p3.6.m1.3.3.1.1.1.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1"><csymbol cd="latexml" id="S3.SS3.p3.6.m1.3.3.1.1.1.3.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.3">conditional</csymbol><apply id="S3.SS3.p3.6.m1.3.3.1.1.1.4.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.4"><csymbol cd="ambiguous" id="S3.SS3.p3.6.m1.3.3.1.1.1.4.1.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.4">subscript</csymbol><ci id="S3.SS3.p3.6.m1.3.3.1.1.1.4.2.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.4.2">𝑦</ci><ci id="S3.SS3.p3.6.m1.3.3.1.1.1.4.3.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.4.3">𝑖</ci></apply><list id="S3.SS3.p3.6.m1.3.3.1.1.1.2.3.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2"><apply id="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1.1.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1">subscript</csymbol><ci id="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1.2.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1.2">𝑦</ci><cn id="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1.3.cmml" type="integer" xref="S3.SS3.p3.6.m1.3.3.1.1.1.1.1.1.3">1</cn></apply><ci id="S3.SS3.p3.6.m1.1.1.cmml" xref="S3.SS3.p3.6.m1.1.1">…</ci><apply id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2"><csymbol cd="ambiguous" id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.1.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2">subscript</csymbol><ci id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.2.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.2">𝑦</ci><apply id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3"><minus id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.1.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.1"></minus><ci id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.2.cmml" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.2">𝑖</ci><cn id="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.3.cmml" type="integer" xref="S3.SS3.p3.6.m1.3.3.1.1.1.2.2.2.3.3">1</cn></apply></apply><ci id="S3.SS3.p3.6.m1.2.2.cmml" xref="S3.SS3.p3.6.m1.2.2">𝑥</ci></list></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p3.6.m1.3c">P(y_{i}\mid y_{1},\ldots,y_{i-1};x)</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p3.6.m1.3d">italic_P ( italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∣ italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT ; italic_x )</annotation></semantics></math> is the conditional probability distribution of the target word <math alttext="y_{i}" class="ltx_Math" display="inline" id="S3.SS3.p3.7.m2.1"><semantics id="S3.SS3.p3.7.m2.1a"><msub id="S3.SS3.p3.7.m2.1.1" xref="S3.SS3.p3.7.m2.1.1.cmml"><mi id="S3.SS3.p3.7.m2.1.1.2" xref="S3.SS3.p3.7.m2.1.1.2.cmml">y</mi><mi id="S3.SS3.p3.7.m2.1.1.3" xref="S3.SS3.p3.7.m2.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.p3.7.m2.1b"><apply id="S3.SS3.p3.7.m2.1.1.cmml" xref="S3.SS3.p3.7.m2.1.1"><csymbol cd="ambiguous" id="S3.SS3.p3.7.m2.1.1.1.cmml" xref="S3.SS3.p3.7.m2.1.1">subscript</csymbol><ci id="S3.SS3.p3.7.m2.1.1.2.cmml" xref="S3.SS3.p3.7.m2.1.1.2">𝑦</ci><ci id="S3.SS3.p3.7.m2.1.1.3.cmml" xref="S3.SS3.p3.7.m2.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p3.7.m2.1c">y_{i}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p3.7.m2.1d">italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math> at time step <math alttext="i" class="ltx_Math" display="inline" id="S3.SS3.p3.8.m3.1"><semantics id="S3.SS3.p3.8.m3.1a"><mi id="S3.SS3.p3.8.m3.1.1" xref="S3.SS3.p3.8.m3.1.1.cmml">i</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.p3.8.m3.1b"><ci id="S3.SS3.p3.8.m3.1.1.cmml" xref="S3.SS3.p3.8.m3.1.1">𝑖</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p3.8.m3.1c">i</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p3.8.m3.1d">italic_i</annotation></semantics></math>, with <math alttext="W_{v}" class="ltx_Math" display="inline" id="S3.SS3.p3.9.m4.1"><semantics id="S3.SS3.p3.9.m4.1a"><msub id="S3.SS3.p3.9.m4.1.1" xref="S3.SS3.p3.9.m4.1.1.cmml"><mi id="S3.SS3.p3.9.m4.1.1.2" xref="S3.SS3.p3.9.m4.1.1.2.cmml">W</mi><mi id="S3.SS3.p3.9.m4.1.1.3" xref="S3.SS3.p3.9.m4.1.1.3.cmml">v</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.p3.9.m4.1b"><apply id="S3.SS3.p3.9.m4.1.1.cmml" xref="S3.SS3.p3.9.m4.1.1"><csymbol cd="ambiguous" id="S3.SS3.p3.9.m4.1.1.1.cmml" xref="S3.SS3.p3.9.m4.1.1">subscript</csymbol><ci id="S3.SS3.p3.9.m4.1.1.2.cmml" xref="S3.SS3.p3.9.m4.1.1.2">𝑊</ci><ci id="S3.SS3.p3.9.m4.1.1.3.cmml" xref="S3.SS3.p3.9.m4.1.1.3">𝑣</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p3.9.m4.1c">W_{v}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p3.9.m4.1d">italic_W start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT</annotation></semantics></math> and <math alttext="b_{v}" class="ltx_Math" display="inline" id="S3.SS3.p3.10.m5.1"><semantics id="S3.SS3.p3.10.m5.1a"><msub id="S3.SS3.p3.10.m5.1.1" xref="S3.SS3.p3.10.m5.1.1.cmml"><mi id="S3.SS3.p3.10.m5.1.1.2" xref="S3.SS3.p3.10.m5.1.1.2.cmml">b</mi><mi id="S3.SS3.p3.10.m5.1.1.3" xref="S3.SS3.p3.10.m5.1.1.3.cmml">v</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.p3.10.m5.1b"><apply id="S3.SS3.p3.10.m5.1.1.cmml" xref="S3.SS3.p3.10.m5.1.1"><csymbol cd="ambiguous" id="S3.SS3.p3.10.m5.1.1.1.cmml" xref="S3.SS3.p3.10.m5.1.1">subscript</csymbol><ci id="S3.SS3.p3.10.m5.1.1.2.cmml" xref="S3.SS3.p3.10.m5.1.1.2">𝑏</ci><ci id="S3.SS3.p3.10.m5.1.1.3.cmml" xref="S3.SS3.p3.10.m5.1.1.3">𝑣</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p3.10.m5.1c">b_{v}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p3.10.m5.1d">italic_b start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT</annotation></semantics></math> as learning parameters.</p> </div> </section> <section class="ltx_subsection" id="S3.SS4"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.4 </span>Pointer Network</h3> <div class="ltx_para" id="S3.SS4.p1"> <p class="ltx_p" id="S3.SS4.p1.5">This paper primarily utilizes a pointer network <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib99" title="">99</a>]</cite> to address the OOV problem. A soft switch <math alttext="P_{p}" class="ltx_Math" display="inline" id="S3.SS4.p1.1.m1.1"><semantics id="S3.SS4.p1.1.m1.1a"><msub id="S3.SS4.p1.1.m1.1.1" xref="S3.SS4.p1.1.m1.1.1.cmml"><mi id="S3.SS4.p1.1.m1.1.1.2" xref="S3.SS4.p1.1.m1.1.1.2.cmml">P</mi><mi id="S3.SS4.p1.1.m1.1.1.3" xref="S3.SS4.p1.1.m1.1.1.3.cmml">p</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.1.m1.1b"><apply id="S3.SS4.p1.1.m1.1.1.cmml" xref="S3.SS4.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.1.m1.1.1.1.cmml" xref="S3.SS4.p1.1.m1.1.1">subscript</csymbol><ci id="S3.SS4.p1.1.m1.1.1.2.cmml" xref="S3.SS4.p1.1.m1.1.1.2">𝑃</ci><ci id="S3.SS4.p1.1.m1.1.1.3.cmml" xref="S3.SS4.p1.1.m1.1.1.3">𝑝</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.1.m1.1c">P_{p}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.1.m1.1d">italic_P start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT</annotation></semantics></math> is used to choose between generating a word from a fixed vocabulary by sampling from <math alttext="P_{v}" class="ltx_Math" display="inline" id="S3.SS4.p1.2.m2.1"><semantics id="S3.SS4.p1.2.m2.1a"><msub id="S3.SS4.p1.2.m2.1.1" xref="S3.SS4.p1.2.m2.1.1.cmml"><mi id="S3.SS4.p1.2.m2.1.1.2" xref="S3.SS4.p1.2.m2.1.1.2.cmml">P</mi><mi id="S3.SS4.p1.2.m2.1.1.3" xref="S3.SS4.p1.2.m2.1.1.3.cmml">v</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.2.m2.1b"><apply id="S3.SS4.p1.2.m2.1.1.cmml" xref="S3.SS4.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.2.m2.1.1.1.cmml" xref="S3.SS4.p1.2.m2.1.1">subscript</csymbol><ci id="S3.SS4.p1.2.m2.1.1.2.cmml" xref="S3.SS4.p1.2.m2.1.1.2">𝑃</ci><ci id="S3.SS4.p1.2.m2.1.1.3.cmml" xref="S3.SS4.p1.2.m2.1.1.3">𝑣</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.2.m2.1c">P_{v}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.2.m2.1d">italic_P start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT</annotation></semantics></math> and copying a word from the input sequence by sampling from the attention distribution <math alttext="a_{i}" class="ltx_Math" display="inline" id="S3.SS4.p1.3.m3.1"><semantics id="S3.SS4.p1.3.m3.1a"><msub id="S3.SS4.p1.3.m3.1.1" xref="S3.SS4.p1.3.m3.1.1.cmml"><mi id="S3.SS4.p1.3.m3.1.1.2" xref="S3.SS4.p1.3.m3.1.1.2.cmml">a</mi><mi id="S3.SS4.p1.3.m3.1.1.3" xref="S3.SS4.p1.3.m3.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.3.m3.1b"><apply id="S3.SS4.p1.3.m3.1.1.cmml" xref="S3.SS4.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.3.m3.1.1.1.cmml" xref="S3.SS4.p1.3.m3.1.1">subscript</csymbol><ci id="S3.SS4.p1.3.m3.1.1.2.cmml" xref="S3.SS4.p1.3.m3.1.1.2">𝑎</ci><ci id="S3.SS4.p1.3.m3.1.1.3.cmml" xref="S3.SS4.p1.3.m3.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.3.m3.1c">a_{i}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.3.m3.1d">italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math>. <math alttext="P_{p}" class="ltx_Math" display="inline" id="S3.SS4.p1.4.m4.1"><semantics id="S3.SS4.p1.4.m4.1a"><msub id="S3.SS4.p1.4.m4.1.1" xref="S3.SS4.p1.4.m4.1.1.cmml"><mi id="S3.SS4.p1.4.m4.1.1.2" xref="S3.SS4.p1.4.m4.1.1.2.cmml">P</mi><mi id="S3.SS4.p1.4.m4.1.1.3" xref="S3.SS4.p1.4.m4.1.1.3.cmml">p</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.4.m4.1b"><apply id="S3.SS4.p1.4.m4.1.1.cmml" xref="S3.SS4.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.4.m4.1.1.1.cmml" xref="S3.SS4.p1.4.m4.1.1">subscript</csymbol><ci id="S3.SS4.p1.4.m4.1.1.2.cmml" xref="S3.SS4.p1.4.m4.1.1.2">𝑃</ci><ci id="S3.SS4.p1.4.m4.1.1.3.cmml" xref="S3.SS4.p1.4.m4.1.1.3">𝑝</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.4.m4.1c">P_{p}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.4.m4.1d">italic_P start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT</annotation></semantics></math> is the generation probability at time step <math alttext="i" class="ltx_Math" display="inline" id="S3.SS4.p1.5.m5.1"><semantics id="S3.SS4.p1.5.m5.1a"><mi id="S3.SS4.p1.5.m5.1.1" xref="S3.SS4.p1.5.m5.1.1.cmml">i</mi><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.5.m5.1b"><ci id="S3.SS4.p1.5.m5.1.1.cmml" xref="S3.SS4.p1.5.m5.1.1">𝑖</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.5.m5.1c">i</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.5.m5.1d">italic_i</annotation></semantics></math>, and the master formula is as follows:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E13"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="P_{p}=\sigma(\omega_{c}^{T}c_{i}+\omega_{h}^{T}h_{i}^{d}+\omega_{y}^{T}y_{i}+% \omega_{d}^{T}C^{d}+b_{g})" class="ltx_Math" display="block" id="S3.E13.m1.1"><semantics id="S3.E13.m1.1a"><mrow id="S3.E13.m1.1.1" xref="S3.E13.m1.1.1.cmml"><msub id="S3.E13.m1.1.1.3" xref="S3.E13.m1.1.1.3.cmml"><mi id="S3.E13.m1.1.1.3.2" xref="S3.E13.m1.1.1.3.2.cmml">P</mi><mi id="S3.E13.m1.1.1.3.3" xref="S3.E13.m1.1.1.3.3.cmml">p</mi></msub><mo id="S3.E13.m1.1.1.2" xref="S3.E13.m1.1.1.2.cmml">=</mo><mrow id="S3.E13.m1.1.1.1" xref="S3.E13.m1.1.1.1.cmml"><mi id="S3.E13.m1.1.1.1.3" xref="S3.E13.m1.1.1.1.3.cmml">σ</mi><mo id="S3.E13.m1.1.1.1.2" xref="S3.E13.m1.1.1.1.2.cmml">⁢</mo><mrow id="S3.E13.m1.1.1.1.1.1" xref="S3.E13.m1.1.1.1.1.1.1.cmml"><mo id="S3.E13.m1.1.1.1.1.1.2" stretchy="false" xref="S3.E13.m1.1.1.1.1.1.1.cmml">(</mo><mrow id="S3.E13.m1.1.1.1.1.1.1" xref="S3.E13.m1.1.1.1.1.1.1.cmml"><mrow id="S3.E13.m1.1.1.1.1.1.1.2" xref="S3.E13.m1.1.1.1.1.1.1.2.cmml"><msubsup id="S3.E13.m1.1.1.1.1.1.1.2.2" xref="S3.E13.m1.1.1.1.1.1.1.2.2.cmml"><mi id="S3.E13.m1.1.1.1.1.1.1.2.2.2.2" xref="S3.E13.m1.1.1.1.1.1.1.2.2.2.2.cmml">ω</mi><mi id="S3.E13.m1.1.1.1.1.1.1.2.2.2.3" xref="S3.E13.m1.1.1.1.1.1.1.2.2.2.3.cmml">c</mi><mi id="S3.E13.m1.1.1.1.1.1.1.2.2.3" xref="S3.E13.m1.1.1.1.1.1.1.2.2.3.cmml">T</mi></msubsup><mo id="S3.E13.m1.1.1.1.1.1.1.2.1" xref="S3.E13.m1.1.1.1.1.1.1.2.1.cmml">⁢</mo><msub id="S3.E13.m1.1.1.1.1.1.1.2.3" xref="S3.E13.m1.1.1.1.1.1.1.2.3.cmml"><mi id="S3.E13.m1.1.1.1.1.1.1.2.3.2" xref="S3.E13.m1.1.1.1.1.1.1.2.3.2.cmml">c</mi><mi id="S3.E13.m1.1.1.1.1.1.1.2.3.3" xref="S3.E13.m1.1.1.1.1.1.1.2.3.3.cmml">i</mi></msub></mrow><mo id="S3.E13.m1.1.1.1.1.1.1.1" xref="S3.E13.m1.1.1.1.1.1.1.1.cmml">+</mo><mrow id="S3.E13.m1.1.1.1.1.1.1.3" xref="S3.E13.m1.1.1.1.1.1.1.3.cmml"><msubsup id="S3.E13.m1.1.1.1.1.1.1.3.2" xref="S3.E13.m1.1.1.1.1.1.1.3.2.cmml"><mi id="S3.E13.m1.1.1.1.1.1.1.3.2.2.2" xref="S3.E13.m1.1.1.1.1.1.1.3.2.2.2.cmml">ω</mi><mi id="S3.E13.m1.1.1.1.1.1.1.3.2.2.3" xref="S3.E13.m1.1.1.1.1.1.1.3.2.2.3.cmml">h</mi><mi id="S3.E13.m1.1.1.1.1.1.1.3.2.3" xref="S3.E13.m1.1.1.1.1.1.1.3.2.3.cmml">T</mi></msubsup><mo id="S3.E13.m1.1.1.1.1.1.1.3.1" xref="S3.E13.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><msubsup id="S3.E13.m1.1.1.1.1.1.1.3.3" xref="S3.E13.m1.1.1.1.1.1.1.3.3.cmml"><mi id="S3.E13.m1.1.1.1.1.1.1.3.3.2.2" xref="S3.E13.m1.1.1.1.1.1.1.3.3.2.2.cmml">h</mi><mi id="S3.E13.m1.1.1.1.1.1.1.3.3.2.3" xref="S3.E13.m1.1.1.1.1.1.1.3.3.2.3.cmml">i</mi><mi id="S3.E13.m1.1.1.1.1.1.1.3.3.3" xref="S3.E13.m1.1.1.1.1.1.1.3.3.3.cmml">d</mi></msubsup></mrow><mo id="S3.E13.m1.1.1.1.1.1.1.1a" xref="S3.E13.m1.1.1.1.1.1.1.1.cmml">+</mo><mrow id="S3.E13.m1.1.1.1.1.1.1.4" xref="S3.E13.m1.1.1.1.1.1.1.4.cmml"><msubsup id="S3.E13.m1.1.1.1.1.1.1.4.2" xref="S3.E13.m1.1.1.1.1.1.1.4.2.cmml"><mi id="S3.E13.m1.1.1.1.1.1.1.4.2.2.2" xref="S3.E13.m1.1.1.1.1.1.1.4.2.2.2.cmml">ω</mi><mi id="S3.E13.m1.1.1.1.1.1.1.4.2.2.3" xref="S3.E13.m1.1.1.1.1.1.1.4.2.2.3.cmml">y</mi><mi id="S3.E13.m1.1.1.1.1.1.1.4.2.3" xref="S3.E13.m1.1.1.1.1.1.1.4.2.3.cmml">T</mi></msubsup><mo id="S3.E13.m1.1.1.1.1.1.1.4.1" xref="S3.E13.m1.1.1.1.1.1.1.4.1.cmml">⁢</mo><msub id="S3.E13.m1.1.1.1.1.1.1.4.3" xref="S3.E13.m1.1.1.1.1.1.1.4.3.cmml"><mi id="S3.E13.m1.1.1.1.1.1.1.4.3.2" xref="S3.E13.m1.1.1.1.1.1.1.4.3.2.cmml">y</mi><mi id="S3.E13.m1.1.1.1.1.1.1.4.3.3" xref="S3.E13.m1.1.1.1.1.1.1.4.3.3.cmml">i</mi></msub></mrow><mo id="S3.E13.m1.1.1.1.1.1.1.1b" xref="S3.E13.m1.1.1.1.1.1.1.1.cmml">+</mo><mrow id="S3.E13.m1.1.1.1.1.1.1.5" xref="S3.E13.m1.1.1.1.1.1.1.5.cmml"><msubsup id="S3.E13.m1.1.1.1.1.1.1.5.2" xref="S3.E13.m1.1.1.1.1.1.1.5.2.cmml"><mi id="S3.E13.m1.1.1.1.1.1.1.5.2.2.2" xref="S3.E13.m1.1.1.1.1.1.1.5.2.2.2.cmml">ω</mi><mi id="S3.E13.m1.1.1.1.1.1.1.5.2.2.3" xref="S3.E13.m1.1.1.1.1.1.1.5.2.2.3.cmml">d</mi><mi id="S3.E13.m1.1.1.1.1.1.1.5.2.3" xref="S3.E13.m1.1.1.1.1.1.1.5.2.3.cmml">T</mi></msubsup><mo id="S3.E13.m1.1.1.1.1.1.1.5.1" xref="S3.E13.m1.1.1.1.1.1.1.5.1.cmml">⁢</mo><msup id="S3.E13.m1.1.1.1.1.1.1.5.3" xref="S3.E13.m1.1.1.1.1.1.1.5.3.cmml"><mi id="S3.E13.m1.1.1.1.1.1.1.5.3.2" xref="S3.E13.m1.1.1.1.1.1.1.5.3.2.cmml">C</mi><mi id="S3.E13.m1.1.1.1.1.1.1.5.3.3" xref="S3.E13.m1.1.1.1.1.1.1.5.3.3.cmml">d</mi></msup></mrow><mo id="S3.E13.m1.1.1.1.1.1.1.1c" xref="S3.E13.m1.1.1.1.1.1.1.1.cmml">+</mo><msub id="S3.E13.m1.1.1.1.1.1.1.6" xref="S3.E13.m1.1.1.1.1.1.1.6.cmml"><mi id="S3.E13.m1.1.1.1.1.1.1.6.2" xref="S3.E13.m1.1.1.1.1.1.1.6.2.cmml">b</mi><mi id="S3.E13.m1.1.1.1.1.1.1.6.3" xref="S3.E13.m1.1.1.1.1.1.1.6.3.cmml">g</mi></msub></mrow><mo id="S3.E13.m1.1.1.1.1.1.3" stretchy="false" xref="S3.E13.m1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E13.m1.1b"><apply id="S3.E13.m1.1.1.cmml" xref="S3.E13.m1.1.1"><eq id="S3.E13.m1.1.1.2.cmml" xref="S3.E13.m1.1.1.2"></eq><apply id="S3.E13.m1.1.1.3.cmml" xref="S3.E13.m1.1.1.3"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.3.1.cmml" xref="S3.E13.m1.1.1.3">subscript</csymbol><ci id="S3.E13.m1.1.1.3.2.cmml" xref="S3.E13.m1.1.1.3.2">𝑃</ci><ci id="S3.E13.m1.1.1.3.3.cmml" xref="S3.E13.m1.1.1.3.3">𝑝</ci></apply><apply id="S3.E13.m1.1.1.1.cmml" xref="S3.E13.m1.1.1.1"><times id="S3.E13.m1.1.1.1.2.cmml" xref="S3.E13.m1.1.1.1.2"></times><ci id="S3.E13.m1.1.1.1.3.cmml" xref="S3.E13.m1.1.1.1.3">𝜎</ci><apply id="S3.E13.m1.1.1.1.1.1.1.cmml" xref="S3.E13.m1.1.1.1.1.1"><plus id="S3.E13.m1.1.1.1.1.1.1.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.1"></plus><apply id="S3.E13.m1.1.1.1.1.1.1.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2"><times id="S3.E13.m1.1.1.1.1.1.1.2.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.1"></times><apply id="S3.E13.m1.1.1.1.1.1.1.2.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.2"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.2.2.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.2">superscript</csymbol><apply id="S3.E13.m1.1.1.1.1.1.1.2.2.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.2"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.2.2.2.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.2">subscript</csymbol><ci id="S3.E13.m1.1.1.1.1.1.1.2.2.2.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.2.2.2">𝜔</ci><ci id="S3.E13.m1.1.1.1.1.1.1.2.2.2.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.2.2.3">𝑐</ci></apply><ci id="S3.E13.m1.1.1.1.1.1.1.2.2.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.2.3">𝑇</ci></apply><apply id="S3.E13.m1.1.1.1.1.1.1.2.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.3"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.2.3.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.3">subscript</csymbol><ci id="S3.E13.m1.1.1.1.1.1.1.2.3.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.3.2">𝑐</ci><ci id="S3.E13.m1.1.1.1.1.1.1.2.3.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.2.3.3">𝑖</ci></apply></apply><apply id="S3.E13.m1.1.1.1.1.1.1.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3"><times id="S3.E13.m1.1.1.1.1.1.1.3.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.1"></times><apply id="S3.E13.m1.1.1.1.1.1.1.3.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.2"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.3.2.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.2">superscript</csymbol><apply id="S3.E13.m1.1.1.1.1.1.1.3.2.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.2"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.3.2.2.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.2">subscript</csymbol><ci id="S3.E13.m1.1.1.1.1.1.1.3.2.2.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.2.2.2">𝜔</ci><ci id="S3.E13.m1.1.1.1.1.1.1.3.2.2.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.2.2.3">ℎ</ci></apply><ci id="S3.E13.m1.1.1.1.1.1.1.3.2.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.2.3">𝑇</ci></apply><apply id="S3.E13.m1.1.1.1.1.1.1.3.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.3"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.3.3.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.3">superscript</csymbol><apply id="S3.E13.m1.1.1.1.1.1.1.3.3.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.3"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.3.3.2.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.3">subscript</csymbol><ci id="S3.E13.m1.1.1.1.1.1.1.3.3.2.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.3.2.2">ℎ</ci><ci id="S3.E13.m1.1.1.1.1.1.1.3.3.2.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.3.2.3">𝑖</ci></apply><ci id="S3.E13.m1.1.1.1.1.1.1.3.3.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.3.3.3">𝑑</ci></apply></apply><apply id="S3.E13.m1.1.1.1.1.1.1.4.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4"><times id="S3.E13.m1.1.1.1.1.1.1.4.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.1"></times><apply id="S3.E13.m1.1.1.1.1.1.1.4.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.2"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.4.2.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.2">superscript</csymbol><apply id="S3.E13.m1.1.1.1.1.1.1.4.2.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.2"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.4.2.2.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.2">subscript</csymbol><ci id="S3.E13.m1.1.1.1.1.1.1.4.2.2.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.2.2.2">𝜔</ci><ci id="S3.E13.m1.1.1.1.1.1.1.4.2.2.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.2.2.3">𝑦</ci></apply><ci id="S3.E13.m1.1.1.1.1.1.1.4.2.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.2.3">𝑇</ci></apply><apply id="S3.E13.m1.1.1.1.1.1.1.4.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.3"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.4.3.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.3">subscript</csymbol><ci id="S3.E13.m1.1.1.1.1.1.1.4.3.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.3.2">𝑦</ci><ci id="S3.E13.m1.1.1.1.1.1.1.4.3.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.4.3.3">𝑖</ci></apply></apply><apply id="S3.E13.m1.1.1.1.1.1.1.5.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5"><times id="S3.E13.m1.1.1.1.1.1.1.5.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.1"></times><apply id="S3.E13.m1.1.1.1.1.1.1.5.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.2"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.5.2.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.2">superscript</csymbol><apply id="S3.E13.m1.1.1.1.1.1.1.5.2.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.2"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.5.2.2.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.2">subscript</csymbol><ci id="S3.E13.m1.1.1.1.1.1.1.5.2.2.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.2.2.2">𝜔</ci><ci id="S3.E13.m1.1.1.1.1.1.1.5.2.2.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.2.2.3">𝑑</ci></apply><ci id="S3.E13.m1.1.1.1.1.1.1.5.2.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.2.3">𝑇</ci></apply><apply id="S3.E13.m1.1.1.1.1.1.1.5.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.3"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.5.3.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.3">superscript</csymbol><ci id="S3.E13.m1.1.1.1.1.1.1.5.3.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.3.2">𝐶</ci><ci id="S3.E13.m1.1.1.1.1.1.1.5.3.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.5.3.3">𝑑</ci></apply></apply><apply id="S3.E13.m1.1.1.1.1.1.1.6.cmml" xref="S3.E13.m1.1.1.1.1.1.1.6"><csymbol cd="ambiguous" id="S3.E13.m1.1.1.1.1.1.1.6.1.cmml" xref="S3.E13.m1.1.1.1.1.1.1.6">subscript</csymbol><ci id="S3.E13.m1.1.1.1.1.1.1.6.2.cmml" xref="S3.E13.m1.1.1.1.1.1.1.6.2">𝑏</ci><ci id="S3.E13.m1.1.1.1.1.1.1.6.3.cmml" xref="S3.E13.m1.1.1.1.1.1.1.6.3">𝑔</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E13.m1.1c">P_{p}=\sigma(\omega_{c}^{T}c_{i}+\omega_{h}^{T}h_{i}^{d}+\omega_{y}^{T}y_{i}+% \omega_{d}^{T}C^{d}+b_{g})</annotation><annotation encoding="application/x-llamapun" id="S3.E13.m1.1d">italic_P start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_σ ( italic_ω start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_ω start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + italic_ω start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_ω start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_C start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + italic_b start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(13)</span></td> </tr></tbody> </table> <table class="ltx_equation ltx_eqn_table" id="S3.E14"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="P_{w}=P_{p}P_{v}(w)+(1-P_{p})\sum_{j}^{w_{i}}a_{ij}" class="ltx_Math" display="block" id="S3.E14.m1.2"><semantics id="S3.E14.m1.2a"><mrow id="S3.E14.m1.2.2" xref="S3.E14.m1.2.2.cmml"><msub id="S3.E14.m1.2.2.3" xref="S3.E14.m1.2.2.3.cmml"><mi id="S3.E14.m1.2.2.3.2" xref="S3.E14.m1.2.2.3.2.cmml">P</mi><mi id="S3.E14.m1.2.2.3.3" xref="S3.E14.m1.2.2.3.3.cmml">w</mi></msub><mo id="S3.E14.m1.2.2.2" xref="S3.E14.m1.2.2.2.cmml">=</mo><mrow id="S3.E14.m1.2.2.1" xref="S3.E14.m1.2.2.1.cmml"><mrow id="S3.E14.m1.2.2.1.3" xref="S3.E14.m1.2.2.1.3.cmml"><msub id="S3.E14.m1.2.2.1.3.2" xref="S3.E14.m1.2.2.1.3.2.cmml"><mi id="S3.E14.m1.2.2.1.3.2.2" xref="S3.E14.m1.2.2.1.3.2.2.cmml">P</mi><mi id="S3.E14.m1.2.2.1.3.2.3" xref="S3.E14.m1.2.2.1.3.2.3.cmml">p</mi></msub><mo id="S3.E14.m1.2.2.1.3.1" xref="S3.E14.m1.2.2.1.3.1.cmml">⁢</mo><msub id="S3.E14.m1.2.2.1.3.3" xref="S3.E14.m1.2.2.1.3.3.cmml"><mi id="S3.E14.m1.2.2.1.3.3.2" xref="S3.E14.m1.2.2.1.3.3.2.cmml">P</mi><mi id="S3.E14.m1.2.2.1.3.3.3" xref="S3.E14.m1.2.2.1.3.3.3.cmml">v</mi></msub><mo id="S3.E14.m1.2.2.1.3.1a" xref="S3.E14.m1.2.2.1.3.1.cmml">⁢</mo><mrow id="S3.E14.m1.2.2.1.3.4.2" xref="S3.E14.m1.2.2.1.3.cmml"><mo id="S3.E14.m1.2.2.1.3.4.2.1" stretchy="false" xref="S3.E14.m1.2.2.1.3.cmml">(</mo><mi id="S3.E14.m1.1.1" xref="S3.E14.m1.1.1.cmml">w</mi><mo id="S3.E14.m1.2.2.1.3.4.2.2" stretchy="false" xref="S3.E14.m1.2.2.1.3.cmml">)</mo></mrow></mrow><mo id="S3.E14.m1.2.2.1.2" xref="S3.E14.m1.2.2.1.2.cmml">+</mo><mrow id="S3.E14.m1.2.2.1.1" xref="S3.E14.m1.2.2.1.1.cmml"><mrow id="S3.E14.m1.2.2.1.1.1.1" xref="S3.E14.m1.2.2.1.1.1.1.1.cmml"><mo id="S3.E14.m1.2.2.1.1.1.1.2" stretchy="false" xref="S3.E14.m1.2.2.1.1.1.1.1.cmml">(</mo><mrow id="S3.E14.m1.2.2.1.1.1.1.1" xref="S3.E14.m1.2.2.1.1.1.1.1.cmml"><mn id="S3.E14.m1.2.2.1.1.1.1.1.2" xref="S3.E14.m1.2.2.1.1.1.1.1.2.cmml">1</mn><mo id="S3.E14.m1.2.2.1.1.1.1.1.1" xref="S3.E14.m1.2.2.1.1.1.1.1.1.cmml">−</mo><msub id="S3.E14.m1.2.2.1.1.1.1.1.3" xref="S3.E14.m1.2.2.1.1.1.1.1.3.cmml"><mi id="S3.E14.m1.2.2.1.1.1.1.1.3.2" xref="S3.E14.m1.2.2.1.1.1.1.1.3.2.cmml">P</mi><mi id="S3.E14.m1.2.2.1.1.1.1.1.3.3" xref="S3.E14.m1.2.2.1.1.1.1.1.3.3.cmml">p</mi></msub></mrow><mo id="S3.E14.m1.2.2.1.1.1.1.3" stretchy="false" xref="S3.E14.m1.2.2.1.1.1.1.1.cmml">)</mo></mrow><mo id="S3.E14.m1.2.2.1.1.2" xref="S3.E14.m1.2.2.1.1.2.cmml">⁢</mo><mrow id="S3.E14.m1.2.2.1.1.3" xref="S3.E14.m1.2.2.1.1.3.cmml"><munderover id="S3.E14.m1.2.2.1.1.3.1" xref="S3.E14.m1.2.2.1.1.3.1.cmml"><mo id="S3.E14.m1.2.2.1.1.3.1.2.2" movablelimits="false" xref="S3.E14.m1.2.2.1.1.3.1.2.2.cmml">∑</mo><mi id="S3.E14.m1.2.2.1.1.3.1.2.3" xref="S3.E14.m1.2.2.1.1.3.1.2.3.cmml">j</mi><msub id="S3.E14.m1.2.2.1.1.3.1.3" xref="S3.E14.m1.2.2.1.1.3.1.3.cmml"><mi id="S3.E14.m1.2.2.1.1.3.1.3.2" xref="S3.E14.m1.2.2.1.1.3.1.3.2.cmml">w</mi><mi id="S3.E14.m1.2.2.1.1.3.1.3.3" xref="S3.E14.m1.2.2.1.1.3.1.3.3.cmml">i</mi></msub></munderover><msub id="S3.E14.m1.2.2.1.1.3.2" xref="S3.E14.m1.2.2.1.1.3.2.cmml"><mi id="S3.E14.m1.2.2.1.1.3.2.2" xref="S3.E14.m1.2.2.1.1.3.2.2.cmml">a</mi><mrow id="S3.E14.m1.2.2.1.1.3.2.3" xref="S3.E14.m1.2.2.1.1.3.2.3.cmml"><mi id="S3.E14.m1.2.2.1.1.3.2.3.2" xref="S3.E14.m1.2.2.1.1.3.2.3.2.cmml">i</mi><mo id="S3.E14.m1.2.2.1.1.3.2.3.1" xref="S3.E14.m1.2.2.1.1.3.2.3.1.cmml">⁢</mo><mi id="S3.E14.m1.2.2.1.1.3.2.3.3" xref="S3.E14.m1.2.2.1.1.3.2.3.3.cmml">j</mi></mrow></msub></mrow></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E14.m1.2b"><apply id="S3.E14.m1.2.2.cmml" xref="S3.E14.m1.2.2"><eq id="S3.E14.m1.2.2.2.cmml" xref="S3.E14.m1.2.2.2"></eq><apply id="S3.E14.m1.2.2.3.cmml" xref="S3.E14.m1.2.2.3"><csymbol cd="ambiguous" id="S3.E14.m1.2.2.3.1.cmml" xref="S3.E14.m1.2.2.3">subscript</csymbol><ci id="S3.E14.m1.2.2.3.2.cmml" xref="S3.E14.m1.2.2.3.2">𝑃</ci><ci id="S3.E14.m1.2.2.3.3.cmml" xref="S3.E14.m1.2.2.3.3">𝑤</ci></apply><apply id="S3.E14.m1.2.2.1.cmml" xref="S3.E14.m1.2.2.1"><plus id="S3.E14.m1.2.2.1.2.cmml" xref="S3.E14.m1.2.2.1.2"></plus><apply id="S3.E14.m1.2.2.1.3.cmml" xref="S3.E14.m1.2.2.1.3"><times id="S3.E14.m1.2.2.1.3.1.cmml" xref="S3.E14.m1.2.2.1.3.1"></times><apply id="S3.E14.m1.2.2.1.3.2.cmml" xref="S3.E14.m1.2.2.1.3.2"><csymbol cd="ambiguous" id="S3.E14.m1.2.2.1.3.2.1.cmml" xref="S3.E14.m1.2.2.1.3.2">subscript</csymbol><ci id="S3.E14.m1.2.2.1.3.2.2.cmml" xref="S3.E14.m1.2.2.1.3.2.2">𝑃</ci><ci id="S3.E14.m1.2.2.1.3.2.3.cmml" xref="S3.E14.m1.2.2.1.3.2.3">𝑝</ci></apply><apply id="S3.E14.m1.2.2.1.3.3.cmml" xref="S3.E14.m1.2.2.1.3.3"><csymbol cd="ambiguous" id="S3.E14.m1.2.2.1.3.3.1.cmml" xref="S3.E14.m1.2.2.1.3.3">subscript</csymbol><ci id="S3.E14.m1.2.2.1.3.3.2.cmml" xref="S3.E14.m1.2.2.1.3.3.2">𝑃</ci><ci id="S3.E14.m1.2.2.1.3.3.3.cmml" xref="S3.E14.m1.2.2.1.3.3.3">𝑣</ci></apply><ci id="S3.E14.m1.1.1.cmml" xref="S3.E14.m1.1.1">𝑤</ci></apply><apply id="S3.E14.m1.2.2.1.1.cmml" xref="S3.E14.m1.2.2.1.1"><times id="S3.E14.m1.2.2.1.1.2.cmml" xref="S3.E14.m1.2.2.1.1.2"></times><apply id="S3.E14.m1.2.2.1.1.1.1.1.cmml" xref="S3.E14.m1.2.2.1.1.1.1"><minus id="S3.E14.m1.2.2.1.1.1.1.1.1.cmml" xref="S3.E14.m1.2.2.1.1.1.1.1.1"></minus><cn id="S3.E14.m1.2.2.1.1.1.1.1.2.cmml" type="integer" xref="S3.E14.m1.2.2.1.1.1.1.1.2">1</cn><apply id="S3.E14.m1.2.2.1.1.1.1.1.3.cmml" xref="S3.E14.m1.2.2.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S3.E14.m1.2.2.1.1.1.1.1.3.1.cmml" xref="S3.E14.m1.2.2.1.1.1.1.1.3">subscript</csymbol><ci id="S3.E14.m1.2.2.1.1.1.1.1.3.2.cmml" xref="S3.E14.m1.2.2.1.1.1.1.1.3.2">𝑃</ci><ci id="S3.E14.m1.2.2.1.1.1.1.1.3.3.cmml" xref="S3.E14.m1.2.2.1.1.1.1.1.3.3">𝑝</ci></apply></apply><apply id="S3.E14.m1.2.2.1.1.3.cmml" xref="S3.E14.m1.2.2.1.1.3"><apply id="S3.E14.m1.2.2.1.1.3.1.cmml" xref="S3.E14.m1.2.2.1.1.3.1"><csymbol cd="ambiguous" id="S3.E14.m1.2.2.1.1.3.1.1.cmml" xref="S3.E14.m1.2.2.1.1.3.1">superscript</csymbol><apply id="S3.E14.m1.2.2.1.1.3.1.2.cmml" xref="S3.E14.m1.2.2.1.1.3.1"><csymbol cd="ambiguous" id="S3.E14.m1.2.2.1.1.3.1.2.1.cmml" xref="S3.E14.m1.2.2.1.1.3.1">subscript</csymbol><sum id="S3.E14.m1.2.2.1.1.3.1.2.2.cmml" xref="S3.E14.m1.2.2.1.1.3.1.2.2"></sum><ci id="S3.E14.m1.2.2.1.1.3.1.2.3.cmml" xref="S3.E14.m1.2.2.1.1.3.1.2.3">𝑗</ci></apply><apply id="S3.E14.m1.2.2.1.1.3.1.3.cmml" xref="S3.E14.m1.2.2.1.1.3.1.3"><csymbol cd="ambiguous" id="S3.E14.m1.2.2.1.1.3.1.3.1.cmml" xref="S3.E14.m1.2.2.1.1.3.1.3">subscript</csymbol><ci id="S3.E14.m1.2.2.1.1.3.1.3.2.cmml" xref="S3.E14.m1.2.2.1.1.3.1.3.2">𝑤</ci><ci id="S3.E14.m1.2.2.1.1.3.1.3.3.cmml" xref="S3.E14.m1.2.2.1.1.3.1.3.3">𝑖</ci></apply></apply><apply id="S3.E14.m1.2.2.1.1.3.2.cmml" xref="S3.E14.m1.2.2.1.1.3.2"><csymbol cd="ambiguous" id="S3.E14.m1.2.2.1.1.3.2.1.cmml" xref="S3.E14.m1.2.2.1.1.3.2">subscript</csymbol><ci id="S3.E14.m1.2.2.1.1.3.2.2.cmml" xref="S3.E14.m1.2.2.1.1.3.2.2">𝑎</ci><apply id="S3.E14.m1.2.2.1.1.3.2.3.cmml" xref="S3.E14.m1.2.2.1.1.3.2.3"><times id="S3.E14.m1.2.2.1.1.3.2.3.1.cmml" xref="S3.E14.m1.2.2.1.1.3.2.3.1"></times><ci id="S3.E14.m1.2.2.1.1.3.2.3.2.cmml" xref="S3.E14.m1.2.2.1.1.3.2.3.2">𝑖</ci><ci id="S3.E14.m1.2.2.1.1.3.2.3.3.cmml" xref="S3.E14.m1.2.2.1.1.3.2.3.3">𝑗</ci></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E14.m1.2c">P_{w}=P_{p}P_{v}(w)+(1-P_{p})\sum_{j}^{w_{i}}a_{ij}</annotation><annotation encoding="application/x-llamapun" id="S3.E14.m1.2d">italic_P start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT = italic_P start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ( italic_w ) + ( 1 - italic_P start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(14)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS4.p1.15">where <math alttext="\omega_{c}^{T}" class="ltx_Math" display="inline" id="S3.SS4.p1.6.m1.1"><semantics id="S3.SS4.p1.6.m1.1a"><msubsup id="S3.SS4.p1.6.m1.1.1" xref="S3.SS4.p1.6.m1.1.1.cmml"><mi id="S3.SS4.p1.6.m1.1.1.2.2" xref="S3.SS4.p1.6.m1.1.1.2.2.cmml">ω</mi><mi id="S3.SS4.p1.6.m1.1.1.2.3" xref="S3.SS4.p1.6.m1.1.1.2.3.cmml">c</mi><mi id="S3.SS4.p1.6.m1.1.1.3" xref="S3.SS4.p1.6.m1.1.1.3.cmml">T</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.6.m1.1b"><apply id="S3.SS4.p1.6.m1.1.1.cmml" xref="S3.SS4.p1.6.m1.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.6.m1.1.1.1.cmml" xref="S3.SS4.p1.6.m1.1.1">superscript</csymbol><apply id="S3.SS4.p1.6.m1.1.1.2.cmml" xref="S3.SS4.p1.6.m1.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.6.m1.1.1.2.1.cmml" xref="S3.SS4.p1.6.m1.1.1">subscript</csymbol><ci id="S3.SS4.p1.6.m1.1.1.2.2.cmml" xref="S3.SS4.p1.6.m1.1.1.2.2">𝜔</ci><ci id="S3.SS4.p1.6.m1.1.1.2.3.cmml" xref="S3.SS4.p1.6.m1.1.1.2.3">𝑐</ci></apply><ci id="S3.SS4.p1.6.m1.1.1.3.cmml" xref="S3.SS4.p1.6.m1.1.1.3">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.6.m1.1c">\omega_{c}^{T}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.6.m1.1d">italic_ω start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT</annotation></semantics></math>, <math alttext="\omega_{h}^{T}" class="ltx_Math" display="inline" id="S3.SS4.p1.7.m2.1"><semantics id="S3.SS4.p1.7.m2.1a"><msubsup id="S3.SS4.p1.7.m2.1.1" xref="S3.SS4.p1.7.m2.1.1.cmml"><mi id="S3.SS4.p1.7.m2.1.1.2.2" xref="S3.SS4.p1.7.m2.1.1.2.2.cmml">ω</mi><mi id="S3.SS4.p1.7.m2.1.1.2.3" xref="S3.SS4.p1.7.m2.1.1.2.3.cmml">h</mi><mi id="S3.SS4.p1.7.m2.1.1.3" xref="S3.SS4.p1.7.m2.1.1.3.cmml">T</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.7.m2.1b"><apply id="S3.SS4.p1.7.m2.1.1.cmml" xref="S3.SS4.p1.7.m2.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.7.m2.1.1.1.cmml" xref="S3.SS4.p1.7.m2.1.1">superscript</csymbol><apply id="S3.SS4.p1.7.m2.1.1.2.cmml" xref="S3.SS4.p1.7.m2.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.7.m2.1.1.2.1.cmml" xref="S3.SS4.p1.7.m2.1.1">subscript</csymbol><ci id="S3.SS4.p1.7.m2.1.1.2.2.cmml" xref="S3.SS4.p1.7.m2.1.1.2.2">𝜔</ci><ci id="S3.SS4.p1.7.m2.1.1.2.3.cmml" xref="S3.SS4.p1.7.m2.1.1.2.3">ℎ</ci></apply><ci id="S3.SS4.p1.7.m2.1.1.3.cmml" xref="S3.SS4.p1.7.m2.1.1.3">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.7.m2.1c">\omega_{h}^{T}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.7.m2.1d">italic_ω start_POSTSUBSCRIPT italic_h end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT</annotation></semantics></math>, <math alttext="\omega_{y}^{T}" class="ltx_Math" display="inline" id="S3.SS4.p1.8.m3.1"><semantics id="S3.SS4.p1.8.m3.1a"><msubsup id="S3.SS4.p1.8.m3.1.1" xref="S3.SS4.p1.8.m3.1.1.cmml"><mi id="S3.SS4.p1.8.m3.1.1.2.2" xref="S3.SS4.p1.8.m3.1.1.2.2.cmml">ω</mi><mi id="S3.SS4.p1.8.m3.1.1.2.3" xref="S3.SS4.p1.8.m3.1.1.2.3.cmml">y</mi><mi id="S3.SS4.p1.8.m3.1.1.3" xref="S3.SS4.p1.8.m3.1.1.3.cmml">T</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.8.m3.1b"><apply id="S3.SS4.p1.8.m3.1.1.cmml" xref="S3.SS4.p1.8.m3.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.8.m3.1.1.1.cmml" xref="S3.SS4.p1.8.m3.1.1">superscript</csymbol><apply id="S3.SS4.p1.8.m3.1.1.2.cmml" xref="S3.SS4.p1.8.m3.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.8.m3.1.1.2.1.cmml" xref="S3.SS4.p1.8.m3.1.1">subscript</csymbol><ci id="S3.SS4.p1.8.m3.1.1.2.2.cmml" xref="S3.SS4.p1.8.m3.1.1.2.2">𝜔</ci><ci id="S3.SS4.p1.8.m3.1.1.2.3.cmml" xref="S3.SS4.p1.8.m3.1.1.2.3">𝑦</ci></apply><ci id="S3.SS4.p1.8.m3.1.1.3.cmml" xref="S3.SS4.p1.8.m3.1.1.3">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.8.m3.1c">\omega_{y}^{T}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.8.m3.1d">italic_ω start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT</annotation></semantics></math>, <math alttext="\omega_{d}^{T}" class="ltx_Math" display="inline" id="S3.SS4.p1.9.m4.1"><semantics id="S3.SS4.p1.9.m4.1a"><msubsup id="S3.SS4.p1.9.m4.1.1" xref="S3.SS4.p1.9.m4.1.1.cmml"><mi id="S3.SS4.p1.9.m4.1.1.2.2" xref="S3.SS4.p1.9.m4.1.1.2.2.cmml">ω</mi><mi id="S3.SS4.p1.9.m4.1.1.2.3" xref="S3.SS4.p1.9.m4.1.1.2.3.cmml">d</mi><mi id="S3.SS4.p1.9.m4.1.1.3" xref="S3.SS4.p1.9.m4.1.1.3.cmml">T</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.9.m4.1b"><apply id="S3.SS4.p1.9.m4.1.1.cmml" xref="S3.SS4.p1.9.m4.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.9.m4.1.1.1.cmml" xref="S3.SS4.p1.9.m4.1.1">superscript</csymbol><apply id="S3.SS4.p1.9.m4.1.1.2.cmml" xref="S3.SS4.p1.9.m4.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.9.m4.1.1.2.1.cmml" xref="S3.SS4.p1.9.m4.1.1">subscript</csymbol><ci id="S3.SS4.p1.9.m4.1.1.2.2.cmml" xref="S3.SS4.p1.9.m4.1.1.2.2">𝜔</ci><ci id="S3.SS4.p1.9.m4.1.1.2.3.cmml" xref="S3.SS4.p1.9.m4.1.1.2.3">𝑑</ci></apply><ci id="S3.SS4.p1.9.m4.1.1.3.cmml" xref="S3.SS4.p1.9.m4.1.1.3">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.9.m4.1c">\omega_{d}^{T}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.9.m4.1d">italic_ω start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT</annotation></semantics></math>, and <math alttext="b_{g}" class="ltx_Math" display="inline" id="S3.SS4.p1.10.m5.1"><semantics id="S3.SS4.p1.10.m5.1a"><msub id="S3.SS4.p1.10.m5.1.1" xref="S3.SS4.p1.10.m5.1.1.cmml"><mi id="S3.SS4.p1.10.m5.1.1.2" xref="S3.SS4.p1.10.m5.1.1.2.cmml">b</mi><mi id="S3.SS4.p1.10.m5.1.1.3" xref="S3.SS4.p1.10.m5.1.1.3.cmml">g</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.10.m5.1b"><apply id="S3.SS4.p1.10.m5.1.1.cmml" xref="S3.SS4.p1.10.m5.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.10.m5.1.1.1.cmml" xref="S3.SS4.p1.10.m5.1.1">subscript</csymbol><ci id="S3.SS4.p1.10.m5.1.1.2.cmml" xref="S3.SS4.p1.10.m5.1.1.2">𝑏</ci><ci id="S3.SS4.p1.10.m5.1.1.3.cmml" xref="S3.SS4.p1.10.m5.1.1.3">𝑔</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.10.m5.1c">b_{g}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.10.m5.1d">italic_b start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT</annotation></semantics></math> are learning parameters, <math alttext="c_{i}" class="ltx_Math" display="inline" id="S3.SS4.p1.11.m6.1"><semantics id="S3.SS4.p1.11.m6.1a"><msub id="S3.SS4.p1.11.m6.1.1" xref="S3.SS4.p1.11.m6.1.1.cmml"><mi id="S3.SS4.p1.11.m6.1.1.2" xref="S3.SS4.p1.11.m6.1.1.2.cmml">c</mi><mi id="S3.SS4.p1.11.m6.1.1.3" xref="S3.SS4.p1.11.m6.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.11.m6.1b"><apply id="S3.SS4.p1.11.m6.1.1.cmml" xref="S3.SS4.p1.11.m6.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.11.m6.1.1.1.cmml" xref="S3.SS4.p1.11.m6.1.1">subscript</csymbol><ci id="S3.SS4.p1.11.m6.1.1.2.cmml" xref="S3.SS4.p1.11.m6.1.1.2">𝑐</ci><ci id="S3.SS4.p1.11.m6.1.1.3.cmml" xref="S3.SS4.p1.11.m6.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.11.m6.1c">c_{i}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.11.m6.1d">italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math> is the context vector, <math alttext="h_{i}^{d}" class="ltx_Math" display="inline" id="S3.SS4.p1.12.m7.1"><semantics id="S3.SS4.p1.12.m7.1a"><msubsup id="S3.SS4.p1.12.m7.1.1" xref="S3.SS4.p1.12.m7.1.1.cmml"><mi id="S3.SS4.p1.12.m7.1.1.2.2" xref="S3.SS4.p1.12.m7.1.1.2.2.cmml">h</mi><mi id="S3.SS4.p1.12.m7.1.1.2.3" xref="S3.SS4.p1.12.m7.1.1.2.3.cmml">i</mi><mi id="S3.SS4.p1.12.m7.1.1.3" xref="S3.SS4.p1.12.m7.1.1.3.cmml">d</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.12.m7.1b"><apply id="S3.SS4.p1.12.m7.1.1.cmml" xref="S3.SS4.p1.12.m7.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.12.m7.1.1.1.cmml" xref="S3.SS4.p1.12.m7.1.1">superscript</csymbol><apply id="S3.SS4.p1.12.m7.1.1.2.cmml" xref="S3.SS4.p1.12.m7.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.12.m7.1.1.2.1.cmml" xref="S3.SS4.p1.12.m7.1.1">subscript</csymbol><ci id="S3.SS4.p1.12.m7.1.1.2.2.cmml" xref="S3.SS4.p1.12.m7.1.1.2.2">ℎ</ci><ci id="S3.SS4.p1.12.m7.1.1.2.3.cmml" xref="S3.SS4.p1.12.m7.1.1.2.3">𝑖</ci></apply><ci id="S3.SS4.p1.12.m7.1.1.3.cmml" xref="S3.SS4.p1.12.m7.1.1.3">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.12.m7.1c">h_{i}^{d}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.12.m7.1d">italic_h start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT</annotation></semantics></math> is the hidden state of the decoder, <math alttext="y_{i}" class="ltx_Math" display="inline" id="S3.SS4.p1.13.m8.1"><semantics id="S3.SS4.p1.13.m8.1a"><msub id="S3.SS4.p1.13.m8.1.1" xref="S3.SS4.p1.13.m8.1.1.cmml"><mi id="S3.SS4.p1.13.m8.1.1.2" xref="S3.SS4.p1.13.m8.1.1.2.cmml">y</mi><mi id="S3.SS4.p1.13.m8.1.1.3" xref="S3.SS4.p1.13.m8.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.13.m8.1b"><apply id="S3.SS4.p1.13.m8.1.1.cmml" xref="S3.SS4.p1.13.m8.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.13.m8.1.1.1.cmml" xref="S3.SS4.p1.13.m8.1.1">subscript</csymbol><ci id="S3.SS4.p1.13.m8.1.1.2.cmml" xref="S3.SS4.p1.13.m8.1.1.2">𝑦</ci><ci id="S3.SS4.p1.13.m8.1.1.3.cmml" xref="S3.SS4.p1.13.m8.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.13.m8.1c">y_{i}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.13.m8.1d">italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math> is the input to the decoder, <math alttext="C^{d}" class="ltx_Math" display="inline" id="S3.SS4.p1.14.m9.1"><semantics id="S3.SS4.p1.14.m9.1a"><msup id="S3.SS4.p1.14.m9.1.1" xref="S3.SS4.p1.14.m9.1.1.cmml"><mi id="S3.SS4.p1.14.m9.1.1.2" xref="S3.SS4.p1.14.m9.1.1.2.cmml">C</mi><mi id="S3.SS4.p1.14.m9.1.1.3" xref="S3.SS4.p1.14.m9.1.1.3.cmml">d</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.14.m9.1b"><apply id="S3.SS4.p1.14.m9.1.1.cmml" xref="S3.SS4.p1.14.m9.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.14.m9.1.1.1.cmml" xref="S3.SS4.p1.14.m9.1.1">superscript</csymbol><ci id="S3.SS4.p1.14.m9.1.1.2.cmml" xref="S3.SS4.p1.14.m9.1.1.2">𝐶</ci><ci id="S3.SS4.p1.14.m9.1.1.3.cmml" xref="S3.SS4.p1.14.m9.1.1.3">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.14.m9.1c">C^{d}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.14.m9.1d">italic_C start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT</annotation></semantics></math> represents the content of the partially decoded sequence, and <math alttext="P_{w}" class="ltx_Math" display="inline" id="S3.SS4.p1.15.m10.1"><semantics id="S3.SS4.p1.15.m10.1a"><msub id="S3.SS4.p1.15.m10.1.1" xref="S3.SS4.p1.15.m10.1.1.cmml"><mi id="S3.SS4.p1.15.m10.1.1.2" xref="S3.SS4.p1.15.m10.1.1.2.cmml">P</mi><mi id="S3.SS4.p1.15.m10.1.1.3" xref="S3.SS4.p1.15.m10.1.1.3.cmml">w</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS4.p1.15.m10.1b"><apply id="S3.SS4.p1.15.m10.1.1.cmml" xref="S3.SS4.p1.15.m10.1.1"><csymbol cd="ambiguous" id="S3.SS4.p1.15.m10.1.1.1.cmml" xref="S3.SS4.p1.15.m10.1.1">subscript</csymbol><ci id="S3.SS4.p1.15.m10.1.1.2.cmml" xref="S3.SS4.p1.15.m10.1.1.2">𝑃</ci><ci id="S3.SS4.p1.15.m10.1.1.3.cmml" xref="S3.SS4.p1.15.m10.1.1.3">𝑤</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS4.p1.15.m10.1c">P_{w}</annotation><annotation encoding="application/x-llamapun" id="S3.SS4.p1.15.m10.1d">italic_P start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT</annotation></semantics></math> is the probability distribution of the extended vocabulary. Details of the pointer network can be referred to in the original pointer network <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib99" title="">99</a>]</cite>.</p> </div> </section> <section class="ltx_subsection" id="S3.SS5"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.5 </span>Enhanced Repetition Suppression Mechanism</h3> <div class="ltx_para" id="S3.SS5.p1"> <p class="ltx_p" id="S3.SS5.p1.2">For sequence-to-sequence models, repetition is a common issue in sequence generation tasks, especially when generating multi-sentence texts. In this paper’s model, an enhanced repetition suppression mechanism is adopted to address this problem. On one hand, the slave encoder generates an encoded feature vector every <math alttext="K" class="ltx_Math" display="inline" id="S3.SS5.p1.1.m1.1"><semantics id="S3.SS5.p1.1.m1.1a"><mi id="S3.SS5.p1.1.m1.1.1" xref="S3.SS5.p1.1.m1.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S3.SS5.p1.1.m1.1b"><ci id="S3.SS5.p1.1.m1.1.1.cmml" xref="S3.SS5.p1.1.m1.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p1.1.m1.1c">K</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p1.1.m1.1d">italic_K</annotation></semantics></math> steps, which enables the decoder to "remember" the content produced in earlier time steps to avoid repetition. On the other hand, this paper utilizes a coverage mechanism, where the coverage vector <math alttext="c^{v}" class="ltx_Math" display="inline" id="S3.SS5.p1.2.m2.1"><semantics id="S3.SS5.p1.2.m2.1a"><msup id="S3.SS5.p1.2.m2.1.1" xref="S3.SS5.p1.2.m2.1.1.cmml"><mi id="S3.SS5.p1.2.m2.1.1.2" xref="S3.SS5.p1.2.m2.1.1.2.cmml">c</mi><mi id="S3.SS5.p1.2.m2.1.1.3" xref="S3.SS5.p1.2.m2.1.1.3.cmml">v</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS5.p1.2.m2.1b"><apply id="S3.SS5.p1.2.m2.1.1.cmml" xref="S3.SS5.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S3.SS5.p1.2.m2.1.1.1.cmml" xref="S3.SS5.p1.2.m2.1.1">superscript</csymbol><ci id="S3.SS5.p1.2.m2.1.1.2.cmml" xref="S3.SS5.p1.2.m2.1.1.2">𝑐</ci><ci id="S3.SS5.p1.2.m2.1.1.3.cmml" xref="S3.SS5.p1.2.m2.1.1.3">𝑣</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p1.2.m2.1c">c^{v}</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p1.2.m2.1d">italic_c start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT</annotation></semantics></math> is defined as the sum of attention distributions across all previous decoder time steps, innovatively expressed as:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E15"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="c_{i}^{v}=\sum_{i^{\prime}=0}^{i-1}a_{i^{\prime}}" class="ltx_Math" display="block" id="S3.E15.m1.1"><semantics id="S3.E15.m1.1a"><mrow id="S3.E15.m1.1.1" xref="S3.E15.m1.1.1.cmml"><msubsup id="S3.E15.m1.1.1.2" xref="S3.E15.m1.1.1.2.cmml"><mi id="S3.E15.m1.1.1.2.2.2" xref="S3.E15.m1.1.1.2.2.2.cmml">c</mi><mi id="S3.E15.m1.1.1.2.2.3" xref="S3.E15.m1.1.1.2.2.3.cmml">i</mi><mi id="S3.E15.m1.1.1.2.3" xref="S3.E15.m1.1.1.2.3.cmml">v</mi></msubsup><mo id="S3.E15.m1.1.1.1" rspace="0.111em" xref="S3.E15.m1.1.1.1.cmml">=</mo><mrow id="S3.E15.m1.1.1.3" xref="S3.E15.m1.1.1.3.cmml"><munderover id="S3.E15.m1.1.1.3.1" xref="S3.E15.m1.1.1.3.1.cmml"><mo id="S3.E15.m1.1.1.3.1.2.2" movablelimits="false" xref="S3.E15.m1.1.1.3.1.2.2.cmml">∑</mo><mrow id="S3.E15.m1.1.1.3.1.2.3" xref="S3.E15.m1.1.1.3.1.2.3.cmml"><msup id="S3.E15.m1.1.1.3.1.2.3.2" xref="S3.E15.m1.1.1.3.1.2.3.2.cmml"><mi id="S3.E15.m1.1.1.3.1.2.3.2.2" xref="S3.E15.m1.1.1.3.1.2.3.2.2.cmml">i</mi><mo id="S3.E15.m1.1.1.3.1.2.3.2.3" xref="S3.E15.m1.1.1.3.1.2.3.2.3.cmml">′</mo></msup><mo id="S3.E15.m1.1.1.3.1.2.3.1" xref="S3.E15.m1.1.1.3.1.2.3.1.cmml">=</mo><mn id="S3.E15.m1.1.1.3.1.2.3.3" xref="S3.E15.m1.1.1.3.1.2.3.3.cmml">0</mn></mrow><mrow id="S3.E15.m1.1.1.3.1.3" xref="S3.E15.m1.1.1.3.1.3.cmml"><mi id="S3.E15.m1.1.1.3.1.3.2" xref="S3.E15.m1.1.1.3.1.3.2.cmml">i</mi><mo id="S3.E15.m1.1.1.3.1.3.1" xref="S3.E15.m1.1.1.3.1.3.1.cmml">−</mo><mn id="S3.E15.m1.1.1.3.1.3.3" xref="S3.E15.m1.1.1.3.1.3.3.cmml">1</mn></mrow></munderover><msub id="S3.E15.m1.1.1.3.2" xref="S3.E15.m1.1.1.3.2.cmml"><mi id="S3.E15.m1.1.1.3.2.2" xref="S3.E15.m1.1.1.3.2.2.cmml">a</mi><msup id="S3.E15.m1.1.1.3.2.3" xref="S3.E15.m1.1.1.3.2.3.cmml"><mi id="S3.E15.m1.1.1.3.2.3.2" xref="S3.E15.m1.1.1.3.2.3.2.cmml">i</mi><mo id="S3.E15.m1.1.1.3.2.3.3" xref="S3.E15.m1.1.1.3.2.3.3.cmml">′</mo></msup></msub></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E15.m1.1b"><apply id="S3.E15.m1.1.1.cmml" xref="S3.E15.m1.1.1"><eq id="S3.E15.m1.1.1.1.cmml" xref="S3.E15.m1.1.1.1"></eq><apply id="S3.E15.m1.1.1.2.cmml" xref="S3.E15.m1.1.1.2"><csymbol cd="ambiguous" id="S3.E15.m1.1.1.2.1.cmml" xref="S3.E15.m1.1.1.2">superscript</csymbol><apply id="S3.E15.m1.1.1.2.2.cmml" xref="S3.E15.m1.1.1.2"><csymbol cd="ambiguous" id="S3.E15.m1.1.1.2.2.1.cmml" xref="S3.E15.m1.1.1.2">subscript</csymbol><ci id="S3.E15.m1.1.1.2.2.2.cmml" xref="S3.E15.m1.1.1.2.2.2">𝑐</ci><ci id="S3.E15.m1.1.1.2.2.3.cmml" xref="S3.E15.m1.1.1.2.2.3">𝑖</ci></apply><ci id="S3.E15.m1.1.1.2.3.cmml" xref="S3.E15.m1.1.1.2.3">𝑣</ci></apply><apply id="S3.E15.m1.1.1.3.cmml" xref="S3.E15.m1.1.1.3"><apply id="S3.E15.m1.1.1.3.1.cmml" xref="S3.E15.m1.1.1.3.1"><csymbol cd="ambiguous" id="S3.E15.m1.1.1.3.1.1.cmml" xref="S3.E15.m1.1.1.3.1">superscript</csymbol><apply id="S3.E15.m1.1.1.3.1.2.cmml" xref="S3.E15.m1.1.1.3.1"><csymbol cd="ambiguous" id="S3.E15.m1.1.1.3.1.2.1.cmml" xref="S3.E15.m1.1.1.3.1">subscript</csymbol><sum id="S3.E15.m1.1.1.3.1.2.2.cmml" xref="S3.E15.m1.1.1.3.1.2.2"></sum><apply id="S3.E15.m1.1.1.3.1.2.3.cmml" xref="S3.E15.m1.1.1.3.1.2.3"><eq id="S3.E15.m1.1.1.3.1.2.3.1.cmml" xref="S3.E15.m1.1.1.3.1.2.3.1"></eq><apply id="S3.E15.m1.1.1.3.1.2.3.2.cmml" xref="S3.E15.m1.1.1.3.1.2.3.2"><csymbol cd="ambiguous" id="S3.E15.m1.1.1.3.1.2.3.2.1.cmml" xref="S3.E15.m1.1.1.3.1.2.3.2">superscript</csymbol><ci id="S3.E15.m1.1.1.3.1.2.3.2.2.cmml" xref="S3.E15.m1.1.1.3.1.2.3.2.2">𝑖</ci><ci id="S3.E15.m1.1.1.3.1.2.3.2.3.cmml" xref="S3.E15.m1.1.1.3.1.2.3.2.3">′</ci></apply><cn id="S3.E15.m1.1.1.3.1.2.3.3.cmml" type="integer" xref="S3.E15.m1.1.1.3.1.2.3.3">0</cn></apply></apply><apply id="S3.E15.m1.1.1.3.1.3.cmml" xref="S3.E15.m1.1.1.3.1.3"><minus id="S3.E15.m1.1.1.3.1.3.1.cmml" xref="S3.E15.m1.1.1.3.1.3.1"></minus><ci id="S3.E15.m1.1.1.3.1.3.2.cmml" xref="S3.E15.m1.1.1.3.1.3.2">𝑖</ci><cn id="S3.E15.m1.1.1.3.1.3.3.cmml" type="integer" xref="S3.E15.m1.1.1.3.1.3.3">1</cn></apply></apply><apply id="S3.E15.m1.1.1.3.2.cmml" xref="S3.E15.m1.1.1.3.2"><csymbol cd="ambiguous" id="S3.E15.m1.1.1.3.2.1.cmml" xref="S3.E15.m1.1.1.3.2">subscript</csymbol><ci id="S3.E15.m1.1.1.3.2.2.cmml" xref="S3.E15.m1.1.1.3.2.2">𝑎</ci><apply id="S3.E15.m1.1.1.3.2.3.cmml" xref="S3.E15.m1.1.1.3.2.3"><csymbol cd="ambiguous" id="S3.E15.m1.1.1.3.2.3.1.cmml" xref="S3.E15.m1.1.1.3.2.3">superscript</csymbol><ci id="S3.E15.m1.1.1.3.2.3.2.cmml" xref="S3.E15.m1.1.1.3.2.3.2">𝑖</ci><ci id="S3.E15.m1.1.1.3.2.3.3.cmml" xref="S3.E15.m1.1.1.3.2.3.3">′</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E15.m1.1c">c_{i}^{v}=\sum_{i^{\prime}=0}^{i-1}a_{i^{\prime}}</annotation><annotation encoding="application/x-llamapun" id="S3.E15.m1.1d">italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT italic_i start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i - 1 end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(15)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S3.SS5.p2"> <p class="ltx_p" id="S3.SS5.p2.1">Next, the coverage vector is also used as an additional input in the attention mechanism formula. Therefore, the formula for the attention mechanism is innovatively updated as:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E16"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="e_{ij}=v_{a}^{T}\tanh(W_{a}h_{i-1}^{d}+U_{a}h_{j}^{p}+W_{c}hc_{i}^{v})" class="ltx_Math" display="block" id="S3.E16.m1.2"><semantics id="S3.E16.m1.2a"><mrow id="S3.E16.m1.2.2" xref="S3.E16.m1.2.2.cmml"><msub id="S3.E16.m1.2.2.3" xref="S3.E16.m1.2.2.3.cmml"><mi id="S3.E16.m1.2.2.3.2" xref="S3.E16.m1.2.2.3.2.cmml">e</mi><mrow id="S3.E16.m1.2.2.3.3" xref="S3.E16.m1.2.2.3.3.cmml"><mi id="S3.E16.m1.2.2.3.3.2" xref="S3.E16.m1.2.2.3.3.2.cmml">i</mi><mo id="S3.E16.m1.2.2.3.3.1" xref="S3.E16.m1.2.2.3.3.1.cmml">⁢</mo><mi id="S3.E16.m1.2.2.3.3.3" xref="S3.E16.m1.2.2.3.3.3.cmml">j</mi></mrow></msub><mo id="S3.E16.m1.2.2.2" xref="S3.E16.m1.2.2.2.cmml">=</mo><mrow id="S3.E16.m1.2.2.1" xref="S3.E16.m1.2.2.1.cmml"><msubsup id="S3.E16.m1.2.2.1.3" xref="S3.E16.m1.2.2.1.3.cmml"><mi id="S3.E16.m1.2.2.1.3.2.2" xref="S3.E16.m1.2.2.1.3.2.2.cmml">v</mi><mi id="S3.E16.m1.2.2.1.3.2.3" xref="S3.E16.m1.2.2.1.3.2.3.cmml">a</mi><mi id="S3.E16.m1.2.2.1.3.3" xref="S3.E16.m1.2.2.1.3.3.cmml">T</mi></msubsup><mo id="S3.E16.m1.2.2.1.2" lspace="0.167em" xref="S3.E16.m1.2.2.1.2.cmml">⁢</mo><mrow id="S3.E16.m1.2.2.1.1.1" xref="S3.E16.m1.2.2.1.1.2.cmml"><mi id="S3.E16.m1.1.1" xref="S3.E16.m1.1.1.cmml">tanh</mi><mo id="S3.E16.m1.2.2.1.1.1a" xref="S3.E16.m1.2.2.1.1.2.cmml">⁡</mo><mrow id="S3.E16.m1.2.2.1.1.1.1" xref="S3.E16.m1.2.2.1.1.2.cmml"><mo id="S3.E16.m1.2.2.1.1.1.1.2" stretchy="false" xref="S3.E16.m1.2.2.1.1.2.cmml">(</mo><mrow id="S3.E16.m1.2.2.1.1.1.1.1" xref="S3.E16.m1.2.2.1.1.1.1.1.cmml"><mrow id="S3.E16.m1.2.2.1.1.1.1.1.2" xref="S3.E16.m1.2.2.1.1.1.1.1.2.cmml"><msub id="S3.E16.m1.2.2.1.1.1.1.1.2.2" xref="S3.E16.m1.2.2.1.1.1.1.1.2.2.cmml"><mi id="S3.E16.m1.2.2.1.1.1.1.1.2.2.2" xref="S3.E16.m1.2.2.1.1.1.1.1.2.2.2.cmml">W</mi><mi id="S3.E16.m1.2.2.1.1.1.1.1.2.2.3" xref="S3.E16.m1.2.2.1.1.1.1.1.2.2.3.cmml">a</mi></msub><mo id="S3.E16.m1.2.2.1.1.1.1.1.2.1" xref="S3.E16.m1.2.2.1.1.1.1.1.2.1.cmml">⁢</mo><msubsup id="S3.E16.m1.2.2.1.1.1.1.1.2.3" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.cmml"><mi id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.2" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.2.cmml">h</mi><mrow id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.cmml"><mi id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.2" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.2.cmml">i</mi><mo id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.1" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.1.cmml">−</mo><mn id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.3" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.3.cmml">1</mn></mrow><mi id="S3.E16.m1.2.2.1.1.1.1.1.2.3.3" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.3.cmml">d</mi></msubsup></mrow><mo id="S3.E16.m1.2.2.1.1.1.1.1.1" xref="S3.E16.m1.2.2.1.1.1.1.1.1.cmml">+</mo><mrow id="S3.E16.m1.2.2.1.1.1.1.1.3" xref="S3.E16.m1.2.2.1.1.1.1.1.3.cmml"><msub id="S3.E16.m1.2.2.1.1.1.1.1.3.2" xref="S3.E16.m1.2.2.1.1.1.1.1.3.2.cmml"><mi id="S3.E16.m1.2.2.1.1.1.1.1.3.2.2" xref="S3.E16.m1.2.2.1.1.1.1.1.3.2.2.cmml">U</mi><mi id="S3.E16.m1.2.2.1.1.1.1.1.3.2.3" xref="S3.E16.m1.2.2.1.1.1.1.1.3.2.3.cmml">a</mi></msub><mo id="S3.E16.m1.2.2.1.1.1.1.1.3.1" xref="S3.E16.m1.2.2.1.1.1.1.1.3.1.cmml">⁢</mo><msubsup id="S3.E16.m1.2.2.1.1.1.1.1.3.3" xref="S3.E16.m1.2.2.1.1.1.1.1.3.3.cmml"><mi id="S3.E16.m1.2.2.1.1.1.1.1.3.3.2.2" xref="S3.E16.m1.2.2.1.1.1.1.1.3.3.2.2.cmml">h</mi><mi id="S3.E16.m1.2.2.1.1.1.1.1.3.3.2.3" xref="S3.E16.m1.2.2.1.1.1.1.1.3.3.2.3.cmml">j</mi><mi id="S3.E16.m1.2.2.1.1.1.1.1.3.3.3" xref="S3.E16.m1.2.2.1.1.1.1.1.3.3.3.cmml">p</mi></msubsup></mrow><mo id="S3.E16.m1.2.2.1.1.1.1.1.1a" xref="S3.E16.m1.2.2.1.1.1.1.1.1.cmml">+</mo><mrow id="S3.E16.m1.2.2.1.1.1.1.1.4" xref="S3.E16.m1.2.2.1.1.1.1.1.4.cmml"><msub id="S3.E16.m1.2.2.1.1.1.1.1.4.2" xref="S3.E16.m1.2.2.1.1.1.1.1.4.2.cmml"><mi id="S3.E16.m1.2.2.1.1.1.1.1.4.2.2" xref="S3.E16.m1.2.2.1.1.1.1.1.4.2.2.cmml">W</mi><mi id="S3.E16.m1.2.2.1.1.1.1.1.4.2.3" xref="S3.E16.m1.2.2.1.1.1.1.1.4.2.3.cmml">c</mi></msub><mo id="S3.E16.m1.2.2.1.1.1.1.1.4.1" xref="S3.E16.m1.2.2.1.1.1.1.1.4.1.cmml">⁢</mo><mi id="S3.E16.m1.2.2.1.1.1.1.1.4.3" xref="S3.E16.m1.2.2.1.1.1.1.1.4.3.cmml">h</mi><mo id="S3.E16.m1.2.2.1.1.1.1.1.4.1a" xref="S3.E16.m1.2.2.1.1.1.1.1.4.1.cmml">⁢</mo><msubsup id="S3.E16.m1.2.2.1.1.1.1.1.4.4" xref="S3.E16.m1.2.2.1.1.1.1.1.4.4.cmml"><mi id="S3.E16.m1.2.2.1.1.1.1.1.4.4.2.2" xref="S3.E16.m1.2.2.1.1.1.1.1.4.4.2.2.cmml">c</mi><mi id="S3.E16.m1.2.2.1.1.1.1.1.4.4.2.3" xref="S3.E16.m1.2.2.1.1.1.1.1.4.4.2.3.cmml">i</mi><mi id="S3.E16.m1.2.2.1.1.1.1.1.4.4.3" xref="S3.E16.m1.2.2.1.1.1.1.1.4.4.3.cmml">v</mi></msubsup></mrow></mrow><mo id="S3.E16.m1.2.2.1.1.1.1.3" stretchy="false" xref="S3.E16.m1.2.2.1.1.2.cmml">)</mo></mrow></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E16.m1.2b"><apply id="S3.E16.m1.2.2.cmml" xref="S3.E16.m1.2.2"><eq id="S3.E16.m1.2.2.2.cmml" xref="S3.E16.m1.2.2.2"></eq><apply id="S3.E16.m1.2.2.3.cmml" xref="S3.E16.m1.2.2.3"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.3.1.cmml" xref="S3.E16.m1.2.2.3">subscript</csymbol><ci id="S3.E16.m1.2.2.3.2.cmml" xref="S3.E16.m1.2.2.3.2">𝑒</ci><apply id="S3.E16.m1.2.2.3.3.cmml" xref="S3.E16.m1.2.2.3.3"><times id="S3.E16.m1.2.2.3.3.1.cmml" xref="S3.E16.m1.2.2.3.3.1"></times><ci id="S3.E16.m1.2.2.3.3.2.cmml" xref="S3.E16.m1.2.2.3.3.2">𝑖</ci><ci id="S3.E16.m1.2.2.3.3.3.cmml" xref="S3.E16.m1.2.2.3.3.3">𝑗</ci></apply></apply><apply id="S3.E16.m1.2.2.1.cmml" xref="S3.E16.m1.2.2.1"><times id="S3.E16.m1.2.2.1.2.cmml" xref="S3.E16.m1.2.2.1.2"></times><apply id="S3.E16.m1.2.2.1.3.cmml" xref="S3.E16.m1.2.2.1.3"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.1.3.1.cmml" xref="S3.E16.m1.2.2.1.3">superscript</csymbol><apply id="S3.E16.m1.2.2.1.3.2.cmml" xref="S3.E16.m1.2.2.1.3"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.1.3.2.1.cmml" xref="S3.E16.m1.2.2.1.3">subscript</csymbol><ci id="S3.E16.m1.2.2.1.3.2.2.cmml" xref="S3.E16.m1.2.2.1.3.2.2">𝑣</ci><ci id="S3.E16.m1.2.2.1.3.2.3.cmml" xref="S3.E16.m1.2.2.1.3.2.3">𝑎</ci></apply><ci id="S3.E16.m1.2.2.1.3.3.cmml" xref="S3.E16.m1.2.2.1.3.3">𝑇</ci></apply><apply id="S3.E16.m1.2.2.1.1.2.cmml" xref="S3.E16.m1.2.2.1.1.1"><tanh id="S3.E16.m1.1.1.cmml" xref="S3.E16.m1.1.1"></tanh><apply id="S3.E16.m1.2.2.1.1.1.1.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1"><plus id="S3.E16.m1.2.2.1.1.1.1.1.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.1"></plus><apply id="S3.E16.m1.2.2.1.1.1.1.1.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2"><times id="S3.E16.m1.2.2.1.1.1.1.1.2.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.1"></times><apply id="S3.E16.m1.2.2.1.1.1.1.1.2.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.2"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.1.1.1.1.1.2.2.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.2">subscript</csymbol><ci id="S3.E16.m1.2.2.1.1.1.1.1.2.2.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.2.2">𝑊</ci><ci id="S3.E16.m1.2.2.1.1.1.1.1.2.2.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.2.3">𝑎</ci></apply><apply id="S3.E16.m1.2.2.1.1.1.1.1.2.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.1.1.1.1.1.2.3.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3">superscript</csymbol><apply id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3">subscript</csymbol><ci id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.2">ℎ</ci><apply id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3"><minus id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.1"></minus><ci id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.2">𝑖</ci><cn id="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.3.cmml" type="integer" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.2.3.3">1</cn></apply></apply><ci id="S3.E16.m1.2.2.1.1.1.1.1.2.3.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.2.3.3">𝑑</ci></apply></apply><apply id="S3.E16.m1.2.2.1.1.1.1.1.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3"><times id="S3.E16.m1.2.2.1.1.1.1.1.3.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.1"></times><apply id="S3.E16.m1.2.2.1.1.1.1.1.3.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.2"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.1.1.1.1.1.3.2.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.2">subscript</csymbol><ci id="S3.E16.m1.2.2.1.1.1.1.1.3.2.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.2.2">𝑈</ci><ci id="S3.E16.m1.2.2.1.1.1.1.1.3.2.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.2.3">𝑎</ci></apply><apply id="S3.E16.m1.2.2.1.1.1.1.1.3.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.3"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.1.1.1.1.1.3.3.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.3">superscript</csymbol><apply id="S3.E16.m1.2.2.1.1.1.1.1.3.3.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.3"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.1.1.1.1.1.3.3.2.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.3">subscript</csymbol><ci id="S3.E16.m1.2.2.1.1.1.1.1.3.3.2.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.3.2.2">ℎ</ci><ci id="S3.E16.m1.2.2.1.1.1.1.1.3.3.2.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.3.2.3">𝑗</ci></apply><ci id="S3.E16.m1.2.2.1.1.1.1.1.3.3.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.3.3.3">𝑝</ci></apply></apply><apply id="S3.E16.m1.2.2.1.1.1.1.1.4.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4"><times id="S3.E16.m1.2.2.1.1.1.1.1.4.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.1"></times><apply id="S3.E16.m1.2.2.1.1.1.1.1.4.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.2"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.1.1.1.1.1.4.2.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.2">subscript</csymbol><ci id="S3.E16.m1.2.2.1.1.1.1.1.4.2.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.2.2">𝑊</ci><ci id="S3.E16.m1.2.2.1.1.1.1.1.4.2.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.2.3">𝑐</ci></apply><ci id="S3.E16.m1.2.2.1.1.1.1.1.4.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.3">ℎ</ci><apply id="S3.E16.m1.2.2.1.1.1.1.1.4.4.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.4"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.1.1.1.1.1.4.4.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.4">superscript</csymbol><apply id="S3.E16.m1.2.2.1.1.1.1.1.4.4.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.4"><csymbol cd="ambiguous" id="S3.E16.m1.2.2.1.1.1.1.1.4.4.2.1.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.4">subscript</csymbol><ci id="S3.E16.m1.2.2.1.1.1.1.1.4.4.2.2.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.4.2.2">𝑐</ci><ci id="S3.E16.m1.2.2.1.1.1.1.1.4.4.2.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.4.2.3">𝑖</ci></apply><ci id="S3.E16.m1.2.2.1.1.1.1.1.4.4.3.cmml" xref="S3.E16.m1.2.2.1.1.1.1.1.4.4.3">𝑣</ci></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E16.m1.2c">e_{ij}=v_{a}^{T}\tanh(W_{a}h_{i-1}^{d}+U_{a}h_{j}^{p}+W_{c}hc_{i}^{v})</annotation><annotation encoding="application/x-llamapun" id="S3.E16.m1.2d">italic_e start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_tanh ( italic_W start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT + italic_U start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT + italic_W start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT italic_h italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(16)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S3.SS5.p3"> <p class="ltx_p" id="S3.SS5.p3.4">At the same time, we continue to define an additional coverage loss to penalize repetitive behavior. The formula for their loss function is written as:</p> <table class="ltx_equation ltx_eqn_table" id="S3.E17"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="L=\frac{1}{T}\sum_{i=0}^{T}\left(\Gamma_{i}+\lambda\sum_{j=0}^{T}\min(a_{ij},c% _{ij}^{v})\right)" class="ltx_Math" display="block" id="S3.E17.m1.2"><semantics id="S3.E17.m1.2a"><mrow id="S3.E17.m1.2.2" xref="S3.E17.m1.2.2.cmml"><mi id="S3.E17.m1.2.2.3" xref="S3.E17.m1.2.2.3.cmml">L</mi><mo id="S3.E17.m1.2.2.2" xref="S3.E17.m1.2.2.2.cmml">=</mo><mrow id="S3.E17.m1.2.2.1" xref="S3.E17.m1.2.2.1.cmml"><mfrac id="S3.E17.m1.2.2.1.3" xref="S3.E17.m1.2.2.1.3.cmml"><mn id="S3.E17.m1.2.2.1.3.2" xref="S3.E17.m1.2.2.1.3.2.cmml">1</mn><mi id="S3.E17.m1.2.2.1.3.3" xref="S3.E17.m1.2.2.1.3.3.cmml">T</mi></mfrac><mo id="S3.E17.m1.2.2.1.2" xref="S3.E17.m1.2.2.1.2.cmml">⁢</mo><mrow id="S3.E17.m1.2.2.1.1" xref="S3.E17.m1.2.2.1.1.cmml"><munderover id="S3.E17.m1.2.2.1.1.2" xref="S3.E17.m1.2.2.1.1.2.cmml"><mo id="S3.E17.m1.2.2.1.1.2.2.2" movablelimits="false" rspace="0em" xref="S3.E17.m1.2.2.1.1.2.2.2.cmml">∑</mo><mrow id="S3.E17.m1.2.2.1.1.2.2.3" xref="S3.E17.m1.2.2.1.1.2.2.3.cmml"><mi id="S3.E17.m1.2.2.1.1.2.2.3.2" xref="S3.E17.m1.2.2.1.1.2.2.3.2.cmml">i</mi><mo id="S3.E17.m1.2.2.1.1.2.2.3.1" xref="S3.E17.m1.2.2.1.1.2.2.3.1.cmml">=</mo><mn id="S3.E17.m1.2.2.1.1.2.2.3.3" xref="S3.E17.m1.2.2.1.1.2.2.3.3.cmml">0</mn></mrow><mi id="S3.E17.m1.2.2.1.1.2.3" xref="S3.E17.m1.2.2.1.1.2.3.cmml">T</mi></munderover><mrow id="S3.E17.m1.2.2.1.1.1.1" xref="S3.E17.m1.2.2.1.1.1.1.1.cmml"><mo id="S3.E17.m1.2.2.1.1.1.1.2" xref="S3.E17.m1.2.2.1.1.1.1.1.cmml">(</mo><mrow id="S3.E17.m1.2.2.1.1.1.1.1" xref="S3.E17.m1.2.2.1.1.1.1.1.cmml"><msub id="S3.E17.m1.2.2.1.1.1.1.1.4" xref="S3.E17.m1.2.2.1.1.1.1.1.4.cmml"><mi id="S3.E17.m1.2.2.1.1.1.1.1.4.2" mathvariant="normal" xref="S3.E17.m1.2.2.1.1.1.1.1.4.2.cmml">Γ</mi><mi id="S3.E17.m1.2.2.1.1.1.1.1.4.3" xref="S3.E17.m1.2.2.1.1.1.1.1.4.3.cmml">i</mi></msub><mo id="S3.E17.m1.2.2.1.1.1.1.1.3" xref="S3.E17.m1.2.2.1.1.1.1.1.3.cmml">+</mo><mrow id="S3.E17.m1.2.2.1.1.1.1.1.2" xref="S3.E17.m1.2.2.1.1.1.1.1.2.cmml"><mi id="S3.E17.m1.2.2.1.1.1.1.1.2.4" xref="S3.E17.m1.2.2.1.1.1.1.1.2.4.cmml">λ</mi><mo id="S3.E17.m1.2.2.1.1.1.1.1.2.3" xref="S3.E17.m1.2.2.1.1.1.1.1.2.3.cmml">⁢</mo><mrow id="S3.E17.m1.2.2.1.1.1.1.1.2.2" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.cmml"><munderover id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.cmml"><mo id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.2" movablelimits="false" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.2.cmml">∑</mo><mrow id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.cmml"><mi id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.2" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.2.cmml">j</mi><mo id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.1" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.1.cmml">=</mo><mn id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.3" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.3.cmml">0</mn></mrow><mi id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.3" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.3.cmml">T</mi></munderover><mrow id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.3.cmml"><mi id="S3.E17.m1.1.1" xref="S3.E17.m1.1.1.cmml">min</mi><mo id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2a" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.3.cmml">⁡</mo><mrow id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.3.cmml"><mo id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.3" stretchy="false" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.3.cmml">(</mo><msub id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.2" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.2.cmml">a</mi><mrow id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.2" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml">i</mi><mo id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.1" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml">j</mi></mrow></msub><mo id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.4" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.3.cmml">,</mo><msubsup id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.cmml"><mi id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.2" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.2.cmml">c</mi><mrow id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.cmml"><mi id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.2" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.2.cmml">i</mi><mo id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.1" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.1.cmml">⁢</mo><mi id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.3" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.3.cmml">j</mi></mrow><mi id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.3" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.3.cmml">v</mi></msubsup><mo id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.5" stretchy="false" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.3.cmml">)</mo></mrow></mrow></mrow></mrow></mrow><mo id="S3.E17.m1.2.2.1.1.1.1.3" xref="S3.E17.m1.2.2.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.E17.m1.2b"><apply id="S3.E17.m1.2.2.cmml" xref="S3.E17.m1.2.2"><eq id="S3.E17.m1.2.2.2.cmml" xref="S3.E17.m1.2.2.2"></eq><ci id="S3.E17.m1.2.2.3.cmml" xref="S3.E17.m1.2.2.3">𝐿</ci><apply id="S3.E17.m1.2.2.1.cmml" xref="S3.E17.m1.2.2.1"><times id="S3.E17.m1.2.2.1.2.cmml" xref="S3.E17.m1.2.2.1.2"></times><apply id="S3.E17.m1.2.2.1.3.cmml" xref="S3.E17.m1.2.2.1.3"><divide id="S3.E17.m1.2.2.1.3.1.cmml" xref="S3.E17.m1.2.2.1.3"></divide><cn id="S3.E17.m1.2.2.1.3.2.cmml" type="integer" xref="S3.E17.m1.2.2.1.3.2">1</cn><ci id="S3.E17.m1.2.2.1.3.3.cmml" xref="S3.E17.m1.2.2.1.3.3">𝑇</ci></apply><apply id="S3.E17.m1.2.2.1.1.cmml" xref="S3.E17.m1.2.2.1.1"><apply id="S3.E17.m1.2.2.1.1.2.cmml" xref="S3.E17.m1.2.2.1.1.2"><csymbol cd="ambiguous" id="S3.E17.m1.2.2.1.1.2.1.cmml" xref="S3.E17.m1.2.2.1.1.2">superscript</csymbol><apply id="S3.E17.m1.2.2.1.1.2.2.cmml" xref="S3.E17.m1.2.2.1.1.2"><csymbol cd="ambiguous" id="S3.E17.m1.2.2.1.1.2.2.1.cmml" xref="S3.E17.m1.2.2.1.1.2">subscript</csymbol><sum id="S3.E17.m1.2.2.1.1.2.2.2.cmml" xref="S3.E17.m1.2.2.1.1.2.2.2"></sum><apply id="S3.E17.m1.2.2.1.1.2.2.3.cmml" xref="S3.E17.m1.2.2.1.1.2.2.3"><eq id="S3.E17.m1.2.2.1.1.2.2.3.1.cmml" xref="S3.E17.m1.2.2.1.1.2.2.3.1"></eq><ci id="S3.E17.m1.2.2.1.1.2.2.3.2.cmml" xref="S3.E17.m1.2.2.1.1.2.2.3.2">𝑖</ci><cn id="S3.E17.m1.2.2.1.1.2.2.3.3.cmml" type="integer" xref="S3.E17.m1.2.2.1.1.2.2.3.3">0</cn></apply></apply><ci id="S3.E17.m1.2.2.1.1.2.3.cmml" xref="S3.E17.m1.2.2.1.1.2.3">𝑇</ci></apply><apply id="S3.E17.m1.2.2.1.1.1.1.1.cmml" xref="S3.E17.m1.2.2.1.1.1.1"><plus id="S3.E17.m1.2.2.1.1.1.1.1.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.3"></plus><apply id="S3.E17.m1.2.2.1.1.1.1.1.4.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.4"><csymbol cd="ambiguous" id="S3.E17.m1.2.2.1.1.1.1.1.4.1.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.4">subscript</csymbol><ci id="S3.E17.m1.2.2.1.1.1.1.1.4.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.4.2">Γ</ci><ci id="S3.E17.m1.2.2.1.1.1.1.1.4.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.4.3">𝑖</ci></apply><apply id="S3.E17.m1.2.2.1.1.1.1.1.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2"><times id="S3.E17.m1.2.2.1.1.1.1.1.2.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.3"></times><ci id="S3.E17.m1.2.2.1.1.1.1.1.2.4.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.4">𝜆</ci><apply id="S3.E17.m1.2.2.1.1.1.1.1.2.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2"><apply id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3"><csymbol cd="ambiguous" id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.1.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3">superscript</csymbol><apply id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3"><csymbol cd="ambiguous" id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.1.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3">subscript</csymbol><sum id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.2"></sum><apply id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3"><eq id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.1.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.1"></eq><ci id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.2">𝑗</ci><cn id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.3.cmml" type="integer" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.2.3.3">0</cn></apply></apply><ci id="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.3.3">𝑇</ci></apply><apply id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2"><min id="S3.E17.m1.1.1.cmml" xref="S3.E17.m1.1.1"></min><apply id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1">subscript</csymbol><ci id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.2">𝑎</ci><apply id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3"><times id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.1"></times><ci id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.2">𝑖</ci><ci id="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3.3">𝑗</ci></apply></apply><apply id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2"><csymbol cd="ambiguous" id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.1.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2">superscript</csymbol><apply id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2"><csymbol cd="ambiguous" id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.1.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2">subscript</csymbol><ci id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.2">𝑐</ci><apply id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3"><times id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.1.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.1"></times><ci id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.2.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.2">𝑖</ci><ci id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.2.3.3">𝑗</ci></apply></apply><ci id="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.3.cmml" xref="S3.E17.m1.2.2.1.1.1.1.1.2.2.2.2.2.2.3">𝑣</ci></apply></apply></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E17.m1.2c">L=\frac{1}{T}\sum_{i=0}^{T}\left(\Gamma_{i}+\lambda\sum_{j=0}^{T}\min(a_{ij},c% _{ij}^{v})\right)</annotation><annotation encoding="application/x-llamapun" id="S3.E17.m1.2d">italic_L = divide start_ARG 1 end_ARG start_ARG italic_T end_ARG ∑ start_POSTSUBSCRIPT italic_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( roman_Γ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_λ ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_min ( italic_a start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT ) )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(17)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS5.p3.3">where <math alttext="\lambda" class="ltx_Math" display="inline" id="S3.SS5.p3.1.m1.1"><semantics id="S3.SS5.p3.1.m1.1a"><mi id="S3.SS5.p3.1.m1.1.1" xref="S3.SS5.p3.1.m1.1.1.cmml">λ</mi><annotation-xml encoding="MathML-Content" id="S3.SS5.p3.1.m1.1b"><ci id="S3.SS5.p3.1.m1.1.1.cmml" xref="S3.SS5.p3.1.m1.1.1">𝜆</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p3.1.m1.1c">\lambda</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p3.1.m1.1d">italic_λ</annotation></semantics></math> is a hyperparameter. <math alttext="i" class="ltx_Math" display="inline" id="S3.SS5.p3.2.m2.1"><semantics id="S3.SS5.p3.2.m2.1a"><mi id="S3.SS5.p3.2.m2.1.1" xref="S3.SS5.p3.2.m2.1.1.cmml">i</mi><annotation-xml encoding="MathML-Content" id="S3.SS5.p3.2.m2.1b"><ci id="S3.SS5.p3.2.m2.1.1.cmml" xref="S3.SS5.p3.2.m2.1.1">𝑖</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p3.2.m2.1c">i</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p3.2.m2.1d">italic_i</annotation></semantics></math> and <math alttext="j" class="ltx_Math" display="inline" id="S3.SS5.p3.3.m3.1"><semantics id="S3.SS5.p3.3.m3.1a"><mi id="S3.SS5.p3.3.m3.1.1" xref="S3.SS5.p3.3.m3.1.1.cmml">j</mi><annotation-xml encoding="MathML-Content" id="S3.SS5.p3.3.m3.1b"><ci id="S3.SS5.p3.3.m3.1.1.cmml" xref="S3.SS5.p3.3.m3.1.1">𝑗</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p3.3.m3.1c">j</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p3.3.m3.1d">italic_j</annotation></semantics></math> respectively represent the decoding time step and the position in the input sequence.</p> </div> </section> </section> <section class="ltx_section" id="S4"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">4 </span>Experiments</h2> <section class="ltx_subsection" id="S4.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.1 </span>Data Source and Preprocessing</h3> <div class="ltx_para" id="S4.SS1.p1"> <p class="ltx_p" id="S4.SS1.p1.1">The experimental data for this paper primarily comes from the Yizhuan Patent Retrieval and Analysis Database (https://www.patyee.com/), which includes patent data from most countries, available for download and research use. The data retrieval date range (publication date) is set from January 1, 2015, to January 1, 2022, covering China (including Hong Kong, Taiwan, and Macau). The patent types include invention applications, invention grants, utility models, design patents, and others, mainly covering five domains: water resources, artificial intelligence, fiber optics, finance, and agriculture. The patent status is set to valid, and the patent language is Chinese, selecting a total of 50,769 water resource patents, 32,939 artificial intelligence patents, 126,987 fiber optics patents, 18,758 finance patents, and 36,483 agriculture patents.</p> </div> <div class="ltx_para" id="S4.SS1.p2"> <p class="ltx_p" id="S4.SS1.p2.1">For data processing, this paper uses regular expressions to remove special characters, punctuation, spaces, and other special formats. References to images and other information contained in the patent claims and specifications also need to be removed, along with web tags and image references. The types of regular expression processing and corresponding expressions are shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.T1" title="Table 1 ‣ 4.1 Data Source and Preprocessing ‣ 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">1</span></a>:</p> </div> <figure class="ltx_table" id="S4.T1"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table">Table 1: </span>Regular expression processing types and their expressions</figcaption> <table class="ltx_tabular ltx_centering ltx_guessed_headers ltx_align_middle" id="S4.T1.1"> <thead class="ltx_thead"> <tr class="ltx_tr" id="S4.T1.1.1.1"> <th class="ltx_td ltx_align_left ltx_th ltx_th_column ltx_border_tt" id="S4.T1.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S4.T1.1.1.1.1.1">Regular Expression</span></th> <th class="ltx_td ltx_align_left ltx_th ltx_th_column ltx_border_tt" id="S4.T1.1.1.1.2"><span class="ltx_text ltx_font_bold" id="S4.T1.1.1.1.2.1">Function</span></th> </tr> </thead> <tbody class="ltx_tbody"> <tr class="ltx_tr" id="S4.T1.1.2.1"> <td class="ltx_td ltx_align_left ltx_border_t" id="S4.T1.1.2.1.1"><span class="ltx_text ltx_font_typewriter" id="S4.T1.1.2.1.1.1">&lt;script[^&gt;]*?&gt;[\s\S]*?&lt;\/script&gt;</span></td> <td class="ltx_td ltx_align_left ltx_border_t" id="S4.T1.1.2.1.2">Handles web tags and similar formats</td> </tr> <tr class="ltx_tr" id="S4.T1.1.3.2"> <td class="ltx_td ltx_align_left" id="S4.T1.1.3.2.1"><span class="ltx_text ltx_font_typewriter" id="S4.T1.1.3.2.1.1">&lt;style[^&gt;]*?&gt;[\s\S]*?&lt;\/style&gt;</span></td> <td class="ltx_td" id="S4.T1.1.3.2.2"></td> </tr> <tr class="ltx_tr" id="S4.T1.1.4.3"> <td class="ltx_td ltx_align_left" id="S4.T1.1.4.3.1"><span class="ltx_text ltx_font_typewriter" id="S4.T1.1.4.3.1.1">&lt;(?!div|/div|p|/p|br)[^&gt;]*&gt;</span></td> <td class="ltx_td" id="S4.T1.1.4.3.2"></td> </tr> <tr class="ltx_tr" id="S4.T1.1.5.4"> <td class="ltx_td ltx_align_left" id="S4.T1.1.5.4.1"><span class="ltx_text ltx_font_typewriter" id="S4.T1.1.5.4.1.1">&lt;tr&gt;(.*?)&lt;/tr&gt;</span></td> <td class="ltx_td" id="S4.T1.1.5.4.2"></td> </tr> <tr class="ltx_tr" id="S4.T1.1.6.5"> <td class="ltx_td ltx_align_left" id="S4.T1.1.6.5.1"><span class="ltx_text ltx_font_typewriter" id="S4.T1.1.6.5.1.1">&lt;th&gt;(.*?)&lt;/th&gt;</span></td> <td class="ltx_td" id="S4.T1.1.6.5.2"></td> </tr> <tr class="ltx_tr" id="S4.T1.1.7.6"> <td class="ltx_td ltx_align_left" id="S4.T1.1.7.6.1"><span class="ltx_text ltx_font_typewriter" id="S4.T1.1.7.6.1.1">&lt;td&gt;(.*?)&lt;/td&gt;</span></td> <td class="ltx_td" id="S4.T1.1.7.6.2"></td> </tr> <tr class="ltx_tr" id="S4.T1.1.8.7"> <td class="ltx_td ltx_align_left" id="S4.T1.1.8.7.1"><span class="ltx_text ltx_font_typewriter" id="S4.T1.1.8.7.1.1">(?&lt;=&lt;title&gt;).*?(?=&lt;\/title&gt;)</span></td> <td class="ltx_td ltx_align_left" id="S4.T1.1.8.7.2">Processes titles</td> </tr> <tr class="ltx_tr" id="S4.T1.1.9.8"> <td class="ltx_td ltx_align_left" id="S4.T1.1.9.8.1"><span class="ltx_text ltx_font_typewriter" id="S4.T1.1.9.8.1.1">&lt;a.*?href=.*?&lt;\/a&gt;</span></td> <td class="ltx_td ltx_align_left" id="S4.T1.1.9.8.2">Handles image references and hyperlinks</td> </tr> <tr class="ltx_tr" id="S4.T1.1.10.9"> <td class="ltx_td ltx_align_left ltx_border_bb" id="S4.T1.1.10.9.1"><span class="ltx_text ltx_font_typewriter" id="S4.T1.1.10.9.1.1">\s*|\t|\r|\n</span></td> <td class="ltx_td ltx_align_left ltx_border_bb" id="S4.T1.1.10.9.2">Handles excess spaces and lines</td> </tr> </tbody> </table> </figure> <div class="ltx_para" id="S4.SS1.p3"> <p class="ltx_p" id="S4.SS1.p3.1">Data is split into training, testing, and validation sets at ratios of 8:1:1. Each downloaded data entry includes a title, publication number, abstract, specification text, and claims. An example of the patent data presentation form for the artificial intelligence domain is shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.T2" title="Table 2 ‣ 4.1 Data Source and Preprocessing ‣ 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">2</span></a>:</p> </div> <figure class="ltx_table" id="S4.T2"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table">Table 2: </span>An Example of a Patent Data Presentation Form</figcaption> <table class="ltx_tabular ltx_centering ltx_guessed_headers ltx_align_middle" id="S4.T2.1"> <thead class="ltx_thead"> <tr class="ltx_tr" id="S4.T2.1.1.1"> <th class="ltx_td ltx_align_justify ltx_align_top ltx_th ltx_th_column ltx_border_t" id="S4.T2.1.1.1.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.1.1.1.1"> <span class="ltx_p" id="S4.T2.1.1.1.1.1.1" style="width:65.0pt;"><span class="ltx_text ltx_font_bold" id="S4.T2.1.1.1.1.1.1.1">Attribute</span></span> </span> </th> <th class="ltx_td ltx_align_justify ltx_align_top ltx_th ltx_th_column ltx_border_t" id="S4.T2.1.1.1.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.1.1.2.1"> <span class="ltx_p" id="S4.T2.1.1.1.2.1.1" style="width:346.9pt;"><span class="ltx_text ltx_font_bold" id="S4.T2.1.1.1.2.1.1.1">Content</span></span> </span> </th> </tr> </thead> <tbody class="ltx_tbody"> <tr class="ltx_tr" id="S4.T2.1.2.1"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T2.1.2.1.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.2.1.1.1"> <span class="ltx_p" id="S4.T2.1.2.1.1.1.1" style="width:65.0pt;">Title</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T2.1.2.1.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.2.1.2.1"> <span class="ltx_p" id="S4.T2.1.2.1.2.1.1" style="width:346.9pt;">Task Scheduling Method and Device Based on Multiple GPUs</span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T2.1.3.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T2.1.3.2.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.3.2.1.1"> <span class="ltx_p" id="S4.T2.1.3.2.1.1.1" style="width:65.0pt;">Publication Number</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T2.1.3.2.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.3.2.2.1"> <span class="ltx_p" id="S4.T2.1.3.2.2.1.1" style="width:346.9pt;">CN113391905A</span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T2.1.4.3"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T2.1.4.3.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.4.3.1.1"> <span class="ltx_p" id="S4.T2.1.4.3.1.1.1" style="width:65.0pt;">Abstract</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T2.1.4.3.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.4.3.2.1"> <span class="ltx_p" id="S4.T2.1.4.3.2.1.1" style="width:346.9pt;">The invention discloses a task scheduling method and device based on multiple GPUs, including: allocating a minimum and maximum number of GPUs for different task types; loading tasks from the database into the task queue, distributing GPUs …</span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T2.1.5.4"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T2.1.5.4.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.5.4.1.1"> <span class="ltx_p" id="S4.T2.1.5.4.1.1.1" style="width:65.0pt;">Specification Text</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T2.1.5.4.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.5.4.2.1"> <span class="ltx_p" id="S4.T2.1.5.4.2.1.1" style="width:346.9pt;">The invention aims to provide a task scheduling method and device based on multiple GPUs. Technical solution: The invention provides a task scheduling method that includes: determining the priority of task types, allocating a minimum and maximum number of GPUs for different task types…</span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T2.1.6.5"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_b ltx_border_t" id="S4.T2.1.6.5.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.6.5.1.1"> <span class="ltx_p" id="S4.T2.1.6.5.1.1.1" style="width:65.0pt;">Claims</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_b ltx_border_t" id="S4.T2.1.6.5.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T2.1.6.5.2.1"> <span class="ltx_p" id="S4.T2.1.6.5.2.1.1" style="width:346.9pt;">The second scheduling unit is for, if the number of GPUs in use has reached the minimum GPU requirement for each task type, or all tasks of a task type have been satisfied, and there are still tasks in the task queue and available GPUs…</span> </span> </td> </tr> </tbody> </table> </figure> </section> <section class="ltx_subsection" id="S4.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.2 </span>Model parameter settings and Metrics</h3> <div class="ltx_para" id="S4.SS2.p1"> <p class="ltx_p" id="S4.SS2.p1.1">For model training, the batch training size is set to 32, the probability of DROPOUT is set to 0.5, the initial learning rate is set to 0.001, the master encoder hidden layer dimension is set to 256, and the slave encoder hidden layer dimension is set to 256, the decoder hidden layer dimension setting is set to 256, the maximum text input length is set to 500, the maximum text output length is set to 100 and the optimizer uses Adam. the maximum vocabulary size is set to 100000. the OOV text is replaced with &lt;UNK&gt;. In this paper, we do not pre-empt word embeddings, but learn them from scratch in training. The dimension of word embeddings is set to 256. network parameters are randomly initialized over a uniform distribution [-0.05,0.05]. Every <math alttext="K" class="ltx_Math" display="inline" id="S4.SS2.p1.1.m1.1"><semantics id="S4.SS2.p1.1.m1.1a"><mi id="S4.SS2.p1.1.m1.1.1" xref="S4.SS2.p1.1.m1.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S4.SS2.p1.1.m1.1b"><ci id="S4.SS2.p1.1.m1.1.1.cmml" xref="S4.SS2.p1.1.m1.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.p1.1.m1.1c">K</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.p1.1.m1.1d">italic_K</annotation></semantics></math> steps from the encoder was set to 100.</p> </div> <div class="ltx_para" id="S4.SS2.p2"> <p class="ltx_p" id="S4.SS2.p2.1">In the field of text generation, the system-generated abstracts are compared with the manual abstracts of the patents themselves, and the specific quality of their text generation is evaluated by calculating their overlap. In this paper, ROUGE is used as a measure of the model’s effectiveness.</p> </div> </section> <section class="ltx_subsection" id="S4.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.3 </span>Baselines</h3> <div class="ltx_para" id="S4.SS3.p1"> <p class="ltx_p" id="S4.SS3.p1.1">In this paper, we compare with other current models in the same field, SuRuNNer <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib78" title="">78</a>]</cite> based on a recurrent neural network sequence model and combined with an attention mechanism <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib99" title="">99</a>]</cite> for extractive summarization. TextRank <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib71" title="">71</a>]</cite> uses a graph-based text processing model with 2 innovative unsupervised keyword extraction methods. MedWriter <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib84" title="">84</a>]</cite> uses a knowledge-aware model of text generation with a capability to learn graph-level representations illustrated. RLCPRA <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib139" title="">139</a>]</cite> uses reinforcement learning for patent text generation. STNLTP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib138" title="">138</a>]</cite> uses an integrated strategy for text generation. IMHAM <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib45" title="">45</a>]</cite> applies improved multi-head attention to decoders and encoders using most important document semantic similarity selection and pointer network optimization, which belongs to the current state-of-the-art of more advanced patent generating summary generation models.</p> </div> </section> <section class="ltx_subsection" id="S4.SS4"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.4 </span>Comparison</h3> <div class="ltx_para" id="S4.SS4.p1"> <p class="ltx_p" id="S4.SS4.p1.1">As shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.T3" title="Table 3 ‣ 4.4 Comparison ‣ 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">3</span></a>, the specific evaluation metrics for each model on ROUGE are displayed. It can be observed that MSEA performs better than TextRank, SuRuNNer, MedWriter, and IMHAM across all metrics. On Rouge-1, Rouge-2, and Rouge-L, MSEA scores higher by 0.006, 0.005, and 0.005 respectively compared to the currently advanced IMHAM, and by as much as 0.109, 0.117, and 0.08 compared to TextRank. However, the performance of Transformer was not as good as MedWriter and IMHAM, possibly because IMHAM and MedWriter incorporate more features related to patent structure and use Bert. The reason MSEA performs better than RLCPRA might be that RLCPRA uses reinforcement learning to generate text specifically for the patent specifications to address OOV and repetitive generation issues. MSEA performs better than STNLTP possibly because STNLTP employs an integration strategy solely for generating text from the patent specifications. The experiments demonstrate that MSEA consistently shows the best results across all metrics, indicating that the model in this paper has achieved good performance.</p> </div> <figure class="ltx_table" id="S4.T3"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">Table 3: </span>ROUGE performance of the models on the patent text dataset</figcaption> <table class="ltx_tabular ltx_centering ltx_guessed_headers ltx_align_middle" id="S4.T3.1"> <thead class="ltx_thead"> <tr class="ltx_tr" id="S4.T3.1.1.1"> <th class="ltx_td ltx_align_left ltx_th ltx_th_column ltx_th_row ltx_border_tt" id="S4.T3.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S4.T3.1.1.1.1.1">Model</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" id="S4.T3.1.1.1.2"><span class="ltx_text ltx_font_bold" id="S4.T3.1.1.1.2.1">Rouge-1</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" id="S4.T3.1.1.1.3"><span class="ltx_text ltx_font_bold" id="S4.T3.1.1.1.3.1">Rouge-2</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" id="S4.T3.1.1.1.4"><span class="ltx_text ltx_font_bold" id="S4.T3.1.1.1.4.1">Rouge-L</span></th> </tr> </thead> <tbody class="ltx_tbody"> <tr class="ltx_tr" id="S4.T3.1.2.1"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_t" id="S4.T3.1.2.1.1">TextRank <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib71" title="">71</a>]</cite> </th> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.1.2.1.2">0.432</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.1.2.1.3">0.235</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.1.2.1.4">0.367</td> </tr> <tr class="ltx_tr" id="S4.T3.1.3.2"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row" id="S4.T3.1.3.2.1">SuRuNNer (2017) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib78" title="">78</a>]</cite> </th> <td class="ltx_td ltx_align_center" id="S4.T3.1.3.2.2">0.482</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.3.2.3">0.293</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.3.2.4">0.393</td> </tr> <tr class="ltx_tr" id="S4.T3.1.4.3"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row" id="S4.T3.1.4.3.1">Transformer (2017)</th> <td class="ltx_td ltx_align_center" id="S4.T3.1.4.3.2">0.491</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.4.3.3">0.316</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.4.3.4">0.402</td> </tr> <tr class="ltx_tr" id="S4.T3.1.5.4"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row" id="S4.T3.1.5.4.1">MedWriter (2020) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib84" title="">84</a>]</cite> </th> <td class="ltx_td ltx_align_center" id="S4.T3.1.5.4.2">0.518</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.5.4.3">0.339</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.5.4.4">0.419</td> </tr> <tr class="ltx_tr" id="S4.T3.1.6.5"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row" id="S4.T3.1.6.5.1">RLCPRA (2021) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib139" title="">139</a>]</cite> </th> <td class="ltx_td ltx_align_center" id="S4.T3.1.6.5.2">0.521</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.6.5.3">0.342</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.6.5.4">0.424</td> </tr> <tr class="ltx_tr" id="S4.T3.1.7.6"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row" id="S4.T3.1.7.6.1">STNLTP (2022) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib138" title="">138</a>]</cite> </th> <td class="ltx_td ltx_align_center" id="S4.T3.1.7.6.2">0.528</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.7.6.3">0.344</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.7.6.4">0.431</td> </tr> <tr class="ltx_tr" id="S4.T3.1.8.7"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row" id="S4.T3.1.8.7.1">IMHAM (2023 )<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#bib.bib45" title="">45</a>]</cite> </th> <td class="ltx_td ltx_align_center" id="S4.T3.1.8.7.2">0.535</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.8.7.3">0.347</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.8.7.4">0.442</td> </tr> <tr class="ltx_tr" id="S4.T3.1.9.8"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_bb" id="S4.T3.1.9.8.1">MSEA</th> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.1.9.8.2">0.541</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.1.9.8.3">0.352</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.1.9.8.4">0.447</td> </tr> </tbody> </table> </figure> </section> <section class="ltx_subsection" id="S4.SS5"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.5 </span>Analysis of Differences between Using Specifications and Claims Text in Patents</h3> <div class="ltx_para" id="S4.SS5.p1"> <p class="ltx_p" id="S4.SS5.p1.1">As shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.T4" title="Table 4 ‣ 4.5 Analysis of Differences between Using Specifications and Claims Text in Patents ‣ 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">4</span></a>, the differences in the performance of the MSEA model when using the specifications text and the claims text under various conditions are presented. The MSEA model achieves the best results when both the specifications and claims texts are used together. When only the specifications text or the claims text is used, the results are not as good as when both are used simultaneously, which also validates the superiority of the MSEA model.</p> </div> <figure class="ltx_table" id="S4.T4"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">Table 4: </span>ROUGE performance of the model on whether or not to use the text of the patent specifications and claims</figcaption> <table class="ltx_tabular ltx_centering ltx_guessed_headers ltx_align_middle" id="S4.T4.1"> <thead class="ltx_thead"> <tr class="ltx_tr" id="S4.T4.1.1.1"> <th class="ltx_td ltx_align_left ltx_th ltx_th_column ltx_th_row ltx_border_tt" id="S4.T4.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S4.T4.1.1.1.1.1">Use of specifications or Claims Text</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" id="S4.T4.1.1.1.2"><span class="ltx_text ltx_font_bold" id="S4.T4.1.1.1.2.1">Rouge-1</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" id="S4.T4.1.1.1.3"><span class="ltx_text ltx_font_bold" id="S4.T4.1.1.1.3.1">Rouge-2</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" id="S4.T4.1.1.1.4"><span class="ltx_text ltx_font_bold" id="S4.T4.1.1.1.4.1">Rouge-L</span></th> </tr> </thead> <tbody class="ltx_tbody"> <tr class="ltx_tr" id="S4.T4.1.2.1"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_t" id="S4.T4.1.2.1.1">specifications Only</th> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.1.2.1.2">0.537</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.1.2.1.3">0.348</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.1.2.1.4">0.443</td> </tr> <tr class="ltx_tr" id="S4.T4.1.3.2"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row" id="S4.T4.1.3.2.1">Claims Only</th> <td class="ltx_td ltx_align_center" id="S4.T4.1.3.2.2">0.534</td> <td class="ltx_td ltx_align_center" id="S4.T4.1.3.2.3">0.345</td> <td class="ltx_td ltx_align_center" id="S4.T4.1.3.2.4">0.444</td> </tr> <tr class="ltx_tr" id="S4.T4.1.4.3"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_bb" id="S4.T4.1.4.3.1">Both Used</th> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T4.1.4.3.2">0.541</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T4.1.4.3.3">0.352</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T4.1.4.3.4">0.447</td> </tr> </tbody> </table> </figure> </section> <section class="ltx_subsection" id="S4.SS6"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.6 </span>Sensitivity Analysis Under Different Decoding Lengths</h3> <div class="ltx_para" id="S4.SS6.p1"> <p class="ltx_p" id="S4.SS6.p1.1">In the MSEA model designed in this paper, to assess the impact of different decoding lengths on performance, various decoding lengths <math alttext="K=\{20,30,50,100,150,200\}" class="ltx_Math" display="inline" id="S4.SS6.p1.1.m1.6"><semantics id="S4.SS6.p1.1.m1.6a"><mrow id="S4.SS6.p1.1.m1.6.7" xref="S4.SS6.p1.1.m1.6.7.cmml"><mi id="S4.SS6.p1.1.m1.6.7.2" xref="S4.SS6.p1.1.m1.6.7.2.cmml">K</mi><mo id="S4.SS6.p1.1.m1.6.7.1" xref="S4.SS6.p1.1.m1.6.7.1.cmml">=</mo><mrow id="S4.SS6.p1.1.m1.6.7.3.2" xref="S4.SS6.p1.1.m1.6.7.3.1.cmml"><mo id="S4.SS6.p1.1.m1.6.7.3.2.1" stretchy="false" xref="S4.SS6.p1.1.m1.6.7.3.1.cmml">{</mo><mn id="S4.SS6.p1.1.m1.1.1" xref="S4.SS6.p1.1.m1.1.1.cmml">20</mn><mo id="S4.SS6.p1.1.m1.6.7.3.2.2" xref="S4.SS6.p1.1.m1.6.7.3.1.cmml">,</mo><mn id="S4.SS6.p1.1.m1.2.2" xref="S4.SS6.p1.1.m1.2.2.cmml">30</mn><mo id="S4.SS6.p1.1.m1.6.7.3.2.3" xref="S4.SS6.p1.1.m1.6.7.3.1.cmml">,</mo><mn id="S4.SS6.p1.1.m1.3.3" xref="S4.SS6.p1.1.m1.3.3.cmml">50</mn><mo id="S4.SS6.p1.1.m1.6.7.3.2.4" xref="S4.SS6.p1.1.m1.6.7.3.1.cmml">,</mo><mn id="S4.SS6.p1.1.m1.4.4" xref="S4.SS6.p1.1.m1.4.4.cmml">100</mn><mo id="S4.SS6.p1.1.m1.6.7.3.2.5" xref="S4.SS6.p1.1.m1.6.7.3.1.cmml">,</mo><mn id="S4.SS6.p1.1.m1.5.5" xref="S4.SS6.p1.1.m1.5.5.cmml">150</mn><mo id="S4.SS6.p1.1.m1.6.7.3.2.6" xref="S4.SS6.p1.1.m1.6.7.3.1.cmml">,</mo><mn id="S4.SS6.p1.1.m1.6.6" xref="S4.SS6.p1.1.m1.6.6.cmml">200</mn><mo id="S4.SS6.p1.1.m1.6.7.3.2.7" stretchy="false" xref="S4.SS6.p1.1.m1.6.7.3.1.cmml">}</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S4.SS6.p1.1.m1.6b"><apply id="S4.SS6.p1.1.m1.6.7.cmml" xref="S4.SS6.p1.1.m1.6.7"><eq id="S4.SS6.p1.1.m1.6.7.1.cmml" xref="S4.SS6.p1.1.m1.6.7.1"></eq><ci id="S4.SS6.p1.1.m1.6.7.2.cmml" xref="S4.SS6.p1.1.m1.6.7.2">𝐾</ci><set id="S4.SS6.p1.1.m1.6.7.3.1.cmml" xref="S4.SS6.p1.1.m1.6.7.3.2"><cn id="S4.SS6.p1.1.m1.1.1.cmml" type="integer" xref="S4.SS6.p1.1.m1.1.1">20</cn><cn id="S4.SS6.p1.1.m1.2.2.cmml" type="integer" xref="S4.SS6.p1.1.m1.2.2">30</cn><cn id="S4.SS6.p1.1.m1.3.3.cmml" type="integer" xref="S4.SS6.p1.1.m1.3.3">50</cn><cn id="S4.SS6.p1.1.m1.4.4.cmml" type="integer" xref="S4.SS6.p1.1.m1.4.4">100</cn><cn id="S4.SS6.p1.1.m1.5.5.cmml" type="integer" xref="S4.SS6.p1.1.m1.5.5">150</cn><cn id="S4.SS6.p1.1.m1.6.6.cmml" type="integer" xref="S4.SS6.p1.1.m1.6.6">200</cn></set></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS6.p1.1.m1.6c">K=\{20,30,50,100,150,200\}</annotation><annotation encoding="application/x-llamapun" id="S4.SS6.p1.1.m1.6d">italic_K = { 20 , 30 , 50 , 100 , 150 , 200 }</annotation></semantics></math> were set. A decoding length of 200 means the entire output sequence can be decoded in one go. As shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.T5" title="Table 5 ‣ 4.6 Sensitivity Analysis Under Different Decoding Lengths ‣ 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">5</span></a>, the ROUGE scores of the model at different decoding lengths are displayed.</p> </div> <figure class="ltx_table" id="S4.T5"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">Table 5: </span>ROUGE performance of different decoding lengths on the patent text dataset</figcaption> <table class="ltx_tabular ltx_centering ltx_guessed_headers ltx_align_middle" id="S4.T5.1"> <thead class="ltx_thead"> <tr class="ltx_tr" id="S4.T5.1.1.1"> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_th_row ltx_border_tt" id="S4.T5.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S4.T5.1.1.1.1.1">Decoding Length</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" id="S4.T5.1.1.1.2"><span class="ltx_text ltx_font_bold" id="S4.T5.1.1.1.2.1">Rouge-1</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" id="S4.T5.1.1.1.3"><span class="ltx_text ltx_font_bold" id="S4.T5.1.1.1.3.1">Rouge-2</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" id="S4.T5.1.1.1.4"><span class="ltx_text ltx_font_bold" id="S4.T5.1.1.1.4.1">Rouge-L</span></th> </tr> </thead> <tbody class="ltx_tbody"> <tr class="ltx_tr" id="S4.T5.1.2.1"> <th class="ltx_td ltx_align_center ltx_th ltx_th_row ltx_border_t" id="S4.T5.1.2.1.1">20</th> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T5.1.2.1.2">0.533</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T5.1.2.1.3">0.343</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T5.1.2.1.4">0.437</td> </tr> <tr class="ltx_tr" id="S4.T5.1.3.2"> <th class="ltx_td ltx_align_center ltx_th ltx_th_row" id="S4.T5.1.3.2.1">30</th> <td class="ltx_td ltx_align_center" id="S4.T5.1.3.2.2">0.536</td> <td class="ltx_td ltx_align_center" id="S4.T5.1.3.2.3">0.347</td> <td class="ltx_td ltx_align_center" id="S4.T5.1.3.2.4">0.444</td> </tr> <tr class="ltx_tr" id="S4.T5.1.4.3"> <th class="ltx_td ltx_align_center ltx_th ltx_th_row" id="S4.T5.1.4.3.1">50</th> <td class="ltx_td ltx_align_center" id="S4.T5.1.4.3.2">0.538</td> <td class="ltx_td ltx_align_center" id="S4.T5.1.4.3.3">0.346</td> <td class="ltx_td ltx_align_center" id="S4.T5.1.4.3.4">0.443</td> </tr> <tr class="ltx_tr" id="S4.T5.1.5.4"> <th class="ltx_td ltx_align_center ltx_th ltx_th_row" id="S4.T5.1.5.4.1">100</th> <td class="ltx_td ltx_align_center" id="S4.T5.1.5.4.2">0.540</td> <td class="ltx_td ltx_align_center" id="S4.T5.1.5.4.3">0.349</td> <td class="ltx_td ltx_align_center" id="S4.T5.1.5.4.4">0.446</td> </tr> <tr class="ltx_tr" id="S4.T5.1.6.5"> <th class="ltx_td ltx_align_center ltx_th ltx_th_row" id="S4.T5.1.6.5.1">150</th> <td class="ltx_td ltx_align_center" id="S4.T5.1.6.5.2">0.537</td> <td class="ltx_td ltx_align_center" id="S4.T5.1.6.5.3">0.351</td> <td class="ltx_td ltx_align_center" id="S4.T5.1.6.5.4">0.441</td> </tr> <tr class="ltx_tr" id="S4.T5.1.7.6"> <th class="ltx_td ltx_align_center ltx_th ltx_th_row ltx_border_bb" id="S4.T5.1.7.6.1">200</th> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T5.1.7.6.2">0.531</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T5.1.7.6.3">0.343</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T5.1.7.6.4">0.435</td> </tr> </tbody> </table> </figure> <div class="ltx_para" id="S4.SS6.p2"> <p class="ltx_p" id="S4.SS6.p2.1">As can be seen from Table <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.T5" title="Table 5 ‣ 4.6 Sensitivity Analysis Under Different Decoding Lengths ‣ 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">5</span></a>, performance significantly decreases when the decoding length is too short. Setting the decoding length between 100 and 150 results in better outcomes.</p> </div> </section> <section class="ltx_subsection" id="S4.SS7"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.7 </span>Sensitivity Analysis of Hidden Layers</h3> <div class="ltx_para" id="S4.SS7.p1"> <p class="ltx_p" id="S4.SS7.p1.1">The settings of the hidden layer dimensions for the master encoder, slave encoder , and decoder significantly impact experimental results, as these layers contain crucial feature information. As shown in Table 6, the effects of the hidden layer sizes for the master encoder, slave encoder, and decoder on experimental outcomes are presented.</p> </div> <figure class="ltx_table" id="S4.T6"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">Table 6: </span>Effect of hidden layer size of master encoder, slave encoder, and decoder on experimental metrics Rouge-1, Rouge-2, and Rouge-L</figcaption> <table class="ltx_tabular ltx_centering ltx_guessed_headers ltx_align_middle" id="S4.T6.1"> <thead class="ltx_thead"> <tr class="ltx_tr" id="S4.T6.1.1.1"> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" id="S4.T6.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S4.T6.1.1.1.1.1">Rouge Metric</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" colspan="3" id="S4.T6.1.1.1.2"><span class="ltx_text ltx_font_bold" id="S4.T6.1.1.1.2.1">Master Encoder</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" colspan="3" id="S4.T6.1.1.1.3"><span class="ltx_text ltx_font_bold" id="S4.T6.1.1.1.3.1">Slave Encoder</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_tt" colspan="3" id="S4.T6.1.1.1.4"><span class="ltx_text ltx_font_bold" id="S4.T6.1.1.1.4.1">Decoder</span></th> </tr> </thead> <tbody class="ltx_tbody"> <tr class="ltx_tr" id="S4.T6.1.2.1"> <td class="ltx_td ltx_border_t" id="S4.T6.1.2.1.1"></td> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T6.1.2.1.2">128</th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T6.1.2.1.3">256</th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T6.1.2.1.4">512</th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T6.1.2.1.5">128</th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T6.1.2.1.6">256</th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T6.1.2.1.7">512</th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T6.1.2.1.8">128</th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T6.1.2.1.9">256</th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T6.1.2.1.10">512</th> </tr> <tr class="ltx_tr" id="S4.T6.1.3.2"> <td class="ltx_td ltx_align_center" id="S4.T6.1.3.2.1"><span class="ltx_text ltx_font_bold" id="S4.T6.1.3.2.1.1">Rouge-1</span></td> <td class="ltx_td ltx_align_center" id="S4.T6.1.3.2.2">0.535</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.3.2.3">0.537</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.3.2.4">0.532</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.3.2.5">0.537</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.3.2.6">0.541</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.3.2.7">0.534</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.3.2.8">0.539</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.3.2.9">0.541</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.3.2.10">0.535</td> </tr> <tr class="ltx_tr" id="S4.T6.1.4.3"> <td class="ltx_td ltx_align_center" id="S4.T6.1.4.3.1"><span class="ltx_text ltx_font_bold" id="S4.T6.1.4.3.1.1">Rouge-2</span></td> <td class="ltx_td ltx_align_center" id="S4.T6.1.4.3.2">0.348</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.4.3.3">0.349</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.4.3.4">0.349</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.4.3.5">0.347</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.4.3.6">0.352</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.4.3.7">0.348</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.4.3.8">0.351</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.4.3.9">0.351</td> <td class="ltx_td ltx_align_center" id="S4.T6.1.4.3.10">0.348</td> </tr> <tr class="ltx_tr" id="S4.T6.1.5.4"> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T6.1.5.4.1"><span class="ltx_text ltx_font_bold" id="S4.T6.1.5.4.1.1">Rouge-L</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T6.1.5.4.2">0.442</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T6.1.5.4.3">0.443</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T6.1.5.4.4">0.439</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T6.1.5.4.5">0.441</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T6.1.5.4.6">0.446</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T6.1.5.4.7">0.441</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T6.1.5.4.8">0.445</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T6.1.5.4.9">0.445</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T6.1.5.4.10">0.440</td> </tr> </tbody> </table> </figure> <div class="ltx_para" id="S4.SS7.p2"> <p class="ltx_p" id="S4.SS7.p2.1">From Table <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.T6" title="Table 6 ‣ 4.7 Sensitivity Analysis of Hidden Layers ‣ 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">6</span></a>, it is observed that the model achieves optimal results on the Rouge-1, Rouge-2, and Rouge-L metrics when the hidden layer sizes are set to 256 for the master encoder, slave encoder, and decoder. This indicates that both excessively large or small hidden layers do not favor good model performance, while moderately sized hidden layers help further enhance performance.</p> </div> </section> <section class="ltx_subsection" id="S4.SS8"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.8 </span>Ablation Study</h3> <div class="ltx_para" id="S4.SS8.p1"> <p class="ltx_p" id="S4.SS8.p1.1">To further explore the impact of different modules on the experimental results, an ablation study was conducted. Specifically, one or two of the three modules were removed from the MSEA model each time. As shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.T7" title="Table 7 ‣ 4.8 Ablation Study ‣ 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">7</span></a>, particularly, in the absence of both the master-slave encoding mechanism and all modules, the model reverts to a classic sequence-to-sequence model.</p> </div> <figure class="ltx_table" id="S4.T7"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">Table 7: </span>Experimental analysis of ablation of the MSEA model</figcaption> <table class="ltx_tabular ltx_centering ltx_guessed_headers ltx_align_middle" id="S4.T7.1"> <thead class="ltx_thead"> <tr class="ltx_tr" id="S4.T7.1.1.1"> <th class="ltx_td ltx_align_justify ltx_th ltx_th_column ltx_border_t" id="S4.T7.1.1.1.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T7.1.1.1.1.1"> <span class="ltx_p" id="S4.T7.1.1.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S4.T7.1.1.1.1.1.1.1">Variable</span></span> </span> </th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T7.1.1.1.2"><span class="ltx_text ltx_font_bold" id="S4.T7.1.1.1.2.1">Rouge-1</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T7.1.1.1.3"><span class="ltx_text ltx_font_bold" id="S4.T7.1.1.1.3.1">Rouge-2</span></th> <th class="ltx_td ltx_align_center ltx_th ltx_th_column ltx_border_t" id="S4.T7.1.1.1.4"><span class="ltx_text ltx_font_bold" id="S4.T7.1.1.1.4.1">Rouge-L</span></th> </tr> </thead> <tbody class="ltx_tbody"> <tr class="ltx_tr" id="S4.T7.1.2.1"> <td class="ltx_td ltx_align_justify ltx_border_t" id="S4.T7.1.2.1.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T7.1.2.1.1.1"> <span class="ltx_p" id="S4.T7.1.2.1.1.1.1">Model Itself</span> </span> </td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T7.1.2.1.2">0.541</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T7.1.2.1.3">0.352</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T7.1.2.1.4">0.447</td> </tr> <tr class="ltx_tr" id="S4.T7.1.3.2"> <td class="ltx_td ltx_align_justify" id="S4.T7.1.3.2.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T7.1.3.2.1.1"> <span class="ltx_p" id="S4.T7.1.3.2.1.1.1">No Enhanced Repetition Suppression</span> </span> </td> <td class="ltx_td ltx_align_center" id="S4.T7.1.3.2.2">0.524</td> <td class="ltx_td ltx_align_center" id="S4.T7.1.3.2.3">0.337</td> <td class="ltx_td ltx_align_center" id="S4.T7.1.3.2.4">0.423</td> </tr> <tr class="ltx_tr" id="S4.T7.1.4.3"> <td class="ltx_td ltx_align_justify" id="S4.T7.1.4.3.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T7.1.4.3.1.1"> <span class="ltx_p" id="S4.T7.1.4.3.1.1.1">No Master-Slave Encoding</span> </span> </td> <td class="ltx_td ltx_align_center" id="S4.T7.1.4.3.2">0.534</td> <td class="ltx_td ltx_align_center" id="S4.T7.1.4.3.3">0.346</td> <td class="ltx_td ltx_align_center" id="S4.T7.1.4.3.4">0.434</td> </tr> <tr class="ltx_tr" id="S4.T7.1.5.4"> <td class="ltx_td ltx_align_justify" id="S4.T7.1.5.4.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T7.1.5.4.1.1"> <span class="ltx_p" id="S4.T7.1.5.4.1.1.1">No Pointer Network</span> </span> </td> <td class="ltx_td ltx_align_center" id="S4.T7.1.5.4.2">0.538</td> <td class="ltx_td ltx_align_center" id="S4.T7.1.5.4.3">0.349</td> <td class="ltx_td ltx_align_center" id="S4.T7.1.5.4.4">0.438</td> </tr> <tr class="ltx_tr" id="S4.T7.1.6.5"> <td class="ltx_td ltx_align_justify" id="S4.T7.1.6.5.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T7.1.6.5.1.1"> <span class="ltx_p" id="S4.T7.1.6.5.1.1.1">No Pointer Network + No Enhanced Repetition Suppression</span> </span> </td> <td class="ltx_td ltx_align_center" id="S4.T7.1.6.5.2">0.478</td> <td class="ltx_td ltx_align_center" id="S4.T7.1.6.5.3">0.276</td> <td class="ltx_td ltx_align_center" id="S4.T7.1.6.5.4">0.382</td> </tr> <tr class="ltx_tr" id="S4.T7.1.7.6"> <td class="ltx_td ltx_align_justify" id="S4.T7.1.7.6.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T7.1.7.6.1.1"> <span class="ltx_p" id="S4.T7.1.7.6.1.1.1">No Master-Slave Encoding + No Enhanced Repetition Suppression</span> </span> </td> <td class="ltx_td ltx_align_center" id="S4.T7.1.7.6.2">0.512</td> <td class="ltx_td ltx_align_center" id="S4.T7.1.7.6.3">0.309</td> <td class="ltx_td ltx_align_center" id="S4.T7.1.7.6.4">0.408</td> </tr> <tr class="ltx_tr" id="S4.T7.1.8.7"> <td class="ltx_td ltx_align_justify ltx_border_b" id="S4.T7.1.8.7.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T7.1.8.7.1.1"> <span class="ltx_p" id="S4.T7.1.8.7.1.1.1">No Enhanced Repetition Suppression + No Master-Slave Encoding + No Pointer Network</span> </span> </td> <td class="ltx_td ltx_align_center ltx_border_b" id="S4.T7.1.8.7.2">0.466</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S4.T7.1.8.7.3">0.268</td> <td class="ltx_td ltx_align_center ltx_border_b" id="S4.T7.1.8.7.4">0.377</td> </tr> </tbody> </table> </figure> <div class="ltx_para" id="S4.SS8.p2"> <p class="ltx_p" id="S4.SS8.p2.1">From Table <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.T7" title="Table 7 ‣ 4.8 Ablation Study ‣ 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">7</span></a>, it is evident that the absence of the repetition suppression mechanism leads to the largest drop in ROUGE scores, indicating that repetitive phenomena significantly affect the performance of summary generation, and the "RA" of this paper can effectively suppress such repetitions. The absence of the master-slave encoding mechanism results in decreases of 0.007, 0.006, and 0.013 in Rouge-1, Rouge-2, and Rouge-L, respectively. This suggests that the slave encoder performs more precise encoding, aiding the model in considering more detailed information. A performance decline is also noted without the pointer mechanism, likely due to the increased appearance of OOV words in the generated summaries. Thus, these three components are crucial for the performance of the MSEA model.</p> </div> </section> <section class="ltx_subsection" id="S4.SS9"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.9 </span>Enhanced Repetition Suppression Mechanism Results Analysis</h3> <div class="ltx_para" id="S4.SS9.p1"> <p class="ltx_p" id="S4.SS9.p1.1">The master-slave encoding model in this paper utilizes an enhanced repetition suppression mechanism that integrates the existing coverage mechanism with outputs already generated by the decoder. To verify the capability of the summary generated without the coverage mechanism to eliminate repetitive phenomena, the decoding length was set to a smaller value to enable the decoder to "remember" decoding information from earlier time steps better.</p> </div> <figure class="ltx_table" id="S4.T8"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">Table 8: </span>Case analysis for different decoding lengths and their effect on output quality</figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T8.2" style="width:414.9pt;height:348.5pt;vertical-align:-0.9pt;"><span class="ltx_transformed_inner" style="transform:translate(-23.0pt,19.3pt) scale(0.9,0.9) ;"> <table class="ltx_tabular ltx_guessed_headers ltx_align_middle" id="S4.T8.2.2"> <thead class="ltx_thead"> <tr class="ltx_tr" id="S4.T8.2.2.3.1"> <th class="ltx_td ltx_align_justify ltx_align_top ltx_th ltx_th_column ltx_border_t" id="S4.T8.2.2.3.1.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T8.2.2.3.1.1.1"> <span class="ltx_p" id="S4.T8.2.2.3.1.1.1.1" style="width:43.4pt;"><span class="ltx_text ltx_font_bold" id="S4.T8.2.2.3.1.1.1.1.1">Model</span></span> </span> </th> <th class="ltx_td ltx_align_justify ltx_align_top ltx_th ltx_th_column ltx_border_t" id="S4.T8.2.2.3.1.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T8.2.2.3.1.2.1"> <span class="ltx_p" id="S4.T8.2.2.3.1.2.1.1" style="width:390.3pt;"><span class="ltx_text ltx_font_bold" id="S4.T8.2.2.3.1.2.1.1.1">Summary Content</span></span> </span> </th> </tr> </thead> <tbody class="ltx_tbody"> <tr class="ltx_tr" id="S4.T8.2.2.4.1"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T8.2.2.4.1.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T8.2.2.4.1.1.1"> <span class="ltx_p" id="S4.T8.2.2.4.1.1.1.1" style="width:43.4pt;">Original Content</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T8.2.2.4.1.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T8.2.2.4.1.2.1"> <span class="ltx_p" id="S4.T8.2.2.4.1.2.1.1" style="width:390.3pt;">Hydraulics engineering projects undertake tasks of water retention and drainage, thus requiring special properties such as stability, pressure bearing, impermeability, abrasion resistance, frost resistance, and crack resistance in hydraulic structures. According to the technical specifications of hydraulic engineering, specific construction methods and measures must be taken to ensure the quality of the work. However, in practical use, embankments are easily loosened after prolonged exposure to tidal impacts, posing significant risks, and the water surface at the embankment carries a lot of floating large debris, which is very inconvenient for workers to salvage.</span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T8.2.2.5.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T8.2.2.5.2.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T8.2.2.5.2.1.1"> <span class="ltx_p" id="S4.T8.2.2.5.2.1.1.1" style="width:43.4pt;">Manual Summary</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T8.2.2.5.2.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T8.2.2.5.2.2.1"> <span class="ltx_p" id="S4.T8.2.2.5.2.2.1.1" style="width:390.3pt;">This utility model discloses a water engineering anti-surge embankment protection device, related to the technical field of water engineering. The device includes two support rods, two second connection plates, and two first connection plates, all fixed at both ends of the top of the protective board. The device features a protection cleaning mechanism, sliding of the reinforcement plate within the groove to facilitate adjustment of its position for adapting to different water levels, rotation of the reel to adjust the position of the collection plate, facilitating workers in collecting surface garbage, with a simple overall design and compact structure, which protects the embankment while facilitating cleaning of surface garbage, possessing good practicality.</span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T8.1.1.1"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T8.1.1.1.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T8.1.1.1.1.1"> <span class="ltx_p" id="S4.T8.1.1.1.1.1.1" style="width:43.4pt;">MSEA, <math alttext="K=200" class="ltx_Math" display="inline" id="S4.T8.1.1.1.1.1.1.m1.1"><semantics id="S4.T8.1.1.1.1.1.1.m1.1a"><mrow id="S4.T8.1.1.1.1.1.1.m1.1.1" xref="S4.T8.1.1.1.1.1.1.m1.1.1.cmml"><mi id="S4.T8.1.1.1.1.1.1.m1.1.1.2" xref="S4.T8.1.1.1.1.1.1.m1.1.1.2.cmml">K</mi><mo id="S4.T8.1.1.1.1.1.1.m1.1.1.1" xref="S4.T8.1.1.1.1.1.1.m1.1.1.1.cmml">=</mo><mn id="S4.T8.1.1.1.1.1.1.m1.1.1.3" xref="S4.T8.1.1.1.1.1.1.m1.1.1.3.cmml">200</mn></mrow><annotation-xml encoding="MathML-Content" id="S4.T8.1.1.1.1.1.1.m1.1b"><apply id="S4.T8.1.1.1.1.1.1.m1.1.1.cmml" xref="S4.T8.1.1.1.1.1.1.m1.1.1"><eq id="S4.T8.1.1.1.1.1.1.m1.1.1.1.cmml" xref="S4.T8.1.1.1.1.1.1.m1.1.1.1"></eq><ci id="S4.T8.1.1.1.1.1.1.m1.1.1.2.cmml" xref="S4.T8.1.1.1.1.1.1.m1.1.1.2">𝐾</ci><cn id="S4.T8.1.1.1.1.1.1.m1.1.1.3.cmml" type="integer" xref="S4.T8.1.1.1.1.1.1.m1.1.1.3">200</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T8.1.1.1.1.1.1.m1.1c">K=200</annotation><annotation encoding="application/x-llamapun" id="S4.T8.1.1.1.1.1.1.m1.1d">italic_K = 200</annotation></semantics></math></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T8.1.1.1.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T8.1.1.1.2.1"> <span class="ltx_p" id="S4.T8.1.1.1.2.1.1" style="width:390.3pt;">This utility model relates to the technical field of water engineering, specifically an embankment protection device for water engineering, including a protective board with two support rods, two second connection plates fixed at both ends of the top of the protective board. The protective board’s surface has two grooves, and the protective cleaning mechanism is set on the surface. The protective cleaning mechanism includes a reinforcement plate, and the support plate is set on the surface of the protective board, able to adjust…</span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T8.2.2.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_b ltx_border_t" id="S4.T8.2.2.2.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T8.2.2.2.1.1"> <span class="ltx_p" id="S4.T8.2.2.2.1.1.1" style="width:43.4pt;">MSEA, <math alttext="K=150" class="ltx_Math" display="inline" id="S4.T8.2.2.2.1.1.1.m1.1"><semantics id="S4.T8.2.2.2.1.1.1.m1.1a"><mrow id="S4.T8.2.2.2.1.1.1.m1.1.1" xref="S4.T8.2.2.2.1.1.1.m1.1.1.cmml"><mi id="S4.T8.2.2.2.1.1.1.m1.1.1.2" xref="S4.T8.2.2.2.1.1.1.m1.1.1.2.cmml">K</mi><mo id="S4.T8.2.2.2.1.1.1.m1.1.1.1" xref="S4.T8.2.2.2.1.1.1.m1.1.1.1.cmml">=</mo><mn id="S4.T8.2.2.2.1.1.1.m1.1.1.3" xref="S4.T8.2.2.2.1.1.1.m1.1.1.3.cmml">150</mn></mrow><annotation-xml encoding="MathML-Content" id="S4.T8.2.2.2.1.1.1.m1.1b"><apply id="S4.T8.2.2.2.1.1.1.m1.1.1.cmml" xref="S4.T8.2.2.2.1.1.1.m1.1.1"><eq id="S4.T8.2.2.2.1.1.1.m1.1.1.1.cmml" xref="S4.T8.2.2.2.1.1.1.m1.1.1.1"></eq><ci id="S4.T8.2.2.2.1.1.1.m1.1.1.2.cmml" xref="S4.T8.2.2.2.1.1.1.m1.1.1.2">𝐾</ci><cn id="S4.T8.2.2.2.1.1.1.m1.1.1.3.cmml" type="integer" xref="S4.T8.2.2.2.1.1.1.m1.1.1.3">150</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T8.2.2.2.1.1.1.m1.1c">K=150</annotation><annotation encoding="application/x-llamapun" id="S4.T8.2.2.2.1.1.1.m1.1d">italic_K = 150</annotation></semantics></math></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_b ltx_border_t" id="S4.T8.2.2.2.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T8.2.2.2.2.1"> <span class="ltx_p" id="S4.T8.2.2.2.2.1.1" style="width:390.3pt;">This utility model pertains to the technical field of water engineering, especially an embankment protection device for water engineering, including a protective board with two support rods and two second connection plates fixed at both ends of the top of the protective board. The device features a cleaning part that facilitates adjustment of the reinforcement plate, suitable for different water levels, easy position adjustment, and convenient garbage cleaning. This device is ingeniously designed for embankment protection and is useful.</span> </span> </td> </tr> </tbody> </table> </span></div> </figure> <div class="ltx_para" id="S4.SS9.p2"> <p class="ltx_p" id="S4.SS9.p2.1">Table <a class="ltx_ref" href="https://arxiv.org/html/2411.14072v1#S4.T8" title="Table 8 ‣ 4.9 Enhanced Repetition Suppression Mechanism Results Analysis ‣ 4 Experiments ‣ The Master-Slave Encoder Model for Improving Patent Text Summarization: A New Approach to Combining Specifications and Claims"><span class="ltx_text ltx_ref_tag">8</span></a> displays examples generated by the MSEA model of this paper without using the coverage mechanism. Results show that when the decoding length is set to a smaller value, the master-slave encoding model without the coverage mechanism still manages to suppress repetition.</p> </div> </section> </section> <section class="ltx_section" id="S5"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">5 </span>Conclusions</h2> <div class="ltx_para" id="S5.p1"> <p class="ltx_p" id="S5.p1.1">This study enhanced patent text summarization by developing a master-slave encoder architecture (MSEA) model that integrates patent specifications and claims, significantly improving summary quality. The MSEA model addresses traditional limitations by incorporating a pointer network for handling new technological terms and an enhanced repetition suppression mechanism to reduce content redundancy, thereby overcoming the inadequacies of previous summarization methods and coverage mechanisms.</p> </div> </section> <section class="ltx_bibliography" id="bib"> <h2 class="ltx_title ltx_title_bibliography">References</h2> <ul class="ltx_biblist"> <li class="ltx_bibitem" id="bib.bib1"> <span class="ltx_tag ltx_tag_bibitem">[1]</span> <span class="ltx_bibblock"> Following Hillary Clinton and Donald Trump: A dissection of their tweets in the 2016 U.S. presidential election (2016) </span> </li> <li class="ltx_bibitem" id="bib.bib2"> <span class="ltx_tag ltx_tag_bibitem">[2]</span> <span class="ltx_bibblock"> Abdulahi, S.M., Yitayaw, M.K., Feyisa, H.L., Mamo, W.B.: Factor affecting technical efficiency of the banking sector: Evidence from Ethiopia. Cogent Economics &amp; Finance <span class="ltx_text ltx_font_bold" id="bib.bib2.1.1">11</span>(1), 2186039 (Dec 2023). https://doi.org/10.1080/23322039.2023.2186039 </span> </li> <li class="ltx_bibitem" id="bib.bib3"> <span class="ltx_tag ltx_tag_bibitem">[3]</span> <span class="ltx_bibblock"> Al-Saif, H.F., Al-Dossari, H.Z.: Exploring the Role of Emotions in Arabic Rumor Detection in Social Media. Applied Sciences <span class="ltx_text ltx_font_bold" id="bib.bib3.1.1">13</span>(15),  8815 (Jul 2023). https://doi.org/10.3390/app13158815 </span> </li> <li class="ltx_bibitem" id="bib.bib4"> <span class="ltx_tag ltx_tag_bibitem">[4]</span> <span class="ltx_bibblock"> Ali, G., Malik, M.S.I.: Rumour identification on Twitter as a function of novel textual and language-context features. Multimedia Tools and Applications <span class="ltx_text ltx_font_bold" id="bib.bib4.1.1">82</span>(5), 7017–7038 (Feb 2023). https://doi.org/10.1007/s11042-022-13595-4 </span> </li> <li class="ltx_bibitem" id="bib.bib5"> <span class="ltx_tag ltx_tag_bibitem">[5]</span> <span class="ltx_bibblock"> Armoti, A.A., Oswal, N., Jawabri, A.: The relationship between the recruitment strategy and the competitive advantage. International Journal of Business Performance Management <span class="ltx_text ltx_font_bold" id="bib.bib5.1.1">24</span>(3-4), 286–303 (2023). https://doi.org/10.1504/IJBPM.2023.132313 </span> </li> <li class="ltx_bibitem" id="bib.bib6"> <span class="ltx_tag ltx_tag_bibitem">[6]</span> <span class="ltx_bibblock"> Bai, N., Meng, F., Rui, X., Wang, Z.: Rumor detection based on a Source-Replies conversation Tree Convolutional Neural Net. Computing <span class="ltx_text ltx_font_bold" id="bib.bib6.1.1">104</span>(5), 1155–1171 (May 2022). https://doi.org/10.1007/s00607-021-01034-5 </span> </li> <li class="ltx_bibitem" id="bib.bib7"> <span class="ltx_tag ltx_tag_bibitem">[7]</span> <span class="ltx_bibblock"> Barton, M., Hamza, M., Guevel, B.: Racial Equity in Healthcare Machine Learning: Illustrating Bias in Models With Minimal Bias Mitigation. Cureus (Feb 2023). https://doi.org/10.7759/cureus.35037 </span> </li> <li class="ltx_bibitem" id="bib.bib8"> <span class="ltx_tag ltx_tag_bibitem">[8]</span> <span class="ltx_bibblock"> Baykara, B., Güngör, T.: Turkish abstractive text summarization using pretrained sequence-to-sequence models. Natural Language Engineering <span class="ltx_text ltx_font_bold" id="bib.bib8.1.1">29</span>(5), 1275–1304 (Sep 2023). https://doi.org/10.1017/S1351324922000195 </span> </li> <li class="ltx_bibitem" id="bib.bib9"> <span class="ltx_tag ltx_tag_bibitem">[9]</span> <span class="ltx_bibblock"> Bharadwaj, A., El Sawy, O.A., University of Southern California, Pavlou, P.A., Temple University, Venkatraman, N., Boston University: Digital Business Strategy: Toward a Next Generation of Insights. MIS Quarterly <span class="ltx_text ltx_font_bold" id="bib.bib9.1.1">37</span>(2), 471–482 (Feb 2013). https://doi.org/10.25300/MISQ/2013/37:2.3 </span> </li> <li class="ltx_bibitem" id="bib.bib10"> <span class="ltx_tag ltx_tag_bibitem">[10]</span> <span class="ltx_bibblock"> Bian, T., Xiao, X., Xu, T., Zhao, P., Huang, W., Rong, Y., Huang, J.: Rumor Detection on Social Media with Bi-Directional Graph Convolutional Networks (Jan 2020) </span> </li> <li class="ltx_bibitem" id="bib.bib11"> <span class="ltx_tag ltx_tag_bibitem">[11]</span> <span class="ltx_bibblock"> Bing, C., Wu, Y., Dong, F., Xu, S., Liu, X., Sun, S.: Dual Co-Attention-Based Multi-Feature Fusion Method for Rumor Detection. Information <span class="ltx_text ltx_font_bold" id="bib.bib11.1.1">13</span>(1),  25 (Jan 2022). https://doi.org/10.3390/info13010025 </span> </li> <li class="ltx_bibitem" id="bib.bib12"> <span class="ltx_tag ltx_tag_bibitem">[12]</span> <span class="ltx_bibblock"> Bordoloi, M., Chatterjee, P.C., Biswas, S.K., Purkayastha, B.: Keyword extraction using supervised cumulative TextRank. Multimedia Tools and Applications <span class="ltx_text ltx_font_bold" id="bib.bib12.1.1">79</span>(41-42), 31467–31496 (Nov 2020). https://doi.org/10.1007/s11042-020-09335-1 </span> </li> <li class="ltx_bibitem" id="bib.bib13"> <span class="ltx_tag ltx_tag_bibitem">[13]</span> <span class="ltx_bibblock"> Boreshban, Y., Mirbostani, S.M., Ghassem-Sani, G., Mirroshandel, S.A., Amiriparian, S.: Improving question answering performance using knowledge distillation and active learning. Engineering Applications of Artificial Intelligence <span class="ltx_text ltx_font_bold" id="bib.bib13.1.1">123</span>, 106137 (Aug 2023). https://doi.org/10.1016/j.engappai.2023.106137 </span> </li> <li class="ltx_bibitem" id="bib.bib14"> <span class="ltx_tag ltx_tag_bibitem">[14]</span> <span class="ltx_bibblock"> Bosselut, A., Rashkin, H., Sap, M., Malaviya, C., Celikyilmaz, A., Choi, Y.: COMET: Commonsense Transformers for Automatic Knowledge Graph Construction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. pp. 4762–4779. Association for Computational Linguistics, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1470 </span> </li> <li class="ltx_bibitem" id="bib.bib15"> <span class="ltx_tag ltx_tag_bibitem">[15]</span> <span class="ltx_bibblock"> Boubaker, S., Do, D.T., Hammami, H., Ly, K.C.: The role of bank affiliation in bank efficiency: A fuzzy multi-objective data envelopment analysis approach. Annals of Operations Research <span class="ltx_text ltx_font_bold" id="bib.bib15.1.1">311</span>(2), 611–639 (Apr 2022). https://doi.org/10.1007/s10479-020-03817-z </span> </li> <li class="ltx_bibitem" id="bib.bib16"> <span class="ltx_tag ltx_tag_bibitem">[16]</span> <span class="ltx_bibblock"> Boussaha, B.E.A., Hernandez, N., Jacquin, C., Morin, E.: End-to-end response selection based on multi-level context response matching. Computer Speech &amp; Language <span class="ltx_text ltx_font_bold" id="bib.bib16.1.1">63</span>, 101080 (Sep 2020). https://doi.org/10.1016/j.csl.2020.101080 </span> </li> <li class="ltx_bibitem" id="bib.bib17"> <span class="ltx_tag ltx_tag_bibitem">[17]</span> <span class="ltx_bibblock"> Cadene, R., Dancette, C.: RUBi: Reducing Unimodal Biases for Visual Question Answering </span> </li> <li class="ltx_bibitem" id="bib.bib18"> <span class="ltx_tag ltx_tag_bibitem">[18]</span> <span class="ltx_bibblock"> Cao, L.: A New Age of AI: Features and Futures. IEEE Intelligent Systems <span class="ltx_text ltx_font_bold" id="bib.bib18.1.1">37</span>(1), 25–37 (Jan 2022). https://doi.org/10.1109/MIS.2022.3150944 </span> </li> <li class="ltx_bibitem" id="bib.bib19"> <span class="ltx_tag ltx_tag_bibitem">[19]</span> <span class="ltx_bibblock"> Chen, J., Wei, N., Yang, H.: Immune Algorithm to Suppress Rumor Propagation Based on Influence Maximization. Security and Communication Networks <span class="ltx_text ltx_font_bold" id="bib.bib19.1.1">2022</span>, 6785828 (Apr 2022). https://doi.org/10.1155/2022/6785828 </span> </li> <li class="ltx_bibitem" id="bib.bib20"> <span class="ltx_tag ltx_tag_bibitem">[20]</span> <span class="ltx_bibblock"> Chen, W., Zhang, Y., Yeo, C.K., Lau, C.T., Lee, B.S.: Unsupervised rumor detection based on users’ behaviors using neural networks. Pattern Recognition Letters <span class="ltx_text ltx_font_bold" id="bib.bib20.1.1">105</span>, 226–233 (Apr 2018). https://doi.org/10.1016/j.patrec.2017.10.014 </span> </li> <li class="ltx_bibitem" id="bib.bib21"> <span class="ltx_tag ltx_tag_bibitem">[21]</span> <span class="ltx_bibblock"> Chen, X., Jia, S., Xiang, Y.: A review: Knowledge reasoning over knowledge graph. EXPERT SYSTEMS WITH APPLICATIONS <span class="ltx_text ltx_font_bold" id="bib.bib21.1.1">141</span>, 112948 (Mar 2020). https://doi.org/10.1016/j.eswa.2019.112948 </span> </li> <li class="ltx_bibitem" id="bib.bib22"> <span class="ltx_tag ltx_tag_bibitem">[22]</span> <span class="ltx_bibblock"> Chen, X., Zhou, F., Trajcevski, G., Bonsangue, M.: Multi-view learning with distinguishable feature fusion for rumor detection. Knowledge-Based Systems <span class="ltx_text ltx_font_bold" id="bib.bib22.1.1">240</span>, 108085 (Mar 2022). https://doi.org/10.1016/j.knosys.2021.108085 </span> </li> <li class="ltx_bibitem" id="bib.bib23"> <span class="ltx_tag ltx_tag_bibitem">[23]</span> <span class="ltx_bibblock"> Chen, Z., Zhang, Y., Fang, Y., Geng, Y., Guo, L., Chen, X., Li, Q., Zhang, W., Chen, J., Zhu, Y., et al.: Knowledge graphs meet multi-modal learning: A comprehensive survey. arXiv preprint arXiv:2402.05391 (2024) </span> </li> <li class="ltx_bibitem" id="bib.bib24"> <span class="ltx_tag ltx_tag_bibitem">[24]</span> <span class="ltx_bibblock"> Choi, E., Hewlett, D., Uszkoreit, J., Polosukhin, I., Lacoste, A., Berant, J.: Coarse-to-Fine Question Answering for Long Documents. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 209–220. Association for Computational Linguistics, Vancouver, Canada (2017). https://doi.org/10.18653/v1/P17-1020 </span> </li> <li class="ltx_bibitem" id="bib.bib25"> <span class="ltx_tag ltx_tag_bibitem">[25]</span> <span class="ltx_bibblock"> Choi, K.: Computational Thematic Analysis of Poetry via Bimodal Large Language Models. Proceedings of the Association for Information Science and Technology <span class="ltx_text ltx_font_bold" id="bib.bib25.1.1">60</span>(1), 538–542 (Oct 2023). https://doi.org/10.1002/pra2.812 </span> </li> <li class="ltx_bibitem" id="bib.bib26"> <span class="ltx_tag ltx_tag_bibitem">[26]</span> <span class="ltx_bibblock"> Choi, K.: Computational Thematic Analysis of Poetry via Bimodal Large Language Models. Proceedings of the Association for Information Science and Technology <span class="ltx_text ltx_font_bold" id="bib.bib26.1.1">60</span>(1), 538–542 (Oct 2023). https://doi.org/10.1002/pra2.812 </span> </li> <li class="ltx_bibitem" id="bib.bib27"> <span class="ltx_tag ltx_tag_bibitem">[27]</span> <span class="ltx_bibblock"> Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: ELECTRA: Pre-training text encoders as discriminators rather than generators. In: ICLR (2020), <a class="ltx_ref" href="https://openreview.net/pdf?id=r1xMH1BtvB" title="">https://openreview.net/pdf?id=r1xMH1BtvB</a> </span> </li> <li class="ltx_bibitem" id="bib.bib28"> <span class="ltx_tag ltx_tag_bibitem">[28]</span> <span class="ltx_bibblock"> Dai, Y., Fu, Y., Yang, L.: A Multiple-Choice Machine Reading Comprehension Model with Multi-Granularity Semantic Reasoning. Applied Sciences <span class="ltx_text ltx_font_bold" id="bib.bib28.1.1">11</span>(17),  7945 (Aug 2021). https://doi.org/10.3390/app11177945 </span> </li> <li class="ltx_bibitem" id="bib.bib29"> <span class="ltx_tag ltx_tag_bibitem">[29]</span> <span class="ltx_bibblock"> Deng, D., Jing, L., Yu, J., Sun, S., Ng, M.K.: Sentiment lexicon construction with hierarchical supervision topic model. IEEE/ACM Transactions on audio, speech, and language processing <span class="ltx_text ltx_font_bold" id="bib.bib29.1.1">27</span>(4), 704–718 (2019) </span> </li> <li class="ltx_bibitem" id="bib.bib30"> <span class="ltx_tag ltx_tag_bibitem">[30]</span> <span class="ltx_bibblock"> Deng, Z., Zou, D., Jiang, D.: Multi-turn response selection with multi-level granularity representations and 3D convolutional neural network. In: Proceedings of the 2020 4th High Performance Computing and Cluster Technologies Conference &amp; 2020 3rd International Conference on Big Data and Artificial Intelligence. pp. 18–23. ACM, Qingdao China (Jul 2020). https://doi.org/10.1145/3409501.3409529 </span> </li> <li class="ltx_bibitem" id="bib.bib31"> <span class="ltx_tag ltx_tag_bibitem">[31]</span> <span class="ltx_bibblock"> Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). pp. 4171–4186 (2019) </span> </li> <li class="ltx_bibitem" id="bib.bib32"> <span class="ltx_tag ltx_tag_bibitem">[32]</span> <span class="ltx_bibblock"> Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. pp. 4171–4186 (2019). https://doi.org/10.18653/v1/N19-1423 </span> </li> <li class="ltx_bibitem" id="bib.bib33"> <span class="ltx_tag ltx_tag_bibitem">[33]</span> <span class="ltx_bibblock"> Ditkaew, K.: Strategic Management Accounting on Competitive Advantage. International Journal of Asian Business and Information Management <span class="ltx_text ltx_font_bold" id="bib.bib33.1.1">10</span>(2), 232–244 (2022) </span> </li> <li class="ltx_bibitem" id="bib.bib34"> <span class="ltx_tag ltx_tag_bibitem">[34]</span> <span class="ltx_bibblock"> Dong, X., Lian, Y., Chi, Y., Tang, X., Liu, Y.: A two-step rumor detection model based on the supernetwork theory about Weibo. Journal of Supercomputing <span class="ltx_text ltx_font_bold" id="bib.bib34.1.1">77</span>(10), 12050–12074 (Oct 2021). https://doi.org/10.1007/s11227-021-03748-x </span> </li> <li class="ltx_bibitem" id="bib.bib35"> <span class="ltx_tag ltx_tag_bibitem">[35]</span> <span class="ltx_bibblock"> Dou, X., Li, M., Zhao, J., Gao, S.: A Text Classification Model Based on Virtual Adversarial Training and Bilateral Contrastive Learning. In: 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC). pp. 1978–1983. IEEE, Honolulu, Oahu, HI, USA (Oct 2023). https://doi.org/10.1109/SMC53992.2023.10394238 </span> </li> <li class="ltx_bibitem" id="bib.bib36"> <span class="ltx_tag ltx_tag_bibitem">[36]</span> <span class="ltx_bibblock"> Draude, C., Klumbyte, G., Lücking, P., Treusch, P.: Situated algorithms: A sociotechnical systemic approach to bias. Online Information Review <span class="ltx_text ltx_font_bold" id="bib.bib36.1.1">44</span>(2), 325–342 (Nov 2019). https://doi.org/10.1108/OIR-10-2018-0332 </span> </li> <li class="ltx_bibitem" id="bib.bib37"> <span class="ltx_tag ltx_tag_bibitem">[37]</span> <span class="ltx_bibblock"> Garrido-Muñoz, I., Montejo-Ráez, A., Martínez-Santiago, F., Ureña-López, L.A.: A Survey on Bias in Deep NLP (Mar 2021). https://doi.org/10.20944/preprints202103.0049.v1 </span> </li> <li class="ltx_bibitem" id="bib.bib38"> <span class="ltx_tag ltx_tag_bibitem">[38]</span> <span class="ltx_bibblock"> Ge, X., Zhang, M., Wei, B., Liu, Y.: A Rumor Detection Method Based on Graph Convolutional Network. In: Li, X. (ed.) Advances in Intelligent Automation and Soft Computing, vol. 80, pp. 423–429. Springer International Publishing, Cham (2022). https://doi.org/10.1007/978-3-030-81007-8_47 </span> </li> <li class="ltx_bibitem" id="bib.bib39"> <span class="ltx_tag ltx_tag_bibitem">[39]</span> <span class="ltx_bibblock"> Ghosal, D., Majumder, N., Poria, S., Chhaya, N., Gelbukh, A.: DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). pp. 154–164. Association for Computational Linguistics, Hong Kong, China (2019). https://doi.org/10.18653/v1/D19-1015 </span> </li> <li class="ltx_bibitem" id="bib.bib40"> <span class="ltx_tag ltx_tag_bibitem">[40]</span> <span class="ltx_bibblock"> Goby, V.P., Karimova, G.Z.: “Simple rules” as an approach to corporate selection of CSR strategies. International Journal of Organizational Analysis <span class="ltx_text ltx_font_bold" id="bib.bib40.1.1">30</span>(2), 197–206 (Jan 2022). https://doi.org/10.1108/IJOA-07-2020-2320 </span> </li> <li class="ltx_bibitem" id="bib.bib41"> <span class="ltx_tag ltx_tag_bibitem">[41]</span> <span class="ltx_bibblock"> Grosz, B.J., Kraus, S.: Collaborative plans for complex group action. Artificial Intelligence <span class="ltx_text ltx_font_bold" id="bib.bib41.1.1">86</span>(2), 269–357 (1996) </span> </li> <li class="ltx_bibitem" id="bib.bib42"> <span class="ltx_tag ltx_tag_bibitem">[42]</span> <span class="ltx_bibblock"> Gu, J.C., Ling, Z.H., Liu, Q.: Utterance-to-Utterance Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. IEEE/ACM Transactions on Audio, Speech, and Language Processing <span class="ltx_text ltx_font_bold" id="bib.bib42.1.1">28</span>, 369–379 (2020). https://doi.org/10.1109/TASLP.2019.2955290 </span> </li> <li class="ltx_bibitem" id="bib.bib43"> <span class="ltx_tag ltx_tag_bibitem">[43]</span> <span class="ltx_bibblock"> Gumaei, A., Al-Rakhami, M.S., Hassan, M.M., De Albuquerque, V.H.C., Camacho, D.: An Effective Approach for Rumor Detection of Arabic Tweets Using eXtreme Gradient Boosting Method. ACM Transactions on Asian and Low-Resource Language Information Processing <span class="ltx_text ltx_font_bold" id="bib.bib43.1.1">21</span>(1), 1–16 (Jan 2022). https://doi.org/10.1145/3461697 </span> </li> <li class="ltx_bibitem" id="bib.bib44"> <span class="ltx_tag ltx_tag_bibitem">[44]</span> <span class="ltx_bibblock"> Guo, Y., Nie, L., Cheng, Z., Tian, Q., Zhang, M.: Loss re-scaling VQA: Revisiting the LanguagePrior Problem from a Class-imbalance View (Dec 2021) </span> </li> <li class="ltx_bibitem" id="bib.bib45"> <span class="ltx_tag ltx_tag_bibitem">[45]</span> <span class="ltx_bibblock"> Guoliang, S., Shu, Z., Yunfeng, W., Chunjiang, S., Liang, L.: Generating patent text abstracts based on improved multi-head attention mechanism. Data Analysis and Knowledge Discovery <span class="ltx_text ltx_font_bold" id="bib.bib45.1.1">7</span>(6), 61–72 (2023) </span> </li> <li class="ltx_bibitem" id="bib.bib46"> <span class="ltx_tag ltx_tag_bibitem">[46]</span> <span class="ltx_bibblock"> Hu, M., Peng, Y., Huang, Z., Li, D.: Retrieve, Read, Rerank: Towards End-to-End Multi-Document Reading Comprehension. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. pp. 2285–2295. Association for Computational Linguistics, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1221 </span> </li> <li class="ltx_bibitem" id="bib.bib47"> <span class="ltx_tag ltx_tag_bibitem">[47]</span> <span class="ltx_bibblock"> Jia, R., Liang, P.: Adversarial examples for evaluating reading comprehension systems. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. pp. 2021–2031 (2017) </span> </li> <li class="ltx_bibitem" id="bib.bib48"> <span class="ltx_tag ltx_tag_bibitem">[48]</span> <span class="ltx_bibblock"> Jing, C., Wu, Y., Zhang, X., Jia, Y., Wu, Q.: Overcoming Language Priors in VQA via Decomposed Linguistic Representations. Proceedings of the AAAI Conference on Artificial Intelligence <span class="ltx_text ltx_font_bold" id="bib.bib48.1.1">34</span>(07), 11181–11188 (Apr 2020). https://doi.org/10.1609/aaai.v34i07.6776 </span> </li> <li class="ltx_bibitem" id="bib.bib49"> <span class="ltx_tag ltx_tag_bibitem">[49]</span> <span class="ltx_bibblock"> Joshi, M., Chen, D., Liu, Y., Weld, D.S., Zettlemoyer, L., Levy, O.: SpanBERT: Improving Pre-training by Representing and Predicting Spans (Jan 2020) </span> </li> <li class="ltx_bibitem" id="bib.bib50"> <span class="ltx_tag ltx_tag_bibitem">[50]</span> <span class="ltx_bibblock"> Kautz, H.A., Selman, B.: Planning as satisfiability. In: Proceedings of the 10th European Conference on Artificial Intelligence (ECAI). pp. 359–363 (1992) </span> </li> <li class="ltx_bibitem" id="bib.bib51"> <span class="ltx_tag ltx_tag_bibitem">[51]</span> <span class="ltx_bibblock"> Khoury, C., Owen-Smith, A., Joh, U., Duan, Y., Hemsley, J.: <span class="ltx_text ltx_font_smallcaps" id="bib.bib51.1.1">Multi-Modal</span> Crisis Discourse and Collective Sensemaking on <span class="ltx_text ltx_font_smallcaps" id="bib.bib51.2.2">TikTok</span>. Proceedings of the Association for Information Science and Technology <span class="ltx_text ltx_font_bold" id="bib.bib51.3.3">60</span>(1), 203–212 (Oct 2023). https://doi.org/10.1002/pra2.781 </span> </li> <li class="ltx_bibitem" id="bib.bib52"> <span class="ltx_tag ltx_tag_bibitem">[52]</span> <span class="ltx_bibblock"> Kudo, T., Richardson, J.: Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. pp. 66–71 (2018) </span> </li> <li class="ltx_bibitem" id="bib.bib53"> <span class="ltx_tag ltx_tag_bibitem">[53]</span> <span class="ltx_bibblock"> Le, Z., Jidong, L., Xueqiang, L., Zhuo, C., Lei, W., Xindong, Y.: Rlcpar: A rewriting model for chinese patent abstracts based on reinforcement learning. Data Analysis and Knowledge Discovery <span class="ltx_text ltx_font_bold" id="bib.bib53.1.1">5</span>(7), 59–69 (2021) </span> </li> <li class="ltx_bibitem" id="bib.bib54"> <span class="ltx_tag ltx_tag_bibitem">[54]</span> <span class="ltx_bibblock"> Levine, A., Park, J., Kuo, H.J.: Understanding Disability Biases in Undergraduate Rehabilitation Students: An Exploratory Study. Rehabilitation Counseling Bulletin <span class="ltx_text ltx_font_bold" id="bib.bib54.1.1">64</span>(3), 172–180 (Apr 2021). https://doi.org/10.1177/0034355220910238 </span> </li> <li class="ltx_bibitem" id="bib.bib55"> <span class="ltx_tag ltx_tag_bibitem">[55]</span> <span class="ltx_bibblock"> Li, J., Liu, M., Kan, M.Y., Zheng, Z., Wang, Z., Lei, W., Liu, T., Qin, B.: Molweni: A Challenge Multiparty Dialogue-based Machine Reading Comprehension Dataset with Discourse Structure </span> </li> <li class="ltx_bibitem" id="bib.bib56"> <span class="ltx_tag ltx_tag_bibitem">[56]</span> <span class="ltx_bibblock"> Li, J., Liu, M., Zheng, Z., Zhang, H., Qin, B., Kan, M.Y., Liu, T.: DADgraph: A Discourse-aware Dialogue Graph Neural Network for Multiparty Dialogue Machine Reading Comprehension (Apr 2021) </span> </li> <li class="ltx_bibitem" id="bib.bib57"> <span class="ltx_tag ltx_tag_bibitem">[57]</span> <span class="ltx_bibblock"> Li, J., Zhang, C., Chen, X., Cao, Y., Liao, P., Zhang, P.: Abstractive Text Summarization with Multi-Head Attention. In: 2019 International Joint Conference on Neural Networks (IJCNN). pp. 1–8. IEEE, Budapest, Hungary (Jul 2019). https://doi.org/10.1109/IJCNN.2019.8851885 </span> </li> <li class="ltx_bibitem" id="bib.bib58"> <span class="ltx_tag ltx_tag_bibitem">[58]</span> <span class="ltx_bibblock"> Li, J., Liu, C., Tao, C., Chan, Z., Zhao, D., Zhang, M., Yan, R.: Dialogue History Matters! Personalized Response Selection in Multi-Turn Retrieval-Based Chatbots. ACM Transactions on Information Systems <span class="ltx_text ltx_font_bold" id="bib.bib58.1.1">39</span>(4), 1–25 (Oct 2021). https://doi.org/10.1145/3453183 </span> </li> <li class="ltx_bibitem" id="bib.bib59"> <span class="ltx_tag ltx_tag_bibitem">[59]</span> <span class="ltx_bibblock"> Li, L., Li, C., Ji, D.: Deep context modeling for multi-turn response selection in dialogue systems. Information Processing &amp; Management <span class="ltx_text ltx_font_bold" id="bib.bib59.1.1">58</span>(1), 102415 (Jan 2021). https://doi.org/10.1016/j.ipm.2020.102415 </span> </li> <li class="ltx_bibitem" id="bib.bib60"> <span class="ltx_tag ltx_tag_bibitem">[60]</span> <span class="ltx_bibblock"> Li, Y., Zhao, H.: Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance for Multi-party Dialogue Reading Comprehension. In: Findings of the Association for Computational Linguistics: EMNLP 2021. pp. 2053–2063. Association for Computational Linguistics, Punta Cana, Dominican Republic (2021). https://doi.org/10.18653/v1/2021.findings-emnlp.176 </span> </li> <li class="ltx_bibitem" id="bib.bib61"> <span class="ltx_tag ltx_tag_bibitem">[61]</span> <span class="ltx_bibblock"> Li, Y., Zhao, H., Zhang, Z.: Back to the Future: Bidirectional Information Decoupling Network for Multi-turn Dialogue Modeling. In: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. pp. 2761–2774. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (2022). https://doi.org/10.18653/v1/2022.emnlp-main.177 </span> </li> <li class="ltx_bibitem" id="bib.bib62"> <span class="ltx_tag ltx_tag_bibitem">[62]</span> <span class="ltx_bibblock"> Liew, S.R.C., Law, N.F.: Use of subword tokenization for domain generation algorithm classification. Cybersecurity <span class="ltx_text ltx_font_bold" id="bib.bib62.1.1">6</span>(1),  49 (Sep 2023). https://doi.org/10.1186/s42400-023-00183-8 </span> </li> <li class="ltx_bibitem" id="bib.bib63"> <span class="ltx_tag ltx_tag_bibitem">[63]</span> <span class="ltx_bibblock"> Lin, J., Su, Q., Yang, P., Ma, S., Sun, X.: Semantic-Unit-Based Dilated Convolution for Multi-Label Text Classification. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. pp. 4554–4564. Association for Computational Linguistics, Brussels, Belgium (2018). https://doi.org/10.18653/v1/D18-1485 </span> </li> <li class="ltx_bibitem" id="bib.bib64"> <span class="ltx_tag ltx_tag_bibitem">[64]</span> <span class="ltx_bibblock"> Liu, L., Zhang, Z., Zhao, H., Zhou, X., Zhou, X.: Filling the Gap of Utterance-aware and Speaker-aware Representation for Multi-turn Dialogue. Proceedings of the AAAI Conference on Artificial Intelligence <span class="ltx_text ltx_font_bold" id="bib.bib64.1.1">35</span>(15), 13406–13414 (May 2021). https://doi.org/10.1609/aaai.v35i15.17582 </span> </li> <li class="ltx_bibitem" id="bib.bib65"> <span class="ltx_tag ltx_tag_bibitem">[65]</span> <span class="ltx_bibblock"> Liu, M., Yang, E., Xiong, D., Zhang, Y., Meng, Y., Hu, C., Xu, J., Chen, Y.: A learning-exploring method to generate diverse paraphrases with multi-objective deep reinforcement learning. In: Proceedings of the 28th International Conference on Computational Linguistics. pp. 2310–2321 (2020) </span> </li> <li class="ltx_bibitem" id="bib.bib66"> <span class="ltx_tag ltx_tag_bibitem">[66]</span> <span class="ltx_bibblock"> Lu, Y., Guo, C., Dou, Y., Dai, X., Wang, F.Y.: Could ChatGPT Imagine: Content Control for Artistic Painting Generation Via Large Language Models. Journal of Intelligent &amp; Robotic Systems <span class="ltx_text ltx_font_bold" id="bib.bib66.1.1">109</span>(2),  39 (Oct 2023). https://doi.org/10.1007/s10846-023-01956-6 </span> </li> <li class="ltx_bibitem" id="bib.bib67"> <span class="ltx_tag ltx_tag_bibitem">[67]</span> <span class="ltx_bibblock"> Ma, J., Gao, W., Wong, K.F.: Rumor Detection on Twitter with Tree-structured Recursive Neural Networks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 1980–1989. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-1184 </span> </li> <li class="ltx_bibitem" id="bib.bib68"> <span class="ltx_tag ltx_tag_bibitem">[68]</span> <span class="ltx_bibblock"> Ma, X., Zhang, Z., Zhao, H.: Enhanced Speaker-Aware Multi-Party Multi-Turn Dialogue Comprehension. IEEE/ACM Transactions on Audio, Speech, and Language Processing <span class="ltx_text ltx_font_bold" id="bib.bib68.1.1">31</span>, 2410–2423 (2023). https://doi.org/10.1109/TASLP.2023.3284516 </span> </li> <li class="ltx_bibitem" id="bib.bib69"> <span class="ltx_tag ltx_tag_bibitem">[69]</span> <span class="ltx_bibblock"> Manzini, T., Lim, Y.C., Tsvetkov, Y., Black, A.W.: Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings (Jul 2019) </span> </li> <li class="ltx_bibitem" id="bib.bib70"> <span class="ltx_tag ltx_tag_bibitem">[70]</span> <span class="ltx_bibblock"> Marodon, R.: Can Development Banks Step Up to the Challenge of Sustainable Development? Review of Political Economy <span class="ltx_text ltx_font_bold" id="bib.bib70.1.1">34</span>(2), 268–285 (Apr 2022). https://doi.org/10.1080/09538259.2021.1977542 </span> </li> <li class="ltx_bibitem" id="bib.bib71"> <span class="ltx_tag ltx_tag_bibitem">[71]</span> <span class="ltx_bibblock"> Mihalcea, R., Tarau, P.: Textrank: Bringing order into text. In: Proceedings of the 2004 conference on empirical methods in natural language processing. pp. 404–411 (2004) </span> </li> <li class="ltx_bibitem" id="bib.bib72"> <span class="ltx_tag ltx_tag_bibitem">[72]</span> <span class="ltx_bibblock"> Mihaylov, T., Frank, A.: Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 821–832. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-1076 </span> </li> <li class="ltx_bibitem" id="bib.bib73"> <span class="ltx_tag ltx_tag_bibitem">[73]</span> <span class="ltx_bibblock"> Mikhalkina, T., Cabantous, L.: Business Model Innovation: How Iconic Business Models Emerge. In: Baden-Fuller, C., Mangematin, V. (eds.) Advances in Strategic Management, vol. 33, pp. 59–95. Emerald Group Publishing Limited (Oct 2015). https://doi.org/10.1108/S0742-332220150000033024 </span> </li> <li class="ltx_bibitem" id="bib.bib74"> <span class="ltx_tag ltx_tag_bibitem">[74]</span> <span class="ltx_bibblock"> Min, S., Zhong, V., Socher, R., Xiong, C.: Efficient and Robust Question Answering from Minimal Context over Documents. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 1725–1735. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-1160 </span> </li> <li class="ltx_bibitem" id="bib.bib75"> <span class="ltx_tag ltx_tag_bibitem">[75]</span> <span class="ltx_bibblock"> Min, S., Zhong, V., Socher, R., Xiong, C.: Efficient and Robust Question Answering from Minimal Context over Documents. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 1725–1735. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-1160 </span> </li> <li class="ltx_bibitem" id="bib.bib76"> <span class="ltx_tag ltx_tag_bibitem">[76]</span> <span class="ltx_bibblock"> Moratanch, N., Chitrakala, S.: A survey on abstractive text summarization. In: 2016 International Conference on Circuit, power and computing technologies (ICCPCT). pp. 1–7. IEEE (2016) </span> </li> <li class="ltx_bibitem" id="bib.bib77"> <span class="ltx_tag ltx_tag_bibitem">[77]</span> <span class="ltx_bibblock"> Nadeem, M., Bethke, A., Reddy, S.: StereoSet: Measuring stereotypical bias in pretrained language models (Apr 2020) </span> </li> <li class="ltx_bibitem" id="bib.bib78"> <span class="ltx_tag ltx_tag_bibitem">[78]</span> <span class="ltx_bibblock"> Nallapati, R., Zhai, F., Zhou, B.: Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In: Proceedings of the AAAI conference on artificial intelligence. vol. 31 (2017) </span> </li> <li class="ltx_bibitem" id="bib.bib79"> <span class="ltx_tag ltx_tag_bibitem">[79]</span> <span class="ltx_bibblock"> Nallapati, R., Zhou, B., dos Santos, C., Xiang, B.: Abstractive text summarization using sequence-to-sequence rnns and beyond. In: Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. pp. 280–290 (2016) </span> </li> <li class="ltx_bibitem" id="bib.bib80"> <span class="ltx_tag ltx_tag_bibitem">[80]</span> <span class="ltx_bibblock"> Nie, Y., Wang, S., Bansal, M.: Revealing the Importance of Semantic Retrieval for Machine Reading at Scale. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). pp. 2553–2566. Association for Computational Linguistics, Hong Kong, China (2019). https://doi.org/10.18653/v1/D19-1258 </span> </li> <li class="ltx_bibitem" id="bib.bib81"> <span class="ltx_tag ltx_tag_bibitem">[81]</span> <span class="ltx_bibblock"> Osiyevskyy, O., Dewald, J.: Explorative Versus Exploitative Business Model Change: The Cognitive Antecedents of Firm-Level Responses to Disruptive Innovation. Strategic Entrepreneurship Journal <span class="ltx_text ltx_font_bold" id="bib.bib81.1.1">9</span>(1), 58–78 (Mar 2015). https://doi.org/10.1002/sej.1192 </span> </li> <li class="ltx_bibitem" id="bib.bib82"> <span class="ltx_tag ltx_tag_bibitem">[82]</span> <span class="ltx_bibblock"> Pak, S.J.: Reputation and Social Ties: J. P. Morgan &amp; Co. and Private Investment Banking. Business History Review <span class="ltx_text ltx_font_bold" id="bib.bib82.1.1">87</span>(4), 703–728 (2013). https://doi.org/10.1017/S0007680513001104 </span> </li> <li class="ltx_bibitem" id="bib.bib83"> <span class="ltx_tag ltx_tag_bibitem">[83]</span> <span class="ltx_bibblock"> Pan, Y., Chen, Q., Peng, W., Wang, X., Hu, B., Liu, X., Chen, J., Zhou, W.: MedWriter: Knowledge-Aware Medical Text Generation. In: Proceedings of the 28th International Conference on Computational Linguistics. pp. 2363–2368. International Committee on Computational Linguistics, Barcelona, Spain (Online) (2020). https://doi.org/10.18653/v1/2020.coling-main.214 </span> </li> <li class="ltx_bibitem" id="bib.bib84"> <span class="ltx_tag ltx_tag_bibitem">[84]</span> <span class="ltx_bibblock"> Pan, Y., Chen, Q., Peng, W., Wang, X., Hu, B., Liu, X., Chen, J., Zhou, W.: Medwriter: Knowledge-aware medical text generation. In: Proceedings of the 28th International Conference on Computational Linguistics. pp. 2363–2368 (2020) </span> </li> <li class="ltx_bibitem" id="bib.bib85"> <span class="ltx_tag ltx_tag_bibitem">[85]</span> <span class="ltx_bibblock"> Paulheim, H.: Knowledge Graph Refinement: A Survey of Approaches and Evaluation Methods. SEMANTIC WEB <span class="ltx_text ltx_font_bold" id="bib.bib85.1.1">8</span>(3), 489–+ (2017). https://doi.org/10.3233/sw-160218 </span> </li> <li class="ltx_bibitem" id="bib.bib86"> <span class="ltx_tag ltx_tag_bibitem">[86]</span> <span class="ltx_bibblock"> Perelman, G.: The entropy formula for the Ricci flow and its geometric applications. Preprint arXiv:math/0211159 (2002) </span> </li> <li class="ltx_bibitem" id="bib.bib87"> <span class="ltx_tag ltx_tag_bibitem">[87]</span> <span class="ltx_bibblock"> Perin, F., Renggli, L., Ressia, J.: Linguistic style checking with program checking tools. Computer Languages, Systems &amp; Structures <span class="ltx_text ltx_font_bold" id="bib.bib87.1.1">38</span>(1), 61–72 (Apr 2012). https://doi.org/10.1016/j.cl.2011.11.002 </span> </li> <li class="ltx_bibitem" id="bib.bib88"> <span class="ltx_tag ltx_tag_bibitem">[88]</span> <span class="ltx_bibblock"> Peters, M., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep Contextualized Word Representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018). https://doi.org/10.18653/v1/N18-1202 </span> </li> <li class="ltx_bibitem" id="bib.bib89"> <span class="ltx_tag ltx_tag_bibitem">[89]</span> <span class="ltx_bibblock"> Pittman, W.C.: The rise and fall of strategic planning. Ieee Transactions on Engineering Management <span class="ltx_text ltx_font_bold" id="bib.bib89.1.1">47</span>(2), 281–282 (May 2000). https://doi.org/10.1109/TEM.2000.846794 </span> </li> <li class="ltx_bibitem" id="bib.bib90"> <span class="ltx_tag ltx_tag_bibitem">[90]</span> <span class="ltx_bibblock"> Pruksachatkun, Y., Yeres, P., Liu, H., Phang, J., Htut, P.M., Wang, A., Tenney, I., Bowman, S.R.: Jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models (May 2020) </span> </li> <li class="ltx_bibitem" id="bib.bib91"> <span class="ltx_tag ltx_tag_bibitem">[91]</span> <span class="ltx_bibblock"> Qiu, D., Yang, B.: Text summarization based on multi-head self-attention mechanism and pointer network. Complex &amp; Intelligent Systems <span class="ltx_text ltx_font_bold" id="bib.bib91.1.1">8</span>(1), 555–567 (Feb 2022). https://doi.org/10.1007/s40747-021-00527-2 </span> </li> <li class="ltx_bibitem" id="bib.bib92"> <span class="ltx_tag ltx_tag_bibitem">[92]</span> <span class="ltx_bibblock"> Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning Transferable Visual Models From Natural Language Supervision (Feb 2021) </span> </li> <li class="ltx_bibitem" id="bib.bib93"> <span class="ltx_tag ltx_tag_bibitem">[93]</span> <span class="ltx_bibblock"> Rahim, A.: Rumor Identification on Twitter Data for 2020 US Presidential Elections with BERT Model. UMT Artificial Intelligence Review <span class="ltx_text ltx_font_bold" id="bib.bib93.1.1">1</span>(1),  1–1 (Jun 2021). https://doi.org/10.32350/umtair.11.03 </span> </li> <li class="ltx_bibitem" id="bib.bib94"> <span class="ltx_tag ltx_tag_bibitem">[94]</span> <span class="ltx_bibblock"> Rajpurkar, P., Jia, R., Liang, P.: Know What You Don’t Know: Unanswerable Questions for SQuAD. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). pp. 784–789. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-2124 </span> </li> <li class="ltx_bibitem" id="bib.bib95"> <span class="ltx_tag ltx_tag_bibitem">[95]</span> <span class="ltx_bibblock"> Rajpurkar, P., Jia, R., Liang, P.: Know what you don’t know: Unanswerable questions for squad. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). pp. 784–789 (2018) </span> </li> <li class="ltx_bibitem" id="bib.bib96"> <span class="ltx_tag ltx_tag_bibitem">[96]</span> <span class="ltx_bibblock"> Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ Questions for Machine Comprehension of Text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. pp. 2383–2392. Association for Computational Linguistics, Austin, Texas (2016). https://doi.org/10.18653/v1/D16-1264 </span> </li> <li class="ltx_bibitem" id="bib.bib97"> <span class="ltx_tag ltx_tag_bibitem">[97]</span> <span class="ltx_bibblock"> Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature <span class="ltx_text ltx_font_bold" id="bib.bib97.1.1">323</span>(6088), 533–536 (1986) </span> </li> <li class="ltx_bibitem" id="bib.bib98"> <span class="ltx_tag ltx_tag_bibitem">[98]</span> <span class="ltx_bibblock"> Sanggeon Yun, and Hyeokman Kim, S.K.: BERT-Based Logits Ensemble Model for Gender Bias and Hate Speech Detection. Journal of Information Processing Systems <span class="ltx_text ltx_font_bold" id="bib.bib98.1.1">19</span>(5), 641–651 (Oct 2023). https://doi.org/10.3745/JIPS.04.0287 </span> </li> <li class="ltx_bibitem" id="bib.bib99"> <span class="ltx_tag ltx_tag_bibitem">[99]</span> <span class="ltx_bibblock"> See, A., Liu, P.J., Manning, C.D.: Get to the point: Summarization with pointer-generator networks. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 1073–1083 (2017) </span> </li> <li class="ltx_bibitem" id="bib.bib100"> <span class="ltx_tag ltx_tag_bibitem">[100]</span> <span class="ltx_bibblock"> Selvaraju, R.R., Lee, S., Shen, Y., Jin, H., Ghosh, S., Heck, L., Batra, D., Parikh, D.: Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 2591–2600. IEEE, Seoul, Korea (South) (Oct 2019). https://doi.org/10.1109/ICCV.2019.00268 </span> </li> <li class="ltx_bibitem" id="bib.bib101"> <span class="ltx_tag ltx_tag_bibitem">[101]</span> <span class="ltx_bibblock"> Shi, G., Zhou, S., Wang, Y., Shi, C., Liu, L.: Generating Patent Text Abstracts Based on Improved Multi-head Attention Mechanism. Data Analysis and Knowledge Discovery <span class="ltx_text ltx_font_bold" id="bib.bib101.1.1">7</span>(6), 61–72 (2023). https://doi.org/10.11925/infotech.2096-3467.2022.0530 </span> </li> <li class="ltx_bibitem" id="bib.bib102"> <span class="ltx_tag ltx_tag_bibitem">[102]</span> <span class="ltx_bibblock"> Smetona, M.J.: On the Interrelation of Production and Reproduction. Theoria <span class="ltx_text ltx_font_bold" id="bib.bib102.1.1">65</span>(156), 52–75 (Sep 2018). https://doi.org/10.3167/th.2018.6515603 </span> </li> <li class="ltx_bibitem" id="bib.bib103"> <span class="ltx_tag ltx_tag_bibitem">[103]</span> <span class="ltx_bibblock"> Sotudeh, S., Goharian, N.: TSTR: Too Short to Represent, Summarize with Details! Intro-Guided Extended Summary Generation (Jun 2022) </span> </li> <li class="ltx_bibitem" id="bib.bib104"> <span class="ltx_tag ltx_tag_bibitem">[104]</span> <span class="ltx_bibblock"> Steed, R., Caliskan, A.: Image Representations Learned With Unsupervised Pre-Training Contain Human-like Biases. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. pp. 701–713 (Mar 2021). https://doi.org/10.1145/3442188.3445932 </span> </li> <li class="ltx_bibitem" id="bib.bib105"> <span class="ltx_tag ltx_tag_bibitem">[105]</span> <span class="ltx_bibblock"> Sun, M., Zhang, X., Zheng, J., Ma, G.: DDGCN: Dual Dynamic Graph Convolutional Networks for Rumor Detection on Social Media. Proceedings of the AAAI Conference on Artificial Intelligence <span class="ltx_text ltx_font_bold" id="bib.bib105.1.1">36</span>(4), 4611–4619 (Jun 2022). https://doi.org/10.1609/aaai.v36i4.20385 </span> </li> <li class="ltx_bibitem" id="bib.bib106"> <span class="ltx_tag ltx_tag_bibitem">[106]</span> <span class="ltx_bibblock"> Syed, A.A., Gaol, F.L., Matsuo, T.: A survey of the state-of-the-art models in neural abstractive text summarization. IEEE Access <span class="ltx_text ltx_font_bold" id="bib.bib106.1.1">9</span>, 13248–13265 (2021) </span> </li> <li class="ltx_bibitem" id="bib.bib107"> <span class="ltx_tag ltx_tag_bibitem">[107]</span> <span class="ltx_bibblock"> Tae-Seok Lee, and Seung-Shik Kang, H.Y.L.: Improving Abstractive Summarization by Training Masked Out-of-Vocabulary Words. Journal of Information Processing Systems <span class="ltx_text ltx_font_bold" id="bib.bib107.1.1">18</span>(3), 344–358 (Jun 2022). https://doi.org/10.3745/JIPS.02.0172 </span> </li> <li class="ltx_bibitem" id="bib.bib108"> <span class="ltx_tag ltx_tag_bibitem">[108]</span> <span class="ltx_bibblock"> Tan, L., Wang, G., Jia, F., Lian, X.: Research status of deep learning methods for rumor detection. Multimedia Tools and Applications <span class="ltx_text ltx_font_bold" id="bib.bib108.1.1">82</span>(2), 2941–2982 (Jan 2023). https://doi.org/10.1007/s11042-022-12800-8 </span> </li> <li class="ltx_bibitem" id="bib.bib109"> <span class="ltx_tag ltx_tag_bibitem">[109]</span> <span class="ltx_bibblock"> Tao, C., Wu, W., Feng, Y., Zhao, D., Yan, R.: Improving Matching Models with Hierarchical Contextualized Representations for Multi-turn Response Selection. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 1865–1868. ACM, Virtual Event China (Jul 2020). https://doi.org/10.1145/3397271.3401290 </span> </li> <li class="ltx_bibitem" id="bib.bib110"> <span class="ltx_tag ltx_tag_bibitem">[110]</span> <span class="ltx_bibblock"> Tao, C., Wu, W., Xu, C., Hu, W., Zhao, D., Yan, R.: One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. pp. 1–11. Association for Computational Linguistics, Florence, Italy (2019). https://doi.org/10.18653/v1/P19-1001 </span> </li> <li class="ltx_bibitem" id="bib.bib111"> <span class="ltx_tag ltx_tag_bibitem">[111]</span> <span class="ltx_bibblock"> Teece, D.J.: Explicating dynamic capabilities: The nature and microfoundations of (sustainable) enterprise performance. Strategic Management Journal <span class="ltx_text ltx_font_bold" id="bib.bib111.1.1">28</span>(13), 1319–1350 (Dec 2007). https://doi.org/10.1002/smj.640 </span> </li> <li class="ltx_bibitem" id="bib.bib112"> <span class="ltx_tag ltx_tag_bibitem">[112]</span> <span class="ltx_bibblock"> Teney, D., Abbasnejad, E., Van Den Hengel, A.: Unshuffling Data for Improved Generalization in Visual Question Answering. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 1397–1407. IEEE, Montreal, QC, Canada (Oct 2021). https://doi.org/10.1109/ICCV48922.2021.00145 </span> </li> <li class="ltx_bibitem" id="bib.bib113"> <span class="ltx_tag ltx_tag_bibitem">[113]</span> <span class="ltx_bibblock"> Truong, K.H.V.T., Huynh, V.P., Nguyen, H.D.: Corporate Strategy for Sustainability: Reflections of Prospective Entrepreneurs. Foresight and Sti Governance <span class="ltx_text ltx_font_bold" id="bib.bib113.1.1">17</span>(2), 21–34 (2023). https://doi.org/10.17323/2500-2597.2023.2.21.34 </span> </li> <li class="ltx_bibitem" id="bib.bib114"> <span class="ltx_tag ltx_tag_bibitem">[114]</span> <span class="ltx_bibblock"> Turing, A.M.: Computing machinery and intelligence. Mind <span class="ltx_text ltx_font_bold" id="bib.bib114.1.1">LIX</span>(236), 433–460 (1950) </span> </li> <li class="ltx_bibitem" id="bib.bib115"> <span class="ltx_tag ltx_tag_bibitem">[115]</span> <span class="ltx_bibblock"> Turner, T.: Marxian value theory: An anthropological perspective. Anthropological Theory <span class="ltx_text ltx_font_bold" id="bib.bib115.1.1">8</span>(1), 43–56 (Mar 2008). https://doi.org/10.1177/1463499607087494 </span> </li> <li class="ltx_bibitem" id="bib.bib116"> <span class="ltx_tag ltx_tag_bibitem">[116]</span> <span class="ltx_bibblock"> Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is All you Need </span> </li> <li class="ltx_bibitem" id="bib.bib117"> <span class="ltx_tag ltx_tag_bibitem">[117]</span> <span class="ltx_bibblock"> Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention Is All You Need (Aug 2017) </span> </li> <li class="ltx_bibitem" id="bib.bib118"> <span class="ltx_tag ltx_tag_bibitem">[118]</span> <span class="ltx_bibblock"> Wan, S., Lan, Y., Guo, J., Xu, J., Pang, L., Cheng, X.: A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations. Proceedings of the AAAI Conference on Artificial Intelligence <span class="ltx_text ltx_font_bold" id="bib.bib118.1.1">30</span>(1) (Mar 2016). https://doi.org/10.1609/aaai.v30i1.10342 </span> </li> <li class="ltx_bibitem" id="bib.bib119"> <span class="ltx_tag ltx_tag_bibitem">[119]</span> <span class="ltx_bibblock"> Wan, S., Tang, B., Dong, F., Wang, M., Yang, G.: A writing style-based multi-task model with the hierarchical attention for rumor detection. International Journal of Machine Learning and Cybernetics <span class="ltx_text ltx_font_bold" id="bib.bib119.1.1">14</span>(11), 3993–4008 (Nov 2023). https://doi.org/10.1007/s13042-023-01877-8 </span> </li> <li class="ltx_bibitem" id="bib.bib120"> <span class="ltx_tag ltx_tag_bibitem">[120]</span> <span class="ltx_bibblock"> Wang, E., Peng, Z., Xie, Z., Liu, X., Cheng, M.M.: GET: Unlocking the Multi-modal Potential of CLIP for Generalized Category Discovery (Mar 2024) </span> </li> <li class="ltx_bibitem" id="bib.bib121"> <span class="ltx_tag ltx_tag_bibitem">[121]</span> <span class="ltx_bibblock"> Wang, S., Jiang, J.: Learning Natural Language Inference with LSTM. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 1442–1451. Association for Computational Linguistics, San Diego, California (2016). https://doi.org/10.18653/v1/N16-1170 </span> </li> <li class="ltx_bibitem" id="bib.bib122"> <span class="ltx_tag ltx_tag_bibitem">[122]</span> <span class="ltx_bibblock"> Wang, S., Jiang, J.: Machine Comprehension Using Match-LSTM and Answer Pointer (Nov 2016) </span> </li> <li class="ltx_bibitem" id="bib.bib123"> <span class="ltx_tag ltx_tag_bibitem">[123]</span> <span class="ltx_bibblock"> Wang, Z., Wang, Z., Long, Y., Wang, J., Xu, Z., Wang, B.: Enhancing generative conversational service agents with dialog history and external knowledge. Computer Speech &amp; Language <span class="ltx_text ltx_font_bold" id="bib.bib123.1.1">54</span>, 71–85 (Mar 2019). https://doi.org/10.1016/j.csl.2018.09.003 </span> </li> <li class="ltx_bibitem" id="bib.bib124"> <span class="ltx_tag ltx_tag_bibitem">[124]</span> <span class="ltx_bibblock"> Wazery, Y., Saleh, M.E., Alharbi, A., Ali, A.A.: Abstractive Arabic Text Summarization Based on Deep Learning. Computational Intelligence and Neuroscience <span class="ltx_text ltx_font_bold" id="bib.bib124.1.1">2022</span>, 1–14 (Jan 2022). https://doi.org/10.1155/2022/1566890 </span> </li> <li class="ltx_bibitem" id="bib.bib125"> <span class="ltx_tag ltx_tag_bibitem">[125]</span> <span class="ltx_bibblock"> Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., Davison, J., Shleifer, S., Von Platen, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., Rush, A.: Transformers: State-of-the-Art Natural Language Processing. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. pp. 38–45. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.emnlp-demos.6 </span> </li> <li class="ltx_bibitem" id="bib.bib126"> <span class="ltx_tag ltx_tag_bibitem">[126]</span> <span class="ltx_bibblock"> Wu, J., Mooney, R.: Self-Critical Reasoning for Robust Visual Question Answering. Advances in Neural Information Processing Systems (34), 3784–3796 (2021) </span> </li> <li class="ltx_bibitem" id="bib.bib127"> <span class="ltx_tag ltx_tag_bibitem">[127]</span> <span class="ltx_bibblock"> Wu, Y., Wu, W., Xing, C., Zhou, M., Li, Z.: Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 496–505. Association for Computational Linguistics, Vancouver, Canada (2017). https://doi.org/10.18653/v1/P17-1046 </span> </li> <li class="ltx_bibitem" id="bib.bib128"> <span class="ltx_tag ltx_tag_bibitem">[128]</span> <span class="ltx_bibblock"> Yadav, P.L., Han, S.H., Kim, H.: Sustaining Competitive Advantage Through Corporate Environmental Performance. Business Strategy and the Environment <span class="ltx_text ltx_font_bold" id="bib.bib128.1.1">26</span>(3), 345–357 (Mar 2017). https://doi.org/10.1002/bse.1921 </span> </li> <li class="ltx_bibitem" id="bib.bib129"> <span class="ltx_tag ltx_tag_bibitem">[129]</span> <span class="ltx_bibblock"> Yamada, I., Asai, A., Shindo, H., Takeda, H., Matsumoto, Y.: LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). pp. 6442–6454. Association for Computational Linguistics, Online (2020). https://doi.org/10.18653/v1/2020.emnlp-main.523 </span> </li> <li class="ltx_bibitem" id="bib.bib130"> <span class="ltx_tag ltx_tag_bibitem">[130]</span> <span class="ltx_bibblock"> Yang, M., Li, C., Shen, Y., Wu, Q., Zhao, Z., Chen, X.: Hierarchical human-like deep neural networks for abstractive text summarization. IEEE Transactions on Neural Networks and Learning Systems <span class="ltx_text ltx_font_bold" id="bib.bib130.1.1">32</span>(6), 2744–2757 (2020) </span> </li> <li class="ltx_bibitem" id="bib.bib131"> <span class="ltx_tag ltx_tag_bibitem">[131]</span> <span class="ltx_bibblock"> Yang, S., Feng, D., Liu, Y., Li, D.: Distant context aware text generation from abstract meaning representation. Applied Intelligence <span class="ltx_text ltx_font_bold" id="bib.bib131.1.1">52</span>(2), 1672–1685 (Jan 2022). https://doi.org/10.1007/s10489-021-02431-1 </span> </li> <li class="ltx_bibitem" id="bib.bib132"> <span class="ltx_tag ltx_tag_bibitem">[132]</span> <span class="ltx_bibblock"> Yang, Z., Choi, J.D.: FriendsQA: Open-Domain Question Answering on TV Show Transcripts. In: Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. pp. 188–197. Association for Computational Linguistics, Stockholm, Sweden (2019). https://doi.org/10.18653/v1/W19-5923 </span> </li> <li class="ltx_bibitem" id="bib.bib133"> <span class="ltx_tag ltx_tag_bibitem">[133]</span> <span class="ltx_bibblock"> Ye, N., Yu, D., Zhou, Y., Shang, K.k., Zhang, S.: Graph Convolutional-Based Deep Residual Modeling for Rumor Detection on Social Media. Mathematics <span class="ltx_text ltx_font_bold" id="bib.bib133.1.1">11</span>(15),  3393 (Aug 2023). https://doi.org/10.3390/math11153393 </span> </li> <li class="ltx_bibitem" id="bib.bib134"> <span class="ltx_tag ltx_tag_bibitem">[134]</span> <span class="ltx_bibblock"> Yu, D., Zhou, Y., Zhang, S., Liu, C.: Heterogeneous Graph Convolutional Network-Based Dynamic Rumor Detection on Social Media. Complexity <span class="ltx_text ltx_font_bold" id="bib.bib134.1.1">2022</span>, 8393736 (Apr 2022). https://doi.org/10.1155/2022/8393736 </span> </li> <li class="ltx_bibitem" id="bib.bib135"> <span class="ltx_tag ltx_tag_bibitem">[135]</span> <span class="ltx_bibblock"> Yuan, C., Zhou, W., Li, M., Lv, S., Zhu, F., Han, J., Hu, S.: Multi-hop Selector Network for Multi-turn Response Selection in Retrieval-based Chatbots. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). pp. 111–120. Association for Computational Linguistics, Hong Kong, China (2019). https://doi.org/10.18653/v1/D19-1011 </span> </li> <li class="ltx_bibitem" id="bib.bib136"> <span class="ltx_tag ltx_tag_bibitem">[136]</span> <span class="ltx_bibblock"> Zhang, F.: Reducing Multi-model Biases for Robust Visual Question Answering. Acta Scientiarum Naturalium Universitatis Pekinensis <span class="ltx_text ltx_font_bold" id="bib.bib136.1.1">60</span>(1) (Jan 2024) </span> </li> <li class="ltx_bibitem" id="bib.bib137"> <span class="ltx_tag ltx_tag_bibitem">[137]</span> <span class="ltx_bibblock"> Zhang, H., Xu, H., Lin, T.E.: Deep Open Intent Classification with Adaptive Decision Boundary. Proceedings of the AAAI Conference on Artificial Intelligence <span class="ltx_text ltx_font_bold" id="bib.bib137.1.1">35</span>(16), 14374–14382 (May 2021). https://doi.org/10.1609/aaai.v35i16.17690 </span> </li> <li class="ltx_bibitem" id="bib.bib138"> <span class="ltx_tag ltx_tag_bibitem">[138]</span> <span class="ltx_bibblock"> Zhang, L., Du, Y., Lv, X.: STNLTP: Generating Chinese Patent Abstracts Based on Integrated Strategy. Data Analysis and Knowledge Discovery <span class="ltx_text ltx_font_bold" id="bib.bib138.1.1">6</span>(7), 107–117 (2022). https://doi.org/10.11925/infotech.2096-3467.2021.1307 </span> </li> <li class="ltx_bibitem" id="bib.bib139"> <span class="ltx_tag ltx_tag_bibitem">[139]</span> <span class="ltx_bibblock"> Zhang, L., Leng, J., Lv, X., Cui, Z., Wang, L., You, X.: RLCPAR: A Rewriting Model for Chinese Patent Abstracts Based on Reinforcement Learning. Data Analysis and Knowledge Discovery <span class="ltx_text ltx_font_bold" id="bib.bib139.1.1">5</span>(7), 59–69 (2021). https://doi.org/10.11925/infotech.2096-3467.2021.0089 </span> </li> <li class="ltx_bibitem" id="bib.bib140"> <span class="ltx_tag ltx_tag_bibitem">[140]</span> <span class="ltx_bibblock"> Zhang, Z., Li, J., Zhao, H.: Multi-Turn Dialogue Reading Comprehension With Pivot Turns and Knowledge. IEEE/ACM Transactions on Audio, Speech, and Language Processing <span class="ltx_text ltx_font_bold" id="bib.bib140.1.1">29</span>, 1161–1173 (2021). https://doi.org/10.1109/TASLP.2021.3058616 </span> </li> <li class="ltx_bibitem" id="bib.bib141"> <span class="ltx_tag ltx_tag_bibitem">[141]</span> <span class="ltx_bibblock"> Zhang, Z., Wu, Y., Zhou, J., Duan, S., Zhao, H., Wang, R.: SG-Net: Syntax-Guided Machine Reading Comprehension. Proceedings of the AAAI Conference on Artificial Intelligence <span class="ltx_text ltx_font_bold" id="bib.bib141.1.1">34</span>(05), 9636–9643 (Apr 2020). https://doi.org/10.1609/aaai.v34i05.6511 </span> </li> <li class="ltx_bibitem" id="bib.bib142"> <span class="ltx_tag ltx_tag_bibitem">[142]</span> <span class="ltx_bibblock"> Zhao, Y., Zhao, H., Duan, S.: Multi-Grained Evidence Inference for Multi-Choice Reading Comprehension. IEEE/ACM Transactions on Audio, Speech, and Language Processing <span class="ltx_text ltx_font_bold" id="bib.bib142.1.1">31</span>, 3896–3907 (2023). https://doi.org/10.1109/TASLP.2023.3313885 </span> </li> <li class="ltx_bibitem" id="bib.bib143"> <span class="ltx_tag ltx_tag_bibitem">[143]</span> <span class="ltx_bibblock"> Zhong, L., Wu, J., Li, Q., Peng, H., Wu, X.: A Comprehensive Survey on Automatic Knowledge Graph Construction (Feb 2023) </span> </li> <li class="ltx_bibitem" id="bib.bib144"> <span class="ltx_tag ltx_tag_bibitem">[144]</span> <span class="ltx_bibblock"> Zhong, N., Zhou, G., Ding, W., Zhang, J.: A Rumor Detection Method Based on Multimodal Feature Fusion by a Joining Aggregation Structure. Electronics <span class="ltx_text ltx_font_bold" id="bib.bib144.1.1">11</span>(19),  3200 (Oct 2022). https://doi.org/10.3390/electronics11193200 </span> </li> <li class="ltx_bibitem" id="bib.bib145"> <span class="ltx_tag ltx_tag_bibitem">[145]</span> <span class="ltx_bibblock"> Zhong, V., Xiong, C., Keskar, N.S., Socher, R.: Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering (arXiv:1901.00603) (May 2019) </span> </li> <li class="ltx_bibitem" id="bib.bib146"> <span class="ltx_tag ltx_tag_bibitem">[146]</span> <span class="ltx_bibblock"> Zhou, X., Dong, D., Wu, H., Zhao, S., Yu, D., Tian, H., Liu, X., Yan, R.: Multi-view Response Selection for Human-Computer Conversation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. pp. 372–381. Association for Computational Linguistics, Austin, Texas (2016). https://doi.org/10.18653/v1/D16-1036 </span> </li> <li class="ltx_bibitem" id="bib.bib147"> <span class="ltx_tag ltx_tag_bibitem">[147]</span> <span class="ltx_bibblock"> Zhou, X., Dong, D., Wu, H., Zhao, S., Yu, D., Tian, H., Liu, X., Yan, R.: Multi-view Response Selection for Human-Computer Conversation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. pp. 372–381. Association for Computational Linguistics, Austin, Texas (2016). https://doi.org/10.18653/v1/D16-1036 </span> </li> <li class="ltx_bibitem" id="bib.bib148"> <span class="ltx_tag ltx_tag_bibitem">[148]</span> <span class="ltx_bibblock"> Zhou, X., Li, L., Dong, D., Liu, Y., Chen, Y., Zhao, W.X., Yu, D., Wu, H.: Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 1118–1127. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-1103 </span> </li> <li class="ltx_bibitem" id="bib.bib149"> <span class="ltx_tag ltx_tag_bibitem">[149]</span> <span class="ltx_bibblock"> Zhu, X., Wang, J., Zhang, X.: An Enhanced Key-utterance Interactive Model with Decouped Auxiliary Tasks for Multi-party Dialogue Reading Comprehension. In: 2022 International Joint Conference on Neural Networks (IJCNN). pp. 1–8. IEEE, Padua, Italy (Jul 2022). https://doi.org/10.1109/IJCNN55064.2022.9892162 </span> </li> <li class="ltx_bibitem" id="bib.bib150"> <span class="ltx_tag ltx_tag_bibitem">[150]</span> <span class="ltx_bibblock"> Zhu, Y., Wang, G., Li, S., Huang, X.: A Novel Rumor Detection Method Based on Non-Consecutive Semantic Features and Comment Stance. IEEE Access <span class="ltx_text ltx_font_bold" id="bib.bib150.1.1">11</span>, 58016–58024 (2023). https://doi.org/10.1109/ACCESS.2023.3284308 </span> </li> <li class="ltx_bibitem" id="bib.bib151"> <span class="ltx_tag ltx_tag_bibitem">[151]</span> <span class="ltx_bibblock"> Zott, C., Amit, R.: Business Model Design: An Activity System Perspective. Long Range Planning <span class="ltx_text ltx_font_bold" id="bib.bib151.1.1">43</span>(2-3), 216–226 (Apr 2010). https://doi.org/10.1016/j.lrp.2009.07.004 </span> </li> </ul> </section> </article> </div> <footer class="ltx_page_footer"> <div class="ltx_page_logo">Generated on Thu Nov 21 12:24:34 2024 by <a class="ltx_LaTeXML_logo" href="http://dlmf.nist.gov/LaTeXML/"><span style="letter-spacing:-0.2em; margin-right:0.1em;">L<span class="ltx_font_smallcaps" style="position:relative; bottom:2.2pt;">a</span>T<span class="ltx_font_smallcaps" style="font-size:120%;position:relative; bottom:-0.2ex;">e</span></span><span style="font-size:90%; position:relative; bottom:-0.2ex;">XML</span><img alt="Mascot Sammy" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAsAAAAOCAYAAAD5YeaVAAAAAXNSR0IArs4c6QAAAAZiS0dEAP8A/wD/oL2nkwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9wKExQZLWTEaOUAAAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAdpJREFUKM9tkL+L2nAARz9fPZNCKFapUn8kyI0e4iRHSR1Kb8ng0lJw6FYHFwv2LwhOpcWxTjeUunYqOmqd6hEoRDhtDWdA8ApRYsSUCDHNt5ul13vz4w0vWCgUnnEc975arX6ORqN3VqtVZbfbTQC4uEHANM3jSqXymFI6yWazP2KxWAXAL9zCUa1Wy2tXVxheKA9YNoR8Pt+aTqe4FVVVvz05O6MBhqUIBGk8Hn8HAOVy+T+XLJfLS4ZhTiRJgqIoVBRFIoric47jPnmeB1mW/9rr9ZpSSn3Lsmir1fJZlqWlUonKsvwWwD8ymc/nXwVBeLjf7xEKhdBut9Hr9WgmkyGEkJwsy5eHG5vN5g0AKIoCAEgkEkin0wQAfN9/cXPdheu6P33fBwB4ngcAcByHJpPJl+fn54mD3Gg0NrquXxeLRQAAwzAYj8cwTZPwPH9/sVg8PXweDAauqqr2cDjEer1GJBLBZDJBs9mE4zjwfZ85lAGg2+06hmGgXq+j3+/DsixYlgVN03a9Xu8jgCNCyIegIAgx13Vfd7vdu+FweG8YRkjXdWy329+dTgeSJD3ieZ7RNO0VAXAPwDEAO5VKndi2fWrb9jWl9Esul6PZbDY9Go1OZ7PZ9z/lyuD3OozU2wAAAABJRU5ErkJggg=="/></a> </div></footer> </div> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10