CINXE.COM
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
<!DOCTYPE html> <html lang="en"> <head> <meta content="text/html; charset=utf-8" http-equiv="content-type"/> <title>Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC</title> <!--Generated on Fri Jan 3 12:26:00 2025 by LaTeXML (version 0.8.8) http://dlmf.nist.gov/LaTeXML/.--> <meta content="width=device-width, initial-scale=1, shrink-to-fit=no" name="viewport"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv-fonts.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/latexml_styles.css" rel="stylesheet" type="text/css"/> <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.3.3/html2canvas.min.js"></script> <script src="/static/browse/0.3.4/js/addons_new.js"></script> <script src="/static/browse/0.3.4/js/feedbackOverlay.js"></script> <meta content=" multi-talker speech recognition, speech recognition, Connectionist Temporal Classification, cocktail party problem, speech separation " lang="en" name="keywords"/> <base href="/html/2409.12388v2/"/></head> <body> <nav class="ltx_page_navbar"> <nav class="ltx_TOC"> <ol class="ltx_toclist"> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S1" title="In Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">I </span><span class="ltx_text ltx_font_smallcaps">Introduction</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S2" title="In Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">II </span><span class="ltx_text ltx_font_smallcaps">Methods</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S2.SS1" title="In II Methods ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-A</span> </span><span class="ltx_text ltx_font_italic">Revisit CTC in speech recognition</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S2.SS2" title="In II Methods ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-B</span> </span><span class="ltx_text ltx_font_italic">Speaker-aware CTC based on minimizing Bayes risk</span></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S3" title="In Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">III </span><span class="ltx_text ltx_font_smallcaps">Experimental setup</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S4" title="In Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">IV </span><span class="ltx_text ltx_font_smallcaps">Results and discussions</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S4.SS1" title="In IV Results and discussions ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-A</span> </span><span class="ltx_text ltx_font_italic">Analysis of vanilla CTC</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S4.SS2" title="In IV Results and discussions ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-B</span> </span><span class="ltx_text ltx_font_italic">Performance of SACTC</span></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S5" title="In Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">V </span><span class="ltx_text ltx_font_smallcaps">conclusions</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S6" title="In Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">VI </span><span class="ltx_text ltx_font_smallcaps">Acknowledgements</span></span></a></li> </ol></nav> </nav> <div class="ltx_page_main"> <div class="ltx_page_content"> <article class="ltx_document ltx_authors_1line"> <h1 class="ltx_title ltx_title_document"> Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC </h1> <div class="ltx_authors"> <span class="ltx_creator ltx_role_author"> <span class="ltx_personname"> <span class="ltx_text ltx_font_italic" id="id1.1.id1" style="font-size:144%;">Jiawen Kang, Lingwei Meng, Mingyu Cui, Yuejiao Wang, Xixin Wu, Xunying Liu, Helen Meng<span class="ltx_text ltx_font_upright" id="id1.1.id1.1"> <br class="ltx_break"/>The Chinese University of Hong Kong, Hong Kong SAR, China </span></span> </span></span> </div> <div class="ltx_abstract"> <h6 class="ltx_title ltx_title_abstract">Abstract</h6> <p class="ltx_p" id="id2.id1">Multi-talker speech recognition (MTASR) faces unique challenges in disentangling and transcribing overlapping speech. To address these challenges, this paper investigates the role of Connectionist Temporal Classification (CTC) in speaker disentanglement when incorporated with Serialized Output Training (SOT) for MTASR. Our visualization reveals that CTC guides the encoder to represent different speakers in distinct temporal regions of acoustic embeddings. Leveraging this insight, we propose a novel Speaker-Aware CTC (SACTC) training objective, based on the Bayes risk CTC framework. SACTC is a tailored CTC variant for multi-talker scenarios, it explicitly models speaker disentanglement by constraining the encoder to represent different speakers’ tokens at specific time frames. When integrated with SOT, the SOT-SACTC model consistently outperforms standard SOT-CTC across various degrees of speech overlap. Specifically, we observe relative word error rate reductions of 10% overall and 15% on low-overlap speech. This work represents an initial exploration of CTC-based enhancements for MTASR tasks, offering a new perspective on speaker disentanglement in multi-talker speech recognition. The code is available at <a class="ltx_ref ltx_href" href="https://github.com/kjw11/Speaker-Aware-CTC" style="color:#0000FF;" title="">https://github.com/kjw11/Speaker-Aware-CTC</a>.</p> </div> <div class="ltx_keywords"> <h6 class="ltx_title ltx_title_keywords">Index Terms: </h6> multi-talker speech recognition, speech recognition, Connectionist Temporal Classification, cocktail party problem, speech separation </div> <section class="ltx_section" id="S1"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">I </span><span class="ltx_text ltx_font_smallcaps" id="S1.1.1">Introduction</span> </h2> <div class="ltx_para" id="S1.p1"> <p class="ltx_p" id="S1.p1.1">Natural human conversations always involve multiple speakers, with varying degrees of speech overlaps. Multi-talker speech recognition (MTASR) has emerged as a critical field, aiming to transcribe these natural conversational speech. While traditional automatic speech recognition (ASR) tasks typically perform monotonic speech-to-text sequence mapping, MTASR presents unique challenges: recognition models are required to disentangle speech from distinct speakers, and separately transcribe their speech.</p> </div> <div class="ltx_para" id="S1.p2"> <p class="ltx_p" id="S1.p2.1">In recent years, many approaches have been proposed to address this challenge. These approaches can be categorized into two types based on their ways of differentiating speakers: branched acoustic encoder (BAE) based models and serialized output training <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib1" title="">1</a>]</cite> based models. BAE models leverage structural priors to disentangle speakers: they separate mixed speech into independent branches, then use shared recognition blocks to transcribe different speakers in parallel. To align branches with respective speakers, permutation invariant training (PIT) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib2" title="">2</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib3" title="">3</a>]</cite> is applied to calculate ASR loss. Yu et al. <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib4" title="">4</a>]</cite> first adopt a BAE-style model with PIT loss. Seki et al. <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib5" title="">5</a>]</cite> further extend this approach in a fully end-to-end manner. Subsequently, Chang et al. <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib6" title="">6</a>]</cite> incorporated transformer backbone into this framework. Further works <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib7" title="">7</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib8" title="">8</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib9" title="">9</a>]</cite> explored streaming ASR following the BAE framework. More recently, sidecar separator-based methods <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib10" title="">10</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib11" title="">11</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib12" title="">12</a>]</cite> were proposed to convert a single-talker ASR system into a multi-talker one through model-internal separation.</p> </div> <div class="ltx_para" id="S1.p3"> <p class="ltx_p" id="S1.p3.1">Another line of research lies on Serialized Output Training (SOT) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib1" title="">1</a>]</cite>. The SOT approach serializes text from different speakers as a single stream, with a special token <math alttext="\langle sc\rangle" class="ltx_Math" display="inline" id="S1.p3.1.m1.1"><semantics id="S1.p3.1.m1.1a"><mrow id="S1.p3.1.m1.1.1.1" xref="S1.p3.1.m1.1.1.2.cmml"><mo id="S1.p3.1.m1.1.1.1.2" stretchy="false" xref="S1.p3.1.m1.1.1.2.1.cmml">⟨</mo><mrow id="S1.p3.1.m1.1.1.1.1" xref="S1.p3.1.m1.1.1.1.1.cmml"><mi id="S1.p3.1.m1.1.1.1.1.2" xref="S1.p3.1.m1.1.1.1.1.2.cmml">s</mi><mo id="S1.p3.1.m1.1.1.1.1.1" xref="S1.p3.1.m1.1.1.1.1.1.cmml"></mo><mi id="S1.p3.1.m1.1.1.1.1.3" xref="S1.p3.1.m1.1.1.1.1.3.cmml">c</mi></mrow><mo id="S1.p3.1.m1.1.1.1.3" stretchy="false" xref="S1.p3.1.m1.1.1.2.1.cmml">⟩</mo></mrow><annotation-xml encoding="MathML-Content" id="S1.p3.1.m1.1b"><apply id="S1.p3.1.m1.1.1.2.cmml" xref="S1.p3.1.m1.1.1.1"><csymbol cd="latexml" id="S1.p3.1.m1.1.1.2.1.cmml" xref="S1.p3.1.m1.1.1.1.2">delimited-⟨⟩</csymbol><apply id="S1.p3.1.m1.1.1.1.1.cmml" xref="S1.p3.1.m1.1.1.1.1"><times id="S1.p3.1.m1.1.1.1.1.1.cmml" xref="S1.p3.1.m1.1.1.1.1.1"></times><ci id="S1.p3.1.m1.1.1.1.1.2.cmml" xref="S1.p3.1.m1.1.1.1.1.2">𝑠</ci><ci id="S1.p3.1.m1.1.1.1.1.3.cmml" xref="S1.p3.1.m1.1.1.1.1.3">𝑐</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S1.p3.1.m1.1c">\langle sc\rangle</annotation><annotation encoding="application/x-llamapun" id="S1.p3.1.m1.1d">⟨ italic_s italic_c ⟩</annotation></semantics></math> as a delimiter between speakers. In contrast to the structural priors of BAE models, this approach relies on attention mechanisms in attention encoder-decoder (AED) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib13" title="">13</a>]</cite> to disambiguate speakers. This confers an advantage in that it does not require pre-defining speakers and branch numbers, allowing it to handle a variable number of speakers. The superiority of SOT methods has been demonstrated in the M2Met challenges <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib14" title="">14</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib15" title="">15</a>]</cite>, which provided challenging ”in the wild” multi-talker meeting speech. SOT methods have been further enhanced with speaker information <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib16" title="">16</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib17" title="">17</a>]</cite>, time boundary <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib18" title="">18</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib19" title="">19</a>]</cite>, learnable speaker orders <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib20" title="">20</a>]</cite>, large language models <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib21" title="">21</a>]</cite>, and integrated with BAE structures as a hybrid branched SOT model <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib22" title="">22</a>]</cite>.</p> </div> <div class="ltx_para" id="S1.p4"> <p class="ltx_p" id="S1.p4.1">In contrast to these two paradigms, there is a lack of investigation on the role of connectionist temporal classification (CTC) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib23" title="">23</a>]</cite> in MTASR. CTC has become a fundamental training criterion for sequence-to-sequence tasks including speech recognition. Specifically, CTC introduces a blank token to construct alignments between input and target sequences, providing a method to compute posterior probabilities by summing over all possible alignment paths between inputs and target sequences. Compared to other ASR architecture of AED and Neural Transducer <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib24" title="">24</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib25" title="">25</a>]</cite>, CTC generates all tokens in the sequence simultaneously in a non-autoregressive manner, thus enabling faster decoding speeds. CTC was also used together with AED models as a joint CTC/Attention model <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib26" title="">26</a>]</cite>, which has long been considered one of the state-of-the-art approaches for speech recognition. In the MTASR domain, although the original SOT adopted the AED architecture without including CTC loss, many studies have empirically demonstrated that the joint CTC/Attention SOT model can effectively improve recognition accuracy on overlapped speech <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib27" title="">27</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib28" title="">28</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib29" title="">29</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib18" title="">18</a>]</cite>. However, given CTC’s monotonicity assumption, it is counter-intuitive that CTC can non-monotonically map overlapped speech to serialized transcriptions, and there is a lack of literature investigating these results.</p> </div> <div class="ltx_para" id="S1.p5"> <p class="ltx_p" id="S1.p5.1">In this work, we investigated the effect of CTC, especially when incorporated with SOT. Our experiments with conformer encoders reveal that CTC loss enables the acoustic encoder to represent different speakers at distinct temporal within the acoustic embeddings. Distinct from existing BAE and SOT approaches, we attribute CTC’s speaker distinction capability to its non-autoregressive reordering capability <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib30" title="">30</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib31" title="">31</a>]</cite>, which is potentially a novel direction for speaker disentanglement in MTASR. Building on this insight, we proposed a novel speaker-aware CTC (SACTC) as an enhanced and tailored CTC variant for multi-talker scenarios. This SACTC explicitly models speaker disentanglement by constraining the encoder to represent different speakers’ tokens at specific temporal locations. This is achieved by the Bayes risk CTC framework, where we introduced a speaker-aware risk function to penalty CTC paths with undesired token emit. In experiments, SACTC was used as an auxiliary loss for SOT-based MTASR models. Experimental results demonstrate that the SOT-SACTC model consistently outperforms the standard SOT-CTC approach across various degrees of speech overlap. Notably, we observe word error rate reductions of 10% overall and of 15% on low-overlap speech. To our knowledge, this work represents the first exploration of CTC-based enhancements for MTASR tasks.</p> </div> </section> <section class="ltx_section" id="S2"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">II </span><span class="ltx_text ltx_font_smallcaps" id="S2.1.1">Methods</span> </h2> <section class="ltx_subsection" id="S2.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S2.SS1.5.1.1">II-A</span> </span><span class="ltx_text ltx_font_italic" id="S2.SS1.6.2">Revisit CTC in speech recognition</span> </h3> <div class="ltx_para" id="S2.SS1.p1"> <p class="ltx_p" id="S2.SS1.p1.15">CTC loss guides sequence-to-sequence models by maximizing the posterior probability <math alttext="p(l|x)" class="ltx_Math" display="inline" id="S2.SS1.p1.1.m1.1"><semantics id="S2.SS1.p1.1.m1.1a"><mrow id="S2.SS1.p1.1.m1.1.1" xref="S2.SS1.p1.1.m1.1.1.cmml"><mi id="S2.SS1.p1.1.m1.1.1.3" xref="S2.SS1.p1.1.m1.1.1.3.cmml">p</mi><mo id="S2.SS1.p1.1.m1.1.1.2" xref="S2.SS1.p1.1.m1.1.1.2.cmml"></mo><mrow id="S2.SS1.p1.1.m1.1.1.1.1" xref="S2.SS1.p1.1.m1.1.1.1.1.1.cmml"><mo id="S2.SS1.p1.1.m1.1.1.1.1.2" stretchy="false" xref="S2.SS1.p1.1.m1.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.p1.1.m1.1.1.1.1.1" xref="S2.SS1.p1.1.m1.1.1.1.1.1.cmml"><mi id="S2.SS1.p1.1.m1.1.1.1.1.1.2" xref="S2.SS1.p1.1.m1.1.1.1.1.1.2.cmml">l</mi><mo fence="false" id="S2.SS1.p1.1.m1.1.1.1.1.1.1" xref="S2.SS1.p1.1.m1.1.1.1.1.1.1.cmml">|</mo><mi id="S2.SS1.p1.1.m1.1.1.1.1.1.3" xref="S2.SS1.p1.1.m1.1.1.1.1.1.3.cmml">x</mi></mrow><mo id="S2.SS1.p1.1.m1.1.1.1.1.3" stretchy="false" xref="S2.SS1.p1.1.m1.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.1.m1.1b"><apply id="S2.SS1.p1.1.m1.1.1.cmml" xref="S2.SS1.p1.1.m1.1.1"><times id="S2.SS1.p1.1.m1.1.1.2.cmml" xref="S2.SS1.p1.1.m1.1.1.2"></times><ci id="S2.SS1.p1.1.m1.1.1.3.cmml" xref="S2.SS1.p1.1.m1.1.1.3">𝑝</ci><apply id="S2.SS1.p1.1.m1.1.1.1.1.1.cmml" xref="S2.SS1.p1.1.m1.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.p1.1.m1.1.1.1.1.1.1.cmml" xref="S2.SS1.p1.1.m1.1.1.1.1.1.1">conditional</csymbol><ci id="S2.SS1.p1.1.m1.1.1.1.1.1.2.cmml" xref="S2.SS1.p1.1.m1.1.1.1.1.1.2">𝑙</ci><ci id="S2.SS1.p1.1.m1.1.1.1.1.1.3.cmml" xref="S2.SS1.p1.1.m1.1.1.1.1.1.3">𝑥</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.1.m1.1c">p(l|x)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.1.m1.1d">italic_p ( italic_l | italic_x )</annotation></semantics></math> of the target sequence, where <math alttext="x=[x_{1},...,x_{T}]" class="ltx_Math" display="inline" id="S2.SS1.p1.2.m2.3"><semantics id="S2.SS1.p1.2.m2.3a"><mrow id="S2.SS1.p1.2.m2.3.3" xref="S2.SS1.p1.2.m2.3.3.cmml"><mi id="S2.SS1.p1.2.m2.3.3.4" xref="S2.SS1.p1.2.m2.3.3.4.cmml">x</mi><mo id="S2.SS1.p1.2.m2.3.3.3" xref="S2.SS1.p1.2.m2.3.3.3.cmml">=</mo><mrow id="S2.SS1.p1.2.m2.3.3.2.2" xref="S2.SS1.p1.2.m2.3.3.2.3.cmml"><mo id="S2.SS1.p1.2.m2.3.3.2.2.3" stretchy="false" xref="S2.SS1.p1.2.m2.3.3.2.3.cmml">[</mo><msub id="S2.SS1.p1.2.m2.2.2.1.1.1" xref="S2.SS1.p1.2.m2.2.2.1.1.1.cmml"><mi id="S2.SS1.p1.2.m2.2.2.1.1.1.2" xref="S2.SS1.p1.2.m2.2.2.1.1.1.2.cmml">x</mi><mn id="S2.SS1.p1.2.m2.2.2.1.1.1.3" xref="S2.SS1.p1.2.m2.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S2.SS1.p1.2.m2.3.3.2.2.4" xref="S2.SS1.p1.2.m2.3.3.2.3.cmml">,</mo><mi id="S2.SS1.p1.2.m2.1.1" mathvariant="normal" xref="S2.SS1.p1.2.m2.1.1.cmml">…</mi><mo id="S2.SS1.p1.2.m2.3.3.2.2.5" xref="S2.SS1.p1.2.m2.3.3.2.3.cmml">,</mo><msub id="S2.SS1.p1.2.m2.3.3.2.2.2" xref="S2.SS1.p1.2.m2.3.3.2.2.2.cmml"><mi id="S2.SS1.p1.2.m2.3.3.2.2.2.2" xref="S2.SS1.p1.2.m2.3.3.2.2.2.2.cmml">x</mi><mi id="S2.SS1.p1.2.m2.3.3.2.2.2.3" xref="S2.SS1.p1.2.m2.3.3.2.2.2.3.cmml">T</mi></msub><mo id="S2.SS1.p1.2.m2.3.3.2.2.6" stretchy="false" xref="S2.SS1.p1.2.m2.3.3.2.3.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.2.m2.3b"><apply id="S2.SS1.p1.2.m2.3.3.cmml" xref="S2.SS1.p1.2.m2.3.3"><eq id="S2.SS1.p1.2.m2.3.3.3.cmml" xref="S2.SS1.p1.2.m2.3.3.3"></eq><ci id="S2.SS1.p1.2.m2.3.3.4.cmml" xref="S2.SS1.p1.2.m2.3.3.4">𝑥</ci><list id="S2.SS1.p1.2.m2.3.3.2.3.cmml" xref="S2.SS1.p1.2.m2.3.3.2.2"><apply id="S2.SS1.p1.2.m2.2.2.1.1.1.cmml" xref="S2.SS1.p1.2.m2.2.2.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p1.2.m2.2.2.1.1.1.1.cmml" xref="S2.SS1.p1.2.m2.2.2.1.1.1">subscript</csymbol><ci id="S2.SS1.p1.2.m2.2.2.1.1.1.2.cmml" xref="S2.SS1.p1.2.m2.2.2.1.1.1.2">𝑥</ci><cn id="S2.SS1.p1.2.m2.2.2.1.1.1.3.cmml" type="integer" xref="S2.SS1.p1.2.m2.2.2.1.1.1.3">1</cn></apply><ci id="S2.SS1.p1.2.m2.1.1.cmml" xref="S2.SS1.p1.2.m2.1.1">…</ci><apply id="S2.SS1.p1.2.m2.3.3.2.2.2.cmml" xref="S2.SS1.p1.2.m2.3.3.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.p1.2.m2.3.3.2.2.2.1.cmml" xref="S2.SS1.p1.2.m2.3.3.2.2.2">subscript</csymbol><ci id="S2.SS1.p1.2.m2.3.3.2.2.2.2.cmml" xref="S2.SS1.p1.2.m2.3.3.2.2.2.2">𝑥</ci><ci id="S2.SS1.p1.2.m2.3.3.2.2.2.3.cmml" xref="S2.SS1.p1.2.m2.3.3.2.2.2.3">𝑇</ci></apply></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.2.m2.3c">x=[x_{1},...,x_{T}]</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.2.m2.3d">italic_x = [ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ]</annotation></semantics></math> represents the input acoustic embedding, e.g., from an acoustic encoder, and <math alttext="l=[l_{1},...,l_{U}]" class="ltx_Math" display="inline" id="S2.SS1.p1.3.m3.3"><semantics id="S2.SS1.p1.3.m3.3a"><mrow id="S2.SS1.p1.3.m3.3.3" xref="S2.SS1.p1.3.m3.3.3.cmml"><mi id="S2.SS1.p1.3.m3.3.3.4" xref="S2.SS1.p1.3.m3.3.3.4.cmml">l</mi><mo id="S2.SS1.p1.3.m3.3.3.3" xref="S2.SS1.p1.3.m3.3.3.3.cmml">=</mo><mrow id="S2.SS1.p1.3.m3.3.3.2.2" xref="S2.SS1.p1.3.m3.3.3.2.3.cmml"><mo id="S2.SS1.p1.3.m3.3.3.2.2.3" stretchy="false" xref="S2.SS1.p1.3.m3.3.3.2.3.cmml">[</mo><msub id="S2.SS1.p1.3.m3.2.2.1.1.1" xref="S2.SS1.p1.3.m3.2.2.1.1.1.cmml"><mi id="S2.SS1.p1.3.m3.2.2.1.1.1.2" xref="S2.SS1.p1.3.m3.2.2.1.1.1.2.cmml">l</mi><mn id="S2.SS1.p1.3.m3.2.2.1.1.1.3" xref="S2.SS1.p1.3.m3.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S2.SS1.p1.3.m3.3.3.2.2.4" xref="S2.SS1.p1.3.m3.3.3.2.3.cmml">,</mo><mi id="S2.SS1.p1.3.m3.1.1" mathvariant="normal" xref="S2.SS1.p1.3.m3.1.1.cmml">…</mi><mo id="S2.SS1.p1.3.m3.3.3.2.2.5" xref="S2.SS1.p1.3.m3.3.3.2.3.cmml">,</mo><msub id="S2.SS1.p1.3.m3.3.3.2.2.2" xref="S2.SS1.p1.3.m3.3.3.2.2.2.cmml"><mi id="S2.SS1.p1.3.m3.3.3.2.2.2.2" xref="S2.SS1.p1.3.m3.3.3.2.2.2.2.cmml">l</mi><mi id="S2.SS1.p1.3.m3.3.3.2.2.2.3" xref="S2.SS1.p1.3.m3.3.3.2.2.2.3.cmml">U</mi></msub><mo id="S2.SS1.p1.3.m3.3.3.2.2.6" stretchy="false" xref="S2.SS1.p1.3.m3.3.3.2.3.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.3.m3.3b"><apply id="S2.SS1.p1.3.m3.3.3.cmml" xref="S2.SS1.p1.3.m3.3.3"><eq id="S2.SS1.p1.3.m3.3.3.3.cmml" xref="S2.SS1.p1.3.m3.3.3.3"></eq><ci id="S2.SS1.p1.3.m3.3.3.4.cmml" xref="S2.SS1.p1.3.m3.3.3.4">𝑙</ci><list id="S2.SS1.p1.3.m3.3.3.2.3.cmml" xref="S2.SS1.p1.3.m3.3.3.2.2"><apply id="S2.SS1.p1.3.m3.2.2.1.1.1.cmml" xref="S2.SS1.p1.3.m3.2.2.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p1.3.m3.2.2.1.1.1.1.cmml" xref="S2.SS1.p1.3.m3.2.2.1.1.1">subscript</csymbol><ci id="S2.SS1.p1.3.m3.2.2.1.1.1.2.cmml" xref="S2.SS1.p1.3.m3.2.2.1.1.1.2">𝑙</ci><cn id="S2.SS1.p1.3.m3.2.2.1.1.1.3.cmml" type="integer" xref="S2.SS1.p1.3.m3.2.2.1.1.1.3">1</cn></apply><ci id="S2.SS1.p1.3.m3.1.1.cmml" xref="S2.SS1.p1.3.m3.1.1">…</ci><apply id="S2.SS1.p1.3.m3.3.3.2.2.2.cmml" xref="S2.SS1.p1.3.m3.3.3.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.p1.3.m3.3.3.2.2.2.1.cmml" xref="S2.SS1.p1.3.m3.3.3.2.2.2">subscript</csymbol><ci id="S2.SS1.p1.3.m3.3.3.2.2.2.2.cmml" xref="S2.SS1.p1.3.m3.3.3.2.2.2.2">𝑙</ci><ci id="S2.SS1.p1.3.m3.3.3.2.2.2.3.cmml" xref="S2.SS1.p1.3.m3.3.3.2.2.2.3">𝑈</ci></apply></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.3.m3.3c">l=[l_{1},...,l_{U}]</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.3.m3.3d">italic_l = [ italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_l start_POSTSUBSCRIPT italic_U end_POSTSUBSCRIPT ]</annotation></semantics></math> represents the transcription label sequence. To compensate for the length discrepancy between <math alttext="x" class="ltx_Math" display="inline" id="S2.SS1.p1.4.m4.1"><semantics id="S2.SS1.p1.4.m4.1a"><mi id="S2.SS1.p1.4.m4.1.1" xref="S2.SS1.p1.4.m4.1.1.cmml">x</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.4.m4.1b"><ci id="S2.SS1.p1.4.m4.1.1.cmml" xref="S2.SS1.p1.4.m4.1.1">𝑥</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.4.m4.1c">x</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.4.m4.1d">italic_x</annotation></semantics></math> and <math alttext="l" class="ltx_Math" display="inline" id="S2.SS1.p1.5.m5.1"><semantics id="S2.SS1.p1.5.m5.1a"><mi id="S2.SS1.p1.5.m5.1.1" xref="S2.SS1.p1.5.m5.1.1.cmml">l</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.5.m5.1b"><ci id="S2.SS1.p1.5.m5.1.1.cmml" xref="S2.SS1.p1.5.m5.1.1">𝑙</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.5.m5.1c">l</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.5.m5.1d">italic_l</annotation></semantics></math>, CTC introduces blank tokens <math alttext="\varnothing" class="ltx_Math" display="inline" id="S2.SS1.p1.6.m6.1"><semantics id="S2.SS1.p1.6.m6.1a"><mi id="S2.SS1.p1.6.m6.1.1" mathvariant="normal" xref="S2.SS1.p1.6.m6.1.1.cmml">∅</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.6.m6.1b"><emptyset id="S2.SS1.p1.6.m6.1.1.cmml" xref="S2.SS1.p1.6.m6.1.1"></emptyset></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.6.m6.1c">\varnothing</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.6.m6.1d">∅</annotation></semantics></math> into the label sequence <math alttext="l" class="ltx_Math" display="inline" id="S2.SS1.p1.7.m7.1"><semantics id="S2.SS1.p1.7.m7.1a"><mi id="S2.SS1.p1.7.m7.1.1" xref="S2.SS1.p1.7.m7.1.1.cmml">l</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.7.m7.1b"><ci id="S2.SS1.p1.7.m7.1.1.cmml" xref="S2.SS1.p1.7.m7.1.1">𝑙</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.7.m7.1c">l</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.7.m7.1d">italic_l</annotation></semantics></math> to construct alignment paths (also as known as CTC labels) <math alttext="\pi=[\pi_{1},...,\pi_{T}]" class="ltx_Math" display="inline" id="S2.SS1.p1.8.m8.3"><semantics id="S2.SS1.p1.8.m8.3a"><mrow id="S2.SS1.p1.8.m8.3.3" xref="S2.SS1.p1.8.m8.3.3.cmml"><mi id="S2.SS1.p1.8.m8.3.3.4" xref="S2.SS1.p1.8.m8.3.3.4.cmml">π</mi><mo id="S2.SS1.p1.8.m8.3.3.3" xref="S2.SS1.p1.8.m8.3.3.3.cmml">=</mo><mrow id="S2.SS1.p1.8.m8.3.3.2.2" xref="S2.SS1.p1.8.m8.3.3.2.3.cmml"><mo id="S2.SS1.p1.8.m8.3.3.2.2.3" stretchy="false" xref="S2.SS1.p1.8.m8.3.3.2.3.cmml">[</mo><msub id="S2.SS1.p1.8.m8.2.2.1.1.1" xref="S2.SS1.p1.8.m8.2.2.1.1.1.cmml"><mi id="S2.SS1.p1.8.m8.2.2.1.1.1.2" xref="S2.SS1.p1.8.m8.2.2.1.1.1.2.cmml">π</mi><mn id="S2.SS1.p1.8.m8.2.2.1.1.1.3" xref="S2.SS1.p1.8.m8.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S2.SS1.p1.8.m8.3.3.2.2.4" xref="S2.SS1.p1.8.m8.3.3.2.3.cmml">,</mo><mi id="S2.SS1.p1.8.m8.1.1" mathvariant="normal" xref="S2.SS1.p1.8.m8.1.1.cmml">…</mi><mo id="S2.SS1.p1.8.m8.3.3.2.2.5" xref="S2.SS1.p1.8.m8.3.3.2.3.cmml">,</mo><msub id="S2.SS1.p1.8.m8.3.3.2.2.2" xref="S2.SS1.p1.8.m8.3.3.2.2.2.cmml"><mi id="S2.SS1.p1.8.m8.3.3.2.2.2.2" xref="S2.SS1.p1.8.m8.3.3.2.2.2.2.cmml">π</mi><mi id="S2.SS1.p1.8.m8.3.3.2.2.2.3" xref="S2.SS1.p1.8.m8.3.3.2.2.2.3.cmml">T</mi></msub><mo id="S2.SS1.p1.8.m8.3.3.2.2.6" stretchy="false" xref="S2.SS1.p1.8.m8.3.3.2.3.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.8.m8.3b"><apply id="S2.SS1.p1.8.m8.3.3.cmml" xref="S2.SS1.p1.8.m8.3.3"><eq id="S2.SS1.p1.8.m8.3.3.3.cmml" xref="S2.SS1.p1.8.m8.3.3.3"></eq><ci id="S2.SS1.p1.8.m8.3.3.4.cmml" xref="S2.SS1.p1.8.m8.3.3.4">𝜋</ci><list id="S2.SS1.p1.8.m8.3.3.2.3.cmml" xref="S2.SS1.p1.8.m8.3.3.2.2"><apply id="S2.SS1.p1.8.m8.2.2.1.1.1.cmml" xref="S2.SS1.p1.8.m8.2.2.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p1.8.m8.2.2.1.1.1.1.cmml" xref="S2.SS1.p1.8.m8.2.2.1.1.1">subscript</csymbol><ci id="S2.SS1.p1.8.m8.2.2.1.1.1.2.cmml" xref="S2.SS1.p1.8.m8.2.2.1.1.1.2">𝜋</ci><cn id="S2.SS1.p1.8.m8.2.2.1.1.1.3.cmml" type="integer" xref="S2.SS1.p1.8.m8.2.2.1.1.1.3">1</cn></apply><ci id="S2.SS1.p1.8.m8.1.1.cmml" xref="S2.SS1.p1.8.m8.1.1">…</ci><apply id="S2.SS1.p1.8.m8.3.3.2.2.2.cmml" xref="S2.SS1.p1.8.m8.3.3.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.p1.8.m8.3.3.2.2.2.1.cmml" xref="S2.SS1.p1.8.m8.3.3.2.2.2">subscript</csymbol><ci id="S2.SS1.p1.8.m8.3.3.2.2.2.2.cmml" xref="S2.SS1.p1.8.m8.3.3.2.2.2.2">𝜋</ci><ci id="S2.SS1.p1.8.m8.3.3.2.2.2.3.cmml" xref="S2.SS1.p1.8.m8.3.3.2.2.2.3">𝑇</ci></apply></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.8.m8.3c">\pi=[\pi_{1},...,\pi_{T}]</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.8.m8.3d">italic_π = [ italic_π start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_π start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ]</annotation></semantics></math> between <math alttext="x" class="ltx_Math" display="inline" id="S2.SS1.p1.9.m9.1"><semantics id="S2.SS1.p1.9.m9.1a"><mi id="S2.SS1.p1.9.m9.1.1" xref="S2.SS1.p1.9.m9.1.1.cmml">x</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.9.m9.1b"><ci id="S2.SS1.p1.9.m9.1.1.cmml" xref="S2.SS1.p1.9.m9.1.1">𝑥</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.9.m9.1c">x</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.9.m9.1d">italic_x</annotation></semantics></math> and <math alttext="l" class="ltx_Math" display="inline" id="S2.SS1.p1.10.m10.1"><semantics id="S2.SS1.p1.10.m10.1a"><mi id="S2.SS1.p1.10.m10.1.1" xref="S2.SS1.p1.10.m10.1.1.cmml">l</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.10.m10.1b"><ci id="S2.SS1.p1.10.m10.1.1.cmml" xref="S2.SS1.p1.10.m10.1.1">𝑙</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.10.m10.1c">l</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.10.m10.1d">italic_l</annotation></semantics></math>. <math alttext="\pi_{t}" class="ltx_Math" display="inline" id="S2.SS1.p1.11.m11.1"><semantics id="S2.SS1.p1.11.m11.1a"><msub id="S2.SS1.p1.11.m11.1.1" xref="S2.SS1.p1.11.m11.1.1.cmml"><mi id="S2.SS1.p1.11.m11.1.1.2" xref="S2.SS1.p1.11.m11.1.1.2.cmml">π</mi><mi id="S2.SS1.p1.11.m11.1.1.3" xref="S2.SS1.p1.11.m11.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.11.m11.1b"><apply id="S2.SS1.p1.11.m11.1.1.cmml" xref="S2.SS1.p1.11.m11.1.1"><csymbol cd="ambiguous" id="S2.SS1.p1.11.m11.1.1.1.cmml" xref="S2.SS1.p1.11.m11.1.1">subscript</csymbol><ci id="S2.SS1.p1.11.m11.1.1.2.cmml" xref="S2.SS1.p1.11.m11.1.1.2">𝜋</ci><ci id="S2.SS1.p1.11.m11.1.1.3.cmml" xref="S2.SS1.p1.11.m11.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.11.m11.1c">\pi_{t}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.11.m11.1d">italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> denotes the output token at time step <math alttext="t" class="ltx_Math" display="inline" id="S2.SS1.p1.12.m12.1"><semantics id="S2.SS1.p1.12.m12.1a"><mi id="S2.SS1.p1.12.m12.1.1" xref="S2.SS1.p1.12.m12.1.1.cmml">t</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.12.m12.1b"><ci id="S2.SS1.p1.12.m12.1.1.cmml" xref="S2.SS1.p1.12.m12.1.1">𝑡</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.12.m12.1c">t</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.12.m12.1d">italic_t</annotation></semantics></math>. A collapsing function <math alttext="B(\pi)=l" class="ltx_Math" display="inline" id="S2.SS1.p1.13.m13.1"><semantics id="S2.SS1.p1.13.m13.1a"><mrow id="S2.SS1.p1.13.m13.1.2" xref="S2.SS1.p1.13.m13.1.2.cmml"><mrow id="S2.SS1.p1.13.m13.1.2.2" xref="S2.SS1.p1.13.m13.1.2.2.cmml"><mi id="S2.SS1.p1.13.m13.1.2.2.2" xref="S2.SS1.p1.13.m13.1.2.2.2.cmml">B</mi><mo id="S2.SS1.p1.13.m13.1.2.2.1" xref="S2.SS1.p1.13.m13.1.2.2.1.cmml"></mo><mrow id="S2.SS1.p1.13.m13.1.2.2.3.2" xref="S2.SS1.p1.13.m13.1.2.2.cmml"><mo id="S2.SS1.p1.13.m13.1.2.2.3.2.1" stretchy="false" xref="S2.SS1.p1.13.m13.1.2.2.cmml">(</mo><mi id="S2.SS1.p1.13.m13.1.1" xref="S2.SS1.p1.13.m13.1.1.cmml">π</mi><mo id="S2.SS1.p1.13.m13.1.2.2.3.2.2" stretchy="false" xref="S2.SS1.p1.13.m13.1.2.2.cmml">)</mo></mrow></mrow><mo id="S2.SS1.p1.13.m13.1.2.1" xref="S2.SS1.p1.13.m13.1.2.1.cmml">=</mo><mi id="S2.SS1.p1.13.m13.1.2.3" xref="S2.SS1.p1.13.m13.1.2.3.cmml">l</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.13.m13.1b"><apply id="S2.SS1.p1.13.m13.1.2.cmml" xref="S2.SS1.p1.13.m13.1.2"><eq id="S2.SS1.p1.13.m13.1.2.1.cmml" xref="S2.SS1.p1.13.m13.1.2.1"></eq><apply id="S2.SS1.p1.13.m13.1.2.2.cmml" xref="S2.SS1.p1.13.m13.1.2.2"><times id="S2.SS1.p1.13.m13.1.2.2.1.cmml" xref="S2.SS1.p1.13.m13.1.2.2.1"></times><ci id="S2.SS1.p1.13.m13.1.2.2.2.cmml" xref="S2.SS1.p1.13.m13.1.2.2.2">𝐵</ci><ci id="S2.SS1.p1.13.m13.1.1.cmml" xref="S2.SS1.p1.13.m13.1.1">𝜋</ci></apply><ci id="S2.SS1.p1.13.m13.1.2.3.cmml" xref="S2.SS1.p1.13.m13.1.2.3">𝑙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.13.m13.1c">B(\pi)=l</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.13.m13.1d">italic_B ( italic_π ) = italic_l</annotation></semantics></math> maps alignment paths to text labels by collapsing repeated consecutive labels into a single label and removing all blank labels (e.g., <math alttext="B(\varnothing a\varnothing aabb)=aab" class="ltx_Math" display="inline" id="S2.SS1.p1.14.m14.1"><semantics id="S2.SS1.p1.14.m14.1a"><mrow id="S2.SS1.p1.14.m14.1.1" xref="S2.SS1.p1.14.m14.1.1.cmml"><mrow id="S2.SS1.p1.14.m14.1.1.1" xref="S2.SS1.p1.14.m14.1.1.1.cmml"><mi id="S2.SS1.p1.14.m14.1.1.1.3" xref="S2.SS1.p1.14.m14.1.1.1.3.cmml">B</mi><mo id="S2.SS1.p1.14.m14.1.1.1.2" xref="S2.SS1.p1.14.m14.1.1.1.2.cmml"></mo><mrow id="S2.SS1.p1.14.m14.1.1.1.1.1" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.cmml"><mo id="S2.SS1.p1.14.m14.1.1.1.1.1.2" stretchy="false" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.p1.14.m14.1.1.1.1.1.1" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.cmml"><mi id="S2.SS1.p1.14.m14.1.1.1.1.1.1.2" mathvariant="normal" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.2.cmml">∅</mi><mo id="S2.SS1.p1.14.m14.1.1.1.1.1.1.1" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.1.cmml"></mo><mi id="S2.SS1.p1.14.m14.1.1.1.1.1.1.3" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.3.cmml">a</mi><mo id="S2.SS1.p1.14.m14.1.1.1.1.1.1.1a" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.1.cmml"></mo><mi id="S2.SS1.p1.14.m14.1.1.1.1.1.1.4" mathvariant="normal" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.4.cmml">∅</mi><mo id="S2.SS1.p1.14.m14.1.1.1.1.1.1.1b" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.1.cmml"></mo><mi id="S2.SS1.p1.14.m14.1.1.1.1.1.1.5" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.5.cmml">a</mi><mo id="S2.SS1.p1.14.m14.1.1.1.1.1.1.1c" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.1.cmml"></mo><mi id="S2.SS1.p1.14.m14.1.1.1.1.1.1.6" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.6.cmml">a</mi><mo id="S2.SS1.p1.14.m14.1.1.1.1.1.1.1d" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.1.cmml"></mo><mi id="S2.SS1.p1.14.m14.1.1.1.1.1.1.7" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.7.cmml">b</mi><mo id="S2.SS1.p1.14.m14.1.1.1.1.1.1.1e" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.1.cmml"></mo><mi id="S2.SS1.p1.14.m14.1.1.1.1.1.1.8" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.8.cmml">b</mi></mrow><mo id="S2.SS1.p1.14.m14.1.1.1.1.1.3" stretchy="false" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S2.SS1.p1.14.m14.1.1.2" xref="S2.SS1.p1.14.m14.1.1.2.cmml">=</mo><mrow id="S2.SS1.p1.14.m14.1.1.3" xref="S2.SS1.p1.14.m14.1.1.3.cmml"><mi id="S2.SS1.p1.14.m14.1.1.3.2" xref="S2.SS1.p1.14.m14.1.1.3.2.cmml">a</mi><mo id="S2.SS1.p1.14.m14.1.1.3.1" xref="S2.SS1.p1.14.m14.1.1.3.1.cmml"></mo><mi id="S2.SS1.p1.14.m14.1.1.3.3" xref="S2.SS1.p1.14.m14.1.1.3.3.cmml">a</mi><mo id="S2.SS1.p1.14.m14.1.1.3.1a" xref="S2.SS1.p1.14.m14.1.1.3.1.cmml"></mo><mi id="S2.SS1.p1.14.m14.1.1.3.4" xref="S2.SS1.p1.14.m14.1.1.3.4.cmml">b</mi></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.14.m14.1b"><apply id="S2.SS1.p1.14.m14.1.1.cmml" xref="S2.SS1.p1.14.m14.1.1"><eq id="S2.SS1.p1.14.m14.1.1.2.cmml" xref="S2.SS1.p1.14.m14.1.1.2"></eq><apply id="S2.SS1.p1.14.m14.1.1.1.cmml" xref="S2.SS1.p1.14.m14.1.1.1"><times id="S2.SS1.p1.14.m14.1.1.1.2.cmml" xref="S2.SS1.p1.14.m14.1.1.1.2"></times><ci id="S2.SS1.p1.14.m14.1.1.1.3.cmml" xref="S2.SS1.p1.14.m14.1.1.1.3">𝐵</ci><apply id="S2.SS1.p1.14.m14.1.1.1.1.1.1.cmml" xref="S2.SS1.p1.14.m14.1.1.1.1.1"><times id="S2.SS1.p1.14.m14.1.1.1.1.1.1.1.cmml" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.1"></times><emptyset id="S2.SS1.p1.14.m14.1.1.1.1.1.1.2.cmml" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.2"></emptyset><ci id="S2.SS1.p1.14.m14.1.1.1.1.1.1.3.cmml" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.3">𝑎</ci><emptyset id="S2.SS1.p1.14.m14.1.1.1.1.1.1.4.cmml" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.4"></emptyset><ci id="S2.SS1.p1.14.m14.1.1.1.1.1.1.5.cmml" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.5">𝑎</ci><ci id="S2.SS1.p1.14.m14.1.1.1.1.1.1.6.cmml" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.6">𝑎</ci><ci id="S2.SS1.p1.14.m14.1.1.1.1.1.1.7.cmml" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.7">𝑏</ci><ci id="S2.SS1.p1.14.m14.1.1.1.1.1.1.8.cmml" xref="S2.SS1.p1.14.m14.1.1.1.1.1.1.8">𝑏</ci></apply></apply><apply id="S2.SS1.p1.14.m14.1.1.3.cmml" xref="S2.SS1.p1.14.m14.1.1.3"><times id="S2.SS1.p1.14.m14.1.1.3.1.cmml" xref="S2.SS1.p1.14.m14.1.1.3.1"></times><ci id="S2.SS1.p1.14.m14.1.1.3.2.cmml" xref="S2.SS1.p1.14.m14.1.1.3.2">𝑎</ci><ci id="S2.SS1.p1.14.m14.1.1.3.3.cmml" xref="S2.SS1.p1.14.m14.1.1.3.3">𝑎</ci><ci id="S2.SS1.p1.14.m14.1.1.3.4.cmml" xref="S2.SS1.p1.14.m14.1.1.3.4">𝑏</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.14.m14.1c">B(\varnothing a\varnothing aabb)=aab</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.14.m14.1d">italic_B ( ∅ italic_a ∅ italic_a italic_a italic_b italic_b ) = italic_a italic_a italic_b</annotation></semantics></math>). Subsequently, the posterior probability <math alttext="p(l|x)" class="ltx_Math" display="inline" id="S2.SS1.p1.15.m15.1"><semantics id="S2.SS1.p1.15.m15.1a"><mrow id="S2.SS1.p1.15.m15.1.1" xref="S2.SS1.p1.15.m15.1.1.cmml"><mi id="S2.SS1.p1.15.m15.1.1.3" xref="S2.SS1.p1.15.m15.1.1.3.cmml">p</mi><mo id="S2.SS1.p1.15.m15.1.1.2" xref="S2.SS1.p1.15.m15.1.1.2.cmml"></mo><mrow id="S2.SS1.p1.15.m15.1.1.1.1" xref="S2.SS1.p1.15.m15.1.1.1.1.1.cmml"><mo id="S2.SS1.p1.15.m15.1.1.1.1.2" stretchy="false" xref="S2.SS1.p1.15.m15.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.p1.15.m15.1.1.1.1.1" xref="S2.SS1.p1.15.m15.1.1.1.1.1.cmml"><mi id="S2.SS1.p1.15.m15.1.1.1.1.1.2" xref="S2.SS1.p1.15.m15.1.1.1.1.1.2.cmml">l</mi><mo fence="false" id="S2.SS1.p1.15.m15.1.1.1.1.1.1" xref="S2.SS1.p1.15.m15.1.1.1.1.1.1.cmml">|</mo><mi id="S2.SS1.p1.15.m15.1.1.1.1.1.3" xref="S2.SS1.p1.15.m15.1.1.1.1.1.3.cmml">x</mi></mrow><mo id="S2.SS1.p1.15.m15.1.1.1.1.3" stretchy="false" xref="S2.SS1.p1.15.m15.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.15.m15.1b"><apply id="S2.SS1.p1.15.m15.1.1.cmml" xref="S2.SS1.p1.15.m15.1.1"><times id="S2.SS1.p1.15.m15.1.1.2.cmml" xref="S2.SS1.p1.15.m15.1.1.2"></times><ci id="S2.SS1.p1.15.m15.1.1.3.cmml" xref="S2.SS1.p1.15.m15.1.1.3">𝑝</ci><apply id="S2.SS1.p1.15.m15.1.1.1.1.1.cmml" xref="S2.SS1.p1.15.m15.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.p1.15.m15.1.1.1.1.1.1.cmml" xref="S2.SS1.p1.15.m15.1.1.1.1.1.1">conditional</csymbol><ci id="S2.SS1.p1.15.m15.1.1.1.1.1.2.cmml" xref="S2.SS1.p1.15.m15.1.1.1.1.1.2">𝑙</ci><ci id="S2.SS1.p1.15.m15.1.1.1.1.1.3.cmml" xref="S2.SS1.p1.15.m15.1.1.1.1.1.3">𝑥</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.15.m15.1c">p(l|x)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.15.m15.1d">italic_p ( italic_l | italic_x )</annotation></semantics></math> of the label sequence can be calculated by summing up the posteriors of all possible alignments:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E1"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="P(l|x)=\sum_{\pi\in B^{-1}(l)}p(\pi|x)" class="ltx_Math" display="block" id="S2.E1.m1.3"><semantics id="S2.E1.m1.3a"><mrow id="S2.E1.m1.3.3" xref="S2.E1.m1.3.3.cmml"><mrow id="S2.E1.m1.2.2.1" xref="S2.E1.m1.2.2.1.cmml"><mi id="S2.E1.m1.2.2.1.3" xref="S2.E1.m1.2.2.1.3.cmml">P</mi><mo id="S2.E1.m1.2.2.1.2" xref="S2.E1.m1.2.2.1.2.cmml"></mo><mrow id="S2.E1.m1.2.2.1.1.1" xref="S2.E1.m1.2.2.1.1.1.1.cmml"><mo id="S2.E1.m1.2.2.1.1.1.2" stretchy="false" xref="S2.E1.m1.2.2.1.1.1.1.cmml">(</mo><mrow id="S2.E1.m1.2.2.1.1.1.1" xref="S2.E1.m1.2.2.1.1.1.1.cmml"><mi id="S2.E1.m1.2.2.1.1.1.1.2" xref="S2.E1.m1.2.2.1.1.1.1.2.cmml">l</mi><mo fence="false" id="S2.E1.m1.2.2.1.1.1.1.1" xref="S2.E1.m1.2.2.1.1.1.1.1.cmml">|</mo><mi id="S2.E1.m1.2.2.1.1.1.1.3" xref="S2.E1.m1.2.2.1.1.1.1.3.cmml">x</mi></mrow><mo id="S2.E1.m1.2.2.1.1.1.3" stretchy="false" xref="S2.E1.m1.2.2.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S2.E1.m1.3.3.3" rspace="0.111em" xref="S2.E1.m1.3.3.3.cmml">=</mo><mrow id="S2.E1.m1.3.3.2" xref="S2.E1.m1.3.3.2.cmml"><munder id="S2.E1.m1.3.3.2.2" xref="S2.E1.m1.3.3.2.2.cmml"><mo id="S2.E1.m1.3.3.2.2.2" movablelimits="false" xref="S2.E1.m1.3.3.2.2.2.cmml">∑</mo><mrow id="S2.E1.m1.1.1.1" xref="S2.E1.m1.1.1.1.cmml"><mi id="S2.E1.m1.1.1.1.3" xref="S2.E1.m1.1.1.1.3.cmml">π</mi><mo id="S2.E1.m1.1.1.1.2" xref="S2.E1.m1.1.1.1.2.cmml">∈</mo><mrow id="S2.E1.m1.1.1.1.4" xref="S2.E1.m1.1.1.1.4.cmml"><msup id="S2.E1.m1.1.1.1.4.2" xref="S2.E1.m1.1.1.1.4.2.cmml"><mi id="S2.E1.m1.1.1.1.4.2.2" xref="S2.E1.m1.1.1.1.4.2.2.cmml">B</mi><mrow id="S2.E1.m1.1.1.1.4.2.3" xref="S2.E1.m1.1.1.1.4.2.3.cmml"><mo id="S2.E1.m1.1.1.1.4.2.3a" xref="S2.E1.m1.1.1.1.4.2.3.cmml">−</mo><mn id="S2.E1.m1.1.1.1.4.2.3.2" xref="S2.E1.m1.1.1.1.4.2.3.2.cmml">1</mn></mrow></msup><mo id="S2.E1.m1.1.1.1.4.1" xref="S2.E1.m1.1.1.1.4.1.cmml"></mo><mrow id="S2.E1.m1.1.1.1.4.3.2" xref="S2.E1.m1.1.1.1.4.cmml"><mo id="S2.E1.m1.1.1.1.4.3.2.1" stretchy="false" xref="S2.E1.m1.1.1.1.4.cmml">(</mo><mi id="S2.E1.m1.1.1.1.1" xref="S2.E1.m1.1.1.1.1.cmml">l</mi><mo id="S2.E1.m1.1.1.1.4.3.2.2" stretchy="false" xref="S2.E1.m1.1.1.1.4.cmml">)</mo></mrow></mrow></mrow></munder><mrow id="S2.E1.m1.3.3.2.1" xref="S2.E1.m1.3.3.2.1.cmml"><mi id="S2.E1.m1.3.3.2.1.3" xref="S2.E1.m1.3.3.2.1.3.cmml">p</mi><mo id="S2.E1.m1.3.3.2.1.2" xref="S2.E1.m1.3.3.2.1.2.cmml"></mo><mrow id="S2.E1.m1.3.3.2.1.1.1" xref="S2.E1.m1.3.3.2.1.1.1.1.cmml"><mo id="S2.E1.m1.3.3.2.1.1.1.2" stretchy="false" xref="S2.E1.m1.3.3.2.1.1.1.1.cmml">(</mo><mrow id="S2.E1.m1.3.3.2.1.1.1.1" xref="S2.E1.m1.3.3.2.1.1.1.1.cmml"><mi id="S2.E1.m1.3.3.2.1.1.1.1.2" xref="S2.E1.m1.3.3.2.1.1.1.1.2.cmml">π</mi><mo fence="false" id="S2.E1.m1.3.3.2.1.1.1.1.1" xref="S2.E1.m1.3.3.2.1.1.1.1.1.cmml">|</mo><mi id="S2.E1.m1.3.3.2.1.1.1.1.3" xref="S2.E1.m1.3.3.2.1.1.1.1.3.cmml">x</mi></mrow><mo id="S2.E1.m1.3.3.2.1.1.1.3" stretchy="false" xref="S2.E1.m1.3.3.2.1.1.1.1.cmml">)</mo></mrow></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E1.m1.3b"><apply id="S2.E1.m1.3.3.cmml" xref="S2.E1.m1.3.3"><eq id="S2.E1.m1.3.3.3.cmml" xref="S2.E1.m1.3.3.3"></eq><apply id="S2.E1.m1.2.2.1.cmml" xref="S2.E1.m1.2.2.1"><times id="S2.E1.m1.2.2.1.2.cmml" xref="S2.E1.m1.2.2.1.2"></times><ci id="S2.E1.m1.2.2.1.3.cmml" xref="S2.E1.m1.2.2.1.3">𝑃</ci><apply id="S2.E1.m1.2.2.1.1.1.1.cmml" xref="S2.E1.m1.2.2.1.1.1"><csymbol cd="latexml" id="S2.E1.m1.2.2.1.1.1.1.1.cmml" xref="S2.E1.m1.2.2.1.1.1.1.1">conditional</csymbol><ci id="S2.E1.m1.2.2.1.1.1.1.2.cmml" xref="S2.E1.m1.2.2.1.1.1.1.2">𝑙</ci><ci id="S2.E1.m1.2.2.1.1.1.1.3.cmml" xref="S2.E1.m1.2.2.1.1.1.1.3">𝑥</ci></apply></apply><apply id="S2.E1.m1.3.3.2.cmml" xref="S2.E1.m1.3.3.2"><apply id="S2.E1.m1.3.3.2.2.cmml" xref="S2.E1.m1.3.3.2.2"><csymbol cd="ambiguous" id="S2.E1.m1.3.3.2.2.1.cmml" xref="S2.E1.m1.3.3.2.2">subscript</csymbol><sum id="S2.E1.m1.3.3.2.2.2.cmml" xref="S2.E1.m1.3.3.2.2.2"></sum><apply id="S2.E1.m1.1.1.1.cmml" xref="S2.E1.m1.1.1.1"><in id="S2.E1.m1.1.1.1.2.cmml" xref="S2.E1.m1.1.1.1.2"></in><ci id="S2.E1.m1.1.1.1.3.cmml" xref="S2.E1.m1.1.1.1.3">𝜋</ci><apply id="S2.E1.m1.1.1.1.4.cmml" xref="S2.E1.m1.1.1.1.4"><times id="S2.E1.m1.1.1.1.4.1.cmml" xref="S2.E1.m1.1.1.1.4.1"></times><apply id="S2.E1.m1.1.1.1.4.2.cmml" xref="S2.E1.m1.1.1.1.4.2"><csymbol cd="ambiguous" id="S2.E1.m1.1.1.1.4.2.1.cmml" xref="S2.E1.m1.1.1.1.4.2">superscript</csymbol><ci id="S2.E1.m1.1.1.1.4.2.2.cmml" xref="S2.E1.m1.1.1.1.4.2.2">𝐵</ci><apply id="S2.E1.m1.1.1.1.4.2.3.cmml" xref="S2.E1.m1.1.1.1.4.2.3"><minus id="S2.E1.m1.1.1.1.4.2.3.1.cmml" xref="S2.E1.m1.1.1.1.4.2.3"></minus><cn id="S2.E1.m1.1.1.1.4.2.3.2.cmml" type="integer" xref="S2.E1.m1.1.1.1.4.2.3.2">1</cn></apply></apply><ci id="S2.E1.m1.1.1.1.1.cmml" xref="S2.E1.m1.1.1.1.1">𝑙</ci></apply></apply></apply><apply id="S2.E1.m1.3.3.2.1.cmml" xref="S2.E1.m1.3.3.2.1"><times id="S2.E1.m1.3.3.2.1.2.cmml" xref="S2.E1.m1.3.3.2.1.2"></times><ci id="S2.E1.m1.3.3.2.1.3.cmml" xref="S2.E1.m1.3.3.2.1.3">𝑝</ci><apply id="S2.E1.m1.3.3.2.1.1.1.1.cmml" xref="S2.E1.m1.3.3.2.1.1.1"><csymbol cd="latexml" id="S2.E1.m1.3.3.2.1.1.1.1.1.cmml" xref="S2.E1.m1.3.3.2.1.1.1.1.1">conditional</csymbol><ci id="S2.E1.m1.3.3.2.1.1.1.1.2.cmml" xref="S2.E1.m1.3.3.2.1.1.1.1.2">𝜋</ci><ci id="S2.E1.m1.3.3.2.1.1.1.1.3.cmml" xref="S2.E1.m1.3.3.2.1.1.1.1.3">𝑥</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E1.m1.3c">P(l|x)=\sum_{\pi\in B^{-1}(l)}p(\pi|x)</annotation><annotation encoding="application/x-llamapun" id="S2.E1.m1.3d">italic_P ( italic_l | italic_x ) = ∑ start_POSTSUBSCRIPT italic_π ∈ italic_B start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_l ) end_POSTSUBSCRIPT italic_p ( italic_π | italic_x )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(1)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS1.p1.21">where <math alttext="\pi\in B^{-1}(l)" class="ltx_Math" display="inline" id="S2.SS1.p1.16.m1.1"><semantics id="S2.SS1.p1.16.m1.1a"><mrow id="S2.SS1.p1.16.m1.1.2" xref="S2.SS1.p1.16.m1.1.2.cmml"><mi id="S2.SS1.p1.16.m1.1.2.2" xref="S2.SS1.p1.16.m1.1.2.2.cmml">π</mi><mo id="S2.SS1.p1.16.m1.1.2.1" xref="S2.SS1.p1.16.m1.1.2.1.cmml">∈</mo><mrow id="S2.SS1.p1.16.m1.1.2.3" xref="S2.SS1.p1.16.m1.1.2.3.cmml"><msup id="S2.SS1.p1.16.m1.1.2.3.2" xref="S2.SS1.p1.16.m1.1.2.3.2.cmml"><mi id="S2.SS1.p1.16.m1.1.2.3.2.2" xref="S2.SS1.p1.16.m1.1.2.3.2.2.cmml">B</mi><mrow id="S2.SS1.p1.16.m1.1.2.3.2.3" xref="S2.SS1.p1.16.m1.1.2.3.2.3.cmml"><mo id="S2.SS1.p1.16.m1.1.2.3.2.3a" xref="S2.SS1.p1.16.m1.1.2.3.2.3.cmml">−</mo><mn id="S2.SS1.p1.16.m1.1.2.3.2.3.2" xref="S2.SS1.p1.16.m1.1.2.3.2.3.2.cmml">1</mn></mrow></msup><mo id="S2.SS1.p1.16.m1.1.2.3.1" xref="S2.SS1.p1.16.m1.1.2.3.1.cmml"></mo><mrow id="S2.SS1.p1.16.m1.1.2.3.3.2" xref="S2.SS1.p1.16.m1.1.2.3.cmml"><mo id="S2.SS1.p1.16.m1.1.2.3.3.2.1" stretchy="false" xref="S2.SS1.p1.16.m1.1.2.3.cmml">(</mo><mi id="S2.SS1.p1.16.m1.1.1" xref="S2.SS1.p1.16.m1.1.1.cmml">l</mi><mo id="S2.SS1.p1.16.m1.1.2.3.3.2.2" stretchy="false" xref="S2.SS1.p1.16.m1.1.2.3.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.16.m1.1b"><apply id="S2.SS1.p1.16.m1.1.2.cmml" xref="S2.SS1.p1.16.m1.1.2"><in id="S2.SS1.p1.16.m1.1.2.1.cmml" xref="S2.SS1.p1.16.m1.1.2.1"></in><ci id="S2.SS1.p1.16.m1.1.2.2.cmml" xref="S2.SS1.p1.16.m1.1.2.2">𝜋</ci><apply id="S2.SS1.p1.16.m1.1.2.3.cmml" xref="S2.SS1.p1.16.m1.1.2.3"><times id="S2.SS1.p1.16.m1.1.2.3.1.cmml" xref="S2.SS1.p1.16.m1.1.2.3.1"></times><apply id="S2.SS1.p1.16.m1.1.2.3.2.cmml" xref="S2.SS1.p1.16.m1.1.2.3.2"><csymbol cd="ambiguous" id="S2.SS1.p1.16.m1.1.2.3.2.1.cmml" xref="S2.SS1.p1.16.m1.1.2.3.2">superscript</csymbol><ci id="S2.SS1.p1.16.m1.1.2.3.2.2.cmml" xref="S2.SS1.p1.16.m1.1.2.3.2.2">𝐵</ci><apply id="S2.SS1.p1.16.m1.1.2.3.2.3.cmml" xref="S2.SS1.p1.16.m1.1.2.3.2.3"><minus id="S2.SS1.p1.16.m1.1.2.3.2.3.1.cmml" xref="S2.SS1.p1.16.m1.1.2.3.2.3"></minus><cn id="S2.SS1.p1.16.m1.1.2.3.2.3.2.cmml" type="integer" xref="S2.SS1.p1.16.m1.1.2.3.2.3.2">1</cn></apply></apply><ci id="S2.SS1.p1.16.m1.1.1.cmml" xref="S2.SS1.p1.16.m1.1.1">𝑙</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.16.m1.1c">\pi\in B^{-1}(l)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.16.m1.1d">italic_π ∈ italic_B start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_l )</annotation></semantics></math> if <math alttext="B(\pi)=l" class="ltx_Math" display="inline" id="S2.SS1.p1.17.m2.1"><semantics id="S2.SS1.p1.17.m2.1a"><mrow id="S2.SS1.p1.17.m2.1.2" xref="S2.SS1.p1.17.m2.1.2.cmml"><mrow id="S2.SS1.p1.17.m2.1.2.2" xref="S2.SS1.p1.17.m2.1.2.2.cmml"><mi id="S2.SS1.p1.17.m2.1.2.2.2" xref="S2.SS1.p1.17.m2.1.2.2.2.cmml">B</mi><mo id="S2.SS1.p1.17.m2.1.2.2.1" xref="S2.SS1.p1.17.m2.1.2.2.1.cmml"></mo><mrow id="S2.SS1.p1.17.m2.1.2.2.3.2" xref="S2.SS1.p1.17.m2.1.2.2.cmml"><mo id="S2.SS1.p1.17.m2.1.2.2.3.2.1" stretchy="false" xref="S2.SS1.p1.17.m2.1.2.2.cmml">(</mo><mi id="S2.SS1.p1.17.m2.1.1" xref="S2.SS1.p1.17.m2.1.1.cmml">π</mi><mo id="S2.SS1.p1.17.m2.1.2.2.3.2.2" stretchy="false" xref="S2.SS1.p1.17.m2.1.2.2.cmml">)</mo></mrow></mrow><mo id="S2.SS1.p1.17.m2.1.2.1" xref="S2.SS1.p1.17.m2.1.2.1.cmml">=</mo><mi id="S2.SS1.p1.17.m2.1.2.3" xref="S2.SS1.p1.17.m2.1.2.3.cmml">l</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.17.m2.1b"><apply id="S2.SS1.p1.17.m2.1.2.cmml" xref="S2.SS1.p1.17.m2.1.2"><eq id="S2.SS1.p1.17.m2.1.2.1.cmml" xref="S2.SS1.p1.17.m2.1.2.1"></eq><apply id="S2.SS1.p1.17.m2.1.2.2.cmml" xref="S2.SS1.p1.17.m2.1.2.2"><times id="S2.SS1.p1.17.m2.1.2.2.1.cmml" xref="S2.SS1.p1.17.m2.1.2.2.1"></times><ci id="S2.SS1.p1.17.m2.1.2.2.2.cmml" xref="S2.SS1.p1.17.m2.1.2.2.2">𝐵</ci><ci id="S2.SS1.p1.17.m2.1.1.cmml" xref="S2.SS1.p1.17.m2.1.1">𝜋</ci></apply><ci id="S2.SS1.p1.17.m2.1.2.3.cmml" xref="S2.SS1.p1.17.m2.1.2.3">𝑙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.17.m2.1c">B(\pi)=l</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.17.m2.1d">italic_B ( italic_π ) = italic_l</annotation></semantics></math>. <math alttext="p(\pi|x)" class="ltx_Math" display="inline" id="S2.SS1.p1.18.m3.1"><semantics id="S2.SS1.p1.18.m3.1a"><mrow id="S2.SS1.p1.18.m3.1.1" xref="S2.SS1.p1.18.m3.1.1.cmml"><mi id="S2.SS1.p1.18.m3.1.1.3" xref="S2.SS1.p1.18.m3.1.1.3.cmml">p</mi><mo id="S2.SS1.p1.18.m3.1.1.2" xref="S2.SS1.p1.18.m3.1.1.2.cmml"></mo><mrow id="S2.SS1.p1.18.m3.1.1.1.1" xref="S2.SS1.p1.18.m3.1.1.1.1.1.cmml"><mo id="S2.SS1.p1.18.m3.1.1.1.1.2" stretchy="false" xref="S2.SS1.p1.18.m3.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.p1.18.m3.1.1.1.1.1" xref="S2.SS1.p1.18.m3.1.1.1.1.1.cmml"><mi id="S2.SS1.p1.18.m3.1.1.1.1.1.2" xref="S2.SS1.p1.18.m3.1.1.1.1.1.2.cmml">π</mi><mo fence="false" id="S2.SS1.p1.18.m3.1.1.1.1.1.1" xref="S2.SS1.p1.18.m3.1.1.1.1.1.1.cmml">|</mo><mi id="S2.SS1.p1.18.m3.1.1.1.1.1.3" xref="S2.SS1.p1.18.m3.1.1.1.1.1.3.cmml">x</mi></mrow><mo id="S2.SS1.p1.18.m3.1.1.1.1.3" stretchy="false" xref="S2.SS1.p1.18.m3.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.18.m3.1b"><apply id="S2.SS1.p1.18.m3.1.1.cmml" xref="S2.SS1.p1.18.m3.1.1"><times id="S2.SS1.p1.18.m3.1.1.2.cmml" xref="S2.SS1.p1.18.m3.1.1.2"></times><ci id="S2.SS1.p1.18.m3.1.1.3.cmml" xref="S2.SS1.p1.18.m3.1.1.3">𝑝</ci><apply id="S2.SS1.p1.18.m3.1.1.1.1.1.cmml" xref="S2.SS1.p1.18.m3.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.p1.18.m3.1.1.1.1.1.1.cmml" xref="S2.SS1.p1.18.m3.1.1.1.1.1.1">conditional</csymbol><ci id="S2.SS1.p1.18.m3.1.1.1.1.1.2.cmml" xref="S2.SS1.p1.18.m3.1.1.1.1.1.2">𝜋</ci><ci id="S2.SS1.p1.18.m3.1.1.1.1.1.3.cmml" xref="S2.SS1.p1.18.m3.1.1.1.1.1.3">𝑥</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.18.m3.1c">p(\pi|x)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.18.m3.1d">italic_p ( italic_π | italic_x )</annotation></semantics></math> denotes the posterior probability of path <math alttext="\pi" class="ltx_Math" display="inline" id="S2.SS1.p1.19.m4.1"><semantics id="S2.SS1.p1.19.m4.1a"><mi id="S2.SS1.p1.19.m4.1.1" xref="S2.SS1.p1.19.m4.1.1.cmml">π</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.19.m4.1b"><ci id="S2.SS1.p1.19.m4.1.1.cmml" xref="S2.SS1.p1.19.m4.1.1">𝜋</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.19.m4.1c">\pi</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.19.m4.1d">italic_π</annotation></semantics></math>, calculated by the product of posterior probabilities of <math alttext="\pi_{t}" class="ltx_Math" display="inline" id="S2.SS1.p1.20.m5.1"><semantics id="S2.SS1.p1.20.m5.1a"><msub id="S2.SS1.p1.20.m5.1.1" xref="S2.SS1.p1.20.m5.1.1.cmml"><mi id="S2.SS1.p1.20.m5.1.1.2" xref="S2.SS1.p1.20.m5.1.1.2.cmml">π</mi><mi id="S2.SS1.p1.20.m5.1.1.3" xref="S2.SS1.p1.20.m5.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.20.m5.1b"><apply id="S2.SS1.p1.20.m5.1.1.cmml" xref="S2.SS1.p1.20.m5.1.1"><csymbol cd="ambiguous" id="S2.SS1.p1.20.m5.1.1.1.cmml" xref="S2.SS1.p1.20.m5.1.1">subscript</csymbol><ci id="S2.SS1.p1.20.m5.1.1.2.cmml" xref="S2.SS1.p1.20.m5.1.1.2">𝜋</ci><ci id="S2.SS1.p1.20.m5.1.1.3.cmml" xref="S2.SS1.p1.20.m5.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.20.m5.1c">\pi_{t}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.20.m5.1d">italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> cross <math alttext="T" class="ltx_Math" display="inline" id="S2.SS1.p1.21.m6.1"><semantics id="S2.SS1.p1.21.m6.1a"><mi id="S2.SS1.p1.21.m6.1.1" xref="S2.SS1.p1.21.m6.1.1.cmml">T</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.21.m6.1b"><ci id="S2.SS1.p1.21.m6.1.1.cmml" xref="S2.SS1.p1.21.m6.1.1">𝑇</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.21.m6.1c">T</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.21.m6.1d">italic_T</annotation></semantics></math> time steps:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E2"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="p(\pi|x)=\prod_{t=1}^{T}p(\pi_{t}|x_{t})=\prod_{t=1}^{T}y^{t}_{\pi_{t}}" class="ltx_Math" display="block" id="S2.E2.m1.2"><semantics id="S2.E2.m1.2a"><mrow id="S2.E2.m1.2.2" xref="S2.E2.m1.2.2.cmml"><mrow id="S2.E2.m1.1.1.1" xref="S2.E2.m1.1.1.1.cmml"><mi id="S2.E2.m1.1.1.1.3" xref="S2.E2.m1.1.1.1.3.cmml">p</mi><mo id="S2.E2.m1.1.1.1.2" xref="S2.E2.m1.1.1.1.2.cmml"></mo><mrow id="S2.E2.m1.1.1.1.1.1" xref="S2.E2.m1.1.1.1.1.1.1.cmml"><mo id="S2.E2.m1.1.1.1.1.1.2" stretchy="false" xref="S2.E2.m1.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.E2.m1.1.1.1.1.1.1" xref="S2.E2.m1.1.1.1.1.1.1.cmml"><mi id="S2.E2.m1.1.1.1.1.1.1.2" xref="S2.E2.m1.1.1.1.1.1.1.2.cmml">π</mi><mo fence="false" id="S2.E2.m1.1.1.1.1.1.1.1" xref="S2.E2.m1.1.1.1.1.1.1.1.cmml">|</mo><mi id="S2.E2.m1.1.1.1.1.1.1.3" xref="S2.E2.m1.1.1.1.1.1.1.3.cmml">x</mi></mrow><mo id="S2.E2.m1.1.1.1.1.1.3" stretchy="false" xref="S2.E2.m1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S2.E2.m1.2.2.4" rspace="0.111em" xref="S2.E2.m1.2.2.4.cmml">=</mo><mrow id="S2.E2.m1.2.2.2" xref="S2.E2.m1.2.2.2.cmml"><munderover id="S2.E2.m1.2.2.2.2" xref="S2.E2.m1.2.2.2.2.cmml"><mo id="S2.E2.m1.2.2.2.2.2.2" movablelimits="false" xref="S2.E2.m1.2.2.2.2.2.2.cmml">∏</mo><mrow id="S2.E2.m1.2.2.2.2.2.3" xref="S2.E2.m1.2.2.2.2.2.3.cmml"><mi id="S2.E2.m1.2.2.2.2.2.3.2" xref="S2.E2.m1.2.2.2.2.2.3.2.cmml">t</mi><mo id="S2.E2.m1.2.2.2.2.2.3.1" xref="S2.E2.m1.2.2.2.2.2.3.1.cmml">=</mo><mn id="S2.E2.m1.2.2.2.2.2.3.3" xref="S2.E2.m1.2.2.2.2.2.3.3.cmml">1</mn></mrow><mi id="S2.E2.m1.2.2.2.2.3" xref="S2.E2.m1.2.2.2.2.3.cmml">T</mi></munderover><mrow id="S2.E2.m1.2.2.2.1" xref="S2.E2.m1.2.2.2.1.cmml"><mi id="S2.E2.m1.2.2.2.1.3" xref="S2.E2.m1.2.2.2.1.3.cmml">p</mi><mo id="S2.E2.m1.2.2.2.1.2" xref="S2.E2.m1.2.2.2.1.2.cmml"></mo><mrow id="S2.E2.m1.2.2.2.1.1.1" xref="S2.E2.m1.2.2.2.1.1.1.1.cmml"><mo id="S2.E2.m1.2.2.2.1.1.1.2" stretchy="false" xref="S2.E2.m1.2.2.2.1.1.1.1.cmml">(</mo><mrow id="S2.E2.m1.2.2.2.1.1.1.1" xref="S2.E2.m1.2.2.2.1.1.1.1.cmml"><msub id="S2.E2.m1.2.2.2.1.1.1.1.2" xref="S2.E2.m1.2.2.2.1.1.1.1.2.cmml"><mi id="S2.E2.m1.2.2.2.1.1.1.1.2.2" xref="S2.E2.m1.2.2.2.1.1.1.1.2.2.cmml">π</mi><mi id="S2.E2.m1.2.2.2.1.1.1.1.2.3" xref="S2.E2.m1.2.2.2.1.1.1.1.2.3.cmml">t</mi></msub><mo fence="false" id="S2.E2.m1.2.2.2.1.1.1.1.1" xref="S2.E2.m1.2.2.2.1.1.1.1.1.cmml">|</mo><msub id="S2.E2.m1.2.2.2.1.1.1.1.3" xref="S2.E2.m1.2.2.2.1.1.1.1.3.cmml"><mi id="S2.E2.m1.2.2.2.1.1.1.1.3.2" xref="S2.E2.m1.2.2.2.1.1.1.1.3.2.cmml">x</mi><mi id="S2.E2.m1.2.2.2.1.1.1.1.3.3" xref="S2.E2.m1.2.2.2.1.1.1.1.3.3.cmml">t</mi></msub></mrow><mo id="S2.E2.m1.2.2.2.1.1.1.3" stretchy="false" xref="S2.E2.m1.2.2.2.1.1.1.1.cmml">)</mo></mrow></mrow></mrow><mo id="S2.E2.m1.2.2.5" rspace="0.111em" xref="S2.E2.m1.2.2.5.cmml">=</mo><mrow id="S2.E2.m1.2.2.6" xref="S2.E2.m1.2.2.6.cmml"><munderover id="S2.E2.m1.2.2.6.1" xref="S2.E2.m1.2.2.6.1.cmml"><mo id="S2.E2.m1.2.2.6.1.2.2" movablelimits="false" xref="S2.E2.m1.2.2.6.1.2.2.cmml">∏</mo><mrow id="S2.E2.m1.2.2.6.1.2.3" xref="S2.E2.m1.2.2.6.1.2.3.cmml"><mi id="S2.E2.m1.2.2.6.1.2.3.2" xref="S2.E2.m1.2.2.6.1.2.3.2.cmml">t</mi><mo id="S2.E2.m1.2.2.6.1.2.3.1" xref="S2.E2.m1.2.2.6.1.2.3.1.cmml">=</mo><mn id="S2.E2.m1.2.2.6.1.2.3.3" xref="S2.E2.m1.2.2.6.1.2.3.3.cmml">1</mn></mrow><mi id="S2.E2.m1.2.2.6.1.3" xref="S2.E2.m1.2.2.6.1.3.cmml">T</mi></munderover><msubsup id="S2.E2.m1.2.2.6.2" xref="S2.E2.m1.2.2.6.2.cmml"><mi id="S2.E2.m1.2.2.6.2.2.2" xref="S2.E2.m1.2.2.6.2.2.2.cmml">y</mi><msub id="S2.E2.m1.2.2.6.2.3" xref="S2.E2.m1.2.2.6.2.3.cmml"><mi id="S2.E2.m1.2.2.6.2.3.2" xref="S2.E2.m1.2.2.6.2.3.2.cmml">π</mi><mi id="S2.E2.m1.2.2.6.2.3.3" xref="S2.E2.m1.2.2.6.2.3.3.cmml">t</mi></msub><mi id="S2.E2.m1.2.2.6.2.2.3" xref="S2.E2.m1.2.2.6.2.2.3.cmml">t</mi></msubsup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E2.m1.2b"><apply id="S2.E2.m1.2.2.cmml" xref="S2.E2.m1.2.2"><and id="S2.E2.m1.2.2a.cmml" xref="S2.E2.m1.2.2"></and><apply id="S2.E2.m1.2.2b.cmml" xref="S2.E2.m1.2.2"><eq id="S2.E2.m1.2.2.4.cmml" xref="S2.E2.m1.2.2.4"></eq><apply id="S2.E2.m1.1.1.1.cmml" xref="S2.E2.m1.1.1.1"><times id="S2.E2.m1.1.1.1.2.cmml" xref="S2.E2.m1.1.1.1.2"></times><ci id="S2.E2.m1.1.1.1.3.cmml" xref="S2.E2.m1.1.1.1.3">𝑝</ci><apply id="S2.E2.m1.1.1.1.1.1.1.cmml" xref="S2.E2.m1.1.1.1.1.1"><csymbol cd="latexml" id="S2.E2.m1.1.1.1.1.1.1.1.cmml" xref="S2.E2.m1.1.1.1.1.1.1.1">conditional</csymbol><ci id="S2.E2.m1.1.1.1.1.1.1.2.cmml" xref="S2.E2.m1.1.1.1.1.1.1.2">𝜋</ci><ci id="S2.E2.m1.1.1.1.1.1.1.3.cmml" xref="S2.E2.m1.1.1.1.1.1.1.3">𝑥</ci></apply></apply><apply id="S2.E2.m1.2.2.2.cmml" xref="S2.E2.m1.2.2.2"><apply id="S2.E2.m1.2.2.2.2.cmml" xref="S2.E2.m1.2.2.2.2"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.2.2.1.cmml" xref="S2.E2.m1.2.2.2.2">superscript</csymbol><apply id="S2.E2.m1.2.2.2.2.2.cmml" xref="S2.E2.m1.2.2.2.2"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.2.2.2.1.cmml" xref="S2.E2.m1.2.2.2.2">subscript</csymbol><csymbol cd="latexml" id="S2.E2.m1.2.2.2.2.2.2.cmml" xref="S2.E2.m1.2.2.2.2.2.2">product</csymbol><apply id="S2.E2.m1.2.2.2.2.2.3.cmml" xref="S2.E2.m1.2.2.2.2.2.3"><eq id="S2.E2.m1.2.2.2.2.2.3.1.cmml" xref="S2.E2.m1.2.2.2.2.2.3.1"></eq><ci id="S2.E2.m1.2.2.2.2.2.3.2.cmml" xref="S2.E2.m1.2.2.2.2.2.3.2">𝑡</ci><cn id="S2.E2.m1.2.2.2.2.2.3.3.cmml" type="integer" xref="S2.E2.m1.2.2.2.2.2.3.3">1</cn></apply></apply><ci id="S2.E2.m1.2.2.2.2.3.cmml" xref="S2.E2.m1.2.2.2.2.3">𝑇</ci></apply><apply id="S2.E2.m1.2.2.2.1.cmml" xref="S2.E2.m1.2.2.2.1"><times id="S2.E2.m1.2.2.2.1.2.cmml" xref="S2.E2.m1.2.2.2.1.2"></times><ci id="S2.E2.m1.2.2.2.1.3.cmml" xref="S2.E2.m1.2.2.2.1.3">𝑝</ci><apply id="S2.E2.m1.2.2.2.1.1.1.1.cmml" xref="S2.E2.m1.2.2.2.1.1.1"><csymbol cd="latexml" id="S2.E2.m1.2.2.2.1.1.1.1.1.cmml" xref="S2.E2.m1.2.2.2.1.1.1.1.1">conditional</csymbol><apply id="S2.E2.m1.2.2.2.1.1.1.1.2.cmml" xref="S2.E2.m1.2.2.2.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.2.1.1.1.1.2.1.cmml" xref="S2.E2.m1.2.2.2.1.1.1.1.2">subscript</csymbol><ci id="S2.E2.m1.2.2.2.1.1.1.1.2.2.cmml" xref="S2.E2.m1.2.2.2.1.1.1.1.2.2">𝜋</ci><ci id="S2.E2.m1.2.2.2.1.1.1.1.2.3.cmml" xref="S2.E2.m1.2.2.2.1.1.1.1.2.3">𝑡</ci></apply><apply id="S2.E2.m1.2.2.2.1.1.1.1.3.cmml" xref="S2.E2.m1.2.2.2.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.2.1.1.1.1.3.1.cmml" xref="S2.E2.m1.2.2.2.1.1.1.1.3">subscript</csymbol><ci id="S2.E2.m1.2.2.2.1.1.1.1.3.2.cmml" xref="S2.E2.m1.2.2.2.1.1.1.1.3.2">𝑥</ci><ci id="S2.E2.m1.2.2.2.1.1.1.1.3.3.cmml" xref="S2.E2.m1.2.2.2.1.1.1.1.3.3">𝑡</ci></apply></apply></apply></apply></apply><apply id="S2.E2.m1.2.2c.cmml" xref="S2.E2.m1.2.2"><eq id="S2.E2.m1.2.2.5.cmml" xref="S2.E2.m1.2.2.5"></eq><share href="https://arxiv.org/html/2409.12388v2#S2.E2.m1.2.2.2.cmml" id="S2.E2.m1.2.2d.cmml" xref="S2.E2.m1.2.2"></share><apply id="S2.E2.m1.2.2.6.cmml" xref="S2.E2.m1.2.2.6"><apply id="S2.E2.m1.2.2.6.1.cmml" xref="S2.E2.m1.2.2.6.1"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.6.1.1.cmml" xref="S2.E2.m1.2.2.6.1">superscript</csymbol><apply id="S2.E2.m1.2.2.6.1.2.cmml" xref="S2.E2.m1.2.2.6.1"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.6.1.2.1.cmml" xref="S2.E2.m1.2.2.6.1">subscript</csymbol><csymbol cd="latexml" id="S2.E2.m1.2.2.6.1.2.2.cmml" xref="S2.E2.m1.2.2.6.1.2.2">product</csymbol><apply id="S2.E2.m1.2.2.6.1.2.3.cmml" xref="S2.E2.m1.2.2.6.1.2.3"><eq id="S2.E2.m1.2.2.6.1.2.3.1.cmml" xref="S2.E2.m1.2.2.6.1.2.3.1"></eq><ci id="S2.E2.m1.2.2.6.1.2.3.2.cmml" xref="S2.E2.m1.2.2.6.1.2.3.2">𝑡</ci><cn id="S2.E2.m1.2.2.6.1.2.3.3.cmml" type="integer" xref="S2.E2.m1.2.2.6.1.2.3.3">1</cn></apply></apply><ci id="S2.E2.m1.2.2.6.1.3.cmml" xref="S2.E2.m1.2.2.6.1.3">𝑇</ci></apply><apply id="S2.E2.m1.2.2.6.2.cmml" xref="S2.E2.m1.2.2.6.2"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.6.2.1.cmml" xref="S2.E2.m1.2.2.6.2">subscript</csymbol><apply id="S2.E2.m1.2.2.6.2.2.cmml" xref="S2.E2.m1.2.2.6.2"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.6.2.2.1.cmml" xref="S2.E2.m1.2.2.6.2">superscript</csymbol><ci id="S2.E2.m1.2.2.6.2.2.2.cmml" xref="S2.E2.m1.2.2.6.2.2.2">𝑦</ci><ci id="S2.E2.m1.2.2.6.2.2.3.cmml" xref="S2.E2.m1.2.2.6.2.2.3">𝑡</ci></apply><apply id="S2.E2.m1.2.2.6.2.3.cmml" xref="S2.E2.m1.2.2.6.2.3"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.6.2.3.1.cmml" xref="S2.E2.m1.2.2.6.2.3">subscript</csymbol><ci id="S2.E2.m1.2.2.6.2.3.2.cmml" xref="S2.E2.m1.2.2.6.2.3.2">𝜋</ci><ci id="S2.E2.m1.2.2.6.2.3.3.cmml" xref="S2.E2.m1.2.2.6.2.3.3">𝑡</ci></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E2.m1.2c">p(\pi|x)=\prod_{t=1}^{T}p(\pi_{t}|x_{t})=\prod_{t=1}^{T}y^{t}_{\pi_{t}}</annotation><annotation encoding="application/x-llamapun" id="S2.E2.m1.2d">italic_p ( italic_π | italic_x ) = ∏ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_p ( italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) = ∏ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_y start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(2)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS1.p1.25">Here <math alttext="p(\pi_{t}|x_{t})" class="ltx_Math" display="inline" id="S2.SS1.p1.22.m1.1"><semantics id="S2.SS1.p1.22.m1.1a"><mrow id="S2.SS1.p1.22.m1.1.1" xref="S2.SS1.p1.22.m1.1.1.cmml"><mi id="S2.SS1.p1.22.m1.1.1.3" xref="S2.SS1.p1.22.m1.1.1.3.cmml">p</mi><mo id="S2.SS1.p1.22.m1.1.1.2" xref="S2.SS1.p1.22.m1.1.1.2.cmml"></mo><mrow id="S2.SS1.p1.22.m1.1.1.1.1" xref="S2.SS1.p1.22.m1.1.1.1.1.1.cmml"><mo id="S2.SS1.p1.22.m1.1.1.1.1.2" stretchy="false" xref="S2.SS1.p1.22.m1.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.p1.22.m1.1.1.1.1.1" xref="S2.SS1.p1.22.m1.1.1.1.1.1.cmml"><msub id="S2.SS1.p1.22.m1.1.1.1.1.1.2" xref="S2.SS1.p1.22.m1.1.1.1.1.1.2.cmml"><mi id="S2.SS1.p1.22.m1.1.1.1.1.1.2.2" xref="S2.SS1.p1.22.m1.1.1.1.1.1.2.2.cmml">π</mi><mi id="S2.SS1.p1.22.m1.1.1.1.1.1.2.3" xref="S2.SS1.p1.22.m1.1.1.1.1.1.2.3.cmml">t</mi></msub><mo fence="false" id="S2.SS1.p1.22.m1.1.1.1.1.1.1" xref="S2.SS1.p1.22.m1.1.1.1.1.1.1.cmml">|</mo><msub id="S2.SS1.p1.22.m1.1.1.1.1.1.3" xref="S2.SS1.p1.22.m1.1.1.1.1.1.3.cmml"><mi id="S2.SS1.p1.22.m1.1.1.1.1.1.3.2" xref="S2.SS1.p1.22.m1.1.1.1.1.1.3.2.cmml">x</mi><mi id="S2.SS1.p1.22.m1.1.1.1.1.1.3.3" xref="S2.SS1.p1.22.m1.1.1.1.1.1.3.3.cmml">t</mi></msub></mrow><mo id="S2.SS1.p1.22.m1.1.1.1.1.3" stretchy="false" xref="S2.SS1.p1.22.m1.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.22.m1.1b"><apply id="S2.SS1.p1.22.m1.1.1.cmml" xref="S2.SS1.p1.22.m1.1.1"><times id="S2.SS1.p1.22.m1.1.1.2.cmml" xref="S2.SS1.p1.22.m1.1.1.2"></times><ci id="S2.SS1.p1.22.m1.1.1.3.cmml" xref="S2.SS1.p1.22.m1.1.1.3">𝑝</ci><apply id="S2.SS1.p1.22.m1.1.1.1.1.1.cmml" xref="S2.SS1.p1.22.m1.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.p1.22.m1.1.1.1.1.1.1.cmml" xref="S2.SS1.p1.22.m1.1.1.1.1.1.1">conditional</csymbol><apply id="S2.SS1.p1.22.m1.1.1.1.1.1.2.cmml" xref="S2.SS1.p1.22.m1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.22.m1.1.1.1.1.1.2.1.cmml" xref="S2.SS1.p1.22.m1.1.1.1.1.1.2">subscript</csymbol><ci id="S2.SS1.p1.22.m1.1.1.1.1.1.2.2.cmml" xref="S2.SS1.p1.22.m1.1.1.1.1.1.2.2">𝜋</ci><ci id="S2.SS1.p1.22.m1.1.1.1.1.1.2.3.cmml" xref="S2.SS1.p1.22.m1.1.1.1.1.1.2.3">𝑡</ci></apply><apply id="S2.SS1.p1.22.m1.1.1.1.1.1.3.cmml" xref="S2.SS1.p1.22.m1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.p1.22.m1.1.1.1.1.1.3.1.cmml" xref="S2.SS1.p1.22.m1.1.1.1.1.1.3">subscript</csymbol><ci id="S2.SS1.p1.22.m1.1.1.1.1.1.3.2.cmml" xref="S2.SS1.p1.22.m1.1.1.1.1.1.3.2">𝑥</ci><ci id="S2.SS1.p1.22.m1.1.1.1.1.1.3.3.cmml" xref="S2.SS1.p1.22.m1.1.1.1.1.1.3.3">𝑡</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.22.m1.1c">p(\pi_{t}|x_{t})</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.22.m1.1d">italic_p ( italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT )</annotation></semantics></math> typically consists of linear projection and softmax function to generate frame-wise posterior of <math alttext="\pi_{t}" class="ltx_Math" display="inline" id="S2.SS1.p1.23.m2.1"><semantics id="S2.SS1.p1.23.m2.1a"><msub id="S2.SS1.p1.23.m2.1.1" xref="S2.SS1.p1.23.m2.1.1.cmml"><mi id="S2.SS1.p1.23.m2.1.1.2" xref="S2.SS1.p1.23.m2.1.1.2.cmml">π</mi><mi id="S2.SS1.p1.23.m2.1.1.3" xref="S2.SS1.p1.23.m2.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.23.m2.1b"><apply id="S2.SS1.p1.23.m2.1.1.cmml" xref="S2.SS1.p1.23.m2.1.1"><csymbol cd="ambiguous" id="S2.SS1.p1.23.m2.1.1.1.cmml" xref="S2.SS1.p1.23.m2.1.1">subscript</csymbol><ci id="S2.SS1.p1.23.m2.1.1.2.cmml" xref="S2.SS1.p1.23.m2.1.1.2">𝜋</ci><ci id="S2.SS1.p1.23.m2.1.1.3.cmml" xref="S2.SS1.p1.23.m2.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.23.m2.1c">\pi_{t}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.23.m2.1d">italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> at the <math alttext="t" class="ltx_Math" display="inline" id="S2.SS1.p1.24.m3.1"><semantics id="S2.SS1.p1.24.m3.1a"><mi id="S2.SS1.p1.24.m3.1.1" xref="S2.SS1.p1.24.m3.1.1.cmml">t</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.24.m3.1b"><ci id="S2.SS1.p1.24.m3.1.1.cmml" xref="S2.SS1.p1.24.m3.1.1">𝑡</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.24.m3.1c">t</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.24.m3.1d">italic_t</annotation></semantics></math>-th frame. We denote the output as <math alttext="y^{t}_{\pi_{t}}" class="ltx_Math" display="inline" id="S2.SS1.p1.25.m4.1"><semantics id="S2.SS1.p1.25.m4.1a"><msubsup id="S2.SS1.p1.25.m4.1.1" xref="S2.SS1.p1.25.m4.1.1.cmml"><mi id="S2.SS1.p1.25.m4.1.1.2.2" xref="S2.SS1.p1.25.m4.1.1.2.2.cmml">y</mi><msub id="S2.SS1.p1.25.m4.1.1.3" xref="S2.SS1.p1.25.m4.1.1.3.cmml"><mi id="S2.SS1.p1.25.m4.1.1.3.2" xref="S2.SS1.p1.25.m4.1.1.3.2.cmml">π</mi><mi id="S2.SS1.p1.25.m4.1.1.3.3" xref="S2.SS1.p1.25.m4.1.1.3.3.cmml">t</mi></msub><mi id="S2.SS1.p1.25.m4.1.1.2.3" xref="S2.SS1.p1.25.m4.1.1.2.3.cmml">t</mi></msubsup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.25.m4.1b"><apply id="S2.SS1.p1.25.m4.1.1.cmml" xref="S2.SS1.p1.25.m4.1.1"><csymbol cd="ambiguous" id="S2.SS1.p1.25.m4.1.1.1.cmml" xref="S2.SS1.p1.25.m4.1.1">subscript</csymbol><apply id="S2.SS1.p1.25.m4.1.1.2.cmml" xref="S2.SS1.p1.25.m4.1.1"><csymbol cd="ambiguous" id="S2.SS1.p1.25.m4.1.1.2.1.cmml" xref="S2.SS1.p1.25.m4.1.1">superscript</csymbol><ci id="S2.SS1.p1.25.m4.1.1.2.2.cmml" xref="S2.SS1.p1.25.m4.1.1.2.2">𝑦</ci><ci id="S2.SS1.p1.25.m4.1.1.2.3.cmml" xref="S2.SS1.p1.25.m4.1.1.2.3">𝑡</ci></apply><apply id="S2.SS1.p1.25.m4.1.1.3.cmml" xref="S2.SS1.p1.25.m4.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.p1.25.m4.1.1.3.1.cmml" xref="S2.SS1.p1.25.m4.1.1.3">subscript</csymbol><ci id="S2.SS1.p1.25.m4.1.1.3.2.cmml" xref="S2.SS1.p1.25.m4.1.1.3.2">𝜋</ci><ci id="S2.SS1.p1.25.m4.1.1.3.3.cmml" xref="S2.SS1.p1.25.m4.1.1.3.3">𝑡</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.25.m4.1c">y^{t}_{\pi_{t}}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.25.m4.1d">italic_y start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT</annotation></semantics></math>.</p> </div> <div class="ltx_para" id="S2.SS1.p2"> <p class="ltx_p" id="S2.SS1.p2.6">Considering the combinational explosion of permutating all alignment paths, forward-backward algorithm <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib32" title="">32</a>]</cite> is commonly used to calculate <math alttext="P(l|x)" class="ltx_Math" display="inline" id="S2.SS1.p2.1.m1.1"><semantics id="S2.SS1.p2.1.m1.1a"><mrow id="S2.SS1.p2.1.m1.1.1" xref="S2.SS1.p2.1.m1.1.1.cmml"><mi id="S2.SS1.p2.1.m1.1.1.3" xref="S2.SS1.p2.1.m1.1.1.3.cmml">P</mi><mo id="S2.SS1.p2.1.m1.1.1.2" xref="S2.SS1.p2.1.m1.1.1.2.cmml"></mo><mrow id="S2.SS1.p2.1.m1.1.1.1.1" xref="S2.SS1.p2.1.m1.1.1.1.1.1.cmml"><mo id="S2.SS1.p2.1.m1.1.1.1.1.2" stretchy="false" xref="S2.SS1.p2.1.m1.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.p2.1.m1.1.1.1.1.1" xref="S2.SS1.p2.1.m1.1.1.1.1.1.cmml"><mi id="S2.SS1.p2.1.m1.1.1.1.1.1.2" xref="S2.SS1.p2.1.m1.1.1.1.1.1.2.cmml">l</mi><mo fence="false" id="S2.SS1.p2.1.m1.1.1.1.1.1.1" xref="S2.SS1.p2.1.m1.1.1.1.1.1.1.cmml">|</mo><mi id="S2.SS1.p2.1.m1.1.1.1.1.1.3" xref="S2.SS1.p2.1.m1.1.1.1.1.1.3.cmml">x</mi></mrow><mo id="S2.SS1.p2.1.m1.1.1.1.1.3" stretchy="false" xref="S2.SS1.p2.1.m1.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.1.m1.1b"><apply id="S2.SS1.p2.1.m1.1.1.cmml" xref="S2.SS1.p2.1.m1.1.1"><times id="S2.SS1.p2.1.m1.1.1.2.cmml" xref="S2.SS1.p2.1.m1.1.1.2"></times><ci id="S2.SS1.p2.1.m1.1.1.3.cmml" xref="S2.SS1.p2.1.m1.1.1.3">𝑃</ci><apply id="S2.SS1.p2.1.m1.1.1.1.1.1.cmml" xref="S2.SS1.p2.1.m1.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.p2.1.m1.1.1.1.1.1.1.cmml" xref="S2.SS1.p2.1.m1.1.1.1.1.1.1">conditional</csymbol><ci id="S2.SS1.p2.1.m1.1.1.1.1.1.2.cmml" xref="S2.SS1.p2.1.m1.1.1.1.1.1.2">𝑙</ci><ci id="S2.SS1.p2.1.m1.1.1.1.1.1.3.cmml" xref="S2.SS1.p2.1.m1.1.1.1.1.1.3">𝑥</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.1.m1.1c">P(l|x)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.1.m1.1d">italic_P ( italic_l | italic_x )</annotation></semantics></math> effectively. First, the original label sequence <math alttext="l" class="ltx_Math" display="inline" id="S2.SS1.p2.2.m2.1"><semantics id="S2.SS1.p2.2.m2.1a"><mi id="S2.SS1.p2.2.m2.1.1" xref="S2.SS1.p2.2.m2.1.1.cmml">l</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.2.m2.1b"><ci id="S2.SS1.p2.2.m2.1.1.cmml" xref="S2.SS1.p2.2.m2.1.1">𝑙</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.2.m2.1c">l</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.2.m2.1d">italic_l</annotation></semantics></math> is extended by inserting <math alttext="\varnothing" class="ltx_Math" display="inline" id="S2.SS1.p2.3.m3.1"><semantics id="S2.SS1.p2.3.m3.1a"><mi id="S2.SS1.p2.3.m3.1.1" mathvariant="normal" xref="S2.SS1.p2.3.m3.1.1.cmml">∅</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.3.m3.1b"><emptyset id="S2.SS1.p2.3.m3.1.1.cmml" xref="S2.SS1.p2.3.m3.1.1"></emptyset></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.3.m3.1c">\varnothing</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.3.m3.1d">∅</annotation></semantics></math> symbol between any two non-blank tokens: <math alttext="l^{\prime}=[\varnothing,l_{1},\varnothing,...,\varnothing,l_{U},\varnothing]" class="ltx_Math" display="inline" id="S2.SS1.p2.4.m4.7"><semantics id="S2.SS1.p2.4.m4.7a"><mrow id="S2.SS1.p2.4.m4.7.7" xref="S2.SS1.p2.4.m4.7.7.cmml"><msup id="S2.SS1.p2.4.m4.7.7.4" xref="S2.SS1.p2.4.m4.7.7.4.cmml"><mi id="S2.SS1.p2.4.m4.7.7.4.2" xref="S2.SS1.p2.4.m4.7.7.4.2.cmml">l</mi><mo id="S2.SS1.p2.4.m4.7.7.4.3" xref="S2.SS1.p2.4.m4.7.7.4.3.cmml">′</mo></msup><mo id="S2.SS1.p2.4.m4.7.7.3" xref="S2.SS1.p2.4.m4.7.7.3.cmml">=</mo><mrow id="S2.SS1.p2.4.m4.7.7.2.2" xref="S2.SS1.p2.4.m4.7.7.2.3.cmml"><mo id="S2.SS1.p2.4.m4.7.7.2.2.3" stretchy="false" xref="S2.SS1.p2.4.m4.7.7.2.3.cmml">[</mo><mi id="S2.SS1.p2.4.m4.1.1" mathvariant="normal" xref="S2.SS1.p2.4.m4.1.1.cmml">∅</mi><mo id="S2.SS1.p2.4.m4.7.7.2.2.4" xref="S2.SS1.p2.4.m4.7.7.2.3.cmml">,</mo><msub id="S2.SS1.p2.4.m4.6.6.1.1.1" xref="S2.SS1.p2.4.m4.6.6.1.1.1.cmml"><mi id="S2.SS1.p2.4.m4.6.6.1.1.1.2" xref="S2.SS1.p2.4.m4.6.6.1.1.1.2.cmml">l</mi><mn id="S2.SS1.p2.4.m4.6.6.1.1.1.3" xref="S2.SS1.p2.4.m4.6.6.1.1.1.3.cmml">1</mn></msub><mo id="S2.SS1.p2.4.m4.7.7.2.2.5" xref="S2.SS1.p2.4.m4.7.7.2.3.cmml">,</mo><mi id="S2.SS1.p2.4.m4.2.2" mathvariant="normal" xref="S2.SS1.p2.4.m4.2.2.cmml">∅</mi><mo id="S2.SS1.p2.4.m4.7.7.2.2.6" xref="S2.SS1.p2.4.m4.7.7.2.3.cmml">,</mo><mi id="S2.SS1.p2.4.m4.3.3" mathvariant="normal" xref="S2.SS1.p2.4.m4.3.3.cmml">…</mi><mo id="S2.SS1.p2.4.m4.7.7.2.2.7" xref="S2.SS1.p2.4.m4.7.7.2.3.cmml">,</mo><mi id="S2.SS1.p2.4.m4.4.4" mathvariant="normal" xref="S2.SS1.p2.4.m4.4.4.cmml">∅</mi><mo id="S2.SS1.p2.4.m4.7.7.2.2.8" xref="S2.SS1.p2.4.m4.7.7.2.3.cmml">,</mo><msub id="S2.SS1.p2.4.m4.7.7.2.2.2" xref="S2.SS1.p2.4.m4.7.7.2.2.2.cmml"><mi id="S2.SS1.p2.4.m4.7.7.2.2.2.2" xref="S2.SS1.p2.4.m4.7.7.2.2.2.2.cmml">l</mi><mi id="S2.SS1.p2.4.m4.7.7.2.2.2.3" xref="S2.SS1.p2.4.m4.7.7.2.2.2.3.cmml">U</mi></msub><mo id="S2.SS1.p2.4.m4.7.7.2.2.9" xref="S2.SS1.p2.4.m4.7.7.2.3.cmml">,</mo><mi id="S2.SS1.p2.4.m4.5.5" mathvariant="normal" xref="S2.SS1.p2.4.m4.5.5.cmml">∅</mi><mo id="S2.SS1.p2.4.m4.7.7.2.2.10" stretchy="false" xref="S2.SS1.p2.4.m4.7.7.2.3.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.4.m4.7b"><apply id="S2.SS1.p2.4.m4.7.7.cmml" xref="S2.SS1.p2.4.m4.7.7"><eq id="S2.SS1.p2.4.m4.7.7.3.cmml" xref="S2.SS1.p2.4.m4.7.7.3"></eq><apply id="S2.SS1.p2.4.m4.7.7.4.cmml" xref="S2.SS1.p2.4.m4.7.7.4"><csymbol cd="ambiguous" id="S2.SS1.p2.4.m4.7.7.4.1.cmml" xref="S2.SS1.p2.4.m4.7.7.4">superscript</csymbol><ci id="S2.SS1.p2.4.m4.7.7.4.2.cmml" xref="S2.SS1.p2.4.m4.7.7.4.2">𝑙</ci><ci id="S2.SS1.p2.4.m4.7.7.4.3.cmml" xref="S2.SS1.p2.4.m4.7.7.4.3">′</ci></apply><list id="S2.SS1.p2.4.m4.7.7.2.3.cmml" xref="S2.SS1.p2.4.m4.7.7.2.2"><emptyset id="S2.SS1.p2.4.m4.1.1.cmml" xref="S2.SS1.p2.4.m4.1.1"></emptyset><apply id="S2.SS1.p2.4.m4.6.6.1.1.1.cmml" xref="S2.SS1.p2.4.m4.6.6.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.p2.4.m4.6.6.1.1.1.1.cmml" xref="S2.SS1.p2.4.m4.6.6.1.1.1">subscript</csymbol><ci id="S2.SS1.p2.4.m4.6.6.1.1.1.2.cmml" xref="S2.SS1.p2.4.m4.6.6.1.1.1.2">𝑙</ci><cn id="S2.SS1.p2.4.m4.6.6.1.1.1.3.cmml" type="integer" xref="S2.SS1.p2.4.m4.6.6.1.1.1.3">1</cn></apply><emptyset id="S2.SS1.p2.4.m4.2.2.cmml" xref="S2.SS1.p2.4.m4.2.2"></emptyset><ci id="S2.SS1.p2.4.m4.3.3.cmml" xref="S2.SS1.p2.4.m4.3.3">…</ci><emptyset id="S2.SS1.p2.4.m4.4.4.cmml" xref="S2.SS1.p2.4.m4.4.4"></emptyset><apply id="S2.SS1.p2.4.m4.7.7.2.2.2.cmml" xref="S2.SS1.p2.4.m4.7.7.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.p2.4.m4.7.7.2.2.2.1.cmml" xref="S2.SS1.p2.4.m4.7.7.2.2.2">subscript</csymbol><ci id="S2.SS1.p2.4.m4.7.7.2.2.2.2.cmml" xref="S2.SS1.p2.4.m4.7.7.2.2.2.2">𝑙</ci><ci id="S2.SS1.p2.4.m4.7.7.2.2.2.3.cmml" xref="S2.SS1.p2.4.m4.7.7.2.2.2.3">𝑈</ci></apply><emptyset id="S2.SS1.p2.4.m4.5.5.cmml" xref="S2.SS1.p2.4.m4.5.5"></emptyset></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.4.m4.7c">l^{\prime}=[\varnothing,l_{1},\varnothing,...,\varnothing,l_{U},\varnothing]</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.4.m4.7d">italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = [ ∅ , italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ∅ , … , ∅ , italic_l start_POSTSUBSCRIPT italic_U end_POSTSUBSCRIPT , ∅ ]</annotation></semantics></math>. Then recursively compute the forward-backward variables <math alttext="\alpha(t,v)" class="ltx_Math" display="inline" id="S2.SS1.p2.5.m5.2"><semantics id="S2.SS1.p2.5.m5.2a"><mrow id="S2.SS1.p2.5.m5.2.3" xref="S2.SS1.p2.5.m5.2.3.cmml"><mi id="S2.SS1.p2.5.m5.2.3.2" xref="S2.SS1.p2.5.m5.2.3.2.cmml">α</mi><mo id="S2.SS1.p2.5.m5.2.3.1" xref="S2.SS1.p2.5.m5.2.3.1.cmml"></mo><mrow id="S2.SS1.p2.5.m5.2.3.3.2" xref="S2.SS1.p2.5.m5.2.3.3.1.cmml"><mo id="S2.SS1.p2.5.m5.2.3.3.2.1" stretchy="false" xref="S2.SS1.p2.5.m5.2.3.3.1.cmml">(</mo><mi id="S2.SS1.p2.5.m5.1.1" xref="S2.SS1.p2.5.m5.1.1.cmml">t</mi><mo id="S2.SS1.p2.5.m5.2.3.3.2.2" xref="S2.SS1.p2.5.m5.2.3.3.1.cmml">,</mo><mi id="S2.SS1.p2.5.m5.2.2" xref="S2.SS1.p2.5.m5.2.2.cmml">v</mi><mo id="S2.SS1.p2.5.m5.2.3.3.2.3" stretchy="false" xref="S2.SS1.p2.5.m5.2.3.3.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.5.m5.2b"><apply id="S2.SS1.p2.5.m5.2.3.cmml" xref="S2.SS1.p2.5.m5.2.3"><times id="S2.SS1.p2.5.m5.2.3.1.cmml" xref="S2.SS1.p2.5.m5.2.3.1"></times><ci id="S2.SS1.p2.5.m5.2.3.2.cmml" xref="S2.SS1.p2.5.m5.2.3.2">𝛼</ci><interval closure="open" id="S2.SS1.p2.5.m5.2.3.3.1.cmml" xref="S2.SS1.p2.5.m5.2.3.3.2"><ci id="S2.SS1.p2.5.m5.1.1.cmml" xref="S2.SS1.p2.5.m5.1.1">𝑡</ci><ci id="S2.SS1.p2.5.m5.2.2.cmml" xref="S2.SS1.p2.5.m5.2.2">𝑣</ci></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.5.m5.2c">\alpha(t,v)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.5.m5.2d">italic_α ( italic_t , italic_v )</annotation></semantics></math> and <math alttext="\beta(t,v)" class="ltx_Math" display="inline" id="S2.SS1.p2.6.m6.2"><semantics id="S2.SS1.p2.6.m6.2a"><mrow id="S2.SS1.p2.6.m6.2.3" xref="S2.SS1.p2.6.m6.2.3.cmml"><mi id="S2.SS1.p2.6.m6.2.3.2" xref="S2.SS1.p2.6.m6.2.3.2.cmml">β</mi><mo id="S2.SS1.p2.6.m6.2.3.1" xref="S2.SS1.p2.6.m6.2.3.1.cmml"></mo><mrow id="S2.SS1.p2.6.m6.2.3.3.2" xref="S2.SS1.p2.6.m6.2.3.3.1.cmml"><mo id="S2.SS1.p2.6.m6.2.3.3.2.1" stretchy="false" xref="S2.SS1.p2.6.m6.2.3.3.1.cmml">(</mo><mi id="S2.SS1.p2.6.m6.1.1" xref="S2.SS1.p2.6.m6.1.1.cmml">t</mi><mo id="S2.SS1.p2.6.m6.2.3.3.2.2" xref="S2.SS1.p2.6.m6.2.3.3.1.cmml">,</mo><mi id="S2.SS1.p2.6.m6.2.2" xref="S2.SS1.p2.6.m6.2.2.cmml">v</mi><mo id="S2.SS1.p2.6.m6.2.3.3.2.3" stretchy="false" xref="S2.SS1.p2.6.m6.2.3.3.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.6.m6.2b"><apply id="S2.SS1.p2.6.m6.2.3.cmml" xref="S2.SS1.p2.6.m6.2.3"><times id="S2.SS1.p2.6.m6.2.3.1.cmml" xref="S2.SS1.p2.6.m6.2.3.1"></times><ci id="S2.SS1.p2.6.m6.2.3.2.cmml" xref="S2.SS1.p2.6.m6.2.3.2">𝛽</ci><interval closure="open" id="S2.SS1.p2.6.m6.2.3.3.1.cmml" xref="S2.SS1.p2.6.m6.2.3.3.2"><ci id="S2.SS1.p2.6.m6.1.1.cmml" xref="S2.SS1.p2.6.m6.1.1">𝑡</ci><ci id="S2.SS1.p2.6.m6.2.2.cmml" xref="S2.SS1.p2.6.m6.2.2">𝑣</ci></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.6.m6.2c">\beta(t,v)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.6.m6.2d">italic_β ( italic_t , italic_v )</annotation></semantics></math>:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E3"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\alpha(t,v)=\sum_{\begin{subarray}{c}\pi:B(\pi_{1:t})=B(l^{\prime}_{1:v})\\ \pi_{t}=l^{\prime}_{v}\end{subarray}}\prod^{t}_{t^{\prime}=1}y^{t^{\prime}}_{% \pi_{t^{\prime}}}" class="ltx_Math" display="block" id="S2.E3.m1.3"><semantics id="S2.E3.m1.3a"><mrow id="S2.E3.m1.3.4" xref="S2.E3.m1.3.4.cmml"><mrow id="S2.E3.m1.3.4.2" xref="S2.E3.m1.3.4.2.cmml"><mi id="S2.E3.m1.3.4.2.2" xref="S2.E3.m1.3.4.2.2.cmml">α</mi><mo id="S2.E3.m1.3.4.2.1" xref="S2.E3.m1.3.4.2.1.cmml"></mo><mrow id="S2.E3.m1.3.4.2.3.2" xref="S2.E3.m1.3.4.2.3.1.cmml"><mo id="S2.E3.m1.3.4.2.3.2.1" stretchy="false" xref="S2.E3.m1.3.4.2.3.1.cmml">(</mo><mi id="S2.E3.m1.2.2" xref="S2.E3.m1.2.2.cmml">t</mi><mo id="S2.E3.m1.3.4.2.3.2.2" xref="S2.E3.m1.3.4.2.3.1.cmml">,</mo><mi id="S2.E3.m1.3.3" xref="S2.E3.m1.3.3.cmml">v</mi><mo id="S2.E3.m1.3.4.2.3.2.3" stretchy="false" xref="S2.E3.m1.3.4.2.3.1.cmml">)</mo></mrow></mrow><mo id="S2.E3.m1.3.4.1" rspace="0.111em" xref="S2.E3.m1.3.4.1.cmml">=</mo><mrow id="S2.E3.m1.3.4.3" xref="S2.E3.m1.3.4.3.cmml"><munder id="S2.E3.m1.3.4.3.1" xref="S2.E3.m1.3.4.3.1.cmml"><mo id="S2.E3.m1.3.4.3.1.2" movablelimits="false" rspace="0em" xref="S2.E3.m1.3.4.3.1.2.cmml">∑</mo><mtable id="S2.E3.m1.1.1.1.1.1.1" rowspacing="0pt" xref="S2.E3.m1.1.1.1.2.cmml"><mtr id="S2.E3.m1.1.1.1.1.1.1a" xref="S2.E3.m1.1.1.1.2.cmml"><mtd id="S2.E3.m1.1.1.1.1.1.1b" xref="S2.E3.m1.1.1.1.2.cmml"><mrow id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.cmml"><mi id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.4" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.4.cmml">π</mi><mo id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.3" lspace="0.278em" rspace="0.278em" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.3.cmml">:</mo><mrow id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.cmml"><mrow id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml">B</mi><mo id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"></mo><mrow id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mo id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">(</mo><msub id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml">π</mi><mrow id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><mn id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml">1</mn><mo id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.1" lspace="0.278em" rspace="0.278em" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml">:</mo><mi id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml">t</mi></mrow></msub><mo id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3" stretchy="false" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.3" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.3.cmml">=</mo><mrow id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.cmml"><mi id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.3" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.3.cmml">B</mi><mo id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.2" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.2.cmml"></mo><mrow id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.cmml"><mo id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.2" stretchy="false" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.cmml">(</mo><msubsup id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.cmml"><mi id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.2" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.2.cmml">l</mi><mrow id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.cmml"><mn id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.2" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.2.cmml">1</mn><mo id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.1" lspace="0.278em" rspace="0.278em" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.1.cmml">:</mo><mi id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.cmml">v</mi></mrow><mo id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.3" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.3.cmml">′</mo></msubsup><mo id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.3" stretchy="false" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.cmml">)</mo></mrow></mrow></mrow></mrow></mtd></mtr><mtr id="S2.E3.m1.1.1.1.1.1.1c" xref="S2.E3.m1.1.1.1.2.cmml"><mtd id="S2.E3.m1.1.1.1.1.1.1d" xref="S2.E3.m1.1.1.1.2.cmml"><mrow id="S2.E3.m1.1.1.1.1.1.1.3.1.1" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.cmml"><msub id="S2.E3.m1.1.1.1.1.1.1.3.1.1.2" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.2.cmml"><mi id="S2.E3.m1.1.1.1.1.1.1.3.1.1.2.2" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.2.2.cmml">π</mi><mi id="S2.E3.m1.1.1.1.1.1.1.3.1.1.2.3" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.2.3.cmml">t</mi></msub><mo id="S2.E3.m1.1.1.1.1.1.1.3.1.1.1" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.1.cmml">=</mo><msubsup id="S2.E3.m1.1.1.1.1.1.1.3.1.1.3" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.cmml"><mi id="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.2.2" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.2.2.cmml">l</mi><mi id="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.3" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.3.cmml">v</mi><mo id="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.2.3" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.2.3.cmml">′</mo></msubsup></mrow></mtd></mtr></mtable></munder><mrow id="S2.E3.m1.3.4.3.2" xref="S2.E3.m1.3.4.3.2.cmml"><munderover id="S2.E3.m1.3.4.3.2.1" xref="S2.E3.m1.3.4.3.2.1.cmml"><mo id="S2.E3.m1.3.4.3.2.1.2.2" movablelimits="false" xref="S2.E3.m1.3.4.3.2.1.2.2.cmml">∏</mo><mrow id="S2.E3.m1.3.4.3.2.1.3" xref="S2.E3.m1.3.4.3.2.1.3.cmml"><msup id="S2.E3.m1.3.4.3.2.1.3.2" xref="S2.E3.m1.3.4.3.2.1.3.2.cmml"><mi id="S2.E3.m1.3.4.3.2.1.3.2.2" xref="S2.E3.m1.3.4.3.2.1.3.2.2.cmml">t</mi><mo id="S2.E3.m1.3.4.3.2.1.3.2.3" xref="S2.E3.m1.3.4.3.2.1.3.2.3.cmml">′</mo></msup><mo id="S2.E3.m1.3.4.3.2.1.3.1" xref="S2.E3.m1.3.4.3.2.1.3.1.cmml">=</mo><mn id="S2.E3.m1.3.4.3.2.1.3.3" xref="S2.E3.m1.3.4.3.2.1.3.3.cmml">1</mn></mrow><mi id="S2.E3.m1.3.4.3.2.1.2.3" xref="S2.E3.m1.3.4.3.2.1.2.3.cmml">t</mi></munderover><msubsup id="S2.E3.m1.3.4.3.2.2" xref="S2.E3.m1.3.4.3.2.2.cmml"><mi id="S2.E3.m1.3.4.3.2.2.2.2" xref="S2.E3.m1.3.4.3.2.2.2.2.cmml">y</mi><msub id="S2.E3.m1.3.4.3.2.2.3" xref="S2.E3.m1.3.4.3.2.2.3.cmml"><mi id="S2.E3.m1.3.4.3.2.2.3.2" xref="S2.E3.m1.3.4.3.2.2.3.2.cmml">π</mi><msup id="S2.E3.m1.3.4.3.2.2.3.3" xref="S2.E3.m1.3.4.3.2.2.3.3.cmml"><mi id="S2.E3.m1.3.4.3.2.2.3.3.2" xref="S2.E3.m1.3.4.3.2.2.3.3.2.cmml">t</mi><mo id="S2.E3.m1.3.4.3.2.2.3.3.3" xref="S2.E3.m1.3.4.3.2.2.3.3.3.cmml">′</mo></msup></msub><msup id="S2.E3.m1.3.4.3.2.2.2.3" xref="S2.E3.m1.3.4.3.2.2.2.3.cmml"><mi id="S2.E3.m1.3.4.3.2.2.2.3.2" xref="S2.E3.m1.3.4.3.2.2.2.3.2.cmml">t</mi><mo id="S2.E3.m1.3.4.3.2.2.2.3.3" xref="S2.E3.m1.3.4.3.2.2.2.3.3.cmml">′</mo></msup></msubsup></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E3.m1.3b"><apply id="S2.E3.m1.3.4.cmml" xref="S2.E3.m1.3.4"><eq id="S2.E3.m1.3.4.1.cmml" xref="S2.E3.m1.3.4.1"></eq><apply id="S2.E3.m1.3.4.2.cmml" xref="S2.E3.m1.3.4.2"><times id="S2.E3.m1.3.4.2.1.cmml" xref="S2.E3.m1.3.4.2.1"></times><ci id="S2.E3.m1.3.4.2.2.cmml" xref="S2.E3.m1.3.4.2.2">𝛼</ci><interval closure="open" id="S2.E3.m1.3.4.2.3.1.cmml" xref="S2.E3.m1.3.4.2.3.2"><ci id="S2.E3.m1.2.2.cmml" xref="S2.E3.m1.2.2">𝑡</ci><ci id="S2.E3.m1.3.3.cmml" xref="S2.E3.m1.3.3">𝑣</ci></interval></apply><apply id="S2.E3.m1.3.4.3.cmml" xref="S2.E3.m1.3.4.3"><apply id="S2.E3.m1.3.4.3.1.cmml" xref="S2.E3.m1.3.4.3.1"><csymbol cd="ambiguous" id="S2.E3.m1.3.4.3.1.1.cmml" xref="S2.E3.m1.3.4.3.1">subscript</csymbol><sum id="S2.E3.m1.3.4.3.1.2.cmml" xref="S2.E3.m1.3.4.3.1.2"></sum><list id="S2.E3.m1.1.1.1.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1"><matrix id="S2.E3.m1.1.1.1.1.1.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1"><matrixrow id="S2.E3.m1.1.1.1.1.1.1a.cmml" xref="S2.E3.m1.1.1.1.1.1.1"><apply id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2"><ci id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.3">:</ci><ci id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.4.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.4">𝜋</ci><apply id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2"><eq id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.3"></eq><apply id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1"><times id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.2"></times><ci id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.3">𝐵</ci><apply id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1">subscript</csymbol><ci id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2">𝜋</ci><apply id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3"><ci id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.1">:</ci><cn id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" type="integer" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2">1</cn><ci id="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3">𝑡</ci></apply></apply></apply><apply id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2"><times id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.2"></times><ci id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.3">𝐵</ci><apply id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1"><csymbol cd="ambiguous" id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1">subscript</csymbol><apply id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1"><csymbol cd="ambiguous" id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1">superscript</csymbol><ci id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.2">𝑙</ci><ci id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.3">′</ci></apply><apply id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3"><ci id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.1">:</ci><cn id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.2.cmml" type="integer" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.2">1</cn><ci id="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3">𝑣</ci></apply></apply></apply></apply></apply></matrixrow><matrixrow id="S2.E3.m1.1.1.1.1.1.1b.cmml" xref="S2.E3.m1.1.1.1.1.1.1"><apply id="S2.E3.m1.1.1.1.1.1.1.3.1.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1"><eq id="S2.E3.m1.1.1.1.1.1.1.3.1.1.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.1"></eq><apply id="S2.E3.m1.1.1.1.1.1.1.3.1.1.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.2"><csymbol cd="ambiguous" id="S2.E3.m1.1.1.1.1.1.1.3.1.1.2.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.2">subscript</csymbol><ci id="S2.E3.m1.1.1.1.1.1.1.3.1.1.2.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.2.2">𝜋</ci><ci id="S2.E3.m1.1.1.1.1.1.1.3.1.1.2.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.2.3">𝑡</ci></apply><apply id="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.3"><csymbol cd="ambiguous" id="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.3">subscript</csymbol><apply id="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.3"><csymbol cd="ambiguous" id="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.2.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.3">superscript</csymbol><ci id="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.2.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.2.2">𝑙</ci><ci id="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.2.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.2.3">′</ci></apply><ci id="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1.1.3.3">𝑣</ci></apply></apply></matrixrow></matrix></list></apply><apply id="S2.E3.m1.3.4.3.2.cmml" xref="S2.E3.m1.3.4.3.2"><apply id="S2.E3.m1.3.4.3.2.1.cmml" xref="S2.E3.m1.3.4.3.2.1"><csymbol cd="ambiguous" id="S2.E3.m1.3.4.3.2.1.1.cmml" xref="S2.E3.m1.3.4.3.2.1">subscript</csymbol><apply id="S2.E3.m1.3.4.3.2.1.2.cmml" xref="S2.E3.m1.3.4.3.2.1"><csymbol cd="ambiguous" id="S2.E3.m1.3.4.3.2.1.2.1.cmml" xref="S2.E3.m1.3.4.3.2.1">superscript</csymbol><csymbol cd="latexml" id="S2.E3.m1.3.4.3.2.1.2.2.cmml" xref="S2.E3.m1.3.4.3.2.1.2.2">product</csymbol><ci id="S2.E3.m1.3.4.3.2.1.2.3.cmml" xref="S2.E3.m1.3.4.3.2.1.2.3">𝑡</ci></apply><apply id="S2.E3.m1.3.4.3.2.1.3.cmml" xref="S2.E3.m1.3.4.3.2.1.3"><eq id="S2.E3.m1.3.4.3.2.1.3.1.cmml" xref="S2.E3.m1.3.4.3.2.1.3.1"></eq><apply id="S2.E3.m1.3.4.3.2.1.3.2.cmml" xref="S2.E3.m1.3.4.3.2.1.3.2"><csymbol cd="ambiguous" id="S2.E3.m1.3.4.3.2.1.3.2.1.cmml" xref="S2.E3.m1.3.4.3.2.1.3.2">superscript</csymbol><ci id="S2.E3.m1.3.4.3.2.1.3.2.2.cmml" xref="S2.E3.m1.3.4.3.2.1.3.2.2">𝑡</ci><ci id="S2.E3.m1.3.4.3.2.1.3.2.3.cmml" xref="S2.E3.m1.3.4.3.2.1.3.2.3">′</ci></apply><cn id="S2.E3.m1.3.4.3.2.1.3.3.cmml" type="integer" xref="S2.E3.m1.3.4.3.2.1.3.3">1</cn></apply></apply><apply id="S2.E3.m1.3.4.3.2.2.cmml" xref="S2.E3.m1.3.4.3.2.2"><csymbol cd="ambiguous" id="S2.E3.m1.3.4.3.2.2.1.cmml" xref="S2.E3.m1.3.4.3.2.2">subscript</csymbol><apply id="S2.E3.m1.3.4.3.2.2.2.cmml" xref="S2.E3.m1.3.4.3.2.2"><csymbol cd="ambiguous" id="S2.E3.m1.3.4.3.2.2.2.1.cmml" xref="S2.E3.m1.3.4.3.2.2">superscript</csymbol><ci id="S2.E3.m1.3.4.3.2.2.2.2.cmml" xref="S2.E3.m1.3.4.3.2.2.2.2">𝑦</ci><apply id="S2.E3.m1.3.4.3.2.2.2.3.cmml" xref="S2.E3.m1.3.4.3.2.2.2.3"><csymbol cd="ambiguous" id="S2.E3.m1.3.4.3.2.2.2.3.1.cmml" xref="S2.E3.m1.3.4.3.2.2.2.3">superscript</csymbol><ci id="S2.E3.m1.3.4.3.2.2.2.3.2.cmml" xref="S2.E3.m1.3.4.3.2.2.2.3.2">𝑡</ci><ci id="S2.E3.m1.3.4.3.2.2.2.3.3.cmml" xref="S2.E3.m1.3.4.3.2.2.2.3.3">′</ci></apply></apply><apply id="S2.E3.m1.3.4.3.2.2.3.cmml" xref="S2.E3.m1.3.4.3.2.2.3"><csymbol cd="ambiguous" id="S2.E3.m1.3.4.3.2.2.3.1.cmml" xref="S2.E3.m1.3.4.3.2.2.3">subscript</csymbol><ci id="S2.E3.m1.3.4.3.2.2.3.2.cmml" xref="S2.E3.m1.3.4.3.2.2.3.2">𝜋</ci><apply id="S2.E3.m1.3.4.3.2.2.3.3.cmml" xref="S2.E3.m1.3.4.3.2.2.3.3"><csymbol cd="ambiguous" id="S2.E3.m1.3.4.3.2.2.3.3.1.cmml" xref="S2.E3.m1.3.4.3.2.2.3.3">superscript</csymbol><ci id="S2.E3.m1.3.4.3.2.2.3.3.2.cmml" xref="S2.E3.m1.3.4.3.2.2.3.3.2">𝑡</ci><ci id="S2.E3.m1.3.4.3.2.2.3.3.3.cmml" xref="S2.E3.m1.3.4.3.2.2.3.3.3">′</ci></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E3.m1.3c">\alpha(t,v)=\sum_{\begin{subarray}{c}\pi:B(\pi_{1:t})=B(l^{\prime}_{1:v})\\ \pi_{t}=l^{\prime}_{v}\end{subarray}}\prod^{t}_{t^{\prime}=1}y^{t^{\prime}}_{% \pi_{t^{\prime}}}</annotation><annotation encoding="application/x-llamapun" id="S2.E3.m1.3d">italic_α ( italic_t , italic_v ) = ∑ start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_π : italic_B ( italic_π start_POSTSUBSCRIPT 1 : italic_t end_POSTSUBSCRIPT ) = italic_B ( italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 : italic_v end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_CELL end_ROW end_ARG end_POSTSUBSCRIPT ∏ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1 end_POSTSUBSCRIPT italic_y start_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_π start_POSTSUBSCRIPT italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(3)</span></td> </tr></tbody> </table> <table class="ltx_equation ltx_eqn_table" id="S2.E4"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\beta(t,v)=\sum_{\begin{subarray}{c}\pi:B(\pi_{t:T})=B(l^{\prime}_{v:2U+1})\\ \pi_{t}=l^{\prime}_{v}\end{subarray}}\prod^{T}_{t^{\prime}=t}y^{t^{\prime}}_{% \pi_{t^{\prime}}}" class="ltx_Math" display="block" id="S2.E4.m1.3"><semantics id="S2.E4.m1.3a"><mrow id="S2.E4.m1.3.4" xref="S2.E4.m1.3.4.cmml"><mrow id="S2.E4.m1.3.4.2" xref="S2.E4.m1.3.4.2.cmml"><mi id="S2.E4.m1.3.4.2.2" xref="S2.E4.m1.3.4.2.2.cmml">β</mi><mo id="S2.E4.m1.3.4.2.1" xref="S2.E4.m1.3.4.2.1.cmml"></mo><mrow id="S2.E4.m1.3.4.2.3.2" xref="S2.E4.m1.3.4.2.3.1.cmml"><mo id="S2.E4.m1.3.4.2.3.2.1" stretchy="false" xref="S2.E4.m1.3.4.2.3.1.cmml">(</mo><mi id="S2.E4.m1.2.2" xref="S2.E4.m1.2.2.cmml">t</mi><mo id="S2.E4.m1.3.4.2.3.2.2" xref="S2.E4.m1.3.4.2.3.1.cmml">,</mo><mi id="S2.E4.m1.3.3" xref="S2.E4.m1.3.3.cmml">v</mi><mo id="S2.E4.m1.3.4.2.3.2.3" stretchy="false" xref="S2.E4.m1.3.4.2.3.1.cmml">)</mo></mrow></mrow><mo id="S2.E4.m1.3.4.1" rspace="0.111em" xref="S2.E4.m1.3.4.1.cmml">=</mo><mrow id="S2.E4.m1.3.4.3" xref="S2.E4.m1.3.4.3.cmml"><munder id="S2.E4.m1.3.4.3.1" xref="S2.E4.m1.3.4.3.1.cmml"><mo id="S2.E4.m1.3.4.3.1.2" movablelimits="false" rspace="0em" xref="S2.E4.m1.3.4.3.1.2.cmml">∑</mo><mtable id="S2.E4.m1.1.1.1.1.1.1" rowspacing="0pt" xref="S2.E4.m1.1.1.1.2.cmml"><mtr id="S2.E4.m1.1.1.1.1.1.1a" xref="S2.E4.m1.1.1.1.2.cmml"><mtd id="S2.E4.m1.1.1.1.1.1.1b" xref="S2.E4.m1.1.1.1.2.cmml"><mrow id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.cmml"><mi id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.4" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.4.cmml">π</mi><mo id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.3" lspace="0.278em" rspace="0.278em" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.3.cmml">:</mo><mrow id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.cmml"><mrow id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml">B</mi><mo id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"></mo><mrow id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mo id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">(</mo><msub id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml">π</mi><mrow id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml">t</mi><mo id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.1" lspace="0.278em" rspace="0.278em" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml">:</mo><mi id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml">T</mi></mrow></msub><mo id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3" stretchy="false" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.3" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.3.cmml">=</mo><mrow id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.cmml"><mi id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.3" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.3.cmml">B</mi><mo id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.2" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.2.cmml"></mo><mrow id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.cmml"><mo id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.2" stretchy="false" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.cmml">(</mo><msubsup id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.cmml"><mi id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.2" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.2.cmml">l</mi><mrow id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.cmml"><mi id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.2" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.2.cmml">v</mi><mo id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.1" lspace="0.278em" rspace="0.278em" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.1.cmml">:</mo><mrow id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.cmml"><mrow id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.cmml"><mn id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.2" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.2.cmml">2</mn><mo id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.1" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.1.cmml"></mo><mi id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.3" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.3.cmml">U</mi></mrow><mo id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.1" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.1.cmml">+</mo><mn id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.3" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.3.cmml">1</mn></mrow></mrow><mo id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.3" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.3.cmml">′</mo></msubsup><mo id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.3" stretchy="false" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.cmml">)</mo></mrow></mrow></mrow></mrow></mtd></mtr><mtr id="S2.E4.m1.1.1.1.1.1.1c" xref="S2.E4.m1.1.1.1.2.cmml"><mtd id="S2.E4.m1.1.1.1.1.1.1d" xref="S2.E4.m1.1.1.1.2.cmml"><mrow id="S2.E4.m1.1.1.1.1.1.1.3.1.1" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.cmml"><msub id="S2.E4.m1.1.1.1.1.1.1.3.1.1.2" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.2.cmml"><mi id="S2.E4.m1.1.1.1.1.1.1.3.1.1.2.2" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.2.2.cmml">π</mi><mi id="S2.E4.m1.1.1.1.1.1.1.3.1.1.2.3" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.2.3.cmml">t</mi></msub><mo id="S2.E4.m1.1.1.1.1.1.1.3.1.1.1" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.1.cmml">=</mo><msubsup id="S2.E4.m1.1.1.1.1.1.1.3.1.1.3" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.cmml"><mi id="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.2.2" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.2.2.cmml">l</mi><mi id="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.3" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.3.cmml">v</mi><mo id="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.2.3" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.2.3.cmml">′</mo></msubsup></mrow></mtd></mtr></mtable></munder><mrow id="S2.E4.m1.3.4.3.2" xref="S2.E4.m1.3.4.3.2.cmml"><munderover id="S2.E4.m1.3.4.3.2.1" xref="S2.E4.m1.3.4.3.2.1.cmml"><mo id="S2.E4.m1.3.4.3.2.1.2.2" movablelimits="false" xref="S2.E4.m1.3.4.3.2.1.2.2.cmml">∏</mo><mrow id="S2.E4.m1.3.4.3.2.1.3" xref="S2.E4.m1.3.4.3.2.1.3.cmml"><msup id="S2.E4.m1.3.4.3.2.1.3.2" xref="S2.E4.m1.3.4.3.2.1.3.2.cmml"><mi id="S2.E4.m1.3.4.3.2.1.3.2.2" xref="S2.E4.m1.3.4.3.2.1.3.2.2.cmml">t</mi><mo id="S2.E4.m1.3.4.3.2.1.3.2.3" xref="S2.E4.m1.3.4.3.2.1.3.2.3.cmml">′</mo></msup><mo id="S2.E4.m1.3.4.3.2.1.3.1" xref="S2.E4.m1.3.4.3.2.1.3.1.cmml">=</mo><mi id="S2.E4.m1.3.4.3.2.1.3.3" xref="S2.E4.m1.3.4.3.2.1.3.3.cmml">t</mi></mrow><mi id="S2.E4.m1.3.4.3.2.1.2.3" xref="S2.E4.m1.3.4.3.2.1.2.3.cmml">T</mi></munderover><msubsup id="S2.E4.m1.3.4.3.2.2" xref="S2.E4.m1.3.4.3.2.2.cmml"><mi id="S2.E4.m1.3.4.3.2.2.2.2" xref="S2.E4.m1.3.4.3.2.2.2.2.cmml">y</mi><msub id="S2.E4.m1.3.4.3.2.2.3" xref="S2.E4.m1.3.4.3.2.2.3.cmml"><mi id="S2.E4.m1.3.4.3.2.2.3.2" xref="S2.E4.m1.3.4.3.2.2.3.2.cmml">π</mi><msup id="S2.E4.m1.3.4.3.2.2.3.3" xref="S2.E4.m1.3.4.3.2.2.3.3.cmml"><mi id="S2.E4.m1.3.4.3.2.2.3.3.2" xref="S2.E4.m1.3.4.3.2.2.3.3.2.cmml">t</mi><mo id="S2.E4.m1.3.4.3.2.2.3.3.3" xref="S2.E4.m1.3.4.3.2.2.3.3.3.cmml">′</mo></msup></msub><msup id="S2.E4.m1.3.4.3.2.2.2.3" xref="S2.E4.m1.3.4.3.2.2.2.3.cmml"><mi id="S2.E4.m1.3.4.3.2.2.2.3.2" xref="S2.E4.m1.3.4.3.2.2.2.3.2.cmml">t</mi><mo id="S2.E4.m1.3.4.3.2.2.2.3.3" xref="S2.E4.m1.3.4.3.2.2.2.3.3.cmml">′</mo></msup></msubsup></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E4.m1.3b"><apply id="S2.E4.m1.3.4.cmml" xref="S2.E4.m1.3.4"><eq id="S2.E4.m1.3.4.1.cmml" xref="S2.E4.m1.3.4.1"></eq><apply id="S2.E4.m1.3.4.2.cmml" xref="S2.E4.m1.3.4.2"><times id="S2.E4.m1.3.4.2.1.cmml" xref="S2.E4.m1.3.4.2.1"></times><ci id="S2.E4.m1.3.4.2.2.cmml" xref="S2.E4.m1.3.4.2.2">𝛽</ci><interval closure="open" id="S2.E4.m1.3.4.2.3.1.cmml" xref="S2.E4.m1.3.4.2.3.2"><ci id="S2.E4.m1.2.2.cmml" xref="S2.E4.m1.2.2">𝑡</ci><ci id="S2.E4.m1.3.3.cmml" xref="S2.E4.m1.3.3">𝑣</ci></interval></apply><apply id="S2.E4.m1.3.4.3.cmml" xref="S2.E4.m1.3.4.3"><apply id="S2.E4.m1.3.4.3.1.cmml" xref="S2.E4.m1.3.4.3.1"><csymbol cd="ambiguous" id="S2.E4.m1.3.4.3.1.1.cmml" xref="S2.E4.m1.3.4.3.1">subscript</csymbol><sum id="S2.E4.m1.3.4.3.1.2.cmml" xref="S2.E4.m1.3.4.3.1.2"></sum><list id="S2.E4.m1.1.1.1.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1"><matrix id="S2.E4.m1.1.1.1.1.1.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1"><matrixrow id="S2.E4.m1.1.1.1.1.1.1a.cmml" xref="S2.E4.m1.1.1.1.1.1.1"><apply id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2"><ci id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.3">:</ci><ci id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.4.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.4">𝜋</ci><apply id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2"><eq id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.3"></eq><apply id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1"><times id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.2"></times><ci id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.3">𝐵</ci><apply id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1">subscript</csymbol><ci id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2">𝜋</ci><apply id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3"><ci id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.1">:</ci><ci id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.2">𝑡</ci><ci id="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.3">𝑇</ci></apply></apply></apply><apply id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2"><times id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.2"></times><ci id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.3">𝐵</ci><apply id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1"><csymbol cd="ambiguous" id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1">subscript</csymbol><apply id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1"><csymbol cd="ambiguous" id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1">superscript</csymbol><ci id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.2">𝑙</ci><ci id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.2.3">′</ci></apply><apply id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3"><ci id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.1">:</ci><ci id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.2">𝑣</ci><apply id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3"><plus id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.1"></plus><apply id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2"><times id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.1"></times><cn id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.2.cmml" type="integer" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.2">2</cn><ci id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.2.3">𝑈</ci></apply><cn id="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.3.cmml" type="integer" xref="S2.E4.m1.1.1.1.1.1.1.2.2.2.2.2.2.1.1.1.3.3.3">1</cn></apply></apply></apply></apply></apply></apply></matrixrow><matrixrow id="S2.E4.m1.1.1.1.1.1.1b.cmml" xref="S2.E4.m1.1.1.1.1.1.1"><apply id="S2.E4.m1.1.1.1.1.1.1.3.1.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1"><eq id="S2.E4.m1.1.1.1.1.1.1.3.1.1.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.1"></eq><apply id="S2.E4.m1.1.1.1.1.1.1.3.1.1.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.2"><csymbol cd="ambiguous" id="S2.E4.m1.1.1.1.1.1.1.3.1.1.2.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.2">subscript</csymbol><ci id="S2.E4.m1.1.1.1.1.1.1.3.1.1.2.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.2.2">𝜋</ci><ci id="S2.E4.m1.1.1.1.1.1.1.3.1.1.2.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.2.3">𝑡</ci></apply><apply id="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.3"><csymbol cd="ambiguous" id="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.3">subscript</csymbol><apply id="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.3"><csymbol cd="ambiguous" id="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.2.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.3">superscript</csymbol><ci id="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.2.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.2.2">𝑙</ci><ci id="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.2.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.2.3">′</ci></apply><ci id="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.1.1.3.3">𝑣</ci></apply></apply></matrixrow></matrix></list></apply><apply id="S2.E4.m1.3.4.3.2.cmml" xref="S2.E4.m1.3.4.3.2"><apply id="S2.E4.m1.3.4.3.2.1.cmml" xref="S2.E4.m1.3.4.3.2.1"><csymbol cd="ambiguous" id="S2.E4.m1.3.4.3.2.1.1.cmml" xref="S2.E4.m1.3.4.3.2.1">subscript</csymbol><apply id="S2.E4.m1.3.4.3.2.1.2.cmml" xref="S2.E4.m1.3.4.3.2.1"><csymbol cd="ambiguous" id="S2.E4.m1.3.4.3.2.1.2.1.cmml" xref="S2.E4.m1.3.4.3.2.1">superscript</csymbol><csymbol cd="latexml" id="S2.E4.m1.3.4.3.2.1.2.2.cmml" xref="S2.E4.m1.3.4.3.2.1.2.2">product</csymbol><ci id="S2.E4.m1.3.4.3.2.1.2.3.cmml" xref="S2.E4.m1.3.4.3.2.1.2.3">𝑇</ci></apply><apply id="S2.E4.m1.3.4.3.2.1.3.cmml" xref="S2.E4.m1.3.4.3.2.1.3"><eq id="S2.E4.m1.3.4.3.2.1.3.1.cmml" xref="S2.E4.m1.3.4.3.2.1.3.1"></eq><apply id="S2.E4.m1.3.4.3.2.1.3.2.cmml" xref="S2.E4.m1.3.4.3.2.1.3.2"><csymbol cd="ambiguous" id="S2.E4.m1.3.4.3.2.1.3.2.1.cmml" xref="S2.E4.m1.3.4.3.2.1.3.2">superscript</csymbol><ci id="S2.E4.m1.3.4.3.2.1.3.2.2.cmml" xref="S2.E4.m1.3.4.3.2.1.3.2.2">𝑡</ci><ci id="S2.E4.m1.3.4.3.2.1.3.2.3.cmml" xref="S2.E4.m1.3.4.3.2.1.3.2.3">′</ci></apply><ci id="S2.E4.m1.3.4.3.2.1.3.3.cmml" xref="S2.E4.m1.3.4.3.2.1.3.3">𝑡</ci></apply></apply><apply id="S2.E4.m1.3.4.3.2.2.cmml" xref="S2.E4.m1.3.4.3.2.2"><csymbol cd="ambiguous" id="S2.E4.m1.3.4.3.2.2.1.cmml" xref="S2.E4.m1.3.4.3.2.2">subscript</csymbol><apply id="S2.E4.m1.3.4.3.2.2.2.cmml" xref="S2.E4.m1.3.4.3.2.2"><csymbol cd="ambiguous" id="S2.E4.m1.3.4.3.2.2.2.1.cmml" xref="S2.E4.m1.3.4.3.2.2">superscript</csymbol><ci id="S2.E4.m1.3.4.3.2.2.2.2.cmml" xref="S2.E4.m1.3.4.3.2.2.2.2">𝑦</ci><apply id="S2.E4.m1.3.4.3.2.2.2.3.cmml" xref="S2.E4.m1.3.4.3.2.2.2.3"><csymbol cd="ambiguous" id="S2.E4.m1.3.4.3.2.2.2.3.1.cmml" xref="S2.E4.m1.3.4.3.2.2.2.3">superscript</csymbol><ci id="S2.E4.m1.3.4.3.2.2.2.3.2.cmml" xref="S2.E4.m1.3.4.3.2.2.2.3.2">𝑡</ci><ci id="S2.E4.m1.3.4.3.2.2.2.3.3.cmml" xref="S2.E4.m1.3.4.3.2.2.2.3.3">′</ci></apply></apply><apply id="S2.E4.m1.3.4.3.2.2.3.cmml" xref="S2.E4.m1.3.4.3.2.2.3"><csymbol cd="ambiguous" id="S2.E4.m1.3.4.3.2.2.3.1.cmml" xref="S2.E4.m1.3.4.3.2.2.3">subscript</csymbol><ci id="S2.E4.m1.3.4.3.2.2.3.2.cmml" xref="S2.E4.m1.3.4.3.2.2.3.2">𝜋</ci><apply id="S2.E4.m1.3.4.3.2.2.3.3.cmml" xref="S2.E4.m1.3.4.3.2.2.3.3"><csymbol cd="ambiguous" id="S2.E4.m1.3.4.3.2.2.3.3.1.cmml" xref="S2.E4.m1.3.4.3.2.2.3.3">superscript</csymbol><ci id="S2.E4.m1.3.4.3.2.2.3.3.2.cmml" xref="S2.E4.m1.3.4.3.2.2.3.3.2">𝑡</ci><ci id="S2.E4.m1.3.4.3.2.2.3.3.3.cmml" xref="S2.E4.m1.3.4.3.2.2.3.3.3">′</ci></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E4.m1.3c">\beta(t,v)=\sum_{\begin{subarray}{c}\pi:B(\pi_{t:T})=B(l^{\prime}_{v:2U+1})\\ \pi_{t}=l^{\prime}_{v}\end{subarray}}\prod^{T}_{t^{\prime}=t}y^{t^{\prime}}_{% \pi_{t^{\prime}}}</annotation><annotation encoding="application/x-llamapun" id="S2.E4.m1.3d">italic_β ( italic_t , italic_v ) = ∑ start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_π : italic_B ( italic_π start_POSTSUBSCRIPT italic_t : italic_T end_POSTSUBSCRIPT ) = italic_B ( italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_v : 2 italic_U + 1 end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_CELL end_ROW end_ARG end_POSTSUBSCRIPT ∏ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_t end_POSTSUBSCRIPT italic_y start_POSTSUPERSCRIPT italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_π start_POSTSUBSCRIPT italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(4)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS1.p2.15">with <math alttext="(t,v)" class="ltx_Math" display="inline" id="S2.SS1.p2.7.m1.2"><semantics id="S2.SS1.p2.7.m1.2a"><mrow id="S2.SS1.p2.7.m1.2.3.2" xref="S2.SS1.p2.7.m1.2.3.1.cmml"><mo id="S2.SS1.p2.7.m1.2.3.2.1" stretchy="false" xref="S2.SS1.p2.7.m1.2.3.1.cmml">(</mo><mi id="S2.SS1.p2.7.m1.1.1" xref="S2.SS1.p2.7.m1.1.1.cmml">t</mi><mo id="S2.SS1.p2.7.m1.2.3.2.2" xref="S2.SS1.p2.7.m1.2.3.1.cmml">,</mo><mi id="S2.SS1.p2.7.m1.2.2" xref="S2.SS1.p2.7.m1.2.2.cmml">v</mi><mo id="S2.SS1.p2.7.m1.2.3.2.3" stretchy="false" xref="S2.SS1.p2.7.m1.2.3.1.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.7.m1.2b"><interval closure="open" id="S2.SS1.p2.7.m1.2.3.1.cmml" xref="S2.SS1.p2.7.m1.2.3.2"><ci id="S2.SS1.p2.7.m1.1.1.cmml" xref="S2.SS1.p2.7.m1.1.1">𝑡</ci><ci id="S2.SS1.p2.7.m1.2.2.cmml" xref="S2.SS1.p2.7.m1.2.2">𝑣</ci></interval></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.7.m1.2c">(t,v)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.7.m1.2d">( italic_t , italic_v )</annotation></semantics></math> is a node in CTC lattice and <math alttext="1\,{\leq}\,t\,{\leq}\,T" class="ltx_Math" display="inline" id="S2.SS1.p2.8.m2.1"><semantics id="S2.SS1.p2.8.m2.1a"><mrow id="S2.SS1.p2.8.m2.1.1" xref="S2.SS1.p2.8.m2.1.1.cmml"><mn id="S2.SS1.p2.8.m2.1.1.2" xref="S2.SS1.p2.8.m2.1.1.2.cmml">1</mn><mo id="S2.SS1.p2.8.m2.1.1.3" lspace="0.448em" rspace="0.448em" xref="S2.SS1.p2.8.m2.1.1.3.cmml">≤</mo><mi id="S2.SS1.p2.8.m2.1.1.4" xref="S2.SS1.p2.8.m2.1.1.4.cmml">t</mi><mo id="S2.SS1.p2.8.m2.1.1.5" lspace="0.448em" rspace="0.448em" xref="S2.SS1.p2.8.m2.1.1.5.cmml">≤</mo><mi id="S2.SS1.p2.8.m2.1.1.6" xref="S2.SS1.p2.8.m2.1.1.6.cmml">T</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.8.m2.1b"><apply id="S2.SS1.p2.8.m2.1.1.cmml" xref="S2.SS1.p2.8.m2.1.1"><and id="S2.SS1.p2.8.m2.1.1a.cmml" xref="S2.SS1.p2.8.m2.1.1"></and><apply id="S2.SS1.p2.8.m2.1.1b.cmml" xref="S2.SS1.p2.8.m2.1.1"><leq id="S2.SS1.p2.8.m2.1.1.3.cmml" xref="S2.SS1.p2.8.m2.1.1.3"></leq><cn id="S2.SS1.p2.8.m2.1.1.2.cmml" type="integer" xref="S2.SS1.p2.8.m2.1.1.2">1</cn><ci id="S2.SS1.p2.8.m2.1.1.4.cmml" xref="S2.SS1.p2.8.m2.1.1.4">𝑡</ci></apply><apply id="S2.SS1.p2.8.m2.1.1c.cmml" xref="S2.SS1.p2.8.m2.1.1"><leq id="S2.SS1.p2.8.m2.1.1.5.cmml" xref="S2.SS1.p2.8.m2.1.1.5"></leq><share href="https://arxiv.org/html/2409.12388v2#S2.SS1.p2.8.m2.1.1.4.cmml" id="S2.SS1.p2.8.m2.1.1d.cmml" xref="S2.SS1.p2.8.m2.1.1"></share><ci id="S2.SS1.p2.8.m2.1.1.6.cmml" xref="S2.SS1.p2.8.m2.1.1.6">𝑇</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.8.m2.1c">1\,{\leq}\,t\,{\leq}\,T</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.8.m2.1d">1 ≤ italic_t ≤ italic_T</annotation></semantics></math>, <math alttext="1\,{\leq}\,v\,{\leq}\,2U+1" class="ltx_Math" display="inline" id="S2.SS1.p2.9.m3.1"><semantics id="S2.SS1.p2.9.m3.1a"><mrow id="S2.SS1.p2.9.m3.1.1" xref="S2.SS1.p2.9.m3.1.1.cmml"><mn id="S2.SS1.p2.9.m3.1.1.2" xref="S2.SS1.p2.9.m3.1.1.2.cmml">1</mn><mo id="S2.SS1.p2.9.m3.1.1.3" lspace="0.448em" rspace="0.448em" xref="S2.SS1.p2.9.m3.1.1.3.cmml">≤</mo><mi id="S2.SS1.p2.9.m3.1.1.4" xref="S2.SS1.p2.9.m3.1.1.4.cmml">v</mi><mo id="S2.SS1.p2.9.m3.1.1.5" lspace="0.448em" xref="S2.SS1.p2.9.m3.1.1.5.cmml">≤</mo><mrow id="S2.SS1.p2.9.m3.1.1.6" xref="S2.SS1.p2.9.m3.1.1.6.cmml"><mrow id="S2.SS1.p2.9.m3.1.1.6.2" xref="S2.SS1.p2.9.m3.1.1.6.2.cmml"><mn id="S2.SS1.p2.9.m3.1.1.6.2.2" xref="S2.SS1.p2.9.m3.1.1.6.2.2.cmml"> 2</mn><mo id="S2.SS1.p2.9.m3.1.1.6.2.1" xref="S2.SS1.p2.9.m3.1.1.6.2.1.cmml"></mo><mi id="S2.SS1.p2.9.m3.1.1.6.2.3" xref="S2.SS1.p2.9.m3.1.1.6.2.3.cmml">U</mi></mrow><mo id="S2.SS1.p2.9.m3.1.1.6.1" xref="S2.SS1.p2.9.m3.1.1.6.1.cmml">+</mo><mn id="S2.SS1.p2.9.m3.1.1.6.3" xref="S2.SS1.p2.9.m3.1.1.6.3.cmml">1</mn></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.9.m3.1b"><apply id="S2.SS1.p2.9.m3.1.1.cmml" xref="S2.SS1.p2.9.m3.1.1"><and id="S2.SS1.p2.9.m3.1.1a.cmml" xref="S2.SS1.p2.9.m3.1.1"></and><apply id="S2.SS1.p2.9.m3.1.1b.cmml" xref="S2.SS1.p2.9.m3.1.1"><leq id="S2.SS1.p2.9.m3.1.1.3.cmml" xref="S2.SS1.p2.9.m3.1.1.3"></leq><cn id="S2.SS1.p2.9.m3.1.1.2.cmml" type="integer" xref="S2.SS1.p2.9.m3.1.1.2">1</cn><ci id="S2.SS1.p2.9.m3.1.1.4.cmml" xref="S2.SS1.p2.9.m3.1.1.4">𝑣</ci></apply><apply id="S2.SS1.p2.9.m3.1.1c.cmml" xref="S2.SS1.p2.9.m3.1.1"><leq id="S2.SS1.p2.9.m3.1.1.5.cmml" xref="S2.SS1.p2.9.m3.1.1.5"></leq><share href="https://arxiv.org/html/2409.12388v2#S2.SS1.p2.9.m3.1.1.4.cmml" id="S2.SS1.p2.9.m3.1.1d.cmml" xref="S2.SS1.p2.9.m3.1.1"></share><apply id="S2.SS1.p2.9.m3.1.1.6.cmml" xref="S2.SS1.p2.9.m3.1.1.6"><plus id="S2.SS1.p2.9.m3.1.1.6.1.cmml" xref="S2.SS1.p2.9.m3.1.1.6.1"></plus><apply id="S2.SS1.p2.9.m3.1.1.6.2.cmml" xref="S2.SS1.p2.9.m3.1.1.6.2"><times id="S2.SS1.p2.9.m3.1.1.6.2.1.cmml" xref="S2.SS1.p2.9.m3.1.1.6.2.1"></times><cn id="S2.SS1.p2.9.m3.1.1.6.2.2.cmml" type="integer" xref="S2.SS1.p2.9.m3.1.1.6.2.2">2</cn><ci id="S2.SS1.p2.9.m3.1.1.6.2.3.cmml" xref="S2.SS1.p2.9.m3.1.1.6.2.3">𝑈</ci></apply><cn id="S2.SS1.p2.9.m3.1.1.6.3.cmml" type="integer" xref="S2.SS1.p2.9.m3.1.1.6.3">1</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.9.m3.1c">1\,{\leq}\,v\,{\leq}\,2U+1</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.9.m3.1d">1 ≤ italic_v ≤ 2 italic_U + 1</annotation></semantics></math>. These two variables summarized posterior of all paths passing through node <math alttext="(t,v)" class="ltx_Math" display="inline" id="S2.SS1.p2.10.m4.2"><semantics id="S2.SS1.p2.10.m4.2a"><mrow id="S2.SS1.p2.10.m4.2.3.2" xref="S2.SS1.p2.10.m4.2.3.1.cmml"><mo id="S2.SS1.p2.10.m4.2.3.2.1" stretchy="false" xref="S2.SS1.p2.10.m4.2.3.1.cmml">(</mo><mi id="S2.SS1.p2.10.m4.1.1" xref="S2.SS1.p2.10.m4.1.1.cmml">t</mi><mo id="S2.SS1.p2.10.m4.2.3.2.2" xref="S2.SS1.p2.10.m4.2.3.1.cmml">,</mo><mi id="S2.SS1.p2.10.m4.2.2" xref="S2.SS1.p2.10.m4.2.2.cmml">v</mi><mo id="S2.SS1.p2.10.m4.2.3.2.3" stretchy="false" xref="S2.SS1.p2.10.m4.2.3.1.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.10.m4.2b"><interval closure="open" id="S2.SS1.p2.10.m4.2.3.1.cmml" xref="S2.SS1.p2.10.m4.2.3.2"><ci id="S2.SS1.p2.10.m4.1.1.cmml" xref="S2.SS1.p2.10.m4.1.1">𝑡</ci><ci id="S2.SS1.p2.10.m4.2.2.cmml" xref="S2.SS1.p2.10.m4.2.2">𝑣</ci></interval></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.10.m4.2c">(t,v)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.10.m4.2d">( italic_t , italic_v )</annotation></semantics></math>, with exact <math alttext="1:v" class="ltx_Math" display="inline" id="S2.SS1.p2.11.m5.1"><semantics id="S2.SS1.p2.11.m5.1a"><mrow id="S2.SS1.p2.11.m5.1.1" xref="S2.SS1.p2.11.m5.1.1.cmml"><mn id="S2.SS1.p2.11.m5.1.1.2" xref="S2.SS1.p2.11.m5.1.1.2.cmml">1</mn><mo id="S2.SS1.p2.11.m5.1.1.1" lspace="0.278em" rspace="0.278em" xref="S2.SS1.p2.11.m5.1.1.1.cmml">:</mo><mi id="S2.SS1.p2.11.m5.1.1.3" xref="S2.SS1.p2.11.m5.1.1.3.cmml">v</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.11.m5.1b"><apply id="S2.SS1.p2.11.m5.1.1.cmml" xref="S2.SS1.p2.11.m5.1.1"><ci id="S2.SS1.p2.11.m5.1.1.1.cmml" xref="S2.SS1.p2.11.m5.1.1.1">:</ci><cn id="S2.SS1.p2.11.m5.1.1.2.cmml" type="integer" xref="S2.SS1.p2.11.m5.1.1.2">1</cn><ci id="S2.SS1.p2.11.m5.1.1.3.cmml" xref="S2.SS1.p2.11.m5.1.1.3">𝑣</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.11.m5.1c">1:v</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.11.m5.1d">1 : italic_v</annotation></semantics></math> prefix and <math alttext="v:2U+1" class="ltx_Math" display="inline" id="S2.SS1.p2.12.m6.1"><semantics id="S2.SS1.p2.12.m6.1a"><mrow id="S2.SS1.p2.12.m6.1.1" xref="S2.SS1.p2.12.m6.1.1.cmml"><mi id="S2.SS1.p2.12.m6.1.1.2" xref="S2.SS1.p2.12.m6.1.1.2.cmml">v</mi><mo id="S2.SS1.p2.12.m6.1.1.1" lspace="0.278em" rspace="0.278em" xref="S2.SS1.p2.12.m6.1.1.1.cmml">:</mo><mrow id="S2.SS1.p2.12.m6.1.1.3" xref="S2.SS1.p2.12.m6.1.1.3.cmml"><mrow id="S2.SS1.p2.12.m6.1.1.3.2" xref="S2.SS1.p2.12.m6.1.1.3.2.cmml"><mn id="S2.SS1.p2.12.m6.1.1.3.2.2" xref="S2.SS1.p2.12.m6.1.1.3.2.2.cmml">2</mn><mo id="S2.SS1.p2.12.m6.1.1.3.2.1" xref="S2.SS1.p2.12.m6.1.1.3.2.1.cmml"></mo><mi id="S2.SS1.p2.12.m6.1.1.3.2.3" xref="S2.SS1.p2.12.m6.1.1.3.2.3.cmml">U</mi></mrow><mo id="S2.SS1.p2.12.m6.1.1.3.1" xref="S2.SS1.p2.12.m6.1.1.3.1.cmml">+</mo><mn id="S2.SS1.p2.12.m6.1.1.3.3" xref="S2.SS1.p2.12.m6.1.1.3.3.cmml">1</mn></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.12.m6.1b"><apply id="S2.SS1.p2.12.m6.1.1.cmml" xref="S2.SS1.p2.12.m6.1.1"><ci id="S2.SS1.p2.12.m6.1.1.1.cmml" xref="S2.SS1.p2.12.m6.1.1.1">:</ci><ci id="S2.SS1.p2.12.m6.1.1.2.cmml" xref="S2.SS1.p2.12.m6.1.1.2">𝑣</ci><apply id="S2.SS1.p2.12.m6.1.1.3.cmml" xref="S2.SS1.p2.12.m6.1.1.3"><plus id="S2.SS1.p2.12.m6.1.1.3.1.cmml" xref="S2.SS1.p2.12.m6.1.1.3.1"></plus><apply id="S2.SS1.p2.12.m6.1.1.3.2.cmml" xref="S2.SS1.p2.12.m6.1.1.3.2"><times id="S2.SS1.p2.12.m6.1.1.3.2.1.cmml" xref="S2.SS1.p2.12.m6.1.1.3.2.1"></times><cn id="S2.SS1.p2.12.m6.1.1.3.2.2.cmml" type="integer" xref="S2.SS1.p2.12.m6.1.1.3.2.2">2</cn><ci id="S2.SS1.p2.12.m6.1.1.3.2.3.cmml" xref="S2.SS1.p2.12.m6.1.1.3.2.3">𝑈</ci></apply><cn id="S2.SS1.p2.12.m6.1.1.3.3.cmml" type="integer" xref="S2.SS1.p2.12.m6.1.1.3.3">1</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.12.m6.1c">v:2U+1</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.12.m6.1d">italic_v : 2 italic_U + 1</annotation></semantics></math> suffix alignment. Subsequently, for any consent time step <math alttext="t" class="ltx_Math" display="inline" id="S2.SS1.p2.13.m7.1"><semantics id="S2.SS1.p2.13.m7.1a"><mi id="S2.SS1.p2.13.m7.1.1" xref="S2.SS1.p2.13.m7.1.1.cmml">t</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.13.m7.1b"><ci id="S2.SS1.p2.13.m7.1.1.cmml" xref="S2.SS1.p2.13.m7.1.1">𝑡</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.13.m7.1c">t</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.13.m7.1d">italic_t</annotation></semantics></math>, enumerating all possible tokens <math alttext="v" class="ltx_Math" display="inline" id="S2.SS1.p2.14.m8.1"><semantics id="S2.SS1.p2.14.m8.1a"><mi id="S2.SS1.p2.14.m8.1.1" xref="S2.SS1.p2.14.m8.1.1.cmml">v</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.14.m8.1b"><ci id="S2.SS1.p2.14.m8.1.1.cmml" xref="S2.SS1.p2.14.m8.1.1">𝑣</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.14.m8.1c">v</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.14.m8.1d">italic_v</annotation></semantics></math> in <math alttext="l^{\prime}" class="ltx_Math" display="inline" id="S2.SS1.p2.15.m9.1"><semantics id="S2.SS1.p2.15.m9.1a"><msup id="S2.SS1.p2.15.m9.1.1" xref="S2.SS1.p2.15.m9.1.1.cmml"><mi id="S2.SS1.p2.15.m9.1.1.2" xref="S2.SS1.p2.15.m9.1.1.2.cmml">l</mi><mo id="S2.SS1.p2.15.m9.1.1.3" xref="S2.SS1.p2.15.m9.1.1.3.cmml">′</mo></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.15.m9.1b"><apply id="S2.SS1.p2.15.m9.1.1.cmml" xref="S2.SS1.p2.15.m9.1.1"><csymbol cd="ambiguous" id="S2.SS1.p2.15.m9.1.1.1.cmml" xref="S2.SS1.p2.15.m9.1.1">superscript</csymbol><ci id="S2.SS1.p2.15.m9.1.1.2.cmml" xref="S2.SS1.p2.15.m9.1.1.2">𝑙</ci><ci id="S2.SS1.p2.15.m9.1.1.3.cmml" xref="S2.SS1.p2.15.m9.1.1.3">′</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.15.m9.1c">l^{\prime}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.15.m9.1d">italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT</annotation></semantics></math> will consider all possible paths. Therefore the CTC posteriors can be calculated by:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E5"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="P(l|x)=\sum_{\pi\in B^{-1}(l)}p(\pi|x)=\sum^{2U+1}_{v=1}\frac{\alpha(t,v)\cdot% \beta(t,v)}{y^{t}_{l^{\prime}_{v}}}" class="ltx_Math" display="block" id="S2.E5.m1.7"><semantics id="S2.E5.m1.7a"><mrow id="S2.E5.m1.7.7" xref="S2.E5.m1.7.7.cmml"><mrow id="S2.E5.m1.6.6.1" xref="S2.E5.m1.6.6.1.cmml"><mi id="S2.E5.m1.6.6.1.3" xref="S2.E5.m1.6.6.1.3.cmml">P</mi><mo id="S2.E5.m1.6.6.1.2" xref="S2.E5.m1.6.6.1.2.cmml"></mo><mrow id="S2.E5.m1.6.6.1.1.1" xref="S2.E5.m1.6.6.1.1.1.1.cmml"><mo id="S2.E5.m1.6.6.1.1.1.2" stretchy="false" xref="S2.E5.m1.6.6.1.1.1.1.cmml">(</mo><mrow id="S2.E5.m1.6.6.1.1.1.1" xref="S2.E5.m1.6.6.1.1.1.1.cmml"><mi id="S2.E5.m1.6.6.1.1.1.1.2" xref="S2.E5.m1.6.6.1.1.1.1.2.cmml">l</mi><mo fence="false" id="S2.E5.m1.6.6.1.1.1.1.1" xref="S2.E5.m1.6.6.1.1.1.1.1.cmml">|</mo><mi id="S2.E5.m1.6.6.1.1.1.1.3" xref="S2.E5.m1.6.6.1.1.1.1.3.cmml">x</mi></mrow><mo id="S2.E5.m1.6.6.1.1.1.3" stretchy="false" xref="S2.E5.m1.6.6.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S2.E5.m1.7.7.4" rspace="0.111em" xref="S2.E5.m1.7.7.4.cmml">=</mo><mrow id="S2.E5.m1.7.7.2" xref="S2.E5.m1.7.7.2.cmml"><munder id="S2.E5.m1.7.7.2.2" xref="S2.E5.m1.7.7.2.2.cmml"><mo id="S2.E5.m1.7.7.2.2.2" movablelimits="false" xref="S2.E5.m1.7.7.2.2.2.cmml">∑</mo><mrow id="S2.E5.m1.1.1.1" xref="S2.E5.m1.1.1.1.cmml"><mi id="S2.E5.m1.1.1.1.3" xref="S2.E5.m1.1.1.1.3.cmml">π</mi><mo id="S2.E5.m1.1.1.1.2" xref="S2.E5.m1.1.1.1.2.cmml">∈</mo><mrow id="S2.E5.m1.1.1.1.4" xref="S2.E5.m1.1.1.1.4.cmml"><msup id="S2.E5.m1.1.1.1.4.2" xref="S2.E5.m1.1.1.1.4.2.cmml"><mi id="S2.E5.m1.1.1.1.4.2.2" xref="S2.E5.m1.1.1.1.4.2.2.cmml">B</mi><mrow id="S2.E5.m1.1.1.1.4.2.3" xref="S2.E5.m1.1.1.1.4.2.3.cmml"><mo id="S2.E5.m1.1.1.1.4.2.3a" xref="S2.E5.m1.1.1.1.4.2.3.cmml">−</mo><mn id="S2.E5.m1.1.1.1.4.2.3.2" xref="S2.E5.m1.1.1.1.4.2.3.2.cmml">1</mn></mrow></msup><mo id="S2.E5.m1.1.1.1.4.1" xref="S2.E5.m1.1.1.1.4.1.cmml"></mo><mrow id="S2.E5.m1.1.1.1.4.3.2" xref="S2.E5.m1.1.1.1.4.cmml"><mo id="S2.E5.m1.1.1.1.4.3.2.1" stretchy="false" xref="S2.E5.m1.1.1.1.4.cmml">(</mo><mi id="S2.E5.m1.1.1.1.1" xref="S2.E5.m1.1.1.1.1.cmml">l</mi><mo id="S2.E5.m1.1.1.1.4.3.2.2" stretchy="false" xref="S2.E5.m1.1.1.1.4.cmml">)</mo></mrow></mrow></mrow></munder><mrow id="S2.E5.m1.7.7.2.1" xref="S2.E5.m1.7.7.2.1.cmml"><mi id="S2.E5.m1.7.7.2.1.3" xref="S2.E5.m1.7.7.2.1.3.cmml">p</mi><mo id="S2.E5.m1.7.7.2.1.2" xref="S2.E5.m1.7.7.2.1.2.cmml"></mo><mrow id="S2.E5.m1.7.7.2.1.1.1" xref="S2.E5.m1.7.7.2.1.1.1.1.cmml"><mo id="S2.E5.m1.7.7.2.1.1.1.2" stretchy="false" xref="S2.E5.m1.7.7.2.1.1.1.1.cmml">(</mo><mrow id="S2.E5.m1.7.7.2.1.1.1.1" xref="S2.E5.m1.7.7.2.1.1.1.1.cmml"><mi id="S2.E5.m1.7.7.2.1.1.1.1.2" xref="S2.E5.m1.7.7.2.1.1.1.1.2.cmml">π</mi><mo fence="false" id="S2.E5.m1.7.7.2.1.1.1.1.1" xref="S2.E5.m1.7.7.2.1.1.1.1.1.cmml">|</mo><mi id="S2.E5.m1.7.7.2.1.1.1.1.3" xref="S2.E5.m1.7.7.2.1.1.1.1.3.cmml">x</mi></mrow><mo id="S2.E5.m1.7.7.2.1.1.1.3" stretchy="false" xref="S2.E5.m1.7.7.2.1.1.1.1.cmml">)</mo></mrow></mrow></mrow><mo id="S2.E5.m1.7.7.5" rspace="0.111em" xref="S2.E5.m1.7.7.5.cmml">=</mo><mrow id="S2.E5.m1.7.7.6" xref="S2.E5.m1.7.7.6.cmml"><munderover id="S2.E5.m1.7.7.6.1" xref="S2.E5.m1.7.7.6.1.cmml"><mo id="S2.E5.m1.7.7.6.1.2.2" movablelimits="false" xref="S2.E5.m1.7.7.6.1.2.2.cmml">∑</mo><mrow id="S2.E5.m1.7.7.6.1.3" xref="S2.E5.m1.7.7.6.1.3.cmml"><mi id="S2.E5.m1.7.7.6.1.3.2" xref="S2.E5.m1.7.7.6.1.3.2.cmml">v</mi><mo id="S2.E5.m1.7.7.6.1.3.1" xref="S2.E5.m1.7.7.6.1.3.1.cmml">=</mo><mn id="S2.E5.m1.7.7.6.1.3.3" xref="S2.E5.m1.7.7.6.1.3.3.cmml">1</mn></mrow><mrow id="S2.E5.m1.7.7.6.1.2.3" xref="S2.E5.m1.7.7.6.1.2.3.cmml"><mrow id="S2.E5.m1.7.7.6.1.2.3.2" xref="S2.E5.m1.7.7.6.1.2.3.2.cmml"><mn id="S2.E5.m1.7.7.6.1.2.3.2.2" xref="S2.E5.m1.7.7.6.1.2.3.2.2.cmml">2</mn><mo id="S2.E5.m1.7.7.6.1.2.3.2.1" xref="S2.E5.m1.7.7.6.1.2.3.2.1.cmml"></mo><mi id="S2.E5.m1.7.7.6.1.2.3.2.3" xref="S2.E5.m1.7.7.6.1.2.3.2.3.cmml">U</mi></mrow><mo id="S2.E5.m1.7.7.6.1.2.3.1" xref="S2.E5.m1.7.7.6.1.2.3.1.cmml">+</mo><mn id="S2.E5.m1.7.7.6.1.2.3.3" xref="S2.E5.m1.7.7.6.1.2.3.3.cmml">1</mn></mrow></munderover><mfrac id="S2.E5.m1.5.5" xref="S2.E5.m1.5.5.cmml"><mrow id="S2.E5.m1.5.5.4" xref="S2.E5.m1.5.5.4.cmml"><mrow id="S2.E5.m1.5.5.4.6" xref="S2.E5.m1.5.5.4.6.cmml"><mrow id="S2.E5.m1.5.5.4.6.2" xref="S2.E5.m1.5.5.4.6.2.cmml"><mi id="S2.E5.m1.5.5.4.6.2.2" xref="S2.E5.m1.5.5.4.6.2.2.cmml">α</mi><mo id="S2.E5.m1.5.5.4.6.2.1" xref="S2.E5.m1.5.5.4.6.2.1.cmml"></mo><mrow id="S2.E5.m1.5.5.4.6.2.3.2" xref="S2.E5.m1.5.5.4.6.2.3.1.cmml"><mo id="S2.E5.m1.5.5.4.6.2.3.2.1" stretchy="false" xref="S2.E5.m1.5.5.4.6.2.3.1.cmml">(</mo><mi id="S2.E5.m1.2.2.1.1" xref="S2.E5.m1.2.2.1.1.cmml">t</mi><mo id="S2.E5.m1.5.5.4.6.2.3.2.2" xref="S2.E5.m1.5.5.4.6.2.3.1.cmml">,</mo><mi id="S2.E5.m1.3.3.2.2" xref="S2.E5.m1.3.3.2.2.cmml">v</mi><mo id="S2.E5.m1.5.5.4.6.2.3.2.3" rspace="0.055em" stretchy="false" xref="S2.E5.m1.5.5.4.6.2.3.1.cmml">)</mo></mrow></mrow><mo id="S2.E5.m1.5.5.4.6.1" rspace="0.222em" xref="S2.E5.m1.5.5.4.6.1.cmml">⋅</mo><mi id="S2.E5.m1.5.5.4.6.3" xref="S2.E5.m1.5.5.4.6.3.cmml">β</mi></mrow><mo id="S2.E5.m1.5.5.4.5" xref="S2.E5.m1.5.5.4.5.cmml"></mo><mrow id="S2.E5.m1.5.5.4.7.2" xref="S2.E5.m1.5.5.4.7.1.cmml"><mo id="S2.E5.m1.5.5.4.7.2.1" stretchy="false" xref="S2.E5.m1.5.5.4.7.1.cmml">(</mo><mi id="S2.E5.m1.4.4.3.3" xref="S2.E5.m1.4.4.3.3.cmml">t</mi><mo id="S2.E5.m1.5.5.4.7.2.2" xref="S2.E5.m1.5.5.4.7.1.cmml">,</mo><mi id="S2.E5.m1.5.5.4.4" xref="S2.E5.m1.5.5.4.4.cmml">v</mi><mo id="S2.E5.m1.5.5.4.7.2.3" stretchy="false" xref="S2.E5.m1.5.5.4.7.1.cmml">)</mo></mrow></mrow><msubsup id="S2.E5.m1.5.5.6" xref="S2.E5.m1.5.5.6.cmml"><mi id="S2.E5.m1.5.5.6.2.2" xref="S2.E5.m1.5.5.6.2.2.cmml">y</mi><msubsup id="S2.E5.m1.5.5.6.3" xref="S2.E5.m1.5.5.6.3.cmml"><mi id="S2.E5.m1.5.5.6.3.2.2" xref="S2.E5.m1.5.5.6.3.2.2.cmml">l</mi><mi id="S2.E5.m1.5.5.6.3.3" xref="S2.E5.m1.5.5.6.3.3.cmml">v</mi><mo id="S2.E5.m1.5.5.6.3.2.3" xref="S2.E5.m1.5.5.6.3.2.3.cmml">′</mo></msubsup><mi id="S2.E5.m1.5.5.6.2.3" xref="S2.E5.m1.5.5.6.2.3.cmml">t</mi></msubsup></mfrac></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E5.m1.7b"><apply id="S2.E5.m1.7.7.cmml" xref="S2.E5.m1.7.7"><and id="S2.E5.m1.7.7a.cmml" xref="S2.E5.m1.7.7"></and><apply id="S2.E5.m1.7.7b.cmml" xref="S2.E5.m1.7.7"><eq id="S2.E5.m1.7.7.4.cmml" xref="S2.E5.m1.7.7.4"></eq><apply id="S2.E5.m1.6.6.1.cmml" xref="S2.E5.m1.6.6.1"><times id="S2.E5.m1.6.6.1.2.cmml" xref="S2.E5.m1.6.6.1.2"></times><ci id="S2.E5.m1.6.6.1.3.cmml" xref="S2.E5.m1.6.6.1.3">𝑃</ci><apply id="S2.E5.m1.6.6.1.1.1.1.cmml" xref="S2.E5.m1.6.6.1.1.1"><csymbol cd="latexml" id="S2.E5.m1.6.6.1.1.1.1.1.cmml" xref="S2.E5.m1.6.6.1.1.1.1.1">conditional</csymbol><ci id="S2.E5.m1.6.6.1.1.1.1.2.cmml" xref="S2.E5.m1.6.6.1.1.1.1.2">𝑙</ci><ci id="S2.E5.m1.6.6.1.1.1.1.3.cmml" xref="S2.E5.m1.6.6.1.1.1.1.3">𝑥</ci></apply></apply><apply id="S2.E5.m1.7.7.2.cmml" xref="S2.E5.m1.7.7.2"><apply id="S2.E5.m1.7.7.2.2.cmml" xref="S2.E5.m1.7.7.2.2"><csymbol cd="ambiguous" id="S2.E5.m1.7.7.2.2.1.cmml" xref="S2.E5.m1.7.7.2.2">subscript</csymbol><sum id="S2.E5.m1.7.7.2.2.2.cmml" xref="S2.E5.m1.7.7.2.2.2"></sum><apply id="S2.E5.m1.1.1.1.cmml" xref="S2.E5.m1.1.1.1"><in id="S2.E5.m1.1.1.1.2.cmml" xref="S2.E5.m1.1.1.1.2"></in><ci id="S2.E5.m1.1.1.1.3.cmml" xref="S2.E5.m1.1.1.1.3">𝜋</ci><apply id="S2.E5.m1.1.1.1.4.cmml" xref="S2.E5.m1.1.1.1.4"><times id="S2.E5.m1.1.1.1.4.1.cmml" xref="S2.E5.m1.1.1.1.4.1"></times><apply id="S2.E5.m1.1.1.1.4.2.cmml" xref="S2.E5.m1.1.1.1.4.2"><csymbol cd="ambiguous" id="S2.E5.m1.1.1.1.4.2.1.cmml" xref="S2.E5.m1.1.1.1.4.2">superscript</csymbol><ci id="S2.E5.m1.1.1.1.4.2.2.cmml" xref="S2.E5.m1.1.1.1.4.2.2">𝐵</ci><apply id="S2.E5.m1.1.1.1.4.2.3.cmml" xref="S2.E5.m1.1.1.1.4.2.3"><minus id="S2.E5.m1.1.1.1.4.2.3.1.cmml" xref="S2.E5.m1.1.1.1.4.2.3"></minus><cn id="S2.E5.m1.1.1.1.4.2.3.2.cmml" type="integer" xref="S2.E5.m1.1.1.1.4.2.3.2">1</cn></apply></apply><ci id="S2.E5.m1.1.1.1.1.cmml" xref="S2.E5.m1.1.1.1.1">𝑙</ci></apply></apply></apply><apply id="S2.E5.m1.7.7.2.1.cmml" xref="S2.E5.m1.7.7.2.1"><times id="S2.E5.m1.7.7.2.1.2.cmml" xref="S2.E5.m1.7.7.2.1.2"></times><ci id="S2.E5.m1.7.7.2.1.3.cmml" xref="S2.E5.m1.7.7.2.1.3">𝑝</ci><apply id="S2.E5.m1.7.7.2.1.1.1.1.cmml" xref="S2.E5.m1.7.7.2.1.1.1"><csymbol cd="latexml" id="S2.E5.m1.7.7.2.1.1.1.1.1.cmml" xref="S2.E5.m1.7.7.2.1.1.1.1.1">conditional</csymbol><ci id="S2.E5.m1.7.7.2.1.1.1.1.2.cmml" xref="S2.E5.m1.7.7.2.1.1.1.1.2">𝜋</ci><ci id="S2.E5.m1.7.7.2.1.1.1.1.3.cmml" xref="S2.E5.m1.7.7.2.1.1.1.1.3">𝑥</ci></apply></apply></apply></apply><apply id="S2.E5.m1.7.7c.cmml" xref="S2.E5.m1.7.7"><eq id="S2.E5.m1.7.7.5.cmml" xref="S2.E5.m1.7.7.5"></eq><share href="https://arxiv.org/html/2409.12388v2#S2.E5.m1.7.7.2.cmml" id="S2.E5.m1.7.7d.cmml" xref="S2.E5.m1.7.7"></share><apply id="S2.E5.m1.7.7.6.cmml" xref="S2.E5.m1.7.7.6"><apply id="S2.E5.m1.7.7.6.1.cmml" xref="S2.E5.m1.7.7.6.1"><csymbol cd="ambiguous" id="S2.E5.m1.7.7.6.1.1.cmml" xref="S2.E5.m1.7.7.6.1">subscript</csymbol><apply id="S2.E5.m1.7.7.6.1.2.cmml" xref="S2.E5.m1.7.7.6.1"><csymbol cd="ambiguous" id="S2.E5.m1.7.7.6.1.2.1.cmml" xref="S2.E5.m1.7.7.6.1">superscript</csymbol><sum id="S2.E5.m1.7.7.6.1.2.2.cmml" xref="S2.E5.m1.7.7.6.1.2.2"></sum><apply id="S2.E5.m1.7.7.6.1.2.3.cmml" xref="S2.E5.m1.7.7.6.1.2.3"><plus id="S2.E5.m1.7.7.6.1.2.3.1.cmml" xref="S2.E5.m1.7.7.6.1.2.3.1"></plus><apply id="S2.E5.m1.7.7.6.1.2.3.2.cmml" xref="S2.E5.m1.7.7.6.1.2.3.2"><times id="S2.E5.m1.7.7.6.1.2.3.2.1.cmml" xref="S2.E5.m1.7.7.6.1.2.3.2.1"></times><cn id="S2.E5.m1.7.7.6.1.2.3.2.2.cmml" type="integer" xref="S2.E5.m1.7.7.6.1.2.3.2.2">2</cn><ci id="S2.E5.m1.7.7.6.1.2.3.2.3.cmml" xref="S2.E5.m1.7.7.6.1.2.3.2.3">𝑈</ci></apply><cn id="S2.E5.m1.7.7.6.1.2.3.3.cmml" type="integer" xref="S2.E5.m1.7.7.6.1.2.3.3">1</cn></apply></apply><apply id="S2.E5.m1.7.7.6.1.3.cmml" xref="S2.E5.m1.7.7.6.1.3"><eq id="S2.E5.m1.7.7.6.1.3.1.cmml" xref="S2.E5.m1.7.7.6.1.3.1"></eq><ci id="S2.E5.m1.7.7.6.1.3.2.cmml" xref="S2.E5.m1.7.7.6.1.3.2">𝑣</ci><cn id="S2.E5.m1.7.7.6.1.3.3.cmml" type="integer" xref="S2.E5.m1.7.7.6.1.3.3">1</cn></apply></apply><apply id="S2.E5.m1.5.5.cmml" xref="S2.E5.m1.5.5"><divide id="S2.E5.m1.5.5.5.cmml" xref="S2.E5.m1.5.5"></divide><apply id="S2.E5.m1.5.5.4.cmml" xref="S2.E5.m1.5.5.4"><times id="S2.E5.m1.5.5.4.5.cmml" xref="S2.E5.m1.5.5.4.5"></times><apply id="S2.E5.m1.5.5.4.6.cmml" xref="S2.E5.m1.5.5.4.6"><ci id="S2.E5.m1.5.5.4.6.1.cmml" xref="S2.E5.m1.5.5.4.6.1">⋅</ci><apply id="S2.E5.m1.5.5.4.6.2.cmml" xref="S2.E5.m1.5.5.4.6.2"><times id="S2.E5.m1.5.5.4.6.2.1.cmml" xref="S2.E5.m1.5.5.4.6.2.1"></times><ci id="S2.E5.m1.5.5.4.6.2.2.cmml" xref="S2.E5.m1.5.5.4.6.2.2">𝛼</ci><interval closure="open" id="S2.E5.m1.5.5.4.6.2.3.1.cmml" xref="S2.E5.m1.5.5.4.6.2.3.2"><ci id="S2.E5.m1.2.2.1.1.cmml" xref="S2.E5.m1.2.2.1.1">𝑡</ci><ci id="S2.E5.m1.3.3.2.2.cmml" xref="S2.E5.m1.3.3.2.2">𝑣</ci></interval></apply><ci id="S2.E5.m1.5.5.4.6.3.cmml" xref="S2.E5.m1.5.5.4.6.3">𝛽</ci></apply><interval closure="open" id="S2.E5.m1.5.5.4.7.1.cmml" xref="S2.E5.m1.5.5.4.7.2"><ci id="S2.E5.m1.4.4.3.3.cmml" xref="S2.E5.m1.4.4.3.3">𝑡</ci><ci id="S2.E5.m1.5.5.4.4.cmml" xref="S2.E5.m1.5.5.4.4">𝑣</ci></interval></apply><apply id="S2.E5.m1.5.5.6.cmml" xref="S2.E5.m1.5.5.6"><csymbol cd="ambiguous" id="S2.E5.m1.5.5.6.1.cmml" xref="S2.E5.m1.5.5.6">subscript</csymbol><apply id="S2.E5.m1.5.5.6.2.cmml" xref="S2.E5.m1.5.5.6"><csymbol cd="ambiguous" id="S2.E5.m1.5.5.6.2.1.cmml" xref="S2.E5.m1.5.5.6">superscript</csymbol><ci id="S2.E5.m1.5.5.6.2.2.cmml" xref="S2.E5.m1.5.5.6.2.2">𝑦</ci><ci id="S2.E5.m1.5.5.6.2.3.cmml" xref="S2.E5.m1.5.5.6.2.3">𝑡</ci></apply><apply id="S2.E5.m1.5.5.6.3.cmml" xref="S2.E5.m1.5.5.6.3"><csymbol cd="ambiguous" id="S2.E5.m1.5.5.6.3.1.cmml" xref="S2.E5.m1.5.5.6.3">subscript</csymbol><apply id="S2.E5.m1.5.5.6.3.2.cmml" xref="S2.E5.m1.5.5.6.3"><csymbol cd="ambiguous" id="S2.E5.m1.5.5.6.3.2.1.cmml" xref="S2.E5.m1.5.5.6.3">superscript</csymbol><ci id="S2.E5.m1.5.5.6.3.2.2.cmml" xref="S2.E5.m1.5.5.6.3.2.2">𝑙</ci><ci id="S2.E5.m1.5.5.6.3.2.3.cmml" xref="S2.E5.m1.5.5.6.3.2.3">′</ci></apply><ci id="S2.E5.m1.5.5.6.3.3.cmml" xref="S2.E5.m1.5.5.6.3.3">𝑣</ci></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E5.m1.7c">P(l|x)=\sum_{\pi\in B^{-1}(l)}p(\pi|x)=\sum^{2U+1}_{v=1}\frac{\alpha(t,v)\cdot% \beta(t,v)}{y^{t}_{l^{\prime}_{v}}}</annotation><annotation encoding="application/x-llamapun" id="S2.E5.m1.7d">italic_P ( italic_l | italic_x ) = ∑ start_POSTSUBSCRIPT italic_π ∈ italic_B start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_l ) end_POSTSUBSCRIPT italic_p ( italic_π | italic_x ) = ∑ start_POSTSUPERSCRIPT 2 italic_U + 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_v = 1 end_POSTSUBSCRIPT divide start_ARG italic_α ( italic_t , italic_v ) ⋅ italic_β ( italic_t , italic_v ) end_ARG start_ARG italic_y start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(5)</span></td> </tr></tbody> </table> </div> </section> <section class="ltx_subsection" id="S2.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S2.SS2.5.1.1">II-B</span> </span><span class="ltx_text ltx_font_italic" id="S2.SS2.6.2">Speaker-aware CTC based on minimizing Bayes risk</span> </h3> <div class="ltx_para" id="S2.SS2.p1"> <p class="ltx_p" id="S2.SS2.p1.16">Consider a two-talker scenario with serialized target label <math alttext="l=[l_{1}^{a},...,l_{M}^{a},\langle sc\rangle,l_{1}^{b},...,l_{N}^{b}]" class="ltx_Math" display="inline" id="S2.SS2.p1.1.m1.7"><semantics id="S2.SS2.p1.1.m1.7a"><mrow id="S2.SS2.p1.1.m1.7.7" xref="S2.SS2.p1.1.m1.7.7.cmml"><mi id="S2.SS2.p1.1.m1.7.7.7" xref="S2.SS2.p1.1.m1.7.7.7.cmml">l</mi><mo id="S2.SS2.p1.1.m1.7.7.6" xref="S2.SS2.p1.1.m1.7.7.6.cmml">=</mo><mrow id="S2.SS2.p1.1.m1.7.7.5.5" xref="S2.SS2.p1.1.m1.7.7.5.6.cmml"><mo id="S2.SS2.p1.1.m1.7.7.5.5.6" stretchy="false" xref="S2.SS2.p1.1.m1.7.7.5.6.cmml">[</mo><msubsup id="S2.SS2.p1.1.m1.3.3.1.1.1" xref="S2.SS2.p1.1.m1.3.3.1.1.1.cmml"><mi id="S2.SS2.p1.1.m1.3.3.1.1.1.2.2" xref="S2.SS2.p1.1.m1.3.3.1.1.1.2.2.cmml">l</mi><mn id="S2.SS2.p1.1.m1.3.3.1.1.1.2.3" xref="S2.SS2.p1.1.m1.3.3.1.1.1.2.3.cmml">1</mn><mi id="S2.SS2.p1.1.m1.3.3.1.1.1.3" xref="S2.SS2.p1.1.m1.3.3.1.1.1.3.cmml">a</mi></msubsup><mo id="S2.SS2.p1.1.m1.7.7.5.5.7" xref="S2.SS2.p1.1.m1.7.7.5.6.cmml">,</mo><mi id="S2.SS2.p1.1.m1.1.1" mathvariant="normal" xref="S2.SS2.p1.1.m1.1.1.cmml">…</mi><mo id="S2.SS2.p1.1.m1.7.7.5.5.8" xref="S2.SS2.p1.1.m1.7.7.5.6.cmml">,</mo><msubsup id="S2.SS2.p1.1.m1.4.4.2.2.2" xref="S2.SS2.p1.1.m1.4.4.2.2.2.cmml"><mi id="S2.SS2.p1.1.m1.4.4.2.2.2.2.2" xref="S2.SS2.p1.1.m1.4.4.2.2.2.2.2.cmml">l</mi><mi id="S2.SS2.p1.1.m1.4.4.2.2.2.2.3" xref="S2.SS2.p1.1.m1.4.4.2.2.2.2.3.cmml">M</mi><mi id="S2.SS2.p1.1.m1.4.4.2.2.2.3" xref="S2.SS2.p1.1.m1.4.4.2.2.2.3.cmml">a</mi></msubsup><mo id="S2.SS2.p1.1.m1.7.7.5.5.9" xref="S2.SS2.p1.1.m1.7.7.5.6.cmml">,</mo><mrow id="S2.SS2.p1.1.m1.5.5.3.3.3.1" xref="S2.SS2.p1.1.m1.5.5.3.3.3.2.cmml"><mo id="S2.SS2.p1.1.m1.5.5.3.3.3.1.2" stretchy="false" xref="S2.SS2.p1.1.m1.5.5.3.3.3.2.1.cmml">⟨</mo><mrow id="S2.SS2.p1.1.m1.5.5.3.3.3.1.1" xref="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.cmml"><mi id="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.2" xref="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.2.cmml">s</mi><mo id="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.1" xref="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.1.cmml"></mo><mi id="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.3" xref="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.3.cmml">c</mi></mrow><mo id="S2.SS2.p1.1.m1.5.5.3.3.3.1.3" stretchy="false" xref="S2.SS2.p1.1.m1.5.5.3.3.3.2.1.cmml">⟩</mo></mrow><mo id="S2.SS2.p1.1.m1.7.7.5.5.10" xref="S2.SS2.p1.1.m1.7.7.5.6.cmml">,</mo><msubsup id="S2.SS2.p1.1.m1.6.6.4.4.4" xref="S2.SS2.p1.1.m1.6.6.4.4.4.cmml"><mi id="S2.SS2.p1.1.m1.6.6.4.4.4.2.2" xref="S2.SS2.p1.1.m1.6.6.4.4.4.2.2.cmml">l</mi><mn id="S2.SS2.p1.1.m1.6.6.4.4.4.2.3" xref="S2.SS2.p1.1.m1.6.6.4.4.4.2.3.cmml">1</mn><mi id="S2.SS2.p1.1.m1.6.6.4.4.4.3" xref="S2.SS2.p1.1.m1.6.6.4.4.4.3.cmml">b</mi></msubsup><mo id="S2.SS2.p1.1.m1.7.7.5.5.11" xref="S2.SS2.p1.1.m1.7.7.5.6.cmml">,</mo><mi id="S2.SS2.p1.1.m1.2.2" mathvariant="normal" xref="S2.SS2.p1.1.m1.2.2.cmml">…</mi><mo id="S2.SS2.p1.1.m1.7.7.5.5.12" xref="S2.SS2.p1.1.m1.7.7.5.6.cmml">,</mo><msubsup id="S2.SS2.p1.1.m1.7.7.5.5.5" xref="S2.SS2.p1.1.m1.7.7.5.5.5.cmml"><mi id="S2.SS2.p1.1.m1.7.7.5.5.5.2.2" xref="S2.SS2.p1.1.m1.7.7.5.5.5.2.2.cmml">l</mi><mi id="S2.SS2.p1.1.m1.7.7.5.5.5.2.3" xref="S2.SS2.p1.1.m1.7.7.5.5.5.2.3.cmml">N</mi><mi id="S2.SS2.p1.1.m1.7.7.5.5.5.3" xref="S2.SS2.p1.1.m1.7.7.5.5.5.3.cmml">b</mi></msubsup><mo id="S2.SS2.p1.1.m1.7.7.5.5.13" stretchy="false" xref="S2.SS2.p1.1.m1.7.7.5.6.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.1.m1.7b"><apply id="S2.SS2.p1.1.m1.7.7.cmml" xref="S2.SS2.p1.1.m1.7.7"><eq id="S2.SS2.p1.1.m1.7.7.6.cmml" xref="S2.SS2.p1.1.m1.7.7.6"></eq><ci id="S2.SS2.p1.1.m1.7.7.7.cmml" xref="S2.SS2.p1.1.m1.7.7.7">𝑙</ci><list id="S2.SS2.p1.1.m1.7.7.5.6.cmml" xref="S2.SS2.p1.1.m1.7.7.5.5"><apply id="S2.SS2.p1.1.m1.3.3.1.1.1.cmml" xref="S2.SS2.p1.1.m1.3.3.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.1.m1.3.3.1.1.1.1.cmml" xref="S2.SS2.p1.1.m1.3.3.1.1.1">superscript</csymbol><apply id="S2.SS2.p1.1.m1.3.3.1.1.1.2.cmml" xref="S2.SS2.p1.1.m1.3.3.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.1.m1.3.3.1.1.1.2.1.cmml" xref="S2.SS2.p1.1.m1.3.3.1.1.1">subscript</csymbol><ci id="S2.SS2.p1.1.m1.3.3.1.1.1.2.2.cmml" xref="S2.SS2.p1.1.m1.3.3.1.1.1.2.2">𝑙</ci><cn id="S2.SS2.p1.1.m1.3.3.1.1.1.2.3.cmml" type="integer" xref="S2.SS2.p1.1.m1.3.3.1.1.1.2.3">1</cn></apply><ci id="S2.SS2.p1.1.m1.3.3.1.1.1.3.cmml" xref="S2.SS2.p1.1.m1.3.3.1.1.1.3">𝑎</ci></apply><ci id="S2.SS2.p1.1.m1.1.1.cmml" xref="S2.SS2.p1.1.m1.1.1">…</ci><apply id="S2.SS2.p1.1.m1.4.4.2.2.2.cmml" xref="S2.SS2.p1.1.m1.4.4.2.2.2"><csymbol cd="ambiguous" id="S2.SS2.p1.1.m1.4.4.2.2.2.1.cmml" xref="S2.SS2.p1.1.m1.4.4.2.2.2">superscript</csymbol><apply id="S2.SS2.p1.1.m1.4.4.2.2.2.2.cmml" xref="S2.SS2.p1.1.m1.4.4.2.2.2"><csymbol cd="ambiguous" id="S2.SS2.p1.1.m1.4.4.2.2.2.2.1.cmml" xref="S2.SS2.p1.1.m1.4.4.2.2.2">subscript</csymbol><ci id="S2.SS2.p1.1.m1.4.4.2.2.2.2.2.cmml" xref="S2.SS2.p1.1.m1.4.4.2.2.2.2.2">𝑙</ci><ci id="S2.SS2.p1.1.m1.4.4.2.2.2.2.3.cmml" xref="S2.SS2.p1.1.m1.4.4.2.2.2.2.3">𝑀</ci></apply><ci id="S2.SS2.p1.1.m1.4.4.2.2.2.3.cmml" xref="S2.SS2.p1.1.m1.4.4.2.2.2.3">𝑎</ci></apply><apply id="S2.SS2.p1.1.m1.5.5.3.3.3.2.cmml" xref="S2.SS2.p1.1.m1.5.5.3.3.3.1"><csymbol cd="latexml" id="S2.SS2.p1.1.m1.5.5.3.3.3.2.1.cmml" xref="S2.SS2.p1.1.m1.5.5.3.3.3.1.2">delimited-⟨⟩</csymbol><apply id="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.cmml" xref="S2.SS2.p1.1.m1.5.5.3.3.3.1.1"><times id="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.1.cmml" xref="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.1"></times><ci id="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.2.cmml" xref="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.2">𝑠</ci><ci id="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.3.cmml" xref="S2.SS2.p1.1.m1.5.5.3.3.3.1.1.3">𝑐</ci></apply></apply><apply id="S2.SS2.p1.1.m1.6.6.4.4.4.cmml" xref="S2.SS2.p1.1.m1.6.6.4.4.4"><csymbol cd="ambiguous" id="S2.SS2.p1.1.m1.6.6.4.4.4.1.cmml" xref="S2.SS2.p1.1.m1.6.6.4.4.4">superscript</csymbol><apply id="S2.SS2.p1.1.m1.6.6.4.4.4.2.cmml" xref="S2.SS2.p1.1.m1.6.6.4.4.4"><csymbol cd="ambiguous" id="S2.SS2.p1.1.m1.6.6.4.4.4.2.1.cmml" xref="S2.SS2.p1.1.m1.6.6.4.4.4">subscript</csymbol><ci id="S2.SS2.p1.1.m1.6.6.4.4.4.2.2.cmml" xref="S2.SS2.p1.1.m1.6.6.4.4.4.2.2">𝑙</ci><cn id="S2.SS2.p1.1.m1.6.6.4.4.4.2.3.cmml" type="integer" xref="S2.SS2.p1.1.m1.6.6.4.4.4.2.3">1</cn></apply><ci id="S2.SS2.p1.1.m1.6.6.4.4.4.3.cmml" xref="S2.SS2.p1.1.m1.6.6.4.4.4.3">𝑏</ci></apply><ci id="S2.SS2.p1.1.m1.2.2.cmml" xref="S2.SS2.p1.1.m1.2.2">…</ci><apply id="S2.SS2.p1.1.m1.7.7.5.5.5.cmml" xref="S2.SS2.p1.1.m1.7.7.5.5.5"><csymbol cd="ambiguous" id="S2.SS2.p1.1.m1.7.7.5.5.5.1.cmml" xref="S2.SS2.p1.1.m1.7.7.5.5.5">superscript</csymbol><apply id="S2.SS2.p1.1.m1.7.7.5.5.5.2.cmml" xref="S2.SS2.p1.1.m1.7.7.5.5.5"><csymbol cd="ambiguous" id="S2.SS2.p1.1.m1.7.7.5.5.5.2.1.cmml" xref="S2.SS2.p1.1.m1.7.7.5.5.5">subscript</csymbol><ci id="S2.SS2.p1.1.m1.7.7.5.5.5.2.2.cmml" xref="S2.SS2.p1.1.m1.7.7.5.5.5.2.2">𝑙</ci><ci id="S2.SS2.p1.1.m1.7.7.5.5.5.2.3.cmml" xref="S2.SS2.p1.1.m1.7.7.5.5.5.2.3">𝑁</ci></apply><ci id="S2.SS2.p1.1.m1.7.7.5.5.5.3.cmml" xref="S2.SS2.p1.1.m1.7.7.5.5.5.3">𝑏</ci></apply></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.1.m1.7c">l=[l_{1}^{a},...,l_{M}^{a},\langle sc\rangle,l_{1}^{b},...,l_{N}^{b}]</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.1.m1.7d">italic_l = [ italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , … , italic_l start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , ⟨ italic_s italic_c ⟩ , italic_l start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT , … , italic_l start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ]</annotation></semantics></math>, where <math alttext="a" class="ltx_Math" display="inline" id="S2.SS2.p1.2.m2.1"><semantics id="S2.SS2.p1.2.m2.1a"><mi id="S2.SS2.p1.2.m2.1.1" xref="S2.SS2.p1.2.m2.1.1.cmml">a</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.2.m2.1b"><ci id="S2.SS2.p1.2.m2.1.1.cmml" xref="S2.SS2.p1.2.m2.1.1">𝑎</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.2.m2.1c">a</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.2.m2.1d">italic_a</annotation></semantics></math> and <math alttext="b" class="ltx_Math" display="inline" id="S2.SS2.p1.3.m3.1"><semantics id="S2.SS2.p1.3.m3.1a"><mi id="S2.SS2.p1.3.m3.1.1" xref="S2.SS2.p1.3.m3.1.1.cmml">b</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.3.m3.1b"><ci id="S2.SS2.p1.3.m3.1.1.cmml" xref="S2.SS2.p1.3.m3.1.1">𝑏</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.3.m3.1c">b</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.3.m3.1d">italic_b</annotation></semantics></math> stand for 2 speakers and <math alttext="\langle sc\rangle" class="ltx_Math" display="inline" id="S2.SS2.p1.4.m4.1"><semantics id="S2.SS2.p1.4.m4.1a"><mrow id="S2.SS2.p1.4.m4.1.1.1" xref="S2.SS2.p1.4.m4.1.1.2.cmml"><mo id="S2.SS2.p1.4.m4.1.1.1.2" stretchy="false" xref="S2.SS2.p1.4.m4.1.1.2.1.cmml">⟨</mo><mrow id="S2.SS2.p1.4.m4.1.1.1.1" xref="S2.SS2.p1.4.m4.1.1.1.1.cmml"><mi id="S2.SS2.p1.4.m4.1.1.1.1.2" xref="S2.SS2.p1.4.m4.1.1.1.1.2.cmml">s</mi><mo id="S2.SS2.p1.4.m4.1.1.1.1.1" xref="S2.SS2.p1.4.m4.1.1.1.1.1.cmml"></mo><mi id="S2.SS2.p1.4.m4.1.1.1.1.3" xref="S2.SS2.p1.4.m4.1.1.1.1.3.cmml">c</mi></mrow><mo id="S2.SS2.p1.4.m4.1.1.1.3" stretchy="false" xref="S2.SS2.p1.4.m4.1.1.2.1.cmml">⟩</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.4.m4.1b"><apply id="S2.SS2.p1.4.m4.1.1.2.cmml" xref="S2.SS2.p1.4.m4.1.1.1"><csymbol cd="latexml" id="S2.SS2.p1.4.m4.1.1.2.1.cmml" xref="S2.SS2.p1.4.m4.1.1.1.2">delimited-⟨⟩</csymbol><apply id="S2.SS2.p1.4.m4.1.1.1.1.cmml" xref="S2.SS2.p1.4.m4.1.1.1.1"><times id="S2.SS2.p1.4.m4.1.1.1.1.1.cmml" xref="S2.SS2.p1.4.m4.1.1.1.1.1"></times><ci id="S2.SS2.p1.4.m4.1.1.1.1.2.cmml" xref="S2.SS2.p1.4.m4.1.1.1.1.2">𝑠</ci><ci id="S2.SS2.p1.4.m4.1.1.1.1.3.cmml" xref="S2.SS2.p1.4.m4.1.1.1.1.3">𝑐</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.4.m4.1c">\langle sc\rangle</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.4.m4.1d">⟨ italic_s italic_c ⟩</annotation></semantics></math> token separates them. We denote the alignment path as <math alttext="\pi=[\pi_{1}^{a},...,\pi_{K}^{a},\pi_{K+1}^{b},...,\pi_{T}^{b}]" class="ltx_Math" display="inline" id="S2.SS2.p1.5.m5.6"><semantics id="S2.SS2.p1.5.m5.6a"><mrow id="S2.SS2.p1.5.m5.6.6" xref="S2.SS2.p1.5.m5.6.6.cmml"><mi id="S2.SS2.p1.5.m5.6.6.6" xref="S2.SS2.p1.5.m5.6.6.6.cmml">π</mi><mo id="S2.SS2.p1.5.m5.6.6.5" xref="S2.SS2.p1.5.m5.6.6.5.cmml">=</mo><mrow id="S2.SS2.p1.5.m5.6.6.4.4" xref="S2.SS2.p1.5.m5.6.6.4.5.cmml"><mo id="S2.SS2.p1.5.m5.6.6.4.4.5" stretchy="false" xref="S2.SS2.p1.5.m5.6.6.4.5.cmml">[</mo><msubsup id="S2.SS2.p1.5.m5.3.3.1.1.1" xref="S2.SS2.p1.5.m5.3.3.1.1.1.cmml"><mi id="S2.SS2.p1.5.m5.3.3.1.1.1.2.2" xref="S2.SS2.p1.5.m5.3.3.1.1.1.2.2.cmml">π</mi><mn id="S2.SS2.p1.5.m5.3.3.1.1.1.2.3" xref="S2.SS2.p1.5.m5.3.3.1.1.1.2.3.cmml">1</mn><mi id="S2.SS2.p1.5.m5.3.3.1.1.1.3" xref="S2.SS2.p1.5.m5.3.3.1.1.1.3.cmml">a</mi></msubsup><mo id="S2.SS2.p1.5.m5.6.6.4.4.6" xref="S2.SS2.p1.5.m5.6.6.4.5.cmml">,</mo><mi id="S2.SS2.p1.5.m5.1.1" mathvariant="normal" xref="S2.SS2.p1.5.m5.1.1.cmml">…</mi><mo id="S2.SS2.p1.5.m5.6.6.4.4.7" xref="S2.SS2.p1.5.m5.6.6.4.5.cmml">,</mo><msubsup id="S2.SS2.p1.5.m5.4.4.2.2.2" xref="S2.SS2.p1.5.m5.4.4.2.2.2.cmml"><mi id="S2.SS2.p1.5.m5.4.4.2.2.2.2.2" xref="S2.SS2.p1.5.m5.4.4.2.2.2.2.2.cmml">π</mi><mi id="S2.SS2.p1.5.m5.4.4.2.2.2.2.3" xref="S2.SS2.p1.5.m5.4.4.2.2.2.2.3.cmml">K</mi><mi id="S2.SS2.p1.5.m5.4.4.2.2.2.3" xref="S2.SS2.p1.5.m5.4.4.2.2.2.3.cmml">a</mi></msubsup><mo id="S2.SS2.p1.5.m5.6.6.4.4.8" xref="S2.SS2.p1.5.m5.6.6.4.5.cmml">,</mo><msubsup id="S2.SS2.p1.5.m5.5.5.3.3.3" xref="S2.SS2.p1.5.m5.5.5.3.3.3.cmml"><mi id="S2.SS2.p1.5.m5.5.5.3.3.3.2.2" xref="S2.SS2.p1.5.m5.5.5.3.3.3.2.2.cmml">π</mi><mrow id="S2.SS2.p1.5.m5.5.5.3.3.3.2.3" xref="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.cmml"><mi id="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.2" xref="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.2.cmml">K</mi><mo id="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.1" xref="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.1.cmml">+</mo><mn id="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.3" xref="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.3.cmml">1</mn></mrow><mi id="S2.SS2.p1.5.m5.5.5.3.3.3.3" xref="S2.SS2.p1.5.m5.5.5.3.3.3.3.cmml">b</mi></msubsup><mo id="S2.SS2.p1.5.m5.6.6.4.4.9" xref="S2.SS2.p1.5.m5.6.6.4.5.cmml">,</mo><mi id="S2.SS2.p1.5.m5.2.2" mathvariant="normal" xref="S2.SS2.p1.5.m5.2.2.cmml">…</mi><mo id="S2.SS2.p1.5.m5.6.6.4.4.10" xref="S2.SS2.p1.5.m5.6.6.4.5.cmml">,</mo><msubsup id="S2.SS2.p1.5.m5.6.6.4.4.4" xref="S2.SS2.p1.5.m5.6.6.4.4.4.cmml"><mi id="S2.SS2.p1.5.m5.6.6.4.4.4.2.2" xref="S2.SS2.p1.5.m5.6.6.4.4.4.2.2.cmml">π</mi><mi id="S2.SS2.p1.5.m5.6.6.4.4.4.2.3" xref="S2.SS2.p1.5.m5.6.6.4.4.4.2.3.cmml">T</mi><mi id="S2.SS2.p1.5.m5.6.6.4.4.4.3" xref="S2.SS2.p1.5.m5.6.6.4.4.4.3.cmml">b</mi></msubsup><mo id="S2.SS2.p1.5.m5.6.6.4.4.11" stretchy="false" xref="S2.SS2.p1.5.m5.6.6.4.5.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.5.m5.6b"><apply id="S2.SS2.p1.5.m5.6.6.cmml" xref="S2.SS2.p1.5.m5.6.6"><eq id="S2.SS2.p1.5.m5.6.6.5.cmml" xref="S2.SS2.p1.5.m5.6.6.5"></eq><ci id="S2.SS2.p1.5.m5.6.6.6.cmml" xref="S2.SS2.p1.5.m5.6.6.6">𝜋</ci><list id="S2.SS2.p1.5.m5.6.6.4.5.cmml" xref="S2.SS2.p1.5.m5.6.6.4.4"><apply id="S2.SS2.p1.5.m5.3.3.1.1.1.cmml" xref="S2.SS2.p1.5.m5.3.3.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.5.m5.3.3.1.1.1.1.cmml" xref="S2.SS2.p1.5.m5.3.3.1.1.1">superscript</csymbol><apply id="S2.SS2.p1.5.m5.3.3.1.1.1.2.cmml" xref="S2.SS2.p1.5.m5.3.3.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.5.m5.3.3.1.1.1.2.1.cmml" xref="S2.SS2.p1.5.m5.3.3.1.1.1">subscript</csymbol><ci id="S2.SS2.p1.5.m5.3.3.1.1.1.2.2.cmml" xref="S2.SS2.p1.5.m5.3.3.1.1.1.2.2">𝜋</ci><cn id="S2.SS2.p1.5.m5.3.3.1.1.1.2.3.cmml" type="integer" xref="S2.SS2.p1.5.m5.3.3.1.1.1.2.3">1</cn></apply><ci id="S2.SS2.p1.5.m5.3.3.1.1.1.3.cmml" xref="S2.SS2.p1.5.m5.3.3.1.1.1.3">𝑎</ci></apply><ci id="S2.SS2.p1.5.m5.1.1.cmml" xref="S2.SS2.p1.5.m5.1.1">…</ci><apply id="S2.SS2.p1.5.m5.4.4.2.2.2.cmml" xref="S2.SS2.p1.5.m5.4.4.2.2.2"><csymbol cd="ambiguous" id="S2.SS2.p1.5.m5.4.4.2.2.2.1.cmml" xref="S2.SS2.p1.5.m5.4.4.2.2.2">superscript</csymbol><apply id="S2.SS2.p1.5.m5.4.4.2.2.2.2.cmml" xref="S2.SS2.p1.5.m5.4.4.2.2.2"><csymbol cd="ambiguous" id="S2.SS2.p1.5.m5.4.4.2.2.2.2.1.cmml" xref="S2.SS2.p1.5.m5.4.4.2.2.2">subscript</csymbol><ci id="S2.SS2.p1.5.m5.4.4.2.2.2.2.2.cmml" xref="S2.SS2.p1.5.m5.4.4.2.2.2.2.2">𝜋</ci><ci id="S2.SS2.p1.5.m5.4.4.2.2.2.2.3.cmml" xref="S2.SS2.p1.5.m5.4.4.2.2.2.2.3">𝐾</ci></apply><ci id="S2.SS2.p1.5.m5.4.4.2.2.2.3.cmml" xref="S2.SS2.p1.5.m5.4.4.2.2.2.3">𝑎</ci></apply><apply id="S2.SS2.p1.5.m5.5.5.3.3.3.cmml" xref="S2.SS2.p1.5.m5.5.5.3.3.3"><csymbol cd="ambiguous" id="S2.SS2.p1.5.m5.5.5.3.3.3.1.cmml" xref="S2.SS2.p1.5.m5.5.5.3.3.3">superscript</csymbol><apply id="S2.SS2.p1.5.m5.5.5.3.3.3.2.cmml" xref="S2.SS2.p1.5.m5.5.5.3.3.3"><csymbol cd="ambiguous" id="S2.SS2.p1.5.m5.5.5.3.3.3.2.1.cmml" xref="S2.SS2.p1.5.m5.5.5.3.3.3">subscript</csymbol><ci id="S2.SS2.p1.5.m5.5.5.3.3.3.2.2.cmml" xref="S2.SS2.p1.5.m5.5.5.3.3.3.2.2">𝜋</ci><apply id="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.cmml" xref="S2.SS2.p1.5.m5.5.5.3.3.3.2.3"><plus id="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.1.cmml" xref="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.1"></plus><ci id="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.2.cmml" xref="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.2">𝐾</ci><cn id="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.3.cmml" type="integer" xref="S2.SS2.p1.5.m5.5.5.3.3.3.2.3.3">1</cn></apply></apply><ci id="S2.SS2.p1.5.m5.5.5.3.3.3.3.cmml" xref="S2.SS2.p1.5.m5.5.5.3.3.3.3">𝑏</ci></apply><ci id="S2.SS2.p1.5.m5.2.2.cmml" xref="S2.SS2.p1.5.m5.2.2">…</ci><apply id="S2.SS2.p1.5.m5.6.6.4.4.4.cmml" xref="S2.SS2.p1.5.m5.6.6.4.4.4"><csymbol cd="ambiguous" id="S2.SS2.p1.5.m5.6.6.4.4.4.1.cmml" xref="S2.SS2.p1.5.m5.6.6.4.4.4">superscript</csymbol><apply id="S2.SS2.p1.5.m5.6.6.4.4.4.2.cmml" xref="S2.SS2.p1.5.m5.6.6.4.4.4"><csymbol cd="ambiguous" id="S2.SS2.p1.5.m5.6.6.4.4.4.2.1.cmml" xref="S2.SS2.p1.5.m5.6.6.4.4.4">subscript</csymbol><ci id="S2.SS2.p1.5.m5.6.6.4.4.4.2.2.cmml" xref="S2.SS2.p1.5.m5.6.6.4.4.4.2.2">𝜋</ci><ci id="S2.SS2.p1.5.m5.6.6.4.4.4.2.3.cmml" xref="S2.SS2.p1.5.m5.6.6.4.4.4.2.3">𝑇</ci></apply><ci id="S2.SS2.p1.5.m5.6.6.4.4.4.3.cmml" xref="S2.SS2.p1.5.m5.6.6.4.4.4.3">𝑏</ci></apply></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.5.m5.6c">\pi=[\pi_{1}^{a},...,\pi_{K}^{a},\pi_{K+1}^{b},...,\pi_{T}^{b}]</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.5.m5.6d">italic_π = [ italic_π start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , … , italic_π start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT , italic_π start_POSTSUBSCRIPT italic_K + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT , … , italic_π start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ]</annotation></semantics></math>, where <math alttext="\pi_{K}^{a}" class="ltx_Math" display="inline" id="S2.SS2.p1.6.m6.1"><semantics id="S2.SS2.p1.6.m6.1a"><msubsup id="S2.SS2.p1.6.m6.1.1" xref="S2.SS2.p1.6.m6.1.1.cmml"><mi id="S2.SS2.p1.6.m6.1.1.2.2" xref="S2.SS2.p1.6.m6.1.1.2.2.cmml">π</mi><mi id="S2.SS2.p1.6.m6.1.1.2.3" xref="S2.SS2.p1.6.m6.1.1.2.3.cmml">K</mi><mi id="S2.SS2.p1.6.m6.1.1.3" xref="S2.SS2.p1.6.m6.1.1.3.cmml">a</mi></msubsup><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.6.m6.1b"><apply id="S2.SS2.p1.6.m6.1.1.cmml" xref="S2.SS2.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.6.m6.1.1.1.cmml" xref="S2.SS2.p1.6.m6.1.1">superscript</csymbol><apply id="S2.SS2.p1.6.m6.1.1.2.cmml" xref="S2.SS2.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.6.m6.1.1.2.1.cmml" xref="S2.SS2.p1.6.m6.1.1">subscript</csymbol><ci id="S2.SS2.p1.6.m6.1.1.2.2.cmml" xref="S2.SS2.p1.6.m6.1.1.2.2">𝜋</ci><ci id="S2.SS2.p1.6.m6.1.1.2.3.cmml" xref="S2.SS2.p1.6.m6.1.1.2.3">𝐾</ci></apply><ci id="S2.SS2.p1.6.m6.1.1.3.cmml" xref="S2.SS2.p1.6.m6.1.1.3">𝑎</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.6.m6.1c">\pi_{K}^{a}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.6.m6.1d">italic_π start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_a end_POSTSUPERSCRIPT</annotation></semantics></math> represent the last <math alttext="\langle sc\rangle" class="ltx_Math" display="inline" id="S2.SS2.p1.7.m7.1"><semantics id="S2.SS2.p1.7.m7.1a"><mrow id="S2.SS2.p1.7.m7.1.1.1" xref="S2.SS2.p1.7.m7.1.1.2.cmml"><mo id="S2.SS2.p1.7.m7.1.1.1.2" stretchy="false" xref="S2.SS2.p1.7.m7.1.1.2.1.cmml">⟨</mo><mrow id="S2.SS2.p1.7.m7.1.1.1.1" xref="S2.SS2.p1.7.m7.1.1.1.1.cmml"><mi id="S2.SS2.p1.7.m7.1.1.1.1.2" xref="S2.SS2.p1.7.m7.1.1.1.1.2.cmml">s</mi><mo id="S2.SS2.p1.7.m7.1.1.1.1.1" xref="S2.SS2.p1.7.m7.1.1.1.1.1.cmml"></mo><mi id="S2.SS2.p1.7.m7.1.1.1.1.3" xref="S2.SS2.p1.7.m7.1.1.1.1.3.cmml">c</mi></mrow><mo id="S2.SS2.p1.7.m7.1.1.1.3" stretchy="false" xref="S2.SS2.p1.7.m7.1.1.2.1.cmml">⟩</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.7.m7.1b"><apply id="S2.SS2.p1.7.m7.1.1.2.cmml" xref="S2.SS2.p1.7.m7.1.1.1"><csymbol cd="latexml" id="S2.SS2.p1.7.m7.1.1.2.1.cmml" xref="S2.SS2.p1.7.m7.1.1.1.2">delimited-⟨⟩</csymbol><apply id="S2.SS2.p1.7.m7.1.1.1.1.cmml" xref="S2.SS2.p1.7.m7.1.1.1.1"><times id="S2.SS2.p1.7.m7.1.1.1.1.1.cmml" xref="S2.SS2.p1.7.m7.1.1.1.1.1"></times><ci id="S2.SS2.p1.7.m7.1.1.1.1.2.cmml" xref="S2.SS2.p1.7.m7.1.1.1.1.2">𝑠</ci><ci id="S2.SS2.p1.7.m7.1.1.1.1.3.cmml" xref="S2.SS2.p1.7.m7.1.1.1.1.3">𝑐</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.7.m7.1c">\langle sc\rangle</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.7.m7.1d">⟨ italic_s italic_c ⟩</annotation></semantics></math> token. During CTC training, we consider the case where multi-talker overlapped speech was encoded by an acoustic encoder, resulting in embedding <math alttext="x" class="ltx_Math" display="inline" id="S2.SS2.p1.8.m8.1"><semantics id="S2.SS2.p1.8.m8.1a"><mi id="S2.SS2.p1.8.m8.1.1" xref="S2.SS2.p1.8.m8.1.1.cmml">x</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.8.m8.1b"><ci id="S2.SS2.p1.8.m8.1.1.cmml" xref="S2.SS2.p1.8.m8.1.1">𝑥</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.8.m8.1c">x</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.8.m8.1d">italic_x</annotation></semantics></math>. Eq. <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S2.E2" title="In II-A Revisit CTC in speech recognition ‣ II Methods ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag">2</span></a> suggests that the information carried by <math alttext="x" class="ltx_Math" display="inline" id="S2.SS2.p1.9.m9.1"><semantics id="S2.SS2.p1.9.m9.1a"><mi id="S2.SS2.p1.9.m9.1.1" xref="S2.SS2.p1.9.m9.1.1.cmml">x</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.9.m9.1b"><ci id="S2.SS2.p1.9.m9.1.1.cmml" xref="S2.SS2.p1.9.m9.1.1">𝑥</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.9.m9.1c">x</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.9.m9.1d">italic_x</annotation></semantics></math> is inherently encouraged to align with <math alttext="\pi" class="ltx_Math" display="inline" id="S2.SS2.p1.10.m10.1"><semantics id="S2.SS2.p1.10.m10.1a"><mi id="S2.SS2.p1.10.m10.1.1" xref="S2.SS2.p1.10.m10.1.1.cmml">π</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.10.m10.1b"><ci id="S2.SS2.p1.10.m10.1.1.cmml" xref="S2.SS2.p1.10.m10.1.1">𝜋</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.10.m10.1c">\pi</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.10.m10.1d">italic_π</annotation></semantics></math>. I.e., <math alttext="[x_{1},...,x_{K-1}]" class="ltx_Math" display="inline" id="S2.SS2.p1.11.m11.3"><semantics id="S2.SS2.p1.11.m11.3a"><mrow id="S2.SS2.p1.11.m11.3.3.2" xref="S2.SS2.p1.11.m11.3.3.3.cmml"><mo id="S2.SS2.p1.11.m11.3.3.2.3" stretchy="false" xref="S2.SS2.p1.11.m11.3.3.3.cmml">[</mo><msub id="S2.SS2.p1.11.m11.2.2.1.1" xref="S2.SS2.p1.11.m11.2.2.1.1.cmml"><mi id="S2.SS2.p1.11.m11.2.2.1.1.2" xref="S2.SS2.p1.11.m11.2.2.1.1.2.cmml">x</mi><mn id="S2.SS2.p1.11.m11.2.2.1.1.3" xref="S2.SS2.p1.11.m11.2.2.1.1.3.cmml">1</mn></msub><mo id="S2.SS2.p1.11.m11.3.3.2.4" xref="S2.SS2.p1.11.m11.3.3.3.cmml">,</mo><mi id="S2.SS2.p1.11.m11.1.1" mathvariant="normal" xref="S2.SS2.p1.11.m11.1.1.cmml">…</mi><mo id="S2.SS2.p1.11.m11.3.3.2.5" xref="S2.SS2.p1.11.m11.3.3.3.cmml">,</mo><msub id="S2.SS2.p1.11.m11.3.3.2.2" xref="S2.SS2.p1.11.m11.3.3.2.2.cmml"><mi id="S2.SS2.p1.11.m11.3.3.2.2.2" xref="S2.SS2.p1.11.m11.3.3.2.2.2.cmml">x</mi><mrow id="S2.SS2.p1.11.m11.3.3.2.2.3" xref="S2.SS2.p1.11.m11.3.3.2.2.3.cmml"><mi id="S2.SS2.p1.11.m11.3.3.2.2.3.2" xref="S2.SS2.p1.11.m11.3.3.2.2.3.2.cmml">K</mi><mo id="S2.SS2.p1.11.m11.3.3.2.2.3.1" xref="S2.SS2.p1.11.m11.3.3.2.2.3.1.cmml">−</mo><mn id="S2.SS2.p1.11.m11.3.3.2.2.3.3" xref="S2.SS2.p1.11.m11.3.3.2.2.3.3.cmml">1</mn></mrow></msub><mo id="S2.SS2.p1.11.m11.3.3.2.6" stretchy="false" xref="S2.SS2.p1.11.m11.3.3.3.cmml">]</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.11.m11.3b"><list id="S2.SS2.p1.11.m11.3.3.3.cmml" xref="S2.SS2.p1.11.m11.3.3.2"><apply id="S2.SS2.p1.11.m11.2.2.1.1.cmml" xref="S2.SS2.p1.11.m11.2.2.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.11.m11.2.2.1.1.1.cmml" xref="S2.SS2.p1.11.m11.2.2.1.1">subscript</csymbol><ci id="S2.SS2.p1.11.m11.2.2.1.1.2.cmml" xref="S2.SS2.p1.11.m11.2.2.1.1.2">𝑥</ci><cn id="S2.SS2.p1.11.m11.2.2.1.1.3.cmml" type="integer" xref="S2.SS2.p1.11.m11.2.2.1.1.3">1</cn></apply><ci id="S2.SS2.p1.11.m11.1.1.cmml" xref="S2.SS2.p1.11.m11.1.1">…</ci><apply id="S2.SS2.p1.11.m11.3.3.2.2.cmml" xref="S2.SS2.p1.11.m11.3.3.2.2"><csymbol cd="ambiguous" id="S2.SS2.p1.11.m11.3.3.2.2.1.cmml" xref="S2.SS2.p1.11.m11.3.3.2.2">subscript</csymbol><ci id="S2.SS2.p1.11.m11.3.3.2.2.2.cmml" xref="S2.SS2.p1.11.m11.3.3.2.2.2">𝑥</ci><apply id="S2.SS2.p1.11.m11.3.3.2.2.3.cmml" xref="S2.SS2.p1.11.m11.3.3.2.2.3"><minus id="S2.SS2.p1.11.m11.3.3.2.2.3.1.cmml" xref="S2.SS2.p1.11.m11.3.3.2.2.3.1"></minus><ci id="S2.SS2.p1.11.m11.3.3.2.2.3.2.cmml" xref="S2.SS2.p1.11.m11.3.3.2.2.3.2">𝐾</ci><cn id="S2.SS2.p1.11.m11.3.3.2.2.3.3.cmml" type="integer" xref="S2.SS2.p1.11.m11.3.3.2.2.3.3">1</cn></apply></apply></list></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.11.m11.3c">[x_{1},...,x_{K-1}]</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.11.m11.3d">[ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_K - 1 end_POSTSUBSCRIPT ]</annotation></semantics></math> encodes speaker <math alttext="a" class="ltx_Math" display="inline" id="S2.SS2.p1.12.m12.1"><semantics id="S2.SS2.p1.12.m12.1a"><mi id="S2.SS2.p1.12.m12.1.1" xref="S2.SS2.p1.12.m12.1.1.cmml">a</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.12.m12.1b"><ci id="S2.SS2.p1.12.m12.1.1.cmml" xref="S2.SS2.p1.12.m12.1.1">𝑎</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.12.m12.1c">a</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.12.m12.1d">italic_a</annotation></semantics></math> and <math alttext="[x_{K+1},...,x_{T}]" class="ltx_Math" display="inline" id="S2.SS2.p1.13.m13.3"><semantics id="S2.SS2.p1.13.m13.3a"><mrow id="S2.SS2.p1.13.m13.3.3.2" xref="S2.SS2.p1.13.m13.3.3.3.cmml"><mo id="S2.SS2.p1.13.m13.3.3.2.3" stretchy="false" xref="S2.SS2.p1.13.m13.3.3.3.cmml">[</mo><msub id="S2.SS2.p1.13.m13.2.2.1.1" xref="S2.SS2.p1.13.m13.2.2.1.1.cmml"><mi id="S2.SS2.p1.13.m13.2.2.1.1.2" xref="S2.SS2.p1.13.m13.2.2.1.1.2.cmml">x</mi><mrow id="S2.SS2.p1.13.m13.2.2.1.1.3" xref="S2.SS2.p1.13.m13.2.2.1.1.3.cmml"><mi id="S2.SS2.p1.13.m13.2.2.1.1.3.2" xref="S2.SS2.p1.13.m13.2.2.1.1.3.2.cmml">K</mi><mo id="S2.SS2.p1.13.m13.2.2.1.1.3.1" xref="S2.SS2.p1.13.m13.2.2.1.1.3.1.cmml">+</mo><mn id="S2.SS2.p1.13.m13.2.2.1.1.3.3" xref="S2.SS2.p1.13.m13.2.2.1.1.3.3.cmml">1</mn></mrow></msub><mo id="S2.SS2.p1.13.m13.3.3.2.4" xref="S2.SS2.p1.13.m13.3.3.3.cmml">,</mo><mi id="S2.SS2.p1.13.m13.1.1" mathvariant="normal" xref="S2.SS2.p1.13.m13.1.1.cmml">…</mi><mo id="S2.SS2.p1.13.m13.3.3.2.5" xref="S2.SS2.p1.13.m13.3.3.3.cmml">,</mo><msub id="S2.SS2.p1.13.m13.3.3.2.2" xref="S2.SS2.p1.13.m13.3.3.2.2.cmml"><mi id="S2.SS2.p1.13.m13.3.3.2.2.2" xref="S2.SS2.p1.13.m13.3.3.2.2.2.cmml">x</mi><mi id="S2.SS2.p1.13.m13.3.3.2.2.3" xref="S2.SS2.p1.13.m13.3.3.2.2.3.cmml">T</mi></msub><mo id="S2.SS2.p1.13.m13.3.3.2.6" stretchy="false" xref="S2.SS2.p1.13.m13.3.3.3.cmml">]</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.13.m13.3b"><list id="S2.SS2.p1.13.m13.3.3.3.cmml" xref="S2.SS2.p1.13.m13.3.3.2"><apply id="S2.SS2.p1.13.m13.2.2.1.1.cmml" xref="S2.SS2.p1.13.m13.2.2.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.13.m13.2.2.1.1.1.cmml" xref="S2.SS2.p1.13.m13.2.2.1.1">subscript</csymbol><ci id="S2.SS2.p1.13.m13.2.2.1.1.2.cmml" xref="S2.SS2.p1.13.m13.2.2.1.1.2">𝑥</ci><apply id="S2.SS2.p1.13.m13.2.2.1.1.3.cmml" xref="S2.SS2.p1.13.m13.2.2.1.1.3"><plus id="S2.SS2.p1.13.m13.2.2.1.1.3.1.cmml" xref="S2.SS2.p1.13.m13.2.2.1.1.3.1"></plus><ci id="S2.SS2.p1.13.m13.2.2.1.1.3.2.cmml" xref="S2.SS2.p1.13.m13.2.2.1.1.3.2">𝐾</ci><cn id="S2.SS2.p1.13.m13.2.2.1.1.3.3.cmml" type="integer" xref="S2.SS2.p1.13.m13.2.2.1.1.3.3">1</cn></apply></apply><ci id="S2.SS2.p1.13.m13.1.1.cmml" xref="S2.SS2.p1.13.m13.1.1">…</ci><apply id="S2.SS2.p1.13.m13.3.3.2.2.cmml" xref="S2.SS2.p1.13.m13.3.3.2.2"><csymbol cd="ambiguous" id="S2.SS2.p1.13.m13.3.3.2.2.1.cmml" xref="S2.SS2.p1.13.m13.3.3.2.2">subscript</csymbol><ci id="S2.SS2.p1.13.m13.3.3.2.2.2.cmml" xref="S2.SS2.p1.13.m13.3.3.2.2.2">𝑥</ci><ci id="S2.SS2.p1.13.m13.3.3.2.2.3.cmml" xref="S2.SS2.p1.13.m13.3.3.2.2.3">𝑇</ci></apply></list></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.13.m13.3c">[x_{K+1},...,x_{T}]</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.13.m13.3d">[ italic_x start_POSTSUBSCRIPT italic_K + 1 end_POSTSUBSCRIPT , … , italic_x start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ]</annotation></semantics></math> encodes speaker <math alttext="b" class="ltx_Math" display="inline" id="S2.SS2.p1.14.m14.1"><semantics id="S2.SS2.p1.14.m14.1a"><mi id="S2.SS2.p1.14.m14.1.1" xref="S2.SS2.p1.14.m14.1.1.cmml">b</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.14.m14.1b"><ci id="S2.SS2.p1.14.m14.1.1.cmml" xref="S2.SS2.p1.14.m14.1.1">𝑏</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.14.m14.1c">b</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.14.m14.1d">italic_b</annotation></semantics></math>. We note that the ”speaker boundary” <math alttext="K" class="ltx_Math" display="inline" id="S2.SS2.p1.15.m15.1"><semantics id="S2.SS2.p1.15.m15.1a"><mi id="S2.SS2.p1.15.m15.1.1" xref="S2.SS2.p1.15.m15.1.1.cmml">K</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.15.m15.1b"><ci id="S2.SS2.p1.15.m15.1.1.cmml" xref="S2.SS2.p1.15.m15.1.1">𝐾</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.15.m15.1c">K</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.15.m15.1d">italic_K</annotation></semantics></math> varies across alignment paths, potentially confusing the encoder on how different speakers are represented. Furthermore, embedding <math alttext="x" class="ltx_Math" display="inline" id="S2.SS2.p1.16.m16.1"><semantics id="S2.SS2.p1.16.m16.1a"><mi id="S2.SS2.p1.16.m16.1.1" xref="S2.SS2.p1.16.m16.1.1.cmml">x</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.16.m16.1b"><ci id="S2.SS2.p1.16.m16.1.1.cmml" xref="S2.SS2.p1.16.m16.1.1">𝑥</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.16.m16.1c">x</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.16.m16.1d">italic_x</annotation></semantics></math> with a nondeterministic speaker boundary may complicate subsequent processing, e.g., hindering a cascaded ASR decoder from recognizing different speakers.</p> </div> <div class="ltx_para" id="S2.SS2.p2"> <p class="ltx_p" id="S2.SS2.p2.1">Addressing this issue, we propose a speaker-aware CTC training objective as an enhanced and tailored loss function for MTASR. The core idea is to constrain the encoder model to represent different speakers’ tokens at specific time frames, which <span class="ltx_text ltx_font_italic" id="S2.SS2.p2.1.1">explicitly models speaker disentanglement</span>. To control CTC prediction, the Bayes risk CTC (BRCTC) framework <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib33" title="">33</a>]</cite> was used to introduce preference over alignment paths. Specifically, BRCTC defined Bayes risk function <math alttext="r(\pi)" class="ltx_Math" display="inline" id="S2.SS2.p2.1.m1.1"><semantics id="S2.SS2.p2.1.m1.1a"><mrow id="S2.SS2.p2.1.m1.1.2" xref="S2.SS2.p2.1.m1.1.2.cmml"><mi id="S2.SS2.p2.1.m1.1.2.2" xref="S2.SS2.p2.1.m1.1.2.2.cmml">r</mi><mo id="S2.SS2.p2.1.m1.1.2.1" xref="S2.SS2.p2.1.m1.1.2.1.cmml"></mo><mrow id="S2.SS2.p2.1.m1.1.2.3.2" xref="S2.SS2.p2.1.m1.1.2.cmml"><mo id="S2.SS2.p2.1.m1.1.2.3.2.1" stretchy="false" xref="S2.SS2.p2.1.m1.1.2.cmml">(</mo><mi id="S2.SS2.p2.1.m1.1.1" xref="S2.SS2.p2.1.m1.1.1.cmml">π</mi><mo id="S2.SS2.p2.1.m1.1.2.3.2.2" stretchy="false" xref="S2.SS2.p2.1.m1.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.1.m1.1b"><apply id="S2.SS2.p2.1.m1.1.2.cmml" xref="S2.SS2.p2.1.m1.1.2"><times id="S2.SS2.p2.1.m1.1.2.1.cmml" xref="S2.SS2.p2.1.m1.1.2.1"></times><ci id="S2.SS2.p2.1.m1.1.2.2.cmml" xref="S2.SS2.p2.1.m1.1.2.2">𝑟</ci><ci id="S2.SS2.p2.1.m1.1.1.cmml" xref="S2.SS2.p2.1.m1.1.1">𝜋</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.1.m1.1c">r(\pi)</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.1.m1.1d">italic_r ( italic_π )</annotation></semantics></math> over posteriors of alignment paths, and the training objective is:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E6"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\mathcal{J}_{br}(l,x)=\sum_{\pi\in B^{-1}(l)}r(\pi)\cdot p(\pi|x)" class="ltx_Math" display="block" id="S2.E6.m1.5"><semantics id="S2.E6.m1.5a"><mrow id="S2.E6.m1.5.5" xref="S2.E6.m1.5.5.cmml"><mrow id="S2.E6.m1.5.5.3" xref="S2.E6.m1.5.5.3.cmml"><msub id="S2.E6.m1.5.5.3.2" xref="S2.E6.m1.5.5.3.2.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E6.m1.5.5.3.2.2" xref="S2.E6.m1.5.5.3.2.2.cmml">𝒥</mi><mrow id="S2.E6.m1.5.5.3.2.3" xref="S2.E6.m1.5.5.3.2.3.cmml"><mi id="S2.E6.m1.5.5.3.2.3.2" xref="S2.E6.m1.5.5.3.2.3.2.cmml">b</mi><mo id="S2.E6.m1.5.5.3.2.3.1" xref="S2.E6.m1.5.5.3.2.3.1.cmml"></mo><mi id="S2.E6.m1.5.5.3.2.3.3" xref="S2.E6.m1.5.5.3.2.3.3.cmml">r</mi></mrow></msub><mo id="S2.E6.m1.5.5.3.1" xref="S2.E6.m1.5.5.3.1.cmml"></mo><mrow id="S2.E6.m1.5.5.3.3.2" xref="S2.E6.m1.5.5.3.3.1.cmml"><mo id="S2.E6.m1.5.5.3.3.2.1" stretchy="false" xref="S2.E6.m1.5.5.3.3.1.cmml">(</mo><mi id="S2.E6.m1.2.2" xref="S2.E6.m1.2.2.cmml">l</mi><mo id="S2.E6.m1.5.5.3.3.2.2" xref="S2.E6.m1.5.5.3.3.1.cmml">,</mo><mi id="S2.E6.m1.3.3" xref="S2.E6.m1.3.3.cmml">x</mi><mo id="S2.E6.m1.5.5.3.3.2.3" stretchy="false" xref="S2.E6.m1.5.5.3.3.1.cmml">)</mo></mrow></mrow><mo id="S2.E6.m1.5.5.2" rspace="0.111em" xref="S2.E6.m1.5.5.2.cmml">=</mo><mrow id="S2.E6.m1.5.5.1" xref="S2.E6.m1.5.5.1.cmml"><munder id="S2.E6.m1.5.5.1.2" xref="S2.E6.m1.5.5.1.2.cmml"><mo id="S2.E6.m1.5.5.1.2.2" movablelimits="false" xref="S2.E6.m1.5.5.1.2.2.cmml">∑</mo><mrow id="S2.E6.m1.1.1.1" xref="S2.E6.m1.1.1.1.cmml"><mi id="S2.E6.m1.1.1.1.3" xref="S2.E6.m1.1.1.1.3.cmml">π</mi><mo id="S2.E6.m1.1.1.1.2" xref="S2.E6.m1.1.1.1.2.cmml">∈</mo><mrow id="S2.E6.m1.1.1.1.4" xref="S2.E6.m1.1.1.1.4.cmml"><msup id="S2.E6.m1.1.1.1.4.2" xref="S2.E6.m1.1.1.1.4.2.cmml"><mi id="S2.E6.m1.1.1.1.4.2.2" xref="S2.E6.m1.1.1.1.4.2.2.cmml">B</mi><mrow id="S2.E6.m1.1.1.1.4.2.3" xref="S2.E6.m1.1.1.1.4.2.3.cmml"><mo id="S2.E6.m1.1.1.1.4.2.3a" xref="S2.E6.m1.1.1.1.4.2.3.cmml">−</mo><mn id="S2.E6.m1.1.1.1.4.2.3.2" xref="S2.E6.m1.1.1.1.4.2.3.2.cmml">1</mn></mrow></msup><mo id="S2.E6.m1.1.1.1.4.1" xref="S2.E6.m1.1.1.1.4.1.cmml"></mo><mrow id="S2.E6.m1.1.1.1.4.3.2" xref="S2.E6.m1.1.1.1.4.cmml"><mo id="S2.E6.m1.1.1.1.4.3.2.1" stretchy="false" xref="S2.E6.m1.1.1.1.4.cmml">(</mo><mi id="S2.E6.m1.1.1.1.1" xref="S2.E6.m1.1.1.1.1.cmml">l</mi><mo id="S2.E6.m1.1.1.1.4.3.2.2" stretchy="false" xref="S2.E6.m1.1.1.1.4.cmml">)</mo></mrow></mrow></mrow></munder><mrow id="S2.E6.m1.5.5.1.1" xref="S2.E6.m1.5.5.1.1.cmml"><mrow id="S2.E6.m1.5.5.1.1.3" xref="S2.E6.m1.5.5.1.1.3.cmml"><mrow id="S2.E6.m1.5.5.1.1.3.2" xref="S2.E6.m1.5.5.1.1.3.2.cmml"><mi id="S2.E6.m1.5.5.1.1.3.2.2" xref="S2.E6.m1.5.5.1.1.3.2.2.cmml">r</mi><mo id="S2.E6.m1.5.5.1.1.3.2.1" xref="S2.E6.m1.5.5.1.1.3.2.1.cmml"></mo><mrow id="S2.E6.m1.5.5.1.1.3.2.3.2" xref="S2.E6.m1.5.5.1.1.3.2.cmml"><mo id="S2.E6.m1.5.5.1.1.3.2.3.2.1" stretchy="false" xref="S2.E6.m1.5.5.1.1.3.2.cmml">(</mo><mi id="S2.E6.m1.4.4" xref="S2.E6.m1.4.4.cmml">π</mi><mo id="S2.E6.m1.5.5.1.1.3.2.3.2.2" rspace="0.055em" stretchy="false" xref="S2.E6.m1.5.5.1.1.3.2.cmml">)</mo></mrow></mrow><mo id="S2.E6.m1.5.5.1.1.3.1" rspace="0.222em" xref="S2.E6.m1.5.5.1.1.3.1.cmml">⋅</mo><mi id="S2.E6.m1.5.5.1.1.3.3" xref="S2.E6.m1.5.5.1.1.3.3.cmml">p</mi></mrow><mo id="S2.E6.m1.5.5.1.1.2" xref="S2.E6.m1.5.5.1.1.2.cmml"></mo><mrow id="S2.E6.m1.5.5.1.1.1.1" xref="S2.E6.m1.5.5.1.1.1.1.1.cmml"><mo id="S2.E6.m1.5.5.1.1.1.1.2" stretchy="false" xref="S2.E6.m1.5.5.1.1.1.1.1.cmml">(</mo><mrow id="S2.E6.m1.5.5.1.1.1.1.1" xref="S2.E6.m1.5.5.1.1.1.1.1.cmml"><mi id="S2.E6.m1.5.5.1.1.1.1.1.2" xref="S2.E6.m1.5.5.1.1.1.1.1.2.cmml">π</mi><mo fence="false" id="S2.E6.m1.5.5.1.1.1.1.1.1" xref="S2.E6.m1.5.5.1.1.1.1.1.1.cmml">|</mo><mi id="S2.E6.m1.5.5.1.1.1.1.1.3" xref="S2.E6.m1.5.5.1.1.1.1.1.3.cmml">x</mi></mrow><mo id="S2.E6.m1.5.5.1.1.1.1.3" stretchy="false" xref="S2.E6.m1.5.5.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E6.m1.5b"><apply id="S2.E6.m1.5.5.cmml" xref="S2.E6.m1.5.5"><eq id="S2.E6.m1.5.5.2.cmml" xref="S2.E6.m1.5.5.2"></eq><apply id="S2.E6.m1.5.5.3.cmml" xref="S2.E6.m1.5.5.3"><times id="S2.E6.m1.5.5.3.1.cmml" xref="S2.E6.m1.5.5.3.1"></times><apply id="S2.E6.m1.5.5.3.2.cmml" xref="S2.E6.m1.5.5.3.2"><csymbol cd="ambiguous" id="S2.E6.m1.5.5.3.2.1.cmml" xref="S2.E6.m1.5.5.3.2">subscript</csymbol><ci id="S2.E6.m1.5.5.3.2.2.cmml" xref="S2.E6.m1.5.5.3.2.2">𝒥</ci><apply id="S2.E6.m1.5.5.3.2.3.cmml" xref="S2.E6.m1.5.5.3.2.3"><times id="S2.E6.m1.5.5.3.2.3.1.cmml" xref="S2.E6.m1.5.5.3.2.3.1"></times><ci id="S2.E6.m1.5.5.3.2.3.2.cmml" xref="S2.E6.m1.5.5.3.2.3.2">𝑏</ci><ci id="S2.E6.m1.5.5.3.2.3.3.cmml" xref="S2.E6.m1.5.5.3.2.3.3">𝑟</ci></apply></apply><interval closure="open" id="S2.E6.m1.5.5.3.3.1.cmml" xref="S2.E6.m1.5.5.3.3.2"><ci id="S2.E6.m1.2.2.cmml" xref="S2.E6.m1.2.2">𝑙</ci><ci id="S2.E6.m1.3.3.cmml" xref="S2.E6.m1.3.3">𝑥</ci></interval></apply><apply id="S2.E6.m1.5.5.1.cmml" xref="S2.E6.m1.5.5.1"><apply id="S2.E6.m1.5.5.1.2.cmml" xref="S2.E6.m1.5.5.1.2"><csymbol cd="ambiguous" id="S2.E6.m1.5.5.1.2.1.cmml" xref="S2.E6.m1.5.5.1.2">subscript</csymbol><sum id="S2.E6.m1.5.5.1.2.2.cmml" xref="S2.E6.m1.5.5.1.2.2"></sum><apply id="S2.E6.m1.1.1.1.cmml" xref="S2.E6.m1.1.1.1"><in id="S2.E6.m1.1.1.1.2.cmml" xref="S2.E6.m1.1.1.1.2"></in><ci id="S2.E6.m1.1.1.1.3.cmml" xref="S2.E6.m1.1.1.1.3">𝜋</ci><apply id="S2.E6.m1.1.1.1.4.cmml" xref="S2.E6.m1.1.1.1.4"><times id="S2.E6.m1.1.1.1.4.1.cmml" xref="S2.E6.m1.1.1.1.4.1"></times><apply id="S2.E6.m1.1.1.1.4.2.cmml" xref="S2.E6.m1.1.1.1.4.2"><csymbol cd="ambiguous" id="S2.E6.m1.1.1.1.4.2.1.cmml" xref="S2.E6.m1.1.1.1.4.2">superscript</csymbol><ci id="S2.E6.m1.1.1.1.4.2.2.cmml" xref="S2.E6.m1.1.1.1.4.2.2">𝐵</ci><apply id="S2.E6.m1.1.1.1.4.2.3.cmml" xref="S2.E6.m1.1.1.1.4.2.3"><minus id="S2.E6.m1.1.1.1.4.2.3.1.cmml" xref="S2.E6.m1.1.1.1.4.2.3"></minus><cn id="S2.E6.m1.1.1.1.4.2.3.2.cmml" type="integer" xref="S2.E6.m1.1.1.1.4.2.3.2">1</cn></apply></apply><ci id="S2.E6.m1.1.1.1.1.cmml" xref="S2.E6.m1.1.1.1.1">𝑙</ci></apply></apply></apply><apply id="S2.E6.m1.5.5.1.1.cmml" xref="S2.E6.m1.5.5.1.1"><times id="S2.E6.m1.5.5.1.1.2.cmml" xref="S2.E6.m1.5.5.1.1.2"></times><apply id="S2.E6.m1.5.5.1.1.3.cmml" xref="S2.E6.m1.5.5.1.1.3"><ci id="S2.E6.m1.5.5.1.1.3.1.cmml" xref="S2.E6.m1.5.5.1.1.3.1">⋅</ci><apply id="S2.E6.m1.5.5.1.1.3.2.cmml" xref="S2.E6.m1.5.5.1.1.3.2"><times id="S2.E6.m1.5.5.1.1.3.2.1.cmml" xref="S2.E6.m1.5.5.1.1.3.2.1"></times><ci id="S2.E6.m1.5.5.1.1.3.2.2.cmml" xref="S2.E6.m1.5.5.1.1.3.2.2">𝑟</ci><ci id="S2.E6.m1.4.4.cmml" xref="S2.E6.m1.4.4">𝜋</ci></apply><ci id="S2.E6.m1.5.5.1.1.3.3.cmml" xref="S2.E6.m1.5.5.1.1.3.3">𝑝</ci></apply><apply id="S2.E6.m1.5.5.1.1.1.1.1.cmml" xref="S2.E6.m1.5.5.1.1.1.1"><csymbol cd="latexml" id="S2.E6.m1.5.5.1.1.1.1.1.1.cmml" xref="S2.E6.m1.5.5.1.1.1.1.1.1">conditional</csymbol><ci id="S2.E6.m1.5.5.1.1.1.1.1.2.cmml" xref="S2.E6.m1.5.5.1.1.1.1.1.2">𝜋</ci><ci id="S2.E6.m1.5.5.1.1.1.1.1.3.cmml" xref="S2.E6.m1.5.5.1.1.1.1.1.3">𝑥</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E6.m1.5c">\mathcal{J}_{br}(l,x)=\sum_{\pi\in B^{-1}(l)}r(\pi)\cdot p(\pi|x)</annotation><annotation encoding="application/x-llamapun" id="S2.E6.m1.5d">caligraphic_J start_POSTSUBSCRIPT italic_b italic_r end_POSTSUBSCRIPT ( italic_l , italic_x ) = ∑ start_POSTSUBSCRIPT italic_π ∈ italic_B start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_l ) end_POSTSUBSCRIPT italic_r ( italic_π ) ⋅ italic_p ( italic_π | italic_x )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(6)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS2.p2.4">As paths with the same concerned property could share the same risk value, we group paths by the ending point (frame) of a certain non-blank token and use group-wise risk functions, to control the encoding frames of specific speakers. Given a constant non-blank token <math alttext="l_{u}=l^{\prime}_{2u}" class="ltx_Math" display="inline" id="S2.SS2.p2.2.m1.1"><semantics id="S2.SS2.p2.2.m1.1a"><mrow id="S2.SS2.p2.2.m1.1.1" xref="S2.SS2.p2.2.m1.1.1.cmml"><msub id="S2.SS2.p2.2.m1.1.1.2" xref="S2.SS2.p2.2.m1.1.1.2.cmml"><mi id="S2.SS2.p2.2.m1.1.1.2.2" xref="S2.SS2.p2.2.m1.1.1.2.2.cmml">l</mi><mi id="S2.SS2.p2.2.m1.1.1.2.3" xref="S2.SS2.p2.2.m1.1.1.2.3.cmml">u</mi></msub><mo id="S2.SS2.p2.2.m1.1.1.1" xref="S2.SS2.p2.2.m1.1.1.1.cmml">=</mo><msubsup id="S2.SS2.p2.2.m1.1.1.3" xref="S2.SS2.p2.2.m1.1.1.3.cmml"><mi id="S2.SS2.p2.2.m1.1.1.3.2.2" xref="S2.SS2.p2.2.m1.1.1.3.2.2.cmml">l</mi><mrow id="S2.SS2.p2.2.m1.1.1.3.3" xref="S2.SS2.p2.2.m1.1.1.3.3.cmml"><mn id="S2.SS2.p2.2.m1.1.1.3.3.2" xref="S2.SS2.p2.2.m1.1.1.3.3.2.cmml">2</mn><mo id="S2.SS2.p2.2.m1.1.1.3.3.1" xref="S2.SS2.p2.2.m1.1.1.3.3.1.cmml"></mo><mi id="S2.SS2.p2.2.m1.1.1.3.3.3" xref="S2.SS2.p2.2.m1.1.1.3.3.3.cmml">u</mi></mrow><mo id="S2.SS2.p2.2.m1.1.1.3.2.3" xref="S2.SS2.p2.2.m1.1.1.3.2.3.cmml">′</mo></msubsup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.2.m1.1b"><apply id="S2.SS2.p2.2.m1.1.1.cmml" xref="S2.SS2.p2.2.m1.1.1"><eq id="S2.SS2.p2.2.m1.1.1.1.cmml" xref="S2.SS2.p2.2.m1.1.1.1"></eq><apply id="S2.SS2.p2.2.m1.1.1.2.cmml" xref="S2.SS2.p2.2.m1.1.1.2"><csymbol cd="ambiguous" id="S2.SS2.p2.2.m1.1.1.2.1.cmml" xref="S2.SS2.p2.2.m1.1.1.2">subscript</csymbol><ci id="S2.SS2.p2.2.m1.1.1.2.2.cmml" xref="S2.SS2.p2.2.m1.1.1.2.2">𝑙</ci><ci id="S2.SS2.p2.2.m1.1.1.2.3.cmml" xref="S2.SS2.p2.2.m1.1.1.2.3">𝑢</ci></apply><apply id="S2.SS2.p2.2.m1.1.1.3.cmml" xref="S2.SS2.p2.2.m1.1.1.3"><csymbol cd="ambiguous" id="S2.SS2.p2.2.m1.1.1.3.1.cmml" xref="S2.SS2.p2.2.m1.1.1.3">subscript</csymbol><apply id="S2.SS2.p2.2.m1.1.1.3.2.cmml" xref="S2.SS2.p2.2.m1.1.1.3"><csymbol cd="ambiguous" id="S2.SS2.p2.2.m1.1.1.3.2.1.cmml" xref="S2.SS2.p2.2.m1.1.1.3">superscript</csymbol><ci id="S2.SS2.p2.2.m1.1.1.3.2.2.cmml" xref="S2.SS2.p2.2.m1.1.1.3.2.2">𝑙</ci><ci id="S2.SS2.p2.2.m1.1.1.3.2.3.cmml" xref="S2.SS2.p2.2.m1.1.1.3.2.3">′</ci></apply><apply id="S2.SS2.p2.2.m1.1.1.3.3.cmml" xref="S2.SS2.p2.2.m1.1.1.3.3"><times id="S2.SS2.p2.2.m1.1.1.3.3.1.cmml" xref="S2.SS2.p2.2.m1.1.1.3.3.1"></times><cn id="S2.SS2.p2.2.m1.1.1.3.3.2.cmml" type="integer" xref="S2.SS2.p2.2.m1.1.1.3.3.2">2</cn><ci id="S2.SS2.p2.2.m1.1.1.3.3.3.cmml" xref="S2.SS2.p2.2.m1.1.1.3.3.3">𝑢</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.2.m1.1c">l_{u}=l^{\prime}_{2u}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.2.m1.1d">italic_l start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 italic_u end_POSTSUBSCRIPT</annotation></semantics></math>, the ending point of <math alttext="l_{u}" class="ltx_Math" display="inline" id="S2.SS2.p2.3.m2.1"><semantics id="S2.SS2.p2.3.m2.1a"><msub id="S2.SS2.p2.3.m2.1.1" xref="S2.SS2.p2.3.m2.1.1.cmml"><mi id="S2.SS2.p2.3.m2.1.1.2" xref="S2.SS2.p2.3.m2.1.1.2.cmml">l</mi><mi id="S2.SS2.p2.3.m2.1.1.3" xref="S2.SS2.p2.3.m2.1.1.3.cmml">u</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.3.m2.1b"><apply id="S2.SS2.p2.3.m2.1.1.cmml" xref="S2.SS2.p2.3.m2.1.1"><csymbol cd="ambiguous" id="S2.SS2.p2.3.m2.1.1.1.cmml" xref="S2.SS2.p2.3.m2.1.1">subscript</csymbol><ci id="S2.SS2.p2.3.m2.1.1.2.cmml" xref="S2.SS2.p2.3.m2.1.1.2">𝑙</ci><ci id="S2.SS2.p2.3.m2.1.1.3.cmml" xref="S2.SS2.p2.3.m2.1.1.3">𝑢</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.3.m2.1c">l_{u}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.3.m2.1d">italic_l start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT</annotation></semantics></math> is exclusive over time frames <math alttext="t" class="ltx_Math" display="inline" id="S2.SS2.p2.4.m3.1"><semantics id="S2.SS2.p2.4.m3.1a"><mi id="S2.SS2.p2.4.m3.1.1" xref="S2.SS2.p2.4.m3.1.1.cmml">t</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.4.m3.1b"><ci id="S2.SS2.p2.4.m3.1.1.cmml" xref="S2.SS2.p2.4.m3.1.1">𝑡</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.4.m3.1c">t</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.4.m3.1d">italic_t</annotation></semantics></math>, thus CTC posterior can be alternatively calculated by enumerating all possible frames, and the training objective can be reformulated as:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E7"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\mathcal{J}_{brctc}(l,x)=\sum^{T}_{t=1}r_{g}(t)\cdot\frac{\alpha(t,2u)\cdot% \hat{\beta}(t,2u)}{y^{t}_{\pi_{t}}}" class="ltx_Math" display="block" id="S2.E7.m1.7"><semantics id="S2.E7.m1.7a"><mrow id="S2.E7.m1.7.8" xref="S2.E7.m1.7.8.cmml"><mrow id="S2.E7.m1.7.8.2" xref="S2.E7.m1.7.8.2.cmml"><msub id="S2.E7.m1.7.8.2.2" xref="S2.E7.m1.7.8.2.2.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E7.m1.7.8.2.2.2" xref="S2.E7.m1.7.8.2.2.2.cmml">𝒥</mi><mrow id="S2.E7.m1.7.8.2.2.3" xref="S2.E7.m1.7.8.2.2.3.cmml"><mi id="S2.E7.m1.7.8.2.2.3.2" xref="S2.E7.m1.7.8.2.2.3.2.cmml">b</mi><mo id="S2.E7.m1.7.8.2.2.3.1" xref="S2.E7.m1.7.8.2.2.3.1.cmml"></mo><mi id="S2.E7.m1.7.8.2.2.3.3" xref="S2.E7.m1.7.8.2.2.3.3.cmml">r</mi><mo id="S2.E7.m1.7.8.2.2.3.1a" xref="S2.E7.m1.7.8.2.2.3.1.cmml"></mo><mi id="S2.E7.m1.7.8.2.2.3.4" xref="S2.E7.m1.7.8.2.2.3.4.cmml">c</mi><mo id="S2.E7.m1.7.8.2.2.3.1b" xref="S2.E7.m1.7.8.2.2.3.1.cmml"></mo><mi id="S2.E7.m1.7.8.2.2.3.5" xref="S2.E7.m1.7.8.2.2.3.5.cmml">t</mi><mo id="S2.E7.m1.7.8.2.2.3.1c" xref="S2.E7.m1.7.8.2.2.3.1.cmml"></mo><mi id="S2.E7.m1.7.8.2.2.3.6" xref="S2.E7.m1.7.8.2.2.3.6.cmml">c</mi></mrow></msub><mo id="S2.E7.m1.7.8.2.1" xref="S2.E7.m1.7.8.2.1.cmml"></mo><mrow id="S2.E7.m1.7.8.2.3.2" xref="S2.E7.m1.7.8.2.3.1.cmml"><mo id="S2.E7.m1.7.8.2.3.2.1" stretchy="false" xref="S2.E7.m1.7.8.2.3.1.cmml">(</mo><mi id="S2.E7.m1.5.5" xref="S2.E7.m1.5.5.cmml">l</mi><mo id="S2.E7.m1.7.8.2.3.2.2" xref="S2.E7.m1.7.8.2.3.1.cmml">,</mo><mi id="S2.E7.m1.6.6" xref="S2.E7.m1.6.6.cmml">x</mi><mo id="S2.E7.m1.7.8.2.3.2.3" stretchy="false" xref="S2.E7.m1.7.8.2.3.1.cmml">)</mo></mrow></mrow><mo id="S2.E7.m1.7.8.1" rspace="0.111em" xref="S2.E7.m1.7.8.1.cmml">=</mo><mrow id="S2.E7.m1.7.8.3" xref="S2.E7.m1.7.8.3.cmml"><munderover id="S2.E7.m1.7.8.3.1" xref="S2.E7.m1.7.8.3.1.cmml"><mo id="S2.E7.m1.7.8.3.1.2.2" movablelimits="false" xref="S2.E7.m1.7.8.3.1.2.2.cmml">∑</mo><mrow id="S2.E7.m1.7.8.3.1.3" xref="S2.E7.m1.7.8.3.1.3.cmml"><mi id="S2.E7.m1.7.8.3.1.3.2" xref="S2.E7.m1.7.8.3.1.3.2.cmml">t</mi><mo id="S2.E7.m1.7.8.3.1.3.1" xref="S2.E7.m1.7.8.3.1.3.1.cmml">=</mo><mn id="S2.E7.m1.7.8.3.1.3.3" xref="S2.E7.m1.7.8.3.1.3.3.cmml">1</mn></mrow><mi id="S2.E7.m1.7.8.3.1.2.3" xref="S2.E7.m1.7.8.3.1.2.3.cmml">T</mi></munderover><mrow id="S2.E7.m1.7.8.3.2" xref="S2.E7.m1.7.8.3.2.cmml"><mrow id="S2.E7.m1.7.8.3.2.2" xref="S2.E7.m1.7.8.3.2.2.cmml"><msub id="S2.E7.m1.7.8.3.2.2.2" xref="S2.E7.m1.7.8.3.2.2.2.cmml"><mi id="S2.E7.m1.7.8.3.2.2.2.2" xref="S2.E7.m1.7.8.3.2.2.2.2.cmml">r</mi><mi id="S2.E7.m1.7.8.3.2.2.2.3" xref="S2.E7.m1.7.8.3.2.2.2.3.cmml">g</mi></msub><mo id="S2.E7.m1.7.8.3.2.2.1" xref="S2.E7.m1.7.8.3.2.2.1.cmml"></mo><mrow id="S2.E7.m1.7.8.3.2.2.3.2" xref="S2.E7.m1.7.8.3.2.2.cmml"><mo id="S2.E7.m1.7.8.3.2.2.3.2.1" stretchy="false" xref="S2.E7.m1.7.8.3.2.2.cmml">(</mo><mi id="S2.E7.m1.7.7" xref="S2.E7.m1.7.7.cmml">t</mi><mo id="S2.E7.m1.7.8.3.2.2.3.2.2" rspace="0.055em" stretchy="false" xref="S2.E7.m1.7.8.3.2.2.cmml">)</mo></mrow></mrow><mo id="S2.E7.m1.7.8.3.2.1" rspace="0.222em" xref="S2.E7.m1.7.8.3.2.1.cmml">⋅</mo><mfrac id="S2.E7.m1.4.4" xref="S2.E7.m1.4.4.cmml"><mrow id="S2.E7.m1.4.4.4" xref="S2.E7.m1.4.4.4.cmml"><mrow id="S2.E7.m1.3.3.3.3" xref="S2.E7.m1.3.3.3.3.cmml"><mrow id="S2.E7.m1.3.3.3.3.1" xref="S2.E7.m1.3.3.3.3.1.cmml"><mi id="S2.E7.m1.3.3.3.3.1.3" xref="S2.E7.m1.3.3.3.3.1.3.cmml">α</mi><mo id="S2.E7.m1.3.3.3.3.1.2" xref="S2.E7.m1.3.3.3.3.1.2.cmml"></mo><mrow id="S2.E7.m1.3.3.3.3.1.1.1" xref="S2.E7.m1.3.3.3.3.1.1.2.cmml"><mo id="S2.E7.m1.3.3.3.3.1.1.1.2" stretchy="false" xref="S2.E7.m1.3.3.3.3.1.1.2.cmml">(</mo><mi id="S2.E7.m1.1.1.1.1" xref="S2.E7.m1.1.1.1.1.cmml">t</mi><mo id="S2.E7.m1.3.3.3.3.1.1.1.3" xref="S2.E7.m1.3.3.3.3.1.1.2.cmml">,</mo><mrow id="S2.E7.m1.3.3.3.3.1.1.1.1" xref="S2.E7.m1.3.3.3.3.1.1.1.1.cmml"><mn id="S2.E7.m1.3.3.3.3.1.1.1.1.2" xref="S2.E7.m1.3.3.3.3.1.1.1.1.2.cmml">2</mn><mo id="S2.E7.m1.3.3.3.3.1.1.1.1.1" xref="S2.E7.m1.3.3.3.3.1.1.1.1.1.cmml"></mo><mi id="S2.E7.m1.3.3.3.3.1.1.1.1.3" xref="S2.E7.m1.3.3.3.3.1.1.1.1.3.cmml">u</mi></mrow><mo id="S2.E7.m1.3.3.3.3.1.1.1.4" rspace="0.055em" stretchy="false" xref="S2.E7.m1.3.3.3.3.1.1.2.cmml">)</mo></mrow></mrow><mo id="S2.E7.m1.3.3.3.3.2" rspace="0.222em" xref="S2.E7.m1.3.3.3.3.2.cmml">⋅</mo><mover accent="true" id="S2.E7.m1.3.3.3.3.3" xref="S2.E7.m1.3.3.3.3.3.cmml"><mi id="S2.E7.m1.3.3.3.3.3.2" xref="S2.E7.m1.3.3.3.3.3.2.cmml">β</mi><mo id="S2.E7.m1.3.3.3.3.3.1" xref="S2.E7.m1.3.3.3.3.3.1.cmml">^</mo></mover></mrow><mo id="S2.E7.m1.4.4.4.5" xref="S2.E7.m1.4.4.4.5.cmml"></mo><mrow id="S2.E7.m1.4.4.4.4.1" xref="S2.E7.m1.4.4.4.4.2.cmml"><mo id="S2.E7.m1.4.4.4.4.1.2" stretchy="false" xref="S2.E7.m1.4.4.4.4.2.cmml">(</mo><mi id="S2.E7.m1.2.2.2.2" xref="S2.E7.m1.2.2.2.2.cmml">t</mi><mo id="S2.E7.m1.4.4.4.4.1.3" xref="S2.E7.m1.4.4.4.4.2.cmml">,</mo><mrow id="S2.E7.m1.4.4.4.4.1.1" xref="S2.E7.m1.4.4.4.4.1.1.cmml"><mn id="S2.E7.m1.4.4.4.4.1.1.2" xref="S2.E7.m1.4.4.4.4.1.1.2.cmml">2</mn><mo id="S2.E7.m1.4.4.4.4.1.1.1" xref="S2.E7.m1.4.4.4.4.1.1.1.cmml"></mo><mi id="S2.E7.m1.4.4.4.4.1.1.3" xref="S2.E7.m1.4.4.4.4.1.1.3.cmml">u</mi></mrow><mo id="S2.E7.m1.4.4.4.4.1.4" stretchy="false" xref="S2.E7.m1.4.4.4.4.2.cmml">)</mo></mrow></mrow><msubsup id="S2.E7.m1.4.4.6" xref="S2.E7.m1.4.4.6.cmml"><mi id="S2.E7.m1.4.4.6.2.2" xref="S2.E7.m1.4.4.6.2.2.cmml">y</mi><msub id="S2.E7.m1.4.4.6.3" xref="S2.E7.m1.4.4.6.3.cmml"><mi id="S2.E7.m1.4.4.6.3.2" xref="S2.E7.m1.4.4.6.3.2.cmml">π</mi><mi id="S2.E7.m1.4.4.6.3.3" xref="S2.E7.m1.4.4.6.3.3.cmml">t</mi></msub><mi id="S2.E7.m1.4.4.6.2.3" xref="S2.E7.m1.4.4.6.2.3.cmml">t</mi></msubsup></mfrac></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E7.m1.7b"><apply id="S2.E7.m1.7.8.cmml" xref="S2.E7.m1.7.8"><eq id="S2.E7.m1.7.8.1.cmml" xref="S2.E7.m1.7.8.1"></eq><apply id="S2.E7.m1.7.8.2.cmml" xref="S2.E7.m1.7.8.2"><times id="S2.E7.m1.7.8.2.1.cmml" xref="S2.E7.m1.7.8.2.1"></times><apply id="S2.E7.m1.7.8.2.2.cmml" xref="S2.E7.m1.7.8.2.2"><csymbol cd="ambiguous" id="S2.E7.m1.7.8.2.2.1.cmml" xref="S2.E7.m1.7.8.2.2">subscript</csymbol><ci id="S2.E7.m1.7.8.2.2.2.cmml" xref="S2.E7.m1.7.8.2.2.2">𝒥</ci><apply id="S2.E7.m1.7.8.2.2.3.cmml" xref="S2.E7.m1.7.8.2.2.3"><times id="S2.E7.m1.7.8.2.2.3.1.cmml" xref="S2.E7.m1.7.8.2.2.3.1"></times><ci id="S2.E7.m1.7.8.2.2.3.2.cmml" xref="S2.E7.m1.7.8.2.2.3.2">𝑏</ci><ci id="S2.E7.m1.7.8.2.2.3.3.cmml" xref="S2.E7.m1.7.8.2.2.3.3">𝑟</ci><ci id="S2.E7.m1.7.8.2.2.3.4.cmml" xref="S2.E7.m1.7.8.2.2.3.4">𝑐</ci><ci id="S2.E7.m1.7.8.2.2.3.5.cmml" xref="S2.E7.m1.7.8.2.2.3.5">𝑡</ci><ci id="S2.E7.m1.7.8.2.2.3.6.cmml" xref="S2.E7.m1.7.8.2.2.3.6">𝑐</ci></apply></apply><interval closure="open" id="S2.E7.m1.7.8.2.3.1.cmml" xref="S2.E7.m1.7.8.2.3.2"><ci id="S2.E7.m1.5.5.cmml" xref="S2.E7.m1.5.5">𝑙</ci><ci id="S2.E7.m1.6.6.cmml" xref="S2.E7.m1.6.6">𝑥</ci></interval></apply><apply id="S2.E7.m1.7.8.3.cmml" xref="S2.E7.m1.7.8.3"><apply id="S2.E7.m1.7.8.3.1.cmml" xref="S2.E7.m1.7.8.3.1"><csymbol cd="ambiguous" id="S2.E7.m1.7.8.3.1.1.cmml" xref="S2.E7.m1.7.8.3.1">subscript</csymbol><apply id="S2.E7.m1.7.8.3.1.2.cmml" xref="S2.E7.m1.7.8.3.1"><csymbol cd="ambiguous" id="S2.E7.m1.7.8.3.1.2.1.cmml" xref="S2.E7.m1.7.8.3.1">superscript</csymbol><sum id="S2.E7.m1.7.8.3.1.2.2.cmml" xref="S2.E7.m1.7.8.3.1.2.2"></sum><ci id="S2.E7.m1.7.8.3.1.2.3.cmml" xref="S2.E7.m1.7.8.3.1.2.3">𝑇</ci></apply><apply id="S2.E7.m1.7.8.3.1.3.cmml" xref="S2.E7.m1.7.8.3.1.3"><eq id="S2.E7.m1.7.8.3.1.3.1.cmml" xref="S2.E7.m1.7.8.3.1.3.1"></eq><ci id="S2.E7.m1.7.8.3.1.3.2.cmml" xref="S2.E7.m1.7.8.3.1.3.2">𝑡</ci><cn id="S2.E7.m1.7.8.3.1.3.3.cmml" type="integer" xref="S2.E7.m1.7.8.3.1.3.3">1</cn></apply></apply><apply id="S2.E7.m1.7.8.3.2.cmml" xref="S2.E7.m1.7.8.3.2"><ci id="S2.E7.m1.7.8.3.2.1.cmml" xref="S2.E7.m1.7.8.3.2.1">⋅</ci><apply id="S2.E7.m1.7.8.3.2.2.cmml" xref="S2.E7.m1.7.8.3.2.2"><times id="S2.E7.m1.7.8.3.2.2.1.cmml" xref="S2.E7.m1.7.8.3.2.2.1"></times><apply id="S2.E7.m1.7.8.3.2.2.2.cmml" xref="S2.E7.m1.7.8.3.2.2.2"><csymbol cd="ambiguous" id="S2.E7.m1.7.8.3.2.2.2.1.cmml" xref="S2.E7.m1.7.8.3.2.2.2">subscript</csymbol><ci id="S2.E7.m1.7.8.3.2.2.2.2.cmml" xref="S2.E7.m1.7.8.3.2.2.2.2">𝑟</ci><ci id="S2.E7.m1.7.8.3.2.2.2.3.cmml" xref="S2.E7.m1.7.8.3.2.2.2.3">𝑔</ci></apply><ci id="S2.E7.m1.7.7.cmml" xref="S2.E7.m1.7.7">𝑡</ci></apply><apply id="S2.E7.m1.4.4.cmml" xref="S2.E7.m1.4.4"><divide id="S2.E7.m1.4.4.5.cmml" xref="S2.E7.m1.4.4"></divide><apply id="S2.E7.m1.4.4.4.cmml" xref="S2.E7.m1.4.4.4"><times id="S2.E7.m1.4.4.4.5.cmml" xref="S2.E7.m1.4.4.4.5"></times><apply id="S2.E7.m1.3.3.3.3.cmml" xref="S2.E7.m1.3.3.3.3"><ci id="S2.E7.m1.3.3.3.3.2.cmml" xref="S2.E7.m1.3.3.3.3.2">⋅</ci><apply id="S2.E7.m1.3.3.3.3.1.cmml" xref="S2.E7.m1.3.3.3.3.1"><times id="S2.E7.m1.3.3.3.3.1.2.cmml" xref="S2.E7.m1.3.3.3.3.1.2"></times><ci id="S2.E7.m1.3.3.3.3.1.3.cmml" xref="S2.E7.m1.3.3.3.3.1.3">𝛼</ci><interval closure="open" id="S2.E7.m1.3.3.3.3.1.1.2.cmml" xref="S2.E7.m1.3.3.3.3.1.1.1"><ci id="S2.E7.m1.1.1.1.1.cmml" xref="S2.E7.m1.1.1.1.1">𝑡</ci><apply id="S2.E7.m1.3.3.3.3.1.1.1.1.cmml" xref="S2.E7.m1.3.3.3.3.1.1.1.1"><times id="S2.E7.m1.3.3.3.3.1.1.1.1.1.cmml" xref="S2.E7.m1.3.3.3.3.1.1.1.1.1"></times><cn id="S2.E7.m1.3.3.3.3.1.1.1.1.2.cmml" type="integer" xref="S2.E7.m1.3.3.3.3.1.1.1.1.2">2</cn><ci id="S2.E7.m1.3.3.3.3.1.1.1.1.3.cmml" xref="S2.E7.m1.3.3.3.3.1.1.1.1.3">𝑢</ci></apply></interval></apply><apply id="S2.E7.m1.3.3.3.3.3.cmml" xref="S2.E7.m1.3.3.3.3.3"><ci id="S2.E7.m1.3.3.3.3.3.1.cmml" xref="S2.E7.m1.3.3.3.3.3.1">^</ci><ci id="S2.E7.m1.3.3.3.3.3.2.cmml" xref="S2.E7.m1.3.3.3.3.3.2">𝛽</ci></apply></apply><interval closure="open" id="S2.E7.m1.4.4.4.4.2.cmml" xref="S2.E7.m1.4.4.4.4.1"><ci id="S2.E7.m1.2.2.2.2.cmml" xref="S2.E7.m1.2.2.2.2">𝑡</ci><apply id="S2.E7.m1.4.4.4.4.1.1.cmml" xref="S2.E7.m1.4.4.4.4.1.1"><times id="S2.E7.m1.4.4.4.4.1.1.1.cmml" xref="S2.E7.m1.4.4.4.4.1.1.1"></times><cn id="S2.E7.m1.4.4.4.4.1.1.2.cmml" type="integer" xref="S2.E7.m1.4.4.4.4.1.1.2">2</cn><ci id="S2.E7.m1.4.4.4.4.1.1.3.cmml" xref="S2.E7.m1.4.4.4.4.1.1.3">𝑢</ci></apply></interval></apply><apply id="S2.E7.m1.4.4.6.cmml" xref="S2.E7.m1.4.4.6"><csymbol cd="ambiguous" id="S2.E7.m1.4.4.6.1.cmml" xref="S2.E7.m1.4.4.6">subscript</csymbol><apply id="S2.E7.m1.4.4.6.2.cmml" xref="S2.E7.m1.4.4.6"><csymbol cd="ambiguous" id="S2.E7.m1.4.4.6.2.1.cmml" xref="S2.E7.m1.4.4.6">superscript</csymbol><ci id="S2.E7.m1.4.4.6.2.2.cmml" xref="S2.E7.m1.4.4.6.2.2">𝑦</ci><ci id="S2.E7.m1.4.4.6.2.3.cmml" xref="S2.E7.m1.4.4.6.2.3">𝑡</ci></apply><apply id="S2.E7.m1.4.4.6.3.cmml" xref="S2.E7.m1.4.4.6.3"><csymbol cd="ambiguous" id="S2.E7.m1.4.4.6.3.1.cmml" xref="S2.E7.m1.4.4.6.3">subscript</csymbol><ci id="S2.E7.m1.4.4.6.3.2.cmml" xref="S2.E7.m1.4.4.6.3.2">𝜋</ci><ci id="S2.E7.m1.4.4.6.3.3.cmml" xref="S2.E7.m1.4.4.6.3.3">𝑡</ci></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E7.m1.7c">\mathcal{J}_{brctc}(l,x)=\sum^{T}_{t=1}r_{g}(t)\cdot\frac{\alpha(t,2u)\cdot% \hat{\beta}(t,2u)}{y^{t}_{\pi_{t}}}</annotation><annotation encoding="application/x-llamapun" id="S2.E7.m1.7d">caligraphic_J start_POSTSUBSCRIPT italic_b italic_r italic_c italic_t italic_c end_POSTSUBSCRIPT ( italic_l , italic_x ) = ∑ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ( italic_t ) ⋅ divide start_ARG italic_α ( italic_t , 2 italic_u ) ⋅ over^ start_ARG italic_β end_ARG ( italic_t , 2 italic_u ) end_ARG start_ARG italic_y start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(7)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS2.p2.12">in which <math alttext="\hat{\beta}(t,2u)" class="ltx_Math" display="inline" id="S2.SS2.p2.5.m1.2"><semantics id="S2.SS2.p2.5.m1.2a"><mrow id="S2.SS2.p2.5.m1.2.2" xref="S2.SS2.p2.5.m1.2.2.cmml"><mover accent="true" id="S2.SS2.p2.5.m1.2.2.3" xref="S2.SS2.p2.5.m1.2.2.3.cmml"><mi id="S2.SS2.p2.5.m1.2.2.3.2" xref="S2.SS2.p2.5.m1.2.2.3.2.cmml">β</mi><mo id="S2.SS2.p2.5.m1.2.2.3.1" xref="S2.SS2.p2.5.m1.2.2.3.1.cmml">^</mo></mover><mo id="S2.SS2.p2.5.m1.2.2.2" xref="S2.SS2.p2.5.m1.2.2.2.cmml"></mo><mrow id="S2.SS2.p2.5.m1.2.2.1.1" xref="S2.SS2.p2.5.m1.2.2.1.2.cmml"><mo id="S2.SS2.p2.5.m1.2.2.1.1.2" stretchy="false" xref="S2.SS2.p2.5.m1.2.2.1.2.cmml">(</mo><mi id="S2.SS2.p2.5.m1.1.1" xref="S2.SS2.p2.5.m1.1.1.cmml">t</mi><mo id="S2.SS2.p2.5.m1.2.2.1.1.3" xref="S2.SS2.p2.5.m1.2.2.1.2.cmml">,</mo><mrow id="S2.SS2.p2.5.m1.2.2.1.1.1" xref="S2.SS2.p2.5.m1.2.2.1.1.1.cmml"><mn id="S2.SS2.p2.5.m1.2.2.1.1.1.2" xref="S2.SS2.p2.5.m1.2.2.1.1.1.2.cmml">2</mn><mo id="S2.SS2.p2.5.m1.2.2.1.1.1.1" xref="S2.SS2.p2.5.m1.2.2.1.1.1.1.cmml"></mo><mi id="S2.SS2.p2.5.m1.2.2.1.1.1.3" xref="S2.SS2.p2.5.m1.2.2.1.1.1.3.cmml">u</mi></mrow><mo id="S2.SS2.p2.5.m1.2.2.1.1.4" stretchy="false" xref="S2.SS2.p2.5.m1.2.2.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.5.m1.2b"><apply id="S2.SS2.p2.5.m1.2.2.cmml" xref="S2.SS2.p2.5.m1.2.2"><times id="S2.SS2.p2.5.m1.2.2.2.cmml" xref="S2.SS2.p2.5.m1.2.2.2"></times><apply id="S2.SS2.p2.5.m1.2.2.3.cmml" xref="S2.SS2.p2.5.m1.2.2.3"><ci id="S2.SS2.p2.5.m1.2.2.3.1.cmml" xref="S2.SS2.p2.5.m1.2.2.3.1">^</ci><ci id="S2.SS2.p2.5.m1.2.2.3.2.cmml" xref="S2.SS2.p2.5.m1.2.2.3.2">𝛽</ci></apply><interval closure="open" id="S2.SS2.p2.5.m1.2.2.1.2.cmml" xref="S2.SS2.p2.5.m1.2.2.1.1"><ci id="S2.SS2.p2.5.m1.1.1.cmml" xref="S2.SS2.p2.5.m1.1.1">𝑡</ci><apply id="S2.SS2.p2.5.m1.2.2.1.1.1.cmml" xref="S2.SS2.p2.5.m1.2.2.1.1.1"><times id="S2.SS2.p2.5.m1.2.2.1.1.1.1.cmml" xref="S2.SS2.p2.5.m1.2.2.1.1.1.1"></times><cn id="S2.SS2.p2.5.m1.2.2.1.1.1.2.cmml" type="integer" xref="S2.SS2.p2.5.m1.2.2.1.1.1.2">2</cn><ci id="S2.SS2.p2.5.m1.2.2.1.1.1.3.cmml" xref="S2.SS2.p2.5.m1.2.2.1.1.1.3">𝑢</ci></apply></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.5.m1.2c">\hat{\beta}(t,2u)</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.5.m1.2d">over^ start_ARG italic_β end_ARG ( italic_t , 2 italic_u )</annotation></semantics></math> is a revised backward variable, summarizing posteriors of the paths where <math alttext="l_{u}" class="ltx_Math" display="inline" id="S2.SS2.p2.6.m2.1"><semantics id="S2.SS2.p2.6.m2.1a"><msub id="S2.SS2.p2.6.m2.1.1" xref="S2.SS2.p2.6.m2.1.1.cmml"><mi id="S2.SS2.p2.6.m2.1.1.2" xref="S2.SS2.p2.6.m2.1.1.2.cmml">l</mi><mi id="S2.SS2.p2.6.m2.1.1.3" xref="S2.SS2.p2.6.m2.1.1.3.cmml">u</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.6.m2.1b"><apply id="S2.SS2.p2.6.m2.1.1.cmml" xref="S2.SS2.p2.6.m2.1.1"><csymbol cd="ambiguous" id="S2.SS2.p2.6.m2.1.1.1.cmml" xref="S2.SS2.p2.6.m2.1.1">subscript</csymbol><ci id="S2.SS2.p2.6.m2.1.1.2.cmml" xref="S2.SS2.p2.6.m2.1.1.2">𝑙</ci><ci id="S2.SS2.p2.6.m2.1.1.3.cmml" xref="S2.SS2.p2.6.m2.1.1.3">𝑢</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.6.m2.1c">l_{u}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.6.m2.1d">italic_l start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT</annotation></semantics></math> ends at <math alttext="t" class="ltx_Math" display="inline" id="S2.SS2.p2.7.m3.1"><semantics id="S2.SS2.p2.7.m3.1a"><mi id="S2.SS2.p2.7.m3.1.1" xref="S2.SS2.p2.7.m3.1.1.cmml">t</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.7.m3.1b"><ci id="S2.SS2.p2.7.m3.1.1.cmml" xref="S2.SS2.p2.7.m3.1.1">𝑡</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.7.m3.1c">t</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.7.m3.1d">italic_t</annotation></semantics></math>, i.e., <math alttext="t=argmax_{\tau}" class="ltx_Math" display="inline" id="S2.SS2.p2.8.m4.1"><semantics id="S2.SS2.p2.8.m4.1a"><mrow id="S2.SS2.p2.8.m4.1.1" xref="S2.SS2.p2.8.m4.1.1.cmml"><mi id="S2.SS2.p2.8.m4.1.1.2" xref="S2.SS2.p2.8.m4.1.1.2.cmml">t</mi><mo id="S2.SS2.p2.8.m4.1.1.1" xref="S2.SS2.p2.8.m4.1.1.1.cmml">=</mo><mrow id="S2.SS2.p2.8.m4.1.1.3" xref="S2.SS2.p2.8.m4.1.1.3.cmml"><mi id="S2.SS2.p2.8.m4.1.1.3.2" xref="S2.SS2.p2.8.m4.1.1.3.2.cmml">a</mi><mo id="S2.SS2.p2.8.m4.1.1.3.1" xref="S2.SS2.p2.8.m4.1.1.3.1.cmml"></mo><mi id="S2.SS2.p2.8.m4.1.1.3.3" xref="S2.SS2.p2.8.m4.1.1.3.3.cmml">r</mi><mo id="S2.SS2.p2.8.m4.1.1.3.1a" xref="S2.SS2.p2.8.m4.1.1.3.1.cmml"></mo><mi id="S2.SS2.p2.8.m4.1.1.3.4" xref="S2.SS2.p2.8.m4.1.1.3.4.cmml">g</mi><mo id="S2.SS2.p2.8.m4.1.1.3.1b" xref="S2.SS2.p2.8.m4.1.1.3.1.cmml"></mo><mi id="S2.SS2.p2.8.m4.1.1.3.5" xref="S2.SS2.p2.8.m4.1.1.3.5.cmml">m</mi><mo id="S2.SS2.p2.8.m4.1.1.3.1c" xref="S2.SS2.p2.8.m4.1.1.3.1.cmml"></mo><mi id="S2.SS2.p2.8.m4.1.1.3.6" xref="S2.SS2.p2.8.m4.1.1.3.6.cmml">a</mi><mo id="S2.SS2.p2.8.m4.1.1.3.1d" xref="S2.SS2.p2.8.m4.1.1.3.1.cmml"></mo><msub id="S2.SS2.p2.8.m4.1.1.3.7" xref="S2.SS2.p2.8.m4.1.1.3.7.cmml"><mi id="S2.SS2.p2.8.m4.1.1.3.7.2" xref="S2.SS2.p2.8.m4.1.1.3.7.2.cmml">x</mi><mi id="S2.SS2.p2.8.m4.1.1.3.7.3" xref="S2.SS2.p2.8.m4.1.1.3.7.3.cmml">τ</mi></msub></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.8.m4.1b"><apply id="S2.SS2.p2.8.m4.1.1.cmml" xref="S2.SS2.p2.8.m4.1.1"><eq id="S2.SS2.p2.8.m4.1.1.1.cmml" xref="S2.SS2.p2.8.m4.1.1.1"></eq><ci id="S2.SS2.p2.8.m4.1.1.2.cmml" xref="S2.SS2.p2.8.m4.1.1.2">𝑡</ci><apply id="S2.SS2.p2.8.m4.1.1.3.cmml" xref="S2.SS2.p2.8.m4.1.1.3"><times id="S2.SS2.p2.8.m4.1.1.3.1.cmml" xref="S2.SS2.p2.8.m4.1.1.3.1"></times><ci id="S2.SS2.p2.8.m4.1.1.3.2.cmml" xref="S2.SS2.p2.8.m4.1.1.3.2">𝑎</ci><ci id="S2.SS2.p2.8.m4.1.1.3.3.cmml" xref="S2.SS2.p2.8.m4.1.1.3.3">𝑟</ci><ci id="S2.SS2.p2.8.m4.1.1.3.4.cmml" xref="S2.SS2.p2.8.m4.1.1.3.4">𝑔</ci><ci id="S2.SS2.p2.8.m4.1.1.3.5.cmml" xref="S2.SS2.p2.8.m4.1.1.3.5">𝑚</ci><ci id="S2.SS2.p2.8.m4.1.1.3.6.cmml" xref="S2.SS2.p2.8.m4.1.1.3.6">𝑎</ci><apply id="S2.SS2.p2.8.m4.1.1.3.7.cmml" xref="S2.SS2.p2.8.m4.1.1.3.7"><csymbol cd="ambiguous" id="S2.SS2.p2.8.m4.1.1.3.7.1.cmml" xref="S2.SS2.p2.8.m4.1.1.3.7">subscript</csymbol><ci id="S2.SS2.p2.8.m4.1.1.3.7.2.cmml" xref="S2.SS2.p2.8.m4.1.1.3.7.2">𝑥</ci><ci id="S2.SS2.p2.8.m4.1.1.3.7.3.cmml" xref="S2.SS2.p2.8.m4.1.1.3.7.3">𝜏</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.8.m4.1c">t=argmax_{\tau}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.8.m4.1d">italic_t = italic_a italic_r italic_g italic_m italic_a italic_x start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT</annotation></semantics></math> for <math alttext="1\,{\leq}\,\tau\,{\leq}\,T" class="ltx_Math" display="inline" id="S2.SS2.p2.9.m5.1"><semantics id="S2.SS2.p2.9.m5.1a"><mrow id="S2.SS2.p2.9.m5.1.1" xref="S2.SS2.p2.9.m5.1.1.cmml"><mn id="S2.SS2.p2.9.m5.1.1.2" xref="S2.SS2.p2.9.m5.1.1.2.cmml">1</mn><mo id="S2.SS2.p2.9.m5.1.1.3" lspace="0.448em" rspace="0.448em" xref="S2.SS2.p2.9.m5.1.1.3.cmml">≤</mo><mi id="S2.SS2.p2.9.m5.1.1.4" xref="S2.SS2.p2.9.m5.1.1.4.cmml">τ</mi><mo id="S2.SS2.p2.9.m5.1.1.5" lspace="0.448em" rspace="0.448em" xref="S2.SS2.p2.9.m5.1.1.5.cmml">≤</mo><mi id="S2.SS2.p2.9.m5.1.1.6" xref="S2.SS2.p2.9.m5.1.1.6.cmml">T</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.9.m5.1b"><apply id="S2.SS2.p2.9.m5.1.1.cmml" xref="S2.SS2.p2.9.m5.1.1"><and id="S2.SS2.p2.9.m5.1.1a.cmml" xref="S2.SS2.p2.9.m5.1.1"></and><apply id="S2.SS2.p2.9.m5.1.1b.cmml" xref="S2.SS2.p2.9.m5.1.1"><leq id="S2.SS2.p2.9.m5.1.1.3.cmml" xref="S2.SS2.p2.9.m5.1.1.3"></leq><cn id="S2.SS2.p2.9.m5.1.1.2.cmml" type="integer" xref="S2.SS2.p2.9.m5.1.1.2">1</cn><ci id="S2.SS2.p2.9.m5.1.1.4.cmml" xref="S2.SS2.p2.9.m5.1.1.4">𝜏</ci></apply><apply id="S2.SS2.p2.9.m5.1.1c.cmml" xref="S2.SS2.p2.9.m5.1.1"><leq id="S2.SS2.p2.9.m5.1.1.5.cmml" xref="S2.SS2.p2.9.m5.1.1.5"></leq><share href="https://arxiv.org/html/2409.12388v2#S2.SS2.p2.9.m5.1.1.4.cmml" id="S2.SS2.p2.9.m5.1.1d.cmml" xref="S2.SS2.p2.9.m5.1.1"></share><ci id="S2.SS2.p2.9.m5.1.1.6.cmml" xref="S2.SS2.p2.9.m5.1.1.6">𝑇</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.9.m5.1c">1\,{\leq}\,\tau\,{\leq}\,T</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.9.m5.1d">1 ≤ italic_τ ≤ italic_T</annotation></semantics></math>, s.t. <math alttext="\pi_{\tau}=l^{\prime}_{2u}" class="ltx_Math" display="inline" id="S2.SS2.p2.10.m6.1"><semantics id="S2.SS2.p2.10.m6.1a"><mrow id="S2.SS2.p2.10.m6.1.1" xref="S2.SS2.p2.10.m6.1.1.cmml"><msub id="S2.SS2.p2.10.m6.1.1.2" xref="S2.SS2.p2.10.m6.1.1.2.cmml"><mi id="S2.SS2.p2.10.m6.1.1.2.2" xref="S2.SS2.p2.10.m6.1.1.2.2.cmml">π</mi><mi id="S2.SS2.p2.10.m6.1.1.2.3" xref="S2.SS2.p2.10.m6.1.1.2.3.cmml">τ</mi></msub><mo id="S2.SS2.p2.10.m6.1.1.1" xref="S2.SS2.p2.10.m6.1.1.1.cmml">=</mo><msubsup id="S2.SS2.p2.10.m6.1.1.3" xref="S2.SS2.p2.10.m6.1.1.3.cmml"><mi id="S2.SS2.p2.10.m6.1.1.3.2.2" xref="S2.SS2.p2.10.m6.1.1.3.2.2.cmml">l</mi><mrow id="S2.SS2.p2.10.m6.1.1.3.3" xref="S2.SS2.p2.10.m6.1.1.3.3.cmml"><mn id="S2.SS2.p2.10.m6.1.1.3.3.2" xref="S2.SS2.p2.10.m6.1.1.3.3.2.cmml">2</mn><mo id="S2.SS2.p2.10.m6.1.1.3.3.1" xref="S2.SS2.p2.10.m6.1.1.3.3.1.cmml"></mo><mi id="S2.SS2.p2.10.m6.1.1.3.3.3" xref="S2.SS2.p2.10.m6.1.1.3.3.3.cmml">u</mi></mrow><mo id="S2.SS2.p2.10.m6.1.1.3.2.3" xref="S2.SS2.p2.10.m6.1.1.3.2.3.cmml">′</mo></msubsup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.10.m6.1b"><apply id="S2.SS2.p2.10.m6.1.1.cmml" xref="S2.SS2.p2.10.m6.1.1"><eq id="S2.SS2.p2.10.m6.1.1.1.cmml" xref="S2.SS2.p2.10.m6.1.1.1"></eq><apply id="S2.SS2.p2.10.m6.1.1.2.cmml" xref="S2.SS2.p2.10.m6.1.1.2"><csymbol cd="ambiguous" id="S2.SS2.p2.10.m6.1.1.2.1.cmml" xref="S2.SS2.p2.10.m6.1.1.2">subscript</csymbol><ci id="S2.SS2.p2.10.m6.1.1.2.2.cmml" xref="S2.SS2.p2.10.m6.1.1.2.2">𝜋</ci><ci id="S2.SS2.p2.10.m6.1.1.2.3.cmml" xref="S2.SS2.p2.10.m6.1.1.2.3">𝜏</ci></apply><apply id="S2.SS2.p2.10.m6.1.1.3.cmml" xref="S2.SS2.p2.10.m6.1.1.3"><csymbol cd="ambiguous" id="S2.SS2.p2.10.m6.1.1.3.1.cmml" xref="S2.SS2.p2.10.m6.1.1.3">subscript</csymbol><apply id="S2.SS2.p2.10.m6.1.1.3.2.cmml" xref="S2.SS2.p2.10.m6.1.1.3"><csymbol cd="ambiguous" id="S2.SS2.p2.10.m6.1.1.3.2.1.cmml" xref="S2.SS2.p2.10.m6.1.1.3">superscript</csymbol><ci id="S2.SS2.p2.10.m6.1.1.3.2.2.cmml" xref="S2.SS2.p2.10.m6.1.1.3.2.2">𝑙</ci><ci id="S2.SS2.p2.10.m6.1.1.3.2.3.cmml" xref="S2.SS2.p2.10.m6.1.1.3.2.3">′</ci></apply><apply id="S2.SS2.p2.10.m6.1.1.3.3.cmml" xref="S2.SS2.p2.10.m6.1.1.3.3"><times id="S2.SS2.p2.10.m6.1.1.3.3.1.cmml" xref="S2.SS2.p2.10.m6.1.1.3.3.1"></times><cn id="S2.SS2.p2.10.m6.1.1.3.3.2.cmml" type="integer" xref="S2.SS2.p2.10.m6.1.1.3.3.2">2</cn><ci id="S2.SS2.p2.10.m6.1.1.3.3.3.cmml" xref="S2.SS2.p2.10.m6.1.1.3.3.3">𝑢</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.10.m6.1c">\pi_{\tau}=l^{\prime}_{2u}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.10.m6.1d">italic_π start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT = italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 italic_u end_POSTSUBSCRIPT</annotation></semantics></math>. This can be achieved by eliminating non-ending paths such that <math alttext="\pi_{t+1}=l^{\prime}_{2u}" class="ltx_Math" display="inline" id="S2.SS2.p2.11.m7.1"><semantics id="S2.SS2.p2.11.m7.1a"><mrow id="S2.SS2.p2.11.m7.1.1" xref="S2.SS2.p2.11.m7.1.1.cmml"><msub id="S2.SS2.p2.11.m7.1.1.2" xref="S2.SS2.p2.11.m7.1.1.2.cmml"><mi id="S2.SS2.p2.11.m7.1.1.2.2" xref="S2.SS2.p2.11.m7.1.1.2.2.cmml">π</mi><mrow id="S2.SS2.p2.11.m7.1.1.2.3" xref="S2.SS2.p2.11.m7.1.1.2.3.cmml"><mi id="S2.SS2.p2.11.m7.1.1.2.3.2" xref="S2.SS2.p2.11.m7.1.1.2.3.2.cmml">t</mi><mo id="S2.SS2.p2.11.m7.1.1.2.3.1" xref="S2.SS2.p2.11.m7.1.1.2.3.1.cmml">+</mo><mn id="S2.SS2.p2.11.m7.1.1.2.3.3" xref="S2.SS2.p2.11.m7.1.1.2.3.3.cmml">1</mn></mrow></msub><mo id="S2.SS2.p2.11.m7.1.1.1" xref="S2.SS2.p2.11.m7.1.1.1.cmml">=</mo><msubsup id="S2.SS2.p2.11.m7.1.1.3" xref="S2.SS2.p2.11.m7.1.1.3.cmml"><mi id="S2.SS2.p2.11.m7.1.1.3.2.2" xref="S2.SS2.p2.11.m7.1.1.3.2.2.cmml">l</mi><mrow id="S2.SS2.p2.11.m7.1.1.3.3" xref="S2.SS2.p2.11.m7.1.1.3.3.cmml"><mn id="S2.SS2.p2.11.m7.1.1.3.3.2" xref="S2.SS2.p2.11.m7.1.1.3.3.2.cmml">2</mn><mo id="S2.SS2.p2.11.m7.1.1.3.3.1" xref="S2.SS2.p2.11.m7.1.1.3.3.1.cmml"></mo><mi id="S2.SS2.p2.11.m7.1.1.3.3.3" xref="S2.SS2.p2.11.m7.1.1.3.3.3.cmml">u</mi></mrow><mo id="S2.SS2.p2.11.m7.1.1.3.2.3" xref="S2.SS2.p2.11.m7.1.1.3.2.3.cmml">′</mo></msubsup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.11.m7.1b"><apply id="S2.SS2.p2.11.m7.1.1.cmml" xref="S2.SS2.p2.11.m7.1.1"><eq id="S2.SS2.p2.11.m7.1.1.1.cmml" xref="S2.SS2.p2.11.m7.1.1.1"></eq><apply id="S2.SS2.p2.11.m7.1.1.2.cmml" xref="S2.SS2.p2.11.m7.1.1.2"><csymbol cd="ambiguous" id="S2.SS2.p2.11.m7.1.1.2.1.cmml" xref="S2.SS2.p2.11.m7.1.1.2">subscript</csymbol><ci id="S2.SS2.p2.11.m7.1.1.2.2.cmml" xref="S2.SS2.p2.11.m7.1.1.2.2">𝜋</ci><apply id="S2.SS2.p2.11.m7.1.1.2.3.cmml" xref="S2.SS2.p2.11.m7.1.1.2.3"><plus id="S2.SS2.p2.11.m7.1.1.2.3.1.cmml" xref="S2.SS2.p2.11.m7.1.1.2.3.1"></plus><ci id="S2.SS2.p2.11.m7.1.1.2.3.2.cmml" xref="S2.SS2.p2.11.m7.1.1.2.3.2">𝑡</ci><cn id="S2.SS2.p2.11.m7.1.1.2.3.3.cmml" type="integer" xref="S2.SS2.p2.11.m7.1.1.2.3.3">1</cn></apply></apply><apply id="S2.SS2.p2.11.m7.1.1.3.cmml" xref="S2.SS2.p2.11.m7.1.1.3"><csymbol cd="ambiguous" id="S2.SS2.p2.11.m7.1.1.3.1.cmml" xref="S2.SS2.p2.11.m7.1.1.3">subscript</csymbol><apply id="S2.SS2.p2.11.m7.1.1.3.2.cmml" xref="S2.SS2.p2.11.m7.1.1.3"><csymbol cd="ambiguous" id="S2.SS2.p2.11.m7.1.1.3.2.1.cmml" xref="S2.SS2.p2.11.m7.1.1.3">superscript</csymbol><ci id="S2.SS2.p2.11.m7.1.1.3.2.2.cmml" xref="S2.SS2.p2.11.m7.1.1.3.2.2">𝑙</ci><ci id="S2.SS2.p2.11.m7.1.1.3.2.3.cmml" xref="S2.SS2.p2.11.m7.1.1.3.2.3">′</ci></apply><apply id="S2.SS2.p2.11.m7.1.1.3.3.cmml" xref="S2.SS2.p2.11.m7.1.1.3.3"><times id="S2.SS2.p2.11.m7.1.1.3.3.1.cmml" xref="S2.SS2.p2.11.m7.1.1.3.3.1"></times><cn id="S2.SS2.p2.11.m7.1.1.3.3.2.cmml" type="integer" xref="S2.SS2.p2.11.m7.1.1.3.3.2">2</cn><ci id="S2.SS2.p2.11.m7.1.1.3.3.3.cmml" xref="S2.SS2.p2.11.m7.1.1.3.3.3">𝑢</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.11.m7.1c">\pi_{t+1}=l^{\prime}_{2u}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.11.m7.1d">italic_π start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT = italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 italic_u end_POSTSUBSCRIPT</annotation></semantics></math> for <math alttext="t\,{<}\,T" class="ltx_Math" display="inline" id="S2.SS2.p2.12.m8.1"><semantics id="S2.SS2.p2.12.m8.1a"><mrow id="S2.SS2.p2.12.m8.1.1" xref="S2.SS2.p2.12.m8.1.1.cmml"><mi id="S2.SS2.p2.12.m8.1.1.2" xref="S2.SS2.p2.12.m8.1.1.2.cmml">t</mi><mo id="S2.SS2.p2.12.m8.1.1.1" lspace="0.448em" rspace="0.448em" xref="S2.SS2.p2.12.m8.1.1.1.cmml"><</mo><mi id="S2.SS2.p2.12.m8.1.1.3" xref="S2.SS2.p2.12.m8.1.1.3.cmml">T</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p2.12.m8.1b"><apply id="S2.SS2.p2.12.m8.1.1.cmml" xref="S2.SS2.p2.12.m8.1.1"><lt id="S2.SS2.p2.12.m8.1.1.1.cmml" xref="S2.SS2.p2.12.m8.1.1.1"></lt><ci id="S2.SS2.p2.12.m8.1.1.2.cmml" xref="S2.SS2.p2.12.m8.1.1.2">𝑡</ci><ci id="S2.SS2.p2.12.m8.1.1.3.cmml" xref="S2.SS2.p2.12.m8.1.1.3">𝑇</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p2.12.m8.1c">t\,{<}\,T</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p2.12.m8.1d">italic_t < italic_T</annotation></semantics></math>. Accordingly:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E8"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\hat{\beta}(t,2u)=\begin{cases}\beta(t,2u)-\beta(t+1,2u)\cdot y_{\pi_{t}}^{t}&% \text{if }t<T\\ \beta(t,2u),&\text{Otherwise}\end{cases}" class="ltx_Math" display="block" id="S2.E8.m1.6"><semantics id="S2.E8.m1.6a"><mrow id="S2.E8.m1.6.6" xref="S2.E8.m1.6.6.cmml"><mrow id="S2.E8.m1.6.6.1" xref="S2.E8.m1.6.6.1.cmml"><mover accent="true" id="S2.E8.m1.6.6.1.3" xref="S2.E8.m1.6.6.1.3.cmml"><mi id="S2.E8.m1.6.6.1.3.2" xref="S2.E8.m1.6.6.1.3.2.cmml">β</mi><mo id="S2.E8.m1.6.6.1.3.1" xref="S2.E8.m1.6.6.1.3.1.cmml">^</mo></mover><mo id="S2.E8.m1.6.6.1.2" xref="S2.E8.m1.6.6.1.2.cmml"></mo><mrow id="S2.E8.m1.6.6.1.1.1" xref="S2.E8.m1.6.6.1.1.2.cmml"><mo id="S2.E8.m1.6.6.1.1.1.2" stretchy="false" xref="S2.E8.m1.6.6.1.1.2.cmml">(</mo><mi id="S2.E8.m1.5.5" xref="S2.E8.m1.5.5.cmml">t</mi><mo id="S2.E8.m1.6.6.1.1.1.3" xref="S2.E8.m1.6.6.1.1.2.cmml">,</mo><mrow id="S2.E8.m1.6.6.1.1.1.1" xref="S2.E8.m1.6.6.1.1.1.1.cmml"><mn id="S2.E8.m1.6.6.1.1.1.1.2" xref="S2.E8.m1.6.6.1.1.1.1.2.cmml">2</mn><mo id="S2.E8.m1.6.6.1.1.1.1.1" xref="S2.E8.m1.6.6.1.1.1.1.1.cmml"></mo><mi id="S2.E8.m1.6.6.1.1.1.1.3" xref="S2.E8.m1.6.6.1.1.1.1.3.cmml">u</mi></mrow><mo id="S2.E8.m1.6.6.1.1.1.4" stretchy="false" xref="S2.E8.m1.6.6.1.1.2.cmml">)</mo></mrow></mrow><mo id="S2.E8.m1.6.6.2" xref="S2.E8.m1.6.6.2.cmml">=</mo><mrow id="S2.E8.m1.4.4" xref="S2.E8.m1.6.6.3.1.cmml"><mo id="S2.E8.m1.4.4.5" xref="S2.E8.m1.6.6.3.1.1.cmml">{</mo><mtable columnspacing="5pt" displaystyle="true" id="S2.E8.m1.4.4.4" rowspacing="0pt" xref="S2.E8.m1.6.6.3.1.cmml"><mtr id="S2.E8.m1.4.4.4a" xref="S2.E8.m1.6.6.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E8.m1.4.4.4b" xref="S2.E8.m1.6.6.3.1.cmml"><mrow id="S2.E8.m1.1.1.1.1.1.1" xref="S2.E8.m1.1.1.1.1.1.1.cmml"><mrow id="S2.E8.m1.1.1.1.1.1.1.2" xref="S2.E8.m1.1.1.1.1.1.1.2.cmml"><mi id="S2.E8.m1.1.1.1.1.1.1.2.3" xref="S2.E8.m1.1.1.1.1.1.1.2.3.cmml">β</mi><mo id="S2.E8.m1.1.1.1.1.1.1.2.2" xref="S2.E8.m1.1.1.1.1.1.1.2.2.cmml"></mo><mrow id="S2.E8.m1.1.1.1.1.1.1.2.1.1" xref="S2.E8.m1.1.1.1.1.1.1.2.1.2.cmml"><mo id="S2.E8.m1.1.1.1.1.1.1.2.1.1.2" stretchy="false" xref="S2.E8.m1.1.1.1.1.1.1.2.1.2.cmml">(</mo><mi id="S2.E8.m1.1.1.1.1.1.1.1" xref="S2.E8.m1.1.1.1.1.1.1.1.cmml">t</mi><mo id="S2.E8.m1.1.1.1.1.1.1.2.1.1.3" xref="S2.E8.m1.1.1.1.1.1.1.2.1.2.cmml">,</mo><mrow id="S2.E8.m1.1.1.1.1.1.1.2.1.1.1" xref="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.cmml"><mn id="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.2" xref="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.2.cmml">2</mn><mo id="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.1" xref="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.1.cmml"></mo><mi id="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.3" xref="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.3.cmml">u</mi></mrow><mo id="S2.E8.m1.1.1.1.1.1.1.2.1.1.4" stretchy="false" xref="S2.E8.m1.1.1.1.1.1.1.2.1.2.cmml">)</mo></mrow></mrow><mo id="S2.E8.m1.1.1.1.1.1.1.5" xref="S2.E8.m1.1.1.1.1.1.1.5.cmml">−</mo><mrow id="S2.E8.m1.1.1.1.1.1.1.4" xref="S2.E8.m1.1.1.1.1.1.1.4.cmml"><mrow id="S2.E8.m1.1.1.1.1.1.1.4.2" xref="S2.E8.m1.1.1.1.1.1.1.4.2.cmml"><mi id="S2.E8.m1.1.1.1.1.1.1.4.2.4" xref="S2.E8.m1.1.1.1.1.1.1.4.2.4.cmml">β</mi><mo id="S2.E8.m1.1.1.1.1.1.1.4.2.3" xref="S2.E8.m1.1.1.1.1.1.1.4.2.3.cmml"></mo><mrow id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.3.cmml"><mo id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.3" stretchy="false" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.3.cmml">(</mo><mrow id="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1" xref="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.cmml"><mi id="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.2" xref="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.2.cmml">t</mi><mo id="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.1" xref="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.1.cmml">+</mo><mn id="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.3" xref="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.3.cmml">1</mn></mrow><mo id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.4" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.3.cmml">,</mo><mrow id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.cmml"><mn id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.2" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.2.cmml">2</mn><mo id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.1" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.1.cmml"></mo><mi id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.3" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.3.cmml">u</mi></mrow><mo id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.5" rspace="0.055em" stretchy="false" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.3.cmml">)</mo></mrow></mrow><mo id="S2.E8.m1.1.1.1.1.1.1.4.3" rspace="0.222em" xref="S2.E8.m1.1.1.1.1.1.1.4.3.cmml">⋅</mo><msubsup id="S2.E8.m1.1.1.1.1.1.1.4.4" xref="S2.E8.m1.1.1.1.1.1.1.4.4.cmml"><mi id="S2.E8.m1.1.1.1.1.1.1.4.4.2.2" xref="S2.E8.m1.1.1.1.1.1.1.4.4.2.2.cmml">y</mi><msub id="S2.E8.m1.1.1.1.1.1.1.4.4.2.3" xref="S2.E8.m1.1.1.1.1.1.1.4.4.2.3.cmml"><mi id="S2.E8.m1.1.1.1.1.1.1.4.4.2.3.2" xref="S2.E8.m1.1.1.1.1.1.1.4.4.2.3.2.cmml">π</mi><mi id="S2.E8.m1.1.1.1.1.1.1.4.4.2.3.3" xref="S2.E8.m1.1.1.1.1.1.1.4.4.2.3.3.cmml">t</mi></msub><mi id="S2.E8.m1.1.1.1.1.1.1.4.4.3" xref="S2.E8.m1.1.1.1.1.1.1.4.4.3.cmml">t</mi></msubsup></mrow></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E8.m1.4.4.4c" xref="S2.E8.m1.6.6.3.1.cmml"><mrow id="S2.E8.m1.2.2.2.2.2.1" xref="S2.E8.m1.2.2.2.2.2.1.cmml"><mrow id="S2.E8.m1.2.2.2.2.2.1.2" xref="S2.E8.m1.2.2.2.2.2.1.2.cmml"><mtext id="S2.E8.m1.2.2.2.2.2.1.2.2" xref="S2.E8.m1.2.2.2.2.2.1.2.2a.cmml">if </mtext><mo id="S2.E8.m1.2.2.2.2.2.1.2.1" xref="S2.E8.m1.2.2.2.2.2.1.2.1.cmml"></mo><mi id="S2.E8.m1.2.2.2.2.2.1.2.3" xref="S2.E8.m1.2.2.2.2.2.1.2.3.cmml">t</mi></mrow><mo id="S2.E8.m1.2.2.2.2.2.1.1" xref="S2.E8.m1.2.2.2.2.2.1.1.cmml"><</mo><mi id="S2.E8.m1.2.2.2.2.2.1.3" xref="S2.E8.m1.2.2.2.2.2.1.3.cmml">T</mi></mrow></mtd></mtr><mtr id="S2.E8.m1.4.4.4d" xref="S2.E8.m1.6.6.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E8.m1.4.4.4e" xref="S2.E8.m1.6.6.3.1.cmml"><mrow id="S2.E8.m1.3.3.3.3.1.1.2" xref="S2.E8.m1.3.3.3.3.1.1.2.1.cmml"><mrow id="S2.E8.m1.3.3.3.3.1.1.2.1" xref="S2.E8.m1.3.3.3.3.1.1.2.1.cmml"><mi id="S2.E8.m1.3.3.3.3.1.1.2.1.3" xref="S2.E8.m1.3.3.3.3.1.1.2.1.3.cmml">β</mi><mo id="S2.E8.m1.3.3.3.3.1.1.2.1.2" xref="S2.E8.m1.3.3.3.3.1.1.2.1.2.cmml"></mo><mrow id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.2.cmml"><mo id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.2" stretchy="false" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.2.cmml">(</mo><mi id="S2.E8.m1.3.3.3.3.1.1.1" xref="S2.E8.m1.3.3.3.3.1.1.1.cmml">t</mi><mo id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.3" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.2.cmml">,</mo><mrow id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.cmml"><mn id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.2" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.2.cmml">2</mn><mo id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.1" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.1.cmml"></mo><mi id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.3" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.3.cmml">u</mi></mrow><mo id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.4" stretchy="false" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.2.cmml">)</mo></mrow></mrow><mo id="S2.E8.m1.3.3.3.3.1.1.2.2" xref="S2.E8.m1.3.3.3.3.1.1.2.1.cmml">,</mo></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E8.m1.4.4.4f" xref="S2.E8.m1.6.6.3.1.cmml"><mtext id="S2.E8.m1.4.4.4.4.2.1" xref="S2.E8.m1.4.4.4.4.2.1a.cmml">Otherwise</mtext></mtd></mtr></mtable></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E8.m1.6b"><apply id="S2.E8.m1.6.6.cmml" xref="S2.E8.m1.6.6"><eq id="S2.E8.m1.6.6.2.cmml" xref="S2.E8.m1.6.6.2"></eq><apply id="S2.E8.m1.6.6.1.cmml" xref="S2.E8.m1.6.6.1"><times id="S2.E8.m1.6.6.1.2.cmml" xref="S2.E8.m1.6.6.1.2"></times><apply id="S2.E8.m1.6.6.1.3.cmml" xref="S2.E8.m1.6.6.1.3"><ci id="S2.E8.m1.6.6.1.3.1.cmml" xref="S2.E8.m1.6.6.1.3.1">^</ci><ci id="S2.E8.m1.6.6.1.3.2.cmml" xref="S2.E8.m1.6.6.1.3.2">𝛽</ci></apply><interval closure="open" id="S2.E8.m1.6.6.1.1.2.cmml" xref="S2.E8.m1.6.6.1.1.1"><ci id="S2.E8.m1.5.5.cmml" xref="S2.E8.m1.5.5">𝑡</ci><apply id="S2.E8.m1.6.6.1.1.1.1.cmml" xref="S2.E8.m1.6.6.1.1.1.1"><times id="S2.E8.m1.6.6.1.1.1.1.1.cmml" xref="S2.E8.m1.6.6.1.1.1.1.1"></times><cn id="S2.E8.m1.6.6.1.1.1.1.2.cmml" type="integer" xref="S2.E8.m1.6.6.1.1.1.1.2">2</cn><ci id="S2.E8.m1.6.6.1.1.1.1.3.cmml" xref="S2.E8.m1.6.6.1.1.1.1.3">𝑢</ci></apply></interval></apply><apply id="S2.E8.m1.6.6.3.1.cmml" xref="S2.E8.m1.4.4"><csymbol cd="latexml" id="S2.E8.m1.6.6.3.1.1.cmml" xref="S2.E8.m1.4.4.5">cases</csymbol><apply id="S2.E8.m1.1.1.1.1.1.1.cmml" xref="S2.E8.m1.1.1.1.1.1.1"><minus id="S2.E8.m1.1.1.1.1.1.1.5.cmml" xref="S2.E8.m1.1.1.1.1.1.1.5"></minus><apply id="S2.E8.m1.1.1.1.1.1.1.2.cmml" xref="S2.E8.m1.1.1.1.1.1.1.2"><times id="S2.E8.m1.1.1.1.1.1.1.2.2.cmml" xref="S2.E8.m1.1.1.1.1.1.1.2.2"></times><ci id="S2.E8.m1.1.1.1.1.1.1.2.3.cmml" xref="S2.E8.m1.1.1.1.1.1.1.2.3">𝛽</ci><interval closure="open" id="S2.E8.m1.1.1.1.1.1.1.2.1.2.cmml" xref="S2.E8.m1.1.1.1.1.1.1.2.1.1"><ci id="S2.E8.m1.1.1.1.1.1.1.1.cmml" xref="S2.E8.m1.1.1.1.1.1.1.1">𝑡</ci><apply id="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.cmml" xref="S2.E8.m1.1.1.1.1.1.1.2.1.1.1"><times id="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.1.cmml" xref="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.1"></times><cn id="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.2.cmml" type="integer" xref="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.2">2</cn><ci id="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.3.cmml" xref="S2.E8.m1.1.1.1.1.1.1.2.1.1.1.3">𝑢</ci></apply></interval></apply><apply id="S2.E8.m1.1.1.1.1.1.1.4.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4"><ci id="S2.E8.m1.1.1.1.1.1.1.4.3.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.3">⋅</ci><apply id="S2.E8.m1.1.1.1.1.1.1.4.2.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.2"><times id="S2.E8.m1.1.1.1.1.1.1.4.2.3.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.2.3"></times><ci id="S2.E8.m1.1.1.1.1.1.1.4.2.4.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.2.4">𝛽</ci><interval closure="open" id="S2.E8.m1.1.1.1.1.1.1.4.2.2.3.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.2"><apply id="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.cmml" xref="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1"><plus id="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.1.cmml" xref="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.1"></plus><ci id="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.2.cmml" xref="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.2">𝑡</ci><cn id="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.3.cmml" type="integer" xref="S2.E8.m1.1.1.1.1.1.1.3.1.1.1.1.3">1</cn></apply><apply id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2"><times id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.1.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.1"></times><cn id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.2.cmml" type="integer" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.2">2</cn><ci id="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.3.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.2.2.2.2.3">𝑢</ci></apply></interval></apply><apply id="S2.E8.m1.1.1.1.1.1.1.4.4.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.4"><csymbol cd="ambiguous" id="S2.E8.m1.1.1.1.1.1.1.4.4.1.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.4">superscript</csymbol><apply id="S2.E8.m1.1.1.1.1.1.1.4.4.2.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.4"><csymbol cd="ambiguous" id="S2.E8.m1.1.1.1.1.1.1.4.4.2.1.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.4">subscript</csymbol><ci id="S2.E8.m1.1.1.1.1.1.1.4.4.2.2.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.4.2.2">𝑦</ci><apply id="S2.E8.m1.1.1.1.1.1.1.4.4.2.3.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.4.2.3"><csymbol cd="ambiguous" id="S2.E8.m1.1.1.1.1.1.1.4.4.2.3.1.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.4.2.3">subscript</csymbol><ci id="S2.E8.m1.1.1.1.1.1.1.4.4.2.3.2.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.4.2.3.2">𝜋</ci><ci id="S2.E8.m1.1.1.1.1.1.1.4.4.2.3.3.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.4.2.3.3">𝑡</ci></apply></apply><ci id="S2.E8.m1.1.1.1.1.1.1.4.4.3.cmml" xref="S2.E8.m1.1.1.1.1.1.1.4.4.3">𝑡</ci></apply></apply></apply><apply id="S2.E8.m1.2.2.2.2.2.1.cmml" xref="S2.E8.m1.2.2.2.2.2.1"><lt id="S2.E8.m1.2.2.2.2.2.1.1.cmml" xref="S2.E8.m1.2.2.2.2.2.1.1"></lt><apply id="S2.E8.m1.2.2.2.2.2.1.2.cmml" xref="S2.E8.m1.2.2.2.2.2.1.2"><times id="S2.E8.m1.2.2.2.2.2.1.2.1.cmml" xref="S2.E8.m1.2.2.2.2.2.1.2.1"></times><ci id="S2.E8.m1.2.2.2.2.2.1.2.2a.cmml" xref="S2.E8.m1.2.2.2.2.2.1.2.2"><mtext id="S2.E8.m1.2.2.2.2.2.1.2.2.cmml" xref="S2.E8.m1.2.2.2.2.2.1.2.2">if </mtext></ci><ci id="S2.E8.m1.2.2.2.2.2.1.2.3.cmml" xref="S2.E8.m1.2.2.2.2.2.1.2.3">𝑡</ci></apply><ci id="S2.E8.m1.2.2.2.2.2.1.3.cmml" xref="S2.E8.m1.2.2.2.2.2.1.3">𝑇</ci></apply><apply id="S2.E8.m1.3.3.3.3.1.1.2.1.cmml" xref="S2.E8.m1.3.3.3.3.1.1.2"><times id="S2.E8.m1.3.3.3.3.1.1.2.1.2.cmml" xref="S2.E8.m1.3.3.3.3.1.1.2.1.2"></times><ci id="S2.E8.m1.3.3.3.3.1.1.2.1.3.cmml" xref="S2.E8.m1.3.3.3.3.1.1.2.1.3">𝛽</ci><interval closure="open" id="S2.E8.m1.3.3.3.3.1.1.2.1.1.2.cmml" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.1"><ci id="S2.E8.m1.3.3.3.3.1.1.1.cmml" xref="S2.E8.m1.3.3.3.3.1.1.1">𝑡</ci><apply id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.cmml" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1"><times id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.1.cmml" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.1"></times><cn id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.2.cmml" type="integer" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.2">2</cn><ci id="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.3.cmml" xref="S2.E8.m1.3.3.3.3.1.1.2.1.1.1.1.3">𝑢</ci></apply></interval></apply><ci id="S2.E8.m1.4.4.4.4.2.1a.cmml" xref="S2.E8.m1.4.4.4.4.2.1"><mtext id="S2.E8.m1.4.4.4.4.2.1.cmml" xref="S2.E8.m1.4.4.4.4.2.1">Otherwise</mtext></ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E8.m1.6c">\hat{\beta}(t,2u)=\begin{cases}\beta(t,2u)-\beta(t+1,2u)\cdot y_{\pi_{t}}^{t}&% \text{if }t<T\\ \beta(t,2u),&\text{Otherwise}\end{cases}</annotation><annotation encoding="application/x-llamapun" id="S2.E8.m1.6d">over^ start_ARG italic_β end_ARG ( italic_t , 2 italic_u ) = { start_ROW start_CELL italic_β ( italic_t , 2 italic_u ) - italic_β ( italic_t + 1 , 2 italic_u ) ⋅ italic_y start_POSTSUBSCRIPT italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT end_CELL start_CELL if italic_t < italic_T end_CELL end_ROW start_ROW start_CELL italic_β ( italic_t , 2 italic_u ) , end_CELL start_CELL Otherwise end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(8)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS2.p2.13">The above grouping strategy inherits the original derivation in <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib33" title="">33</a>]</cite>.</p> </div> <div class="ltx_para" id="S2.SS2.p3"> <p class="ltx_p" id="S2.SS2.p3.18">With Eq. <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S2.E7" title="In II-B Speaker-aware CTC based on minimizing Bayes risk ‣ II Methods ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag">7</span></a> enables computing the training objective by summing over time steps, we define the following speaker-aware risk function to constrain the emitting time of tokens with consideration of their belonging speakers:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E9"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="r_{sa}(s,t)=\begin{cases}-\frac{1}{1+e^{(\lambda(\frac{t}{T}-b))}}&\text{if }s% =1\\ -\frac{1}{1+e^{-(\lambda(\frac{t}{T}-b))}}&\text{Otherwise}\end{cases},\,\,b=% \frac{M}{M+N}" class="ltx_Math" display="block" id="S2.E9.m1.8"><semantics id="S2.E9.m1.8a"><mrow id="S2.E9.m1.8.8.2" xref="S2.E9.m1.8.8.3.cmml"><mrow id="S2.E9.m1.7.7.1.1" xref="S2.E9.m1.7.7.1.1.cmml"><mrow id="S2.E9.m1.7.7.1.1.2" xref="S2.E9.m1.7.7.1.1.2.cmml"><msub id="S2.E9.m1.7.7.1.1.2.2" xref="S2.E9.m1.7.7.1.1.2.2.cmml"><mi id="S2.E9.m1.7.7.1.1.2.2.2" xref="S2.E9.m1.7.7.1.1.2.2.2.cmml">r</mi><mrow id="S2.E9.m1.7.7.1.1.2.2.3" xref="S2.E9.m1.7.7.1.1.2.2.3.cmml"><mi id="S2.E9.m1.7.7.1.1.2.2.3.2" xref="S2.E9.m1.7.7.1.1.2.2.3.2.cmml">s</mi><mo id="S2.E9.m1.7.7.1.1.2.2.3.1" xref="S2.E9.m1.7.7.1.1.2.2.3.1.cmml"></mo><mi id="S2.E9.m1.7.7.1.1.2.2.3.3" xref="S2.E9.m1.7.7.1.1.2.2.3.3.cmml">a</mi></mrow></msub><mo id="S2.E9.m1.7.7.1.1.2.1" xref="S2.E9.m1.7.7.1.1.2.1.cmml"></mo><mrow id="S2.E9.m1.7.7.1.1.2.3.2" xref="S2.E9.m1.7.7.1.1.2.3.1.cmml"><mo id="S2.E9.m1.7.7.1.1.2.3.2.1" stretchy="false" xref="S2.E9.m1.7.7.1.1.2.3.1.cmml">(</mo><mi id="S2.E9.m1.5.5" xref="S2.E9.m1.5.5.cmml">s</mi><mo id="S2.E9.m1.7.7.1.1.2.3.2.2" xref="S2.E9.m1.7.7.1.1.2.3.1.cmml">,</mo><mi id="S2.E9.m1.6.6" xref="S2.E9.m1.6.6.cmml">t</mi><mo id="S2.E9.m1.7.7.1.1.2.3.2.3" stretchy="false" xref="S2.E9.m1.7.7.1.1.2.3.1.cmml">)</mo></mrow></mrow><mo id="S2.E9.m1.7.7.1.1.1" xref="S2.E9.m1.7.7.1.1.1.cmml">=</mo><mrow id="S2.E9.m1.4.4" xref="S2.E9.m1.7.7.1.1.3.1.cmml"><mo id="S2.E9.m1.4.4.5" xref="S2.E9.m1.7.7.1.1.3.1.1.cmml">{</mo><mtable columnspacing="5pt" displaystyle="true" id="S2.E9.m1.4.4.4" rowspacing="0pt" xref="S2.E9.m1.7.7.1.1.3.1.cmml"><mtr id="S2.E9.m1.4.4.4a" xref="S2.E9.m1.7.7.1.1.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E9.m1.4.4.4b" xref="S2.E9.m1.7.7.1.1.3.1.cmml"><mrow id="S2.E9.m1.1.1.1.1.1.1" xref="S2.E9.m1.1.1.1.1.1.1.cmml"><mo id="S2.E9.m1.1.1.1.1.1.1a" xref="S2.E9.m1.1.1.1.1.1.1.cmml">−</mo><mstyle displaystyle="false" id="S2.E9.m1.1.1.1.1.1.1.1" xref="S2.E9.m1.1.1.1.1.1.1.1.cmml"><mfrac id="S2.E9.m1.1.1.1.1.1.1.1a" xref="S2.E9.m1.1.1.1.1.1.1.1.cmml"><mn id="S2.E9.m1.1.1.1.1.1.1.1.3" xref="S2.E9.m1.1.1.1.1.1.1.1.3.cmml">1</mn><mrow id="S2.E9.m1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.1.1.1.1.1.1.1.1.cmml"><mn id="S2.E9.m1.1.1.1.1.1.1.1.1.3" xref="S2.E9.m1.1.1.1.1.1.1.1.1.3.cmml">1</mn><mo id="S2.E9.m1.1.1.1.1.1.1.1.1.2" xref="S2.E9.m1.1.1.1.1.1.1.1.1.2.cmml">+</mo><msup id="S2.E9.m1.1.1.1.1.1.1.1.1.4" xref="S2.E9.m1.1.1.1.1.1.1.1.1.4.cmml"><mi id="S2.E9.m1.1.1.1.1.1.1.1.1.4.2" xref="S2.E9.m1.1.1.1.1.1.1.1.1.4.2.cmml">e</mi><mrow id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mo id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml">λ</mi><mo id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"></mo><mrow id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mo id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mfrac id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.2" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml">t</mi><mi id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.3" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml">T</mi></mfrac><mo id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><mi id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml">b</mi></mrow><mo id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3" stretchy="false" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.3" stretchy="false" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow></msup></mrow></mfrac></mstyle></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E9.m1.4.4.4c" xref="S2.E9.m1.7.7.1.1.3.1.cmml"><mrow id="S2.E9.m1.2.2.2.2.2.1" xref="S2.E9.m1.2.2.2.2.2.1.cmml"><mrow id="S2.E9.m1.2.2.2.2.2.1.2" xref="S2.E9.m1.2.2.2.2.2.1.2.cmml"><mtext id="S2.E9.m1.2.2.2.2.2.1.2.2" xref="S2.E9.m1.2.2.2.2.2.1.2.2a.cmml">if </mtext><mo id="S2.E9.m1.2.2.2.2.2.1.2.1" xref="S2.E9.m1.2.2.2.2.2.1.2.1.cmml"></mo><mi id="S2.E9.m1.2.2.2.2.2.1.2.3" xref="S2.E9.m1.2.2.2.2.2.1.2.3.cmml">s</mi></mrow><mo id="S2.E9.m1.2.2.2.2.2.1.1" xref="S2.E9.m1.2.2.2.2.2.1.1.cmml">=</mo><mn id="S2.E9.m1.2.2.2.2.2.1.3" xref="S2.E9.m1.2.2.2.2.2.1.3.cmml">1</mn></mrow></mtd></mtr><mtr id="S2.E9.m1.4.4.4d" xref="S2.E9.m1.7.7.1.1.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E9.m1.4.4.4e" xref="S2.E9.m1.7.7.1.1.3.1.cmml"><mrow id="S2.E9.m1.3.3.3.3.1.1" xref="S2.E9.m1.3.3.3.3.1.1.cmml"><mo id="S2.E9.m1.3.3.3.3.1.1a" xref="S2.E9.m1.3.3.3.3.1.1.cmml">−</mo><mstyle displaystyle="false" id="S2.E9.m1.3.3.3.3.1.1.1" xref="S2.E9.m1.3.3.3.3.1.1.1.cmml"><mfrac id="S2.E9.m1.3.3.3.3.1.1.1a" xref="S2.E9.m1.3.3.3.3.1.1.1.cmml"><mn id="S2.E9.m1.3.3.3.3.1.1.1.3" xref="S2.E9.m1.3.3.3.3.1.1.1.3.cmml">1</mn><mrow id="S2.E9.m1.3.3.3.3.1.1.1.1" xref="S2.E9.m1.3.3.3.3.1.1.1.1.cmml"><mn id="S2.E9.m1.3.3.3.3.1.1.1.1.3" xref="S2.E9.m1.3.3.3.3.1.1.1.1.3.cmml">1</mn><mo id="S2.E9.m1.3.3.3.3.1.1.1.1.2" xref="S2.E9.m1.3.3.3.3.1.1.1.1.2.cmml">+</mo><msup id="S2.E9.m1.3.3.3.3.1.1.1.1.4" xref="S2.E9.m1.3.3.3.3.1.1.1.1.4.cmml"><mi id="S2.E9.m1.3.3.3.3.1.1.1.1.4.2" xref="S2.E9.m1.3.3.3.3.1.1.1.1.4.2.cmml">e</mi><mrow id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.cmml"><mo id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1a" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.cmml">−</mo><mrow id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.cmml"><mo id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.cmml"><mi id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.3" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.3.cmml">λ</mi><mo id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.2" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.2.cmml"></mo><mrow id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mo id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mfrac id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2.2" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml">t</mi><mi id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2.3" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml">T</mi></mfrac><mo id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><mi id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml">b</mi></mrow><mo id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.3" stretchy="false" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.3" stretchy="false" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></msup></mrow></mfrac></mstyle></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E9.m1.4.4.4f" xref="S2.E9.m1.7.7.1.1.3.1.cmml"><mtext id="S2.E9.m1.4.4.4.4.2.1" xref="S2.E9.m1.4.4.4.4.2.1a.cmml">Otherwise</mtext></mtd></mtr></mtable></mrow></mrow><mo id="S2.E9.m1.8.8.2.3" rspace="0.497em" xref="S2.E9.m1.8.8.3a.cmml">,</mo><mrow id="S2.E9.m1.8.8.2.2" xref="S2.E9.m1.8.8.2.2.cmml"><mi id="S2.E9.m1.8.8.2.2.2" xref="S2.E9.m1.8.8.2.2.2.cmml">b</mi><mo id="S2.E9.m1.8.8.2.2.1" xref="S2.E9.m1.8.8.2.2.1.cmml">=</mo><mfrac id="S2.E9.m1.8.8.2.2.3" xref="S2.E9.m1.8.8.2.2.3.cmml"><mi id="S2.E9.m1.8.8.2.2.3.2" xref="S2.E9.m1.8.8.2.2.3.2.cmml">M</mi><mrow id="S2.E9.m1.8.8.2.2.3.3" xref="S2.E9.m1.8.8.2.2.3.3.cmml"><mi id="S2.E9.m1.8.8.2.2.3.3.2" xref="S2.E9.m1.8.8.2.2.3.3.2.cmml">M</mi><mo id="S2.E9.m1.8.8.2.2.3.3.1" xref="S2.E9.m1.8.8.2.2.3.3.1.cmml">+</mo><mi id="S2.E9.m1.8.8.2.2.3.3.3" xref="S2.E9.m1.8.8.2.2.3.3.3.cmml">N</mi></mrow></mfrac></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E9.m1.8b"><apply id="S2.E9.m1.8.8.3.cmml" xref="S2.E9.m1.8.8.2"><csymbol cd="ambiguous" id="S2.E9.m1.8.8.3a.cmml" xref="S2.E9.m1.8.8.2.3">formulae-sequence</csymbol><apply id="S2.E9.m1.7.7.1.1.cmml" xref="S2.E9.m1.7.7.1.1"><eq id="S2.E9.m1.7.7.1.1.1.cmml" xref="S2.E9.m1.7.7.1.1.1"></eq><apply id="S2.E9.m1.7.7.1.1.2.cmml" xref="S2.E9.m1.7.7.1.1.2"><times id="S2.E9.m1.7.7.1.1.2.1.cmml" xref="S2.E9.m1.7.7.1.1.2.1"></times><apply id="S2.E9.m1.7.7.1.1.2.2.cmml" xref="S2.E9.m1.7.7.1.1.2.2"><csymbol cd="ambiguous" id="S2.E9.m1.7.7.1.1.2.2.1.cmml" xref="S2.E9.m1.7.7.1.1.2.2">subscript</csymbol><ci id="S2.E9.m1.7.7.1.1.2.2.2.cmml" xref="S2.E9.m1.7.7.1.1.2.2.2">𝑟</ci><apply id="S2.E9.m1.7.7.1.1.2.2.3.cmml" xref="S2.E9.m1.7.7.1.1.2.2.3"><times id="S2.E9.m1.7.7.1.1.2.2.3.1.cmml" xref="S2.E9.m1.7.7.1.1.2.2.3.1"></times><ci id="S2.E9.m1.7.7.1.1.2.2.3.2.cmml" xref="S2.E9.m1.7.7.1.1.2.2.3.2">𝑠</ci><ci id="S2.E9.m1.7.7.1.1.2.2.3.3.cmml" xref="S2.E9.m1.7.7.1.1.2.2.3.3">𝑎</ci></apply></apply><interval closure="open" id="S2.E9.m1.7.7.1.1.2.3.1.cmml" xref="S2.E9.m1.7.7.1.1.2.3.2"><ci id="S2.E9.m1.5.5.cmml" xref="S2.E9.m1.5.5">𝑠</ci><ci id="S2.E9.m1.6.6.cmml" xref="S2.E9.m1.6.6">𝑡</ci></interval></apply><apply id="S2.E9.m1.7.7.1.1.3.1.cmml" xref="S2.E9.m1.4.4"><csymbol cd="latexml" id="S2.E9.m1.7.7.1.1.3.1.1.cmml" xref="S2.E9.m1.4.4.5">cases</csymbol><apply id="S2.E9.m1.1.1.1.1.1.1.cmml" xref="S2.E9.m1.1.1.1.1.1.1"><minus id="S2.E9.m1.1.1.1.1.1.1.2.cmml" xref="S2.E9.m1.1.1.1.1.1.1"></minus><apply id="S2.E9.m1.1.1.1.1.1.1.1.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1"><divide id="S2.E9.m1.1.1.1.1.1.1.1.2.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1"></divide><cn id="S2.E9.m1.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S2.E9.m1.1.1.1.1.1.1.1.3">1</cn><apply id="S2.E9.m1.1.1.1.1.1.1.1.1.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1"><plus id="S2.E9.m1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.2"></plus><cn id="S2.E9.m1.1.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S2.E9.m1.1.1.1.1.1.1.1.1.3">1</cn><apply id="S2.E9.m1.1.1.1.1.1.1.1.1.4.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.4"><csymbol cd="ambiguous" id="S2.E9.m1.1.1.1.1.1.1.1.1.4.1.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.4">superscript</csymbol><ci id="S2.E9.m1.1.1.1.1.1.1.1.1.4.2.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.4.2">𝑒</ci><apply id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1"><times id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.2"></times><ci id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.3">𝜆</ci><apply id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1"><minus id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2"><divide id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2"></divide><ci id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.2">𝑡</ci><ci id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.2.3">𝑇</ci></apply><ci id="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E9.m1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.3">𝑏</ci></apply></apply></apply></apply></apply></apply><apply id="S2.E9.m1.2.2.2.2.2.1.cmml" xref="S2.E9.m1.2.2.2.2.2.1"><eq id="S2.E9.m1.2.2.2.2.2.1.1.cmml" xref="S2.E9.m1.2.2.2.2.2.1.1"></eq><apply id="S2.E9.m1.2.2.2.2.2.1.2.cmml" xref="S2.E9.m1.2.2.2.2.2.1.2"><times id="S2.E9.m1.2.2.2.2.2.1.2.1.cmml" xref="S2.E9.m1.2.2.2.2.2.1.2.1"></times><ci id="S2.E9.m1.2.2.2.2.2.1.2.2a.cmml" xref="S2.E9.m1.2.2.2.2.2.1.2.2"><mtext id="S2.E9.m1.2.2.2.2.2.1.2.2.cmml" xref="S2.E9.m1.2.2.2.2.2.1.2.2">if </mtext></ci><ci id="S2.E9.m1.2.2.2.2.2.1.2.3.cmml" xref="S2.E9.m1.2.2.2.2.2.1.2.3">𝑠</ci></apply><cn id="S2.E9.m1.2.2.2.2.2.1.3.cmml" type="integer" xref="S2.E9.m1.2.2.2.2.2.1.3">1</cn></apply><apply id="S2.E9.m1.3.3.3.3.1.1.cmml" xref="S2.E9.m1.3.3.3.3.1.1"><minus id="S2.E9.m1.3.3.3.3.1.1.2.cmml" xref="S2.E9.m1.3.3.3.3.1.1"></minus><apply id="S2.E9.m1.3.3.3.3.1.1.1.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1"><divide id="S2.E9.m1.3.3.3.3.1.1.1.2.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1"></divide><cn id="S2.E9.m1.3.3.3.3.1.1.1.3.cmml" type="integer" xref="S2.E9.m1.3.3.3.3.1.1.1.3">1</cn><apply id="S2.E9.m1.3.3.3.3.1.1.1.1.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1"><plus id="S2.E9.m1.3.3.3.3.1.1.1.1.2.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.2"></plus><cn id="S2.E9.m1.3.3.3.3.1.1.1.1.3.cmml" type="integer" xref="S2.E9.m1.3.3.3.3.1.1.1.1.3">1</cn><apply id="S2.E9.m1.3.3.3.3.1.1.1.1.4.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.4"><csymbol cd="ambiguous" id="S2.E9.m1.3.3.3.3.1.1.1.1.4.1.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.4">superscript</csymbol><ci id="S2.E9.m1.3.3.3.3.1.1.1.1.4.2.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.4.2">𝑒</ci><apply id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1"><minus id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.2.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1"></minus><apply id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1"><times id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.2"></times><ci id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.3">𝜆</ci><apply id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1"><minus id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2"><divide id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2"></divide><ci id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2.2">𝑡</ci><ci id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.2.3">𝑇</ci></apply><ci id="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E9.m1.3.3.3.3.1.1.1.1.1.1.1.1.1.1.1.1.3">𝑏</ci></apply></apply></apply></apply></apply></apply></apply><ci id="S2.E9.m1.4.4.4.4.2.1a.cmml" xref="S2.E9.m1.4.4.4.4.2.1"><mtext id="S2.E9.m1.4.4.4.4.2.1.cmml" xref="S2.E9.m1.4.4.4.4.2.1">Otherwise</mtext></ci></apply></apply><apply id="S2.E9.m1.8.8.2.2.cmml" xref="S2.E9.m1.8.8.2.2"><eq id="S2.E9.m1.8.8.2.2.1.cmml" xref="S2.E9.m1.8.8.2.2.1"></eq><ci id="S2.E9.m1.8.8.2.2.2.cmml" xref="S2.E9.m1.8.8.2.2.2">𝑏</ci><apply id="S2.E9.m1.8.8.2.2.3.cmml" xref="S2.E9.m1.8.8.2.2.3"><divide id="S2.E9.m1.8.8.2.2.3.1.cmml" xref="S2.E9.m1.8.8.2.2.3"></divide><ci id="S2.E9.m1.8.8.2.2.3.2.cmml" xref="S2.E9.m1.8.8.2.2.3.2">𝑀</ci><apply id="S2.E9.m1.8.8.2.2.3.3.cmml" xref="S2.E9.m1.8.8.2.2.3.3"><plus id="S2.E9.m1.8.8.2.2.3.3.1.cmml" xref="S2.E9.m1.8.8.2.2.3.3.1"></plus><ci id="S2.E9.m1.8.8.2.2.3.3.2.cmml" xref="S2.E9.m1.8.8.2.2.3.3.2">𝑀</ci><ci id="S2.E9.m1.8.8.2.2.3.3.3.cmml" xref="S2.E9.m1.8.8.2.2.3.3.3">𝑁</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E9.m1.8c">r_{sa}(s,t)=\begin{cases}-\frac{1}{1+e^{(\lambda(\frac{t}{T}-b))}}&\text{if }s% =1\\ -\frac{1}{1+e^{-(\lambda(\frac{t}{T}-b))}}&\text{Otherwise}\end{cases},\,\,b=% \frac{M}{M+N}</annotation><annotation encoding="application/x-llamapun" id="S2.E9.m1.8d">italic_r start_POSTSUBSCRIPT italic_s italic_a end_POSTSUBSCRIPT ( italic_s , italic_t ) = { start_ROW start_CELL - divide start_ARG 1 end_ARG start_ARG 1 + italic_e start_POSTSUPERSCRIPT ( italic_λ ( divide start_ARG italic_t end_ARG start_ARG italic_T end_ARG - italic_b ) ) end_POSTSUPERSCRIPT end_ARG end_CELL start_CELL if italic_s = 1 end_CELL end_ROW start_ROW start_CELL - divide start_ARG 1 end_ARG start_ARG 1 + italic_e start_POSTSUPERSCRIPT - ( italic_λ ( divide start_ARG italic_t end_ARG start_ARG italic_T end_ARG - italic_b ) ) end_POSTSUPERSCRIPT end_ARG end_CELL start_CELL Otherwise end_CELL end_ROW , italic_b = divide start_ARG italic_M end_ARG start_ARG italic_M + italic_N end_ARG</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(9)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS2.p3.6">in which <math alttext="s" class="ltx_Math" display="inline" id="S2.SS2.p3.1.m1.1"><semantics id="S2.SS2.p3.1.m1.1a"><mi id="S2.SS2.p3.1.m1.1.1" xref="S2.SS2.p3.1.m1.1.1.cmml">s</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.1.m1.1b"><ci id="S2.SS2.p3.1.m1.1.1.cmml" xref="S2.SS2.p3.1.m1.1.1">𝑠</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.1.m1.1c">s</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.1.m1.1d">italic_s</annotation></semantics></math> represents speaker index where speakers were ordered chronologically following the first-in-first-out serialization strategy. <math alttext="\lambda" class="ltx_Math" display="inline" id="S2.SS2.p3.2.m2.1"><semantics id="S2.SS2.p3.2.m2.1a"><mi id="S2.SS2.p3.2.m2.1.1" xref="S2.SS2.p3.2.m2.1.1.cmml">λ</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.2.m2.1b"><ci id="S2.SS2.p3.2.m2.1.1.cmml" xref="S2.SS2.p3.2.m2.1.1">𝜆</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.2.m2.1c">\lambda</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.2.m2.1d">italic_λ</annotation></semantics></math> is an adjustable Bayes risk factor controlling the sharpness of risks, <math alttext="\lambda=0" class="ltx_Math" display="inline" id="S2.SS2.p3.3.m3.1"><semantics id="S2.SS2.p3.3.m3.1a"><mrow id="S2.SS2.p3.3.m3.1.1" xref="S2.SS2.p3.3.m3.1.1.cmml"><mi id="S2.SS2.p3.3.m3.1.1.2" xref="S2.SS2.p3.3.m3.1.1.2.cmml">λ</mi><mo id="S2.SS2.p3.3.m3.1.1.1" xref="S2.SS2.p3.3.m3.1.1.1.cmml">=</mo><mn id="S2.SS2.p3.3.m3.1.1.3" xref="S2.SS2.p3.3.m3.1.1.3.cmml">0</mn></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.3.m3.1b"><apply id="S2.SS2.p3.3.m3.1.1.cmml" xref="S2.SS2.p3.3.m3.1.1"><eq id="S2.SS2.p3.3.m3.1.1.1.cmml" xref="S2.SS2.p3.3.m3.1.1.1"></eq><ci id="S2.SS2.p3.3.m3.1.1.2.cmml" xref="S2.SS2.p3.3.m3.1.1.2">𝜆</ci><cn id="S2.SS2.p3.3.m3.1.1.3.cmml" type="integer" xref="S2.SS2.p3.3.m3.1.1.3">0</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.3.m3.1c">\lambda=0</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.3.m3.1d">italic_λ = 0</annotation></semantics></math> leads to uniform risks for all paths. And <math alttext="b" class="ltx_Math" display="inline" id="S2.SS2.p3.4.m4.1"><semantics id="S2.SS2.p3.4.m4.1a"><mi id="S2.SS2.p3.4.m4.1.1" xref="S2.SS2.p3.4.m4.1.1.cmml">b</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.4.m4.1b"><ci id="S2.SS2.p3.4.m4.1.1.cmml" xref="S2.SS2.p3.4.m4.1.1">𝑏</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.4.m4.1c">b</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.4.m4.1d">italic_b</annotation></semantics></math> is a ratio of speaker utterance lengths, used to determine a speaker boundary K invariant to alignment paths. The <math alttext="r_{sa}(s,t)" class="ltx_Math" display="inline" id="S2.SS2.p3.5.m5.2"><semantics id="S2.SS2.p3.5.m5.2a"><mrow id="S2.SS2.p3.5.m5.2.3" xref="S2.SS2.p3.5.m5.2.3.cmml"><msub id="S2.SS2.p3.5.m5.2.3.2" xref="S2.SS2.p3.5.m5.2.3.2.cmml"><mi id="S2.SS2.p3.5.m5.2.3.2.2" xref="S2.SS2.p3.5.m5.2.3.2.2.cmml">r</mi><mrow id="S2.SS2.p3.5.m5.2.3.2.3" xref="S2.SS2.p3.5.m5.2.3.2.3.cmml"><mi id="S2.SS2.p3.5.m5.2.3.2.3.2" xref="S2.SS2.p3.5.m5.2.3.2.3.2.cmml">s</mi><mo id="S2.SS2.p3.5.m5.2.3.2.3.1" xref="S2.SS2.p3.5.m5.2.3.2.3.1.cmml"></mo><mi id="S2.SS2.p3.5.m5.2.3.2.3.3" xref="S2.SS2.p3.5.m5.2.3.2.3.3.cmml">a</mi></mrow></msub><mo id="S2.SS2.p3.5.m5.2.3.1" xref="S2.SS2.p3.5.m5.2.3.1.cmml"></mo><mrow id="S2.SS2.p3.5.m5.2.3.3.2" xref="S2.SS2.p3.5.m5.2.3.3.1.cmml"><mo id="S2.SS2.p3.5.m5.2.3.3.2.1" stretchy="false" xref="S2.SS2.p3.5.m5.2.3.3.1.cmml">(</mo><mi id="S2.SS2.p3.5.m5.1.1" xref="S2.SS2.p3.5.m5.1.1.cmml">s</mi><mo id="S2.SS2.p3.5.m5.2.3.3.2.2" xref="S2.SS2.p3.5.m5.2.3.3.1.cmml">,</mo><mi id="S2.SS2.p3.5.m5.2.2" xref="S2.SS2.p3.5.m5.2.2.cmml">t</mi><mo id="S2.SS2.p3.5.m5.2.3.3.2.3" stretchy="false" xref="S2.SS2.p3.5.m5.2.3.3.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.5.m5.2b"><apply id="S2.SS2.p3.5.m5.2.3.cmml" xref="S2.SS2.p3.5.m5.2.3"><times id="S2.SS2.p3.5.m5.2.3.1.cmml" xref="S2.SS2.p3.5.m5.2.3.1"></times><apply id="S2.SS2.p3.5.m5.2.3.2.cmml" xref="S2.SS2.p3.5.m5.2.3.2"><csymbol cd="ambiguous" id="S2.SS2.p3.5.m5.2.3.2.1.cmml" xref="S2.SS2.p3.5.m5.2.3.2">subscript</csymbol><ci id="S2.SS2.p3.5.m5.2.3.2.2.cmml" xref="S2.SS2.p3.5.m5.2.3.2.2">𝑟</ci><apply id="S2.SS2.p3.5.m5.2.3.2.3.cmml" xref="S2.SS2.p3.5.m5.2.3.2.3"><times id="S2.SS2.p3.5.m5.2.3.2.3.1.cmml" xref="S2.SS2.p3.5.m5.2.3.2.3.1"></times><ci id="S2.SS2.p3.5.m5.2.3.2.3.2.cmml" xref="S2.SS2.p3.5.m5.2.3.2.3.2">𝑠</ci><ci id="S2.SS2.p3.5.m5.2.3.2.3.3.cmml" xref="S2.SS2.p3.5.m5.2.3.2.3.3">𝑎</ci></apply></apply><interval closure="open" id="S2.SS2.p3.5.m5.2.3.3.1.cmml" xref="S2.SS2.p3.5.m5.2.3.3.2"><ci id="S2.SS2.p3.5.m5.1.1.cmml" xref="S2.SS2.p3.5.m5.1.1">𝑠</ci><ci id="S2.SS2.p3.5.m5.2.2.cmml" xref="S2.SS2.p3.5.m5.2.2">𝑡</ci></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.5.m5.2c">r_{sa}(s,t)</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.5.m5.2d">italic_r start_POSTSUBSCRIPT italic_s italic_a end_POSTSUBSCRIPT ( italic_s , italic_t )</annotation></semantics></math> overall is a conditional Sigmoid function, assigning high or low risks according to the established speaker boundary. Subsequently, the training objective for a certain <math alttext="l_{u}" class="ltx_Math" display="inline" id="S2.SS2.p3.6.m6.1"><semantics id="S2.SS2.p3.6.m6.1a"><msub id="S2.SS2.p3.6.m6.1.1" xref="S2.SS2.p3.6.m6.1.1.cmml"><mi id="S2.SS2.p3.6.m6.1.1.2" xref="S2.SS2.p3.6.m6.1.1.2.cmml">l</mi><mi id="S2.SS2.p3.6.m6.1.1.3" xref="S2.SS2.p3.6.m6.1.1.3.cmml">u</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.6.m6.1b"><apply id="S2.SS2.p3.6.m6.1.1.cmml" xref="S2.SS2.p3.6.m6.1.1"><csymbol cd="ambiguous" id="S2.SS2.p3.6.m6.1.1.1.cmml" xref="S2.SS2.p3.6.m6.1.1">subscript</csymbol><ci id="S2.SS2.p3.6.m6.1.1.2.cmml" xref="S2.SS2.p3.6.m6.1.1.2">𝑙</ci><ci id="S2.SS2.p3.6.m6.1.1.3.cmml" xref="S2.SS2.p3.6.m6.1.1.3">𝑢</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.6.m6.1c">l_{u}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.6.m6.1d">italic_l start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT</annotation></semantics></math> is:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E10"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\mathcal{J}^{\prime}_{sa}(l,x,s,u)=\sum^{T}_{t=1}r_{sa}(s,t)\cdot\frac{\alpha(% t,2u)\cdot\hat{\beta}(t,2u)}{y^{t}_{\pi_{t}}}" class="ltx_Math" display="block" id="S2.E10.m1.10"><semantics id="S2.E10.m1.10a"><mrow id="S2.E10.m1.10.11" xref="S2.E10.m1.10.11.cmml"><mrow id="S2.E10.m1.10.11.2" xref="S2.E10.m1.10.11.2.cmml"><msubsup id="S2.E10.m1.10.11.2.2" xref="S2.E10.m1.10.11.2.2.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E10.m1.10.11.2.2.2.2" xref="S2.E10.m1.10.11.2.2.2.2.cmml">𝒥</mi><mrow id="S2.E10.m1.10.11.2.2.3" xref="S2.E10.m1.10.11.2.2.3.cmml"><mi id="S2.E10.m1.10.11.2.2.3.2" xref="S2.E10.m1.10.11.2.2.3.2.cmml">s</mi><mo id="S2.E10.m1.10.11.2.2.3.1" xref="S2.E10.m1.10.11.2.2.3.1.cmml"></mo><mi id="S2.E10.m1.10.11.2.2.3.3" xref="S2.E10.m1.10.11.2.2.3.3.cmml">a</mi></mrow><mo id="S2.E10.m1.10.11.2.2.2.3" xref="S2.E10.m1.10.11.2.2.2.3.cmml">′</mo></msubsup><mo id="S2.E10.m1.10.11.2.1" xref="S2.E10.m1.10.11.2.1.cmml"></mo><mrow id="S2.E10.m1.10.11.2.3.2" xref="S2.E10.m1.10.11.2.3.1.cmml"><mo id="S2.E10.m1.10.11.2.3.2.1" stretchy="false" xref="S2.E10.m1.10.11.2.3.1.cmml">(</mo><mi id="S2.E10.m1.5.5" xref="S2.E10.m1.5.5.cmml">l</mi><mo id="S2.E10.m1.10.11.2.3.2.2" xref="S2.E10.m1.10.11.2.3.1.cmml">,</mo><mi id="S2.E10.m1.6.6" xref="S2.E10.m1.6.6.cmml">x</mi><mo id="S2.E10.m1.10.11.2.3.2.3" xref="S2.E10.m1.10.11.2.3.1.cmml">,</mo><mi id="S2.E10.m1.7.7" xref="S2.E10.m1.7.7.cmml">s</mi><mo id="S2.E10.m1.10.11.2.3.2.4" xref="S2.E10.m1.10.11.2.3.1.cmml">,</mo><mi id="S2.E10.m1.8.8" xref="S2.E10.m1.8.8.cmml">u</mi><mo id="S2.E10.m1.10.11.2.3.2.5" stretchy="false" xref="S2.E10.m1.10.11.2.3.1.cmml">)</mo></mrow></mrow><mo id="S2.E10.m1.10.11.1" rspace="0.111em" xref="S2.E10.m1.10.11.1.cmml">=</mo><mrow id="S2.E10.m1.10.11.3" xref="S2.E10.m1.10.11.3.cmml"><munderover id="S2.E10.m1.10.11.3.1" xref="S2.E10.m1.10.11.3.1.cmml"><mo id="S2.E10.m1.10.11.3.1.2.2" movablelimits="false" xref="S2.E10.m1.10.11.3.1.2.2.cmml">∑</mo><mrow id="S2.E10.m1.10.11.3.1.3" xref="S2.E10.m1.10.11.3.1.3.cmml"><mi id="S2.E10.m1.10.11.3.1.3.2" xref="S2.E10.m1.10.11.3.1.3.2.cmml">t</mi><mo id="S2.E10.m1.10.11.3.1.3.1" xref="S2.E10.m1.10.11.3.1.3.1.cmml">=</mo><mn id="S2.E10.m1.10.11.3.1.3.3" xref="S2.E10.m1.10.11.3.1.3.3.cmml">1</mn></mrow><mi id="S2.E10.m1.10.11.3.1.2.3" xref="S2.E10.m1.10.11.3.1.2.3.cmml">T</mi></munderover><mrow id="S2.E10.m1.10.11.3.2" xref="S2.E10.m1.10.11.3.2.cmml"><mrow id="S2.E10.m1.10.11.3.2.2" xref="S2.E10.m1.10.11.3.2.2.cmml"><msub id="S2.E10.m1.10.11.3.2.2.2" xref="S2.E10.m1.10.11.3.2.2.2.cmml"><mi id="S2.E10.m1.10.11.3.2.2.2.2" xref="S2.E10.m1.10.11.3.2.2.2.2.cmml">r</mi><mrow id="S2.E10.m1.10.11.3.2.2.2.3" xref="S2.E10.m1.10.11.3.2.2.2.3.cmml"><mi id="S2.E10.m1.10.11.3.2.2.2.3.2" xref="S2.E10.m1.10.11.3.2.2.2.3.2.cmml">s</mi><mo id="S2.E10.m1.10.11.3.2.2.2.3.1" xref="S2.E10.m1.10.11.3.2.2.2.3.1.cmml"></mo><mi id="S2.E10.m1.10.11.3.2.2.2.3.3" xref="S2.E10.m1.10.11.3.2.2.2.3.3.cmml">a</mi></mrow></msub><mo id="S2.E10.m1.10.11.3.2.2.1" xref="S2.E10.m1.10.11.3.2.2.1.cmml"></mo><mrow id="S2.E10.m1.10.11.3.2.2.3.2" xref="S2.E10.m1.10.11.3.2.2.3.1.cmml"><mo id="S2.E10.m1.10.11.3.2.2.3.2.1" stretchy="false" xref="S2.E10.m1.10.11.3.2.2.3.1.cmml">(</mo><mi id="S2.E10.m1.9.9" xref="S2.E10.m1.9.9.cmml">s</mi><mo id="S2.E10.m1.10.11.3.2.2.3.2.2" xref="S2.E10.m1.10.11.3.2.2.3.1.cmml">,</mo><mi id="S2.E10.m1.10.10" xref="S2.E10.m1.10.10.cmml">t</mi><mo id="S2.E10.m1.10.11.3.2.2.3.2.3" rspace="0.055em" stretchy="false" xref="S2.E10.m1.10.11.3.2.2.3.1.cmml">)</mo></mrow></mrow><mo id="S2.E10.m1.10.11.3.2.1" rspace="0.222em" xref="S2.E10.m1.10.11.3.2.1.cmml">⋅</mo><mfrac id="S2.E10.m1.4.4" xref="S2.E10.m1.4.4.cmml"><mrow id="S2.E10.m1.4.4.4" xref="S2.E10.m1.4.4.4.cmml"><mrow id="S2.E10.m1.3.3.3.3" xref="S2.E10.m1.3.3.3.3.cmml"><mrow id="S2.E10.m1.3.3.3.3.1" xref="S2.E10.m1.3.3.3.3.1.cmml"><mi id="S2.E10.m1.3.3.3.3.1.3" xref="S2.E10.m1.3.3.3.3.1.3.cmml">α</mi><mo id="S2.E10.m1.3.3.3.3.1.2" xref="S2.E10.m1.3.3.3.3.1.2.cmml"></mo><mrow id="S2.E10.m1.3.3.3.3.1.1.1" xref="S2.E10.m1.3.3.3.3.1.1.2.cmml"><mo id="S2.E10.m1.3.3.3.3.1.1.1.2" stretchy="false" xref="S2.E10.m1.3.3.3.3.1.1.2.cmml">(</mo><mi id="S2.E10.m1.1.1.1.1" xref="S2.E10.m1.1.1.1.1.cmml">t</mi><mo id="S2.E10.m1.3.3.3.3.1.1.1.3" xref="S2.E10.m1.3.3.3.3.1.1.2.cmml">,</mo><mrow id="S2.E10.m1.3.3.3.3.1.1.1.1" xref="S2.E10.m1.3.3.3.3.1.1.1.1.cmml"><mn id="S2.E10.m1.3.3.3.3.1.1.1.1.2" xref="S2.E10.m1.3.3.3.3.1.1.1.1.2.cmml">2</mn><mo id="S2.E10.m1.3.3.3.3.1.1.1.1.1" xref="S2.E10.m1.3.3.3.3.1.1.1.1.1.cmml"></mo><mi id="S2.E10.m1.3.3.3.3.1.1.1.1.3" xref="S2.E10.m1.3.3.3.3.1.1.1.1.3.cmml">u</mi></mrow><mo id="S2.E10.m1.3.3.3.3.1.1.1.4" rspace="0.055em" stretchy="false" xref="S2.E10.m1.3.3.3.3.1.1.2.cmml">)</mo></mrow></mrow><mo id="S2.E10.m1.3.3.3.3.2" rspace="0.222em" xref="S2.E10.m1.3.3.3.3.2.cmml">⋅</mo><mover accent="true" id="S2.E10.m1.3.3.3.3.3" xref="S2.E10.m1.3.3.3.3.3.cmml"><mi id="S2.E10.m1.3.3.3.3.3.2" xref="S2.E10.m1.3.3.3.3.3.2.cmml">β</mi><mo id="S2.E10.m1.3.3.3.3.3.1" xref="S2.E10.m1.3.3.3.3.3.1.cmml">^</mo></mover></mrow><mo id="S2.E10.m1.4.4.4.5" xref="S2.E10.m1.4.4.4.5.cmml"></mo><mrow id="S2.E10.m1.4.4.4.4.1" xref="S2.E10.m1.4.4.4.4.2.cmml"><mo id="S2.E10.m1.4.4.4.4.1.2" stretchy="false" xref="S2.E10.m1.4.4.4.4.2.cmml">(</mo><mi id="S2.E10.m1.2.2.2.2" xref="S2.E10.m1.2.2.2.2.cmml">t</mi><mo id="S2.E10.m1.4.4.4.4.1.3" xref="S2.E10.m1.4.4.4.4.2.cmml">,</mo><mrow id="S2.E10.m1.4.4.4.4.1.1" xref="S2.E10.m1.4.4.4.4.1.1.cmml"><mn id="S2.E10.m1.4.4.4.4.1.1.2" xref="S2.E10.m1.4.4.4.4.1.1.2.cmml">2</mn><mo id="S2.E10.m1.4.4.4.4.1.1.1" xref="S2.E10.m1.4.4.4.4.1.1.1.cmml"></mo><mi id="S2.E10.m1.4.4.4.4.1.1.3" xref="S2.E10.m1.4.4.4.4.1.1.3.cmml">u</mi></mrow><mo id="S2.E10.m1.4.4.4.4.1.4" stretchy="false" xref="S2.E10.m1.4.4.4.4.2.cmml">)</mo></mrow></mrow><msubsup id="S2.E10.m1.4.4.6" xref="S2.E10.m1.4.4.6.cmml"><mi id="S2.E10.m1.4.4.6.2.2" xref="S2.E10.m1.4.4.6.2.2.cmml">y</mi><msub id="S2.E10.m1.4.4.6.3" xref="S2.E10.m1.4.4.6.3.cmml"><mi id="S2.E10.m1.4.4.6.3.2" xref="S2.E10.m1.4.4.6.3.2.cmml">π</mi><mi id="S2.E10.m1.4.4.6.3.3" xref="S2.E10.m1.4.4.6.3.3.cmml">t</mi></msub><mi id="S2.E10.m1.4.4.6.2.3" xref="S2.E10.m1.4.4.6.2.3.cmml">t</mi></msubsup></mfrac></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E10.m1.10b"><apply id="S2.E10.m1.10.11.cmml" xref="S2.E10.m1.10.11"><eq id="S2.E10.m1.10.11.1.cmml" xref="S2.E10.m1.10.11.1"></eq><apply id="S2.E10.m1.10.11.2.cmml" xref="S2.E10.m1.10.11.2"><times id="S2.E10.m1.10.11.2.1.cmml" xref="S2.E10.m1.10.11.2.1"></times><apply id="S2.E10.m1.10.11.2.2.cmml" xref="S2.E10.m1.10.11.2.2"><csymbol cd="ambiguous" id="S2.E10.m1.10.11.2.2.1.cmml" xref="S2.E10.m1.10.11.2.2">subscript</csymbol><apply id="S2.E10.m1.10.11.2.2.2.cmml" xref="S2.E10.m1.10.11.2.2"><csymbol cd="ambiguous" id="S2.E10.m1.10.11.2.2.2.1.cmml" xref="S2.E10.m1.10.11.2.2">superscript</csymbol><ci id="S2.E10.m1.10.11.2.2.2.2.cmml" xref="S2.E10.m1.10.11.2.2.2.2">𝒥</ci><ci id="S2.E10.m1.10.11.2.2.2.3.cmml" xref="S2.E10.m1.10.11.2.2.2.3">′</ci></apply><apply id="S2.E10.m1.10.11.2.2.3.cmml" xref="S2.E10.m1.10.11.2.2.3"><times id="S2.E10.m1.10.11.2.2.3.1.cmml" xref="S2.E10.m1.10.11.2.2.3.1"></times><ci id="S2.E10.m1.10.11.2.2.3.2.cmml" xref="S2.E10.m1.10.11.2.2.3.2">𝑠</ci><ci id="S2.E10.m1.10.11.2.2.3.3.cmml" xref="S2.E10.m1.10.11.2.2.3.3">𝑎</ci></apply></apply><vector id="S2.E10.m1.10.11.2.3.1.cmml" xref="S2.E10.m1.10.11.2.3.2"><ci id="S2.E10.m1.5.5.cmml" xref="S2.E10.m1.5.5">𝑙</ci><ci id="S2.E10.m1.6.6.cmml" xref="S2.E10.m1.6.6">𝑥</ci><ci id="S2.E10.m1.7.7.cmml" xref="S2.E10.m1.7.7">𝑠</ci><ci id="S2.E10.m1.8.8.cmml" xref="S2.E10.m1.8.8">𝑢</ci></vector></apply><apply id="S2.E10.m1.10.11.3.cmml" xref="S2.E10.m1.10.11.3"><apply id="S2.E10.m1.10.11.3.1.cmml" xref="S2.E10.m1.10.11.3.1"><csymbol cd="ambiguous" id="S2.E10.m1.10.11.3.1.1.cmml" xref="S2.E10.m1.10.11.3.1">subscript</csymbol><apply id="S2.E10.m1.10.11.3.1.2.cmml" xref="S2.E10.m1.10.11.3.1"><csymbol cd="ambiguous" id="S2.E10.m1.10.11.3.1.2.1.cmml" xref="S2.E10.m1.10.11.3.1">superscript</csymbol><sum id="S2.E10.m1.10.11.3.1.2.2.cmml" xref="S2.E10.m1.10.11.3.1.2.2"></sum><ci id="S2.E10.m1.10.11.3.1.2.3.cmml" xref="S2.E10.m1.10.11.3.1.2.3">𝑇</ci></apply><apply id="S2.E10.m1.10.11.3.1.3.cmml" xref="S2.E10.m1.10.11.3.1.3"><eq id="S2.E10.m1.10.11.3.1.3.1.cmml" xref="S2.E10.m1.10.11.3.1.3.1"></eq><ci id="S2.E10.m1.10.11.3.1.3.2.cmml" xref="S2.E10.m1.10.11.3.1.3.2">𝑡</ci><cn id="S2.E10.m1.10.11.3.1.3.3.cmml" type="integer" xref="S2.E10.m1.10.11.3.1.3.3">1</cn></apply></apply><apply id="S2.E10.m1.10.11.3.2.cmml" xref="S2.E10.m1.10.11.3.2"><ci id="S2.E10.m1.10.11.3.2.1.cmml" xref="S2.E10.m1.10.11.3.2.1">⋅</ci><apply id="S2.E10.m1.10.11.3.2.2.cmml" xref="S2.E10.m1.10.11.3.2.2"><times id="S2.E10.m1.10.11.3.2.2.1.cmml" xref="S2.E10.m1.10.11.3.2.2.1"></times><apply id="S2.E10.m1.10.11.3.2.2.2.cmml" xref="S2.E10.m1.10.11.3.2.2.2"><csymbol cd="ambiguous" id="S2.E10.m1.10.11.3.2.2.2.1.cmml" xref="S2.E10.m1.10.11.3.2.2.2">subscript</csymbol><ci id="S2.E10.m1.10.11.3.2.2.2.2.cmml" xref="S2.E10.m1.10.11.3.2.2.2.2">𝑟</ci><apply id="S2.E10.m1.10.11.3.2.2.2.3.cmml" xref="S2.E10.m1.10.11.3.2.2.2.3"><times id="S2.E10.m1.10.11.3.2.2.2.3.1.cmml" xref="S2.E10.m1.10.11.3.2.2.2.3.1"></times><ci id="S2.E10.m1.10.11.3.2.2.2.3.2.cmml" xref="S2.E10.m1.10.11.3.2.2.2.3.2">𝑠</ci><ci id="S2.E10.m1.10.11.3.2.2.2.3.3.cmml" xref="S2.E10.m1.10.11.3.2.2.2.3.3">𝑎</ci></apply></apply><interval closure="open" id="S2.E10.m1.10.11.3.2.2.3.1.cmml" xref="S2.E10.m1.10.11.3.2.2.3.2"><ci id="S2.E10.m1.9.9.cmml" xref="S2.E10.m1.9.9">𝑠</ci><ci id="S2.E10.m1.10.10.cmml" xref="S2.E10.m1.10.10">𝑡</ci></interval></apply><apply id="S2.E10.m1.4.4.cmml" xref="S2.E10.m1.4.4"><divide id="S2.E10.m1.4.4.5.cmml" xref="S2.E10.m1.4.4"></divide><apply id="S2.E10.m1.4.4.4.cmml" xref="S2.E10.m1.4.4.4"><times id="S2.E10.m1.4.4.4.5.cmml" xref="S2.E10.m1.4.4.4.5"></times><apply id="S2.E10.m1.3.3.3.3.cmml" xref="S2.E10.m1.3.3.3.3"><ci id="S2.E10.m1.3.3.3.3.2.cmml" xref="S2.E10.m1.3.3.3.3.2">⋅</ci><apply id="S2.E10.m1.3.3.3.3.1.cmml" xref="S2.E10.m1.3.3.3.3.1"><times id="S2.E10.m1.3.3.3.3.1.2.cmml" xref="S2.E10.m1.3.3.3.3.1.2"></times><ci id="S2.E10.m1.3.3.3.3.1.3.cmml" xref="S2.E10.m1.3.3.3.3.1.3">𝛼</ci><interval closure="open" id="S2.E10.m1.3.3.3.3.1.1.2.cmml" xref="S2.E10.m1.3.3.3.3.1.1.1"><ci id="S2.E10.m1.1.1.1.1.cmml" xref="S2.E10.m1.1.1.1.1">𝑡</ci><apply id="S2.E10.m1.3.3.3.3.1.1.1.1.cmml" xref="S2.E10.m1.3.3.3.3.1.1.1.1"><times id="S2.E10.m1.3.3.3.3.1.1.1.1.1.cmml" xref="S2.E10.m1.3.3.3.3.1.1.1.1.1"></times><cn id="S2.E10.m1.3.3.3.3.1.1.1.1.2.cmml" type="integer" xref="S2.E10.m1.3.3.3.3.1.1.1.1.2">2</cn><ci id="S2.E10.m1.3.3.3.3.1.1.1.1.3.cmml" xref="S2.E10.m1.3.3.3.3.1.1.1.1.3">𝑢</ci></apply></interval></apply><apply id="S2.E10.m1.3.3.3.3.3.cmml" xref="S2.E10.m1.3.3.3.3.3"><ci id="S2.E10.m1.3.3.3.3.3.1.cmml" xref="S2.E10.m1.3.3.3.3.3.1">^</ci><ci id="S2.E10.m1.3.3.3.3.3.2.cmml" xref="S2.E10.m1.3.3.3.3.3.2">𝛽</ci></apply></apply><interval closure="open" id="S2.E10.m1.4.4.4.4.2.cmml" xref="S2.E10.m1.4.4.4.4.1"><ci id="S2.E10.m1.2.2.2.2.cmml" xref="S2.E10.m1.2.2.2.2">𝑡</ci><apply id="S2.E10.m1.4.4.4.4.1.1.cmml" xref="S2.E10.m1.4.4.4.4.1.1"><times id="S2.E10.m1.4.4.4.4.1.1.1.cmml" xref="S2.E10.m1.4.4.4.4.1.1.1"></times><cn id="S2.E10.m1.4.4.4.4.1.1.2.cmml" type="integer" xref="S2.E10.m1.4.4.4.4.1.1.2">2</cn><ci id="S2.E10.m1.4.4.4.4.1.1.3.cmml" xref="S2.E10.m1.4.4.4.4.1.1.3">𝑢</ci></apply></interval></apply><apply id="S2.E10.m1.4.4.6.cmml" xref="S2.E10.m1.4.4.6"><csymbol cd="ambiguous" id="S2.E10.m1.4.4.6.1.cmml" xref="S2.E10.m1.4.4.6">subscript</csymbol><apply id="S2.E10.m1.4.4.6.2.cmml" xref="S2.E10.m1.4.4.6"><csymbol cd="ambiguous" id="S2.E10.m1.4.4.6.2.1.cmml" xref="S2.E10.m1.4.4.6">superscript</csymbol><ci id="S2.E10.m1.4.4.6.2.2.cmml" xref="S2.E10.m1.4.4.6.2.2">𝑦</ci><ci id="S2.E10.m1.4.4.6.2.3.cmml" xref="S2.E10.m1.4.4.6.2.3">𝑡</ci></apply><apply id="S2.E10.m1.4.4.6.3.cmml" xref="S2.E10.m1.4.4.6.3"><csymbol cd="ambiguous" id="S2.E10.m1.4.4.6.3.1.cmml" xref="S2.E10.m1.4.4.6.3">subscript</csymbol><ci id="S2.E10.m1.4.4.6.3.2.cmml" xref="S2.E10.m1.4.4.6.3.2">𝜋</ci><ci id="S2.E10.m1.4.4.6.3.3.cmml" xref="S2.E10.m1.4.4.6.3.3">𝑡</ci></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E10.m1.10c">\mathcal{J}^{\prime}_{sa}(l,x,s,u)=\sum^{T}_{t=1}r_{sa}(s,t)\cdot\frac{\alpha(% t,2u)\cdot\hat{\beta}(t,2u)}{y^{t}_{\pi_{t}}}</annotation><annotation encoding="application/x-llamapun" id="S2.E10.m1.10d">caligraphic_J start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_s italic_a end_POSTSUBSCRIPT ( italic_l , italic_x , italic_s , italic_u ) = ∑ start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_s italic_a end_POSTSUBSCRIPT ( italic_s , italic_t ) ⋅ divide start_ARG italic_α ( italic_t , 2 italic_u ) ⋅ over^ start_ARG italic_β end_ARG ( italic_t , 2 italic_u ) end_ARG start_ARG italic_y start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_π start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(10)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS2.p3.10">Note that <math alttext="r_{sa}(s,t)" class="ltx_Math" display="inline" id="S2.SS2.p3.7.m1.2"><semantics id="S2.SS2.p3.7.m1.2a"><mrow id="S2.SS2.p3.7.m1.2.3" xref="S2.SS2.p3.7.m1.2.3.cmml"><msub id="S2.SS2.p3.7.m1.2.3.2" xref="S2.SS2.p3.7.m1.2.3.2.cmml"><mi id="S2.SS2.p3.7.m1.2.3.2.2" xref="S2.SS2.p3.7.m1.2.3.2.2.cmml">r</mi><mrow id="S2.SS2.p3.7.m1.2.3.2.3" xref="S2.SS2.p3.7.m1.2.3.2.3.cmml"><mi id="S2.SS2.p3.7.m1.2.3.2.3.2" xref="S2.SS2.p3.7.m1.2.3.2.3.2.cmml">s</mi><mo id="S2.SS2.p3.7.m1.2.3.2.3.1" xref="S2.SS2.p3.7.m1.2.3.2.3.1.cmml"></mo><mi id="S2.SS2.p3.7.m1.2.3.2.3.3" xref="S2.SS2.p3.7.m1.2.3.2.3.3.cmml">a</mi></mrow></msub><mo id="S2.SS2.p3.7.m1.2.3.1" xref="S2.SS2.p3.7.m1.2.3.1.cmml"></mo><mrow id="S2.SS2.p3.7.m1.2.3.3.2" xref="S2.SS2.p3.7.m1.2.3.3.1.cmml"><mo id="S2.SS2.p3.7.m1.2.3.3.2.1" stretchy="false" xref="S2.SS2.p3.7.m1.2.3.3.1.cmml">(</mo><mi id="S2.SS2.p3.7.m1.1.1" xref="S2.SS2.p3.7.m1.1.1.cmml">s</mi><mo id="S2.SS2.p3.7.m1.2.3.3.2.2" xref="S2.SS2.p3.7.m1.2.3.3.1.cmml">,</mo><mi id="S2.SS2.p3.7.m1.2.2" xref="S2.SS2.p3.7.m1.2.2.cmml">t</mi><mo id="S2.SS2.p3.7.m1.2.3.3.2.3" stretchy="false" xref="S2.SS2.p3.7.m1.2.3.3.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.7.m1.2b"><apply id="S2.SS2.p3.7.m1.2.3.cmml" xref="S2.SS2.p3.7.m1.2.3"><times id="S2.SS2.p3.7.m1.2.3.1.cmml" xref="S2.SS2.p3.7.m1.2.3.1"></times><apply id="S2.SS2.p3.7.m1.2.3.2.cmml" xref="S2.SS2.p3.7.m1.2.3.2"><csymbol cd="ambiguous" id="S2.SS2.p3.7.m1.2.3.2.1.cmml" xref="S2.SS2.p3.7.m1.2.3.2">subscript</csymbol><ci id="S2.SS2.p3.7.m1.2.3.2.2.cmml" xref="S2.SS2.p3.7.m1.2.3.2.2">𝑟</ci><apply id="S2.SS2.p3.7.m1.2.3.2.3.cmml" xref="S2.SS2.p3.7.m1.2.3.2.3"><times id="S2.SS2.p3.7.m1.2.3.2.3.1.cmml" xref="S2.SS2.p3.7.m1.2.3.2.3.1"></times><ci id="S2.SS2.p3.7.m1.2.3.2.3.2.cmml" xref="S2.SS2.p3.7.m1.2.3.2.3.2">𝑠</ci><ci id="S2.SS2.p3.7.m1.2.3.2.3.3.cmml" xref="S2.SS2.p3.7.m1.2.3.2.3.3">𝑎</ci></apply></apply><interval closure="open" id="S2.SS2.p3.7.m1.2.3.3.1.cmml" xref="S2.SS2.p3.7.m1.2.3.3.2"><ci id="S2.SS2.p3.7.m1.1.1.cmml" xref="S2.SS2.p3.7.m1.1.1">𝑠</ci><ci id="S2.SS2.p3.7.m1.2.2.cmml" xref="S2.SS2.p3.7.m1.2.2">𝑡</ci></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.7.m1.2c">r_{sa}(s,t)</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.7.m1.2d">italic_r start_POSTSUBSCRIPT italic_s italic_a end_POSTSUBSCRIPT ( italic_s , italic_t )</annotation></semantics></math> is consistently <math alttext="{\,<}\,0" class="ltx_Math" display="inline" id="S2.SS2.p3.8.m2.1"><semantics id="S2.SS2.p3.8.m2.1a"><mrow id="S2.SS2.p3.8.m2.1.1" xref="S2.SS2.p3.8.m2.1.1.cmml"><mi id="S2.SS2.p3.8.m2.1.1.2" xref="S2.SS2.p3.8.m2.1.1.2.cmml"></mi><mo id="S2.SS2.p3.8.m2.1.1.1" lspace="0.448em" xref="S2.SS2.p3.8.m2.1.1.1.cmml"><</mo><mn id="S2.SS2.p3.8.m2.1.1.3" xref="S2.SS2.p3.8.m2.1.1.3.cmml"> 0</mn></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.8.m2.1b"><apply id="S2.SS2.p3.8.m2.1.1.cmml" xref="S2.SS2.p3.8.m2.1.1"><lt id="S2.SS2.p3.8.m2.1.1.1.cmml" xref="S2.SS2.p3.8.m2.1.1.1"></lt><csymbol cd="latexml" id="S2.SS2.p3.8.m2.1.1.2.cmml" xref="S2.SS2.p3.8.m2.1.1.2">absent</csymbol><cn id="S2.SS2.p3.8.m2.1.1.3.cmml" type="float" xref="S2.SS2.p3.8.m2.1.1.3"> 0</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.8.m2.1c">{\,<}\,0</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.8.m2.1d">< 0</annotation></semantics></math>, so here we minimize expected Bayes risk <math alttext="\mathcal{J}^{\prime}_{sa}" class="ltx_Math" display="inline" id="S2.SS2.p3.9.m3.1"><semantics id="S2.SS2.p3.9.m3.1a"><msubsup id="S2.SS2.p3.9.m3.1.1" xref="S2.SS2.p3.9.m3.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.SS2.p3.9.m3.1.1.2.2" xref="S2.SS2.p3.9.m3.1.1.2.2.cmml">𝒥</mi><mrow id="S2.SS2.p3.9.m3.1.1.3" xref="S2.SS2.p3.9.m3.1.1.3.cmml"><mi id="S2.SS2.p3.9.m3.1.1.3.2" xref="S2.SS2.p3.9.m3.1.1.3.2.cmml">s</mi><mo id="S2.SS2.p3.9.m3.1.1.3.1" xref="S2.SS2.p3.9.m3.1.1.3.1.cmml"></mo><mi id="S2.SS2.p3.9.m3.1.1.3.3" xref="S2.SS2.p3.9.m3.1.1.3.3.cmml">a</mi></mrow><mo id="S2.SS2.p3.9.m3.1.1.2.3" xref="S2.SS2.p3.9.m3.1.1.2.3.cmml">′</mo></msubsup><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.9.m3.1b"><apply id="S2.SS2.p3.9.m3.1.1.cmml" xref="S2.SS2.p3.9.m3.1.1"><csymbol cd="ambiguous" id="S2.SS2.p3.9.m3.1.1.1.cmml" xref="S2.SS2.p3.9.m3.1.1">subscript</csymbol><apply id="S2.SS2.p3.9.m3.1.1.2.cmml" xref="S2.SS2.p3.9.m3.1.1"><csymbol cd="ambiguous" id="S2.SS2.p3.9.m3.1.1.2.1.cmml" xref="S2.SS2.p3.9.m3.1.1">superscript</csymbol><ci id="S2.SS2.p3.9.m3.1.1.2.2.cmml" xref="S2.SS2.p3.9.m3.1.1.2.2">𝒥</ci><ci id="S2.SS2.p3.9.m3.1.1.2.3.cmml" xref="S2.SS2.p3.9.m3.1.1.2.3">′</ci></apply><apply id="S2.SS2.p3.9.m3.1.1.3.cmml" xref="S2.SS2.p3.9.m3.1.1.3"><times id="S2.SS2.p3.9.m3.1.1.3.1.cmml" xref="S2.SS2.p3.9.m3.1.1.3.1"></times><ci id="S2.SS2.p3.9.m3.1.1.3.2.cmml" xref="S2.SS2.p3.9.m3.1.1.3.2">𝑠</ci><ci id="S2.SS2.p3.9.m3.1.1.3.3.cmml" xref="S2.SS2.p3.9.m3.1.1.3.3">𝑎</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.9.m3.1c">\mathcal{J}^{\prime}_{sa}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.9.m3.1d">caligraphic_J start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_s italic_a end_POSTSUBSCRIPT</annotation></semantics></math>, which contrasts with maximizing a posterior as in Eq. <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S2.E1" title="In II-A Revisit CTC in speech recognition ‣ II Methods ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag">1</span></a>. Furthermore, to constrain every token <math alttext="l_{u}\in l" class="ltx_Math" display="inline" id="S2.SS2.p3.10.m4.1"><semantics id="S2.SS2.p3.10.m4.1a"><mrow id="S2.SS2.p3.10.m4.1.1" xref="S2.SS2.p3.10.m4.1.1.cmml"><msub id="S2.SS2.p3.10.m4.1.1.2" xref="S2.SS2.p3.10.m4.1.1.2.cmml"><mi id="S2.SS2.p3.10.m4.1.1.2.2" xref="S2.SS2.p3.10.m4.1.1.2.2.cmml">l</mi><mi id="S2.SS2.p3.10.m4.1.1.2.3" xref="S2.SS2.p3.10.m4.1.1.2.3.cmml">u</mi></msub><mo id="S2.SS2.p3.10.m4.1.1.1" xref="S2.SS2.p3.10.m4.1.1.1.cmml">∈</mo><mi id="S2.SS2.p3.10.m4.1.1.3" xref="S2.SS2.p3.10.m4.1.1.3.cmml">l</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.10.m4.1b"><apply id="S2.SS2.p3.10.m4.1.1.cmml" xref="S2.SS2.p3.10.m4.1.1"><in id="S2.SS2.p3.10.m4.1.1.1.cmml" xref="S2.SS2.p3.10.m4.1.1.1"></in><apply id="S2.SS2.p3.10.m4.1.1.2.cmml" xref="S2.SS2.p3.10.m4.1.1.2"><csymbol cd="ambiguous" id="S2.SS2.p3.10.m4.1.1.2.1.cmml" xref="S2.SS2.p3.10.m4.1.1.2">subscript</csymbol><ci id="S2.SS2.p3.10.m4.1.1.2.2.cmml" xref="S2.SS2.p3.10.m4.1.1.2.2">𝑙</ci><ci id="S2.SS2.p3.10.m4.1.1.2.3.cmml" xref="S2.SS2.p3.10.m4.1.1.2.3">𝑢</ci></apply><ci id="S2.SS2.p3.10.m4.1.1.3.cmml" xref="S2.SS2.p3.10.m4.1.1.3">𝑙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.10.m4.1c">l_{u}\in l</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.10.m4.1d">italic_l start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ∈ italic_l</annotation></semantics></math> from all speakers, the final training objective is to minimize the following:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E11"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\mathcal{J}_{sa}(l,x)=\frac{1}{S}\cdot\sum^{S}_{s=1}\,[\,\frac{1}{U}\cdot\sum^% {U}_{u}\mathbbm{1}(s,u)\cdot log\mathcal{J}^{\prime}_{sa}(l,x,s,u)\,]" class="ltx_Math" display="block" id="S2.E11.m1.9"><semantics id="S2.E11.m1.9a"><mrow id="S2.E11.m1.9.9" xref="S2.E11.m1.9.9.cmml"><mrow id="S2.E11.m1.9.9.3" xref="S2.E11.m1.9.9.3.cmml"><msub id="S2.E11.m1.9.9.3.2" xref="S2.E11.m1.9.9.3.2.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E11.m1.9.9.3.2.2" xref="S2.E11.m1.9.9.3.2.2.cmml">𝒥</mi><mrow id="S2.E11.m1.9.9.3.2.3" xref="S2.E11.m1.9.9.3.2.3.cmml"><mi id="S2.E11.m1.9.9.3.2.3.2" xref="S2.E11.m1.9.9.3.2.3.2.cmml">s</mi><mo id="S2.E11.m1.9.9.3.2.3.1" xref="S2.E11.m1.9.9.3.2.3.1.cmml"></mo><mi id="S2.E11.m1.9.9.3.2.3.3" xref="S2.E11.m1.9.9.3.2.3.3.cmml">a</mi></mrow></msub><mo id="S2.E11.m1.9.9.3.1" xref="S2.E11.m1.9.9.3.1.cmml"></mo><mrow id="S2.E11.m1.9.9.3.3.2" xref="S2.E11.m1.9.9.3.3.1.cmml"><mo id="S2.E11.m1.9.9.3.3.2.1" stretchy="false" xref="S2.E11.m1.9.9.3.3.1.cmml">(</mo><mi id="S2.E11.m1.1.1" xref="S2.E11.m1.1.1.cmml">l</mi><mo id="S2.E11.m1.9.9.3.3.2.2" xref="S2.E11.m1.9.9.3.3.1.cmml">,</mo><mi id="S2.E11.m1.2.2" xref="S2.E11.m1.2.2.cmml">x</mi><mo id="S2.E11.m1.9.9.3.3.2.3" stretchy="false" xref="S2.E11.m1.9.9.3.3.1.cmml">)</mo></mrow></mrow><mo id="S2.E11.m1.9.9.2" xref="S2.E11.m1.9.9.2.cmml">=</mo><mrow id="S2.E11.m1.9.9.1" xref="S2.E11.m1.9.9.1.cmml"><mfrac id="S2.E11.m1.9.9.1.3" xref="S2.E11.m1.9.9.1.3.cmml"><mn id="S2.E11.m1.9.9.1.3.2" xref="S2.E11.m1.9.9.1.3.2.cmml">1</mn><mi id="S2.E11.m1.9.9.1.3.3" xref="S2.E11.m1.9.9.1.3.3.cmml">S</mi></mfrac><mo id="S2.E11.m1.9.9.1.2" lspace="0.222em" rspace="0.055em" xref="S2.E11.m1.9.9.1.2.cmml">⋅</mo><mrow id="S2.E11.m1.9.9.1.1" xref="S2.E11.m1.9.9.1.1.cmml"><munderover id="S2.E11.m1.9.9.1.1.2" xref="S2.E11.m1.9.9.1.1.2.cmml"><mo id="S2.E11.m1.9.9.1.1.2.2.2" movablelimits="false" rspace="0em" xref="S2.E11.m1.9.9.1.1.2.2.2.cmml">∑</mo><mrow id="S2.E11.m1.9.9.1.1.2.3" xref="S2.E11.m1.9.9.1.1.2.3.cmml"><mi id="S2.E11.m1.9.9.1.1.2.3.2" xref="S2.E11.m1.9.9.1.1.2.3.2.cmml">s</mi><mo id="S2.E11.m1.9.9.1.1.2.3.1" xref="S2.E11.m1.9.9.1.1.2.3.1.cmml">=</mo><mn id="S2.E11.m1.9.9.1.1.2.3.3" xref="S2.E11.m1.9.9.1.1.2.3.3.cmml">1</mn></mrow><mi id="S2.E11.m1.9.9.1.1.2.2.3" xref="S2.E11.m1.9.9.1.1.2.2.3.cmml">S</mi></munderover><mrow id="S2.E11.m1.9.9.1.1.1.1" xref="S2.E11.m1.9.9.1.1.1.2.cmml"><mo id="S2.E11.m1.9.9.1.1.1.1.2" rspace="0.170em" stretchy="false" xref="S2.E11.m1.9.9.1.1.1.2.1.cmml">[</mo><mrow id="S2.E11.m1.9.9.1.1.1.1.1" xref="S2.E11.m1.9.9.1.1.1.1.1.cmml"><mfrac id="S2.E11.m1.9.9.1.1.1.1.1.2" xref="S2.E11.m1.9.9.1.1.1.1.1.2.cmml"><mn id="S2.E11.m1.9.9.1.1.1.1.1.2.2" xref="S2.E11.m1.9.9.1.1.1.1.1.2.2.cmml">1</mn><mi id="S2.E11.m1.9.9.1.1.1.1.1.2.3" xref="S2.E11.m1.9.9.1.1.1.1.1.2.3.cmml">U</mi></mfrac><mo id="S2.E11.m1.9.9.1.1.1.1.1.1" lspace="0.222em" rspace="0.055em" xref="S2.E11.m1.9.9.1.1.1.1.1.1.cmml">⋅</mo><mrow id="S2.E11.m1.9.9.1.1.1.1.1.3" xref="S2.E11.m1.9.9.1.1.1.1.1.3.cmml"><munderover id="S2.E11.m1.9.9.1.1.1.1.1.3.1" xref="S2.E11.m1.9.9.1.1.1.1.1.3.1.cmml"><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.1.2.2" movablelimits="false" xref="S2.E11.m1.9.9.1.1.1.1.1.3.1.2.2.cmml">∑</mo><mi id="S2.E11.m1.9.9.1.1.1.1.1.3.1.3" xref="S2.E11.m1.9.9.1.1.1.1.1.3.1.3.cmml">u</mi><mi id="S2.E11.m1.9.9.1.1.1.1.1.3.1.2.3" xref="S2.E11.m1.9.9.1.1.1.1.1.3.1.2.3.cmml">U</mi></munderover><mrow id="S2.E11.m1.9.9.1.1.1.1.1.3.2" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.cmml"><mrow id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.cmml"><mrow id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.cmml"><mn id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.2" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.2.cmml">𝟙</mn><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.1" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.1.cmml"></mo><mrow id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.3.2" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.3.1.cmml"><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.3.2.1" stretchy="false" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.3.1.cmml">(</mo><mi id="S2.E11.m1.3.3" xref="S2.E11.m1.3.3.cmml">s</mi><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.3.2.2" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.3.1.cmml">,</mo><mi id="S2.E11.m1.4.4" xref="S2.E11.m1.4.4.cmml">u</mi><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.3.2.3" rspace="0.055em" stretchy="false" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.3.1.cmml">)</mo></mrow></mrow><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.1" rspace="0.222em" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.1.cmml">⋅</mo><mi id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.3" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.3.cmml">l</mi></mrow><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.1" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.1.cmml"></mo><mi id="S2.E11.m1.9.9.1.1.1.1.1.3.2.3" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.3.cmml">o</mi><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.1a" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.1.cmml"></mo><mi id="S2.E11.m1.9.9.1.1.1.1.1.3.2.4" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.4.cmml">g</mi><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.1b" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.1.cmml"></mo><msubsup id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.2.2" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.2.2.cmml">𝒥</mi><mrow id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.cmml"><mi id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.2" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.2.cmml">s</mi><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.1" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.1.cmml"></mo><mi id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.3" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.3.cmml">a</mi></mrow><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.2.3" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.2.3.cmml">′</mo></msubsup><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.1c" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.1.cmml"></mo><mrow id="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.2" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.1.cmml"><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.2.1" stretchy="false" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.1.cmml">(</mo><mi id="S2.E11.m1.5.5" xref="S2.E11.m1.5.5.cmml">l</mi><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.2.2" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.1.cmml">,</mo><mi id="S2.E11.m1.6.6" xref="S2.E11.m1.6.6.cmml">x</mi><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.2.3" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.1.cmml">,</mo><mi id="S2.E11.m1.7.7" xref="S2.E11.m1.7.7.cmml">s</mi><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.2.4" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.1.cmml">,</mo><mi id="S2.E11.m1.8.8" xref="S2.E11.m1.8.8.cmml">u</mi><mo id="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.2.5" rspace="0.170em" stretchy="false" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.1.cmml">)</mo></mrow></mrow></mrow></mrow><mo id="S2.E11.m1.9.9.1.1.1.1.3" stretchy="false" xref="S2.E11.m1.9.9.1.1.1.2.1.cmml">]</mo></mrow></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E11.m1.9b"><apply id="S2.E11.m1.9.9.cmml" xref="S2.E11.m1.9.9"><eq id="S2.E11.m1.9.9.2.cmml" xref="S2.E11.m1.9.9.2"></eq><apply id="S2.E11.m1.9.9.3.cmml" xref="S2.E11.m1.9.9.3"><times id="S2.E11.m1.9.9.3.1.cmml" xref="S2.E11.m1.9.9.3.1"></times><apply id="S2.E11.m1.9.9.3.2.cmml" xref="S2.E11.m1.9.9.3.2"><csymbol cd="ambiguous" id="S2.E11.m1.9.9.3.2.1.cmml" xref="S2.E11.m1.9.9.3.2">subscript</csymbol><ci id="S2.E11.m1.9.9.3.2.2.cmml" xref="S2.E11.m1.9.9.3.2.2">𝒥</ci><apply id="S2.E11.m1.9.9.3.2.3.cmml" xref="S2.E11.m1.9.9.3.2.3"><times id="S2.E11.m1.9.9.3.2.3.1.cmml" xref="S2.E11.m1.9.9.3.2.3.1"></times><ci id="S2.E11.m1.9.9.3.2.3.2.cmml" xref="S2.E11.m1.9.9.3.2.3.2">𝑠</ci><ci id="S2.E11.m1.9.9.3.2.3.3.cmml" xref="S2.E11.m1.9.9.3.2.3.3">𝑎</ci></apply></apply><interval closure="open" id="S2.E11.m1.9.9.3.3.1.cmml" xref="S2.E11.m1.9.9.3.3.2"><ci id="S2.E11.m1.1.1.cmml" xref="S2.E11.m1.1.1">𝑙</ci><ci id="S2.E11.m1.2.2.cmml" xref="S2.E11.m1.2.2">𝑥</ci></interval></apply><apply id="S2.E11.m1.9.9.1.cmml" xref="S2.E11.m1.9.9.1"><ci id="S2.E11.m1.9.9.1.2.cmml" xref="S2.E11.m1.9.9.1.2">⋅</ci><apply id="S2.E11.m1.9.9.1.3.cmml" xref="S2.E11.m1.9.9.1.3"><divide id="S2.E11.m1.9.9.1.3.1.cmml" xref="S2.E11.m1.9.9.1.3"></divide><cn id="S2.E11.m1.9.9.1.3.2.cmml" type="integer" xref="S2.E11.m1.9.9.1.3.2">1</cn><ci id="S2.E11.m1.9.9.1.3.3.cmml" xref="S2.E11.m1.9.9.1.3.3">𝑆</ci></apply><apply id="S2.E11.m1.9.9.1.1.cmml" xref="S2.E11.m1.9.9.1.1"><apply id="S2.E11.m1.9.9.1.1.2.cmml" xref="S2.E11.m1.9.9.1.1.2"><csymbol cd="ambiguous" id="S2.E11.m1.9.9.1.1.2.1.cmml" xref="S2.E11.m1.9.9.1.1.2">subscript</csymbol><apply id="S2.E11.m1.9.9.1.1.2.2.cmml" xref="S2.E11.m1.9.9.1.1.2"><csymbol cd="ambiguous" id="S2.E11.m1.9.9.1.1.2.2.1.cmml" xref="S2.E11.m1.9.9.1.1.2">superscript</csymbol><sum id="S2.E11.m1.9.9.1.1.2.2.2.cmml" xref="S2.E11.m1.9.9.1.1.2.2.2"></sum><ci id="S2.E11.m1.9.9.1.1.2.2.3.cmml" xref="S2.E11.m1.9.9.1.1.2.2.3">𝑆</ci></apply><apply id="S2.E11.m1.9.9.1.1.2.3.cmml" xref="S2.E11.m1.9.9.1.1.2.3"><eq id="S2.E11.m1.9.9.1.1.2.3.1.cmml" xref="S2.E11.m1.9.9.1.1.2.3.1"></eq><ci id="S2.E11.m1.9.9.1.1.2.3.2.cmml" xref="S2.E11.m1.9.9.1.1.2.3.2">𝑠</ci><cn id="S2.E11.m1.9.9.1.1.2.3.3.cmml" type="integer" xref="S2.E11.m1.9.9.1.1.2.3.3">1</cn></apply></apply><apply id="S2.E11.m1.9.9.1.1.1.2.cmml" xref="S2.E11.m1.9.9.1.1.1.1"><csymbol cd="latexml" id="S2.E11.m1.9.9.1.1.1.2.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.2">delimited-[]</csymbol><apply id="S2.E11.m1.9.9.1.1.1.1.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1"><ci id="S2.E11.m1.9.9.1.1.1.1.1.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.1">⋅</ci><apply id="S2.E11.m1.9.9.1.1.1.1.1.2.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.2"><divide id="S2.E11.m1.9.9.1.1.1.1.1.2.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.2"></divide><cn id="S2.E11.m1.9.9.1.1.1.1.1.2.2.cmml" type="integer" xref="S2.E11.m1.9.9.1.1.1.1.1.2.2">1</cn><ci id="S2.E11.m1.9.9.1.1.1.1.1.2.3.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.2.3">𝑈</ci></apply><apply id="S2.E11.m1.9.9.1.1.1.1.1.3.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3"><apply id="S2.E11.m1.9.9.1.1.1.1.1.3.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.1"><csymbol cd="ambiguous" id="S2.E11.m1.9.9.1.1.1.1.1.3.1.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.1">subscript</csymbol><apply id="S2.E11.m1.9.9.1.1.1.1.1.3.1.2.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.1"><csymbol cd="ambiguous" id="S2.E11.m1.9.9.1.1.1.1.1.3.1.2.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.1">superscript</csymbol><sum id="S2.E11.m1.9.9.1.1.1.1.1.3.1.2.2.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.1.2.2"></sum><ci id="S2.E11.m1.9.9.1.1.1.1.1.3.1.2.3.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.1.2.3">𝑈</ci></apply><ci id="S2.E11.m1.9.9.1.1.1.1.1.3.1.3.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.1.3">𝑢</ci></apply><apply id="S2.E11.m1.9.9.1.1.1.1.1.3.2.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2"><times id="S2.E11.m1.9.9.1.1.1.1.1.3.2.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.1"></times><apply id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2"><ci id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.1">⋅</ci><apply id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2"><times id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.1"></times><cn id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.2.cmml" type="integer" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.2">1</cn><interval closure="open" id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.3.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.2.3.2"><ci id="S2.E11.m1.3.3.cmml" xref="S2.E11.m1.3.3">𝑠</ci><ci id="S2.E11.m1.4.4.cmml" xref="S2.E11.m1.4.4">𝑢</ci></interval></apply><ci id="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.3.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.2.3">𝑙</ci></apply><ci id="S2.E11.m1.9.9.1.1.1.1.1.3.2.3.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.3">𝑜</ci><ci id="S2.E11.m1.9.9.1.1.1.1.1.3.2.4.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.4">𝑔</ci><apply id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5"><csymbol cd="ambiguous" id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5">subscript</csymbol><apply id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.2.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5"><csymbol cd="ambiguous" id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.2.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5">superscript</csymbol><ci id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.2.2.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.2.2">𝒥</ci><ci id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.2.3.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.2.3">′</ci></apply><apply id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3"><times id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.1"></times><ci id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.2.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.2">𝑠</ci><ci id="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.3.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.5.3.3">𝑎</ci></apply></apply><vector id="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.1.cmml" xref="S2.E11.m1.9.9.1.1.1.1.1.3.2.6.2"><ci id="S2.E11.m1.5.5.cmml" xref="S2.E11.m1.5.5">𝑙</ci><ci id="S2.E11.m1.6.6.cmml" xref="S2.E11.m1.6.6">𝑥</ci><ci id="S2.E11.m1.7.7.cmml" xref="S2.E11.m1.7.7">𝑠</ci><ci id="S2.E11.m1.8.8.cmml" xref="S2.E11.m1.8.8">𝑢</ci></vector></apply></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E11.m1.9c">\mathcal{J}_{sa}(l,x)=\frac{1}{S}\cdot\sum^{S}_{s=1}\,[\,\frac{1}{U}\cdot\sum^% {U}_{u}\mathbbm{1}(s,u)\cdot log\mathcal{J}^{\prime}_{sa}(l,x,s,u)\,]</annotation><annotation encoding="application/x-llamapun" id="S2.E11.m1.9d">caligraphic_J start_POSTSUBSCRIPT italic_s italic_a end_POSTSUBSCRIPT ( italic_l , italic_x ) = divide start_ARG 1 end_ARG start_ARG italic_S end_ARG ⋅ ∑ start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_s = 1 end_POSTSUBSCRIPT [ divide start_ARG 1 end_ARG start_ARG italic_U end_ARG ⋅ ∑ start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT blackboard_1 ( italic_s , italic_u ) ⋅ italic_l italic_o italic_g caligraphic_J start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_s italic_a end_POSTSUBSCRIPT ( italic_l , italic_x , italic_s , italic_u ) ]</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(11)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS2.p3.17">where <math alttext="S" class="ltx_Math" display="inline" id="S2.SS2.p3.11.m1.1"><semantics id="S2.SS2.p3.11.m1.1a"><mi id="S2.SS2.p3.11.m1.1.1" xref="S2.SS2.p3.11.m1.1.1.cmml">S</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.11.m1.1b"><ci id="S2.SS2.p3.11.m1.1.1.cmml" xref="S2.SS2.p3.11.m1.1.1">𝑆</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.11.m1.1c">S</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.11.m1.1d">italic_S</annotation></semantics></math> is the total speaker numbers and indicator <math alttext="\mathbbm{1}(s,u)\,{=}\,1" class="ltx_Math" display="inline" id="S2.SS2.p3.12.m2.2"><semantics id="S2.SS2.p3.12.m2.2a"><mrow id="S2.SS2.p3.12.m2.2.3" xref="S2.SS2.p3.12.m2.2.3.cmml"><mrow id="S2.SS2.p3.12.m2.2.3.2" xref="S2.SS2.p3.12.m2.2.3.2.cmml"><mn id="S2.SS2.p3.12.m2.2.3.2.2" xref="S2.SS2.p3.12.m2.2.3.2.2.cmml">𝟙</mn><mo id="S2.SS2.p3.12.m2.2.3.2.1" xref="S2.SS2.p3.12.m2.2.3.2.1.cmml"></mo><mrow id="S2.SS2.p3.12.m2.2.3.2.3.2" xref="S2.SS2.p3.12.m2.2.3.2.3.1.cmml"><mo id="S2.SS2.p3.12.m2.2.3.2.3.2.1" stretchy="false" xref="S2.SS2.p3.12.m2.2.3.2.3.1.cmml">(</mo><mi id="S2.SS2.p3.12.m2.1.1" xref="S2.SS2.p3.12.m2.1.1.cmml">s</mi><mo id="S2.SS2.p3.12.m2.2.3.2.3.2.2" xref="S2.SS2.p3.12.m2.2.3.2.3.1.cmml">,</mo><mi id="S2.SS2.p3.12.m2.2.2" xref="S2.SS2.p3.12.m2.2.2.cmml">u</mi><mo id="S2.SS2.p3.12.m2.2.3.2.3.2.3" rspace="0.170em" stretchy="false" xref="S2.SS2.p3.12.m2.2.3.2.3.1.cmml">)</mo></mrow></mrow><mo id="S2.SS2.p3.12.m2.2.3.1" xref="S2.SS2.p3.12.m2.2.3.1.cmml">=</mo><mn id="S2.SS2.p3.12.m2.2.3.3" xref="S2.SS2.p3.12.m2.2.3.3.cmml"> 1</mn></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.12.m2.2b"><apply id="S2.SS2.p3.12.m2.2.3.cmml" xref="S2.SS2.p3.12.m2.2.3"><eq id="S2.SS2.p3.12.m2.2.3.1.cmml" xref="S2.SS2.p3.12.m2.2.3.1"></eq><apply id="S2.SS2.p3.12.m2.2.3.2.cmml" xref="S2.SS2.p3.12.m2.2.3.2"><times id="S2.SS2.p3.12.m2.2.3.2.1.cmml" xref="S2.SS2.p3.12.m2.2.3.2.1"></times><cn id="S2.SS2.p3.12.m2.2.3.2.2.cmml" type="integer" xref="S2.SS2.p3.12.m2.2.3.2.2">1</cn><interval closure="open" id="S2.SS2.p3.12.m2.2.3.2.3.1.cmml" xref="S2.SS2.p3.12.m2.2.3.2.3.2"><ci id="S2.SS2.p3.12.m2.1.1.cmml" xref="S2.SS2.p3.12.m2.1.1">𝑠</ci><ci id="S2.SS2.p3.12.m2.2.2.cmml" xref="S2.SS2.p3.12.m2.2.2">𝑢</ci></interval></apply><cn id="S2.SS2.p3.12.m2.2.3.3.cmml" type="integer" xref="S2.SS2.p3.12.m2.2.3.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.12.m2.2c">\mathbbm{1}(s,u)\,{=}\,1</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.12.m2.2d">blackboard_1 ( italic_s , italic_u ) = 1</annotation></semantics></math> if token <math alttext="l_{u}" class="ltx_Math" display="inline" id="S2.SS2.p3.13.m3.1"><semantics id="S2.SS2.p3.13.m3.1a"><msub id="S2.SS2.p3.13.m3.1.1" xref="S2.SS2.p3.13.m3.1.1.cmml"><mi id="S2.SS2.p3.13.m3.1.1.2" xref="S2.SS2.p3.13.m3.1.1.2.cmml">l</mi><mi id="S2.SS2.p3.13.m3.1.1.3" xref="S2.SS2.p3.13.m3.1.1.3.cmml">u</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.13.m3.1b"><apply id="S2.SS2.p3.13.m3.1.1.cmml" xref="S2.SS2.p3.13.m3.1.1"><csymbol cd="ambiguous" id="S2.SS2.p3.13.m3.1.1.1.cmml" xref="S2.SS2.p3.13.m3.1.1">subscript</csymbol><ci id="S2.SS2.p3.13.m3.1.1.2.cmml" xref="S2.SS2.p3.13.m3.1.1.2">𝑙</ci><ci id="S2.SS2.p3.13.m3.1.1.3.cmml" xref="S2.SS2.p3.13.m3.1.1.3">𝑢</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.13.m3.1c">l_{u}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.13.m3.1d">italic_l start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT</annotation></semantics></math> belongs to speaker <math alttext="s" class="ltx_Math" display="inline" id="S2.SS2.p3.14.m4.1"><semantics id="S2.SS2.p3.14.m4.1a"><mi id="S2.SS2.p3.14.m4.1.1" xref="S2.SS2.p3.14.m4.1.1.cmml">s</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.14.m4.1b"><ci id="S2.SS2.p3.14.m4.1.1.cmml" xref="S2.SS2.p3.14.m4.1.1">𝑠</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.14.m4.1c">s</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.14.m4.1d">italic_s</annotation></semantics></math>, otherwise <math alttext="0" class="ltx_Math" display="inline" id="S2.SS2.p3.15.m5.1"><semantics id="S2.SS2.p3.15.m5.1a"><mn id="S2.SS2.p3.15.m5.1.1" xref="S2.SS2.p3.15.m5.1.1.cmml">0</mn><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.15.m5.1b"><cn id="S2.SS2.p3.15.m5.1.1.cmml" type="integer" xref="S2.SS2.p3.15.m5.1.1">0</cn></annotation-xml></semantics></math>. Note that if <math alttext="r_{sa}(s,t)" class="ltx_Math" display="inline" id="S2.SS2.p3.16.m6.2"><semantics id="S2.SS2.p3.16.m6.2a"><mrow id="S2.SS2.p3.16.m6.2.3" xref="S2.SS2.p3.16.m6.2.3.cmml"><msub id="S2.SS2.p3.16.m6.2.3.2" xref="S2.SS2.p3.16.m6.2.3.2.cmml"><mi id="S2.SS2.p3.16.m6.2.3.2.2" xref="S2.SS2.p3.16.m6.2.3.2.2.cmml">r</mi><mrow id="S2.SS2.p3.16.m6.2.3.2.3" xref="S2.SS2.p3.16.m6.2.3.2.3.cmml"><mi id="S2.SS2.p3.16.m6.2.3.2.3.2" xref="S2.SS2.p3.16.m6.2.3.2.3.2.cmml">s</mi><mo id="S2.SS2.p3.16.m6.2.3.2.3.1" xref="S2.SS2.p3.16.m6.2.3.2.3.1.cmml"></mo><mi id="S2.SS2.p3.16.m6.2.3.2.3.3" xref="S2.SS2.p3.16.m6.2.3.2.3.3.cmml">a</mi></mrow></msub><mo id="S2.SS2.p3.16.m6.2.3.1" xref="S2.SS2.p3.16.m6.2.3.1.cmml"></mo><mrow id="S2.SS2.p3.16.m6.2.3.3.2" xref="S2.SS2.p3.16.m6.2.3.3.1.cmml"><mo id="S2.SS2.p3.16.m6.2.3.3.2.1" stretchy="false" xref="S2.SS2.p3.16.m6.2.3.3.1.cmml">(</mo><mi id="S2.SS2.p3.16.m6.1.1" xref="S2.SS2.p3.16.m6.1.1.cmml">s</mi><mo id="S2.SS2.p3.16.m6.2.3.3.2.2" xref="S2.SS2.p3.16.m6.2.3.3.1.cmml">,</mo><mi id="S2.SS2.p3.16.m6.2.2" xref="S2.SS2.p3.16.m6.2.2.cmml">t</mi><mo id="S2.SS2.p3.16.m6.2.3.3.2.3" stretchy="false" xref="S2.SS2.p3.16.m6.2.3.3.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.16.m6.2b"><apply id="S2.SS2.p3.16.m6.2.3.cmml" xref="S2.SS2.p3.16.m6.2.3"><times id="S2.SS2.p3.16.m6.2.3.1.cmml" xref="S2.SS2.p3.16.m6.2.3.1"></times><apply id="S2.SS2.p3.16.m6.2.3.2.cmml" xref="S2.SS2.p3.16.m6.2.3.2"><csymbol cd="ambiguous" id="S2.SS2.p3.16.m6.2.3.2.1.cmml" xref="S2.SS2.p3.16.m6.2.3.2">subscript</csymbol><ci id="S2.SS2.p3.16.m6.2.3.2.2.cmml" xref="S2.SS2.p3.16.m6.2.3.2.2">𝑟</ci><apply id="S2.SS2.p3.16.m6.2.3.2.3.cmml" xref="S2.SS2.p3.16.m6.2.3.2.3"><times id="S2.SS2.p3.16.m6.2.3.2.3.1.cmml" xref="S2.SS2.p3.16.m6.2.3.2.3.1"></times><ci id="S2.SS2.p3.16.m6.2.3.2.3.2.cmml" xref="S2.SS2.p3.16.m6.2.3.2.3.2">𝑠</ci><ci id="S2.SS2.p3.16.m6.2.3.2.3.3.cmml" xref="S2.SS2.p3.16.m6.2.3.2.3.3">𝑎</ci></apply></apply><interval closure="open" id="S2.SS2.p3.16.m6.2.3.3.1.cmml" xref="S2.SS2.p3.16.m6.2.3.3.2"><ci id="S2.SS2.p3.16.m6.1.1.cmml" xref="S2.SS2.p3.16.m6.1.1">𝑠</ci><ci id="S2.SS2.p3.16.m6.2.2.cmml" xref="S2.SS2.p3.16.m6.2.2">𝑡</ci></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.16.m6.2c">r_{sa}(s,t)</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.16.m6.2d">italic_r start_POSTSUBSCRIPT italic_s italic_a end_POSTSUBSCRIPT ( italic_s , italic_t )</annotation></semantics></math> provides uniform risks across all paths, <math alttext="\mathcal{J}_{sa}" class="ltx_Math" display="inline" id="S2.SS2.p3.17.m7.1"><semantics id="S2.SS2.p3.17.m7.1a"><msub id="S2.SS2.p3.17.m7.1.1" xref="S2.SS2.p3.17.m7.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.SS2.p3.17.m7.1.1.2" xref="S2.SS2.p3.17.m7.1.1.2.cmml">𝒥</mi><mrow id="S2.SS2.p3.17.m7.1.1.3" xref="S2.SS2.p3.17.m7.1.1.3.cmml"><mi id="S2.SS2.p3.17.m7.1.1.3.2" xref="S2.SS2.p3.17.m7.1.1.3.2.cmml">s</mi><mo id="S2.SS2.p3.17.m7.1.1.3.1" xref="S2.SS2.p3.17.m7.1.1.3.1.cmml"></mo><mi id="S2.SS2.p3.17.m7.1.1.3.3" xref="S2.SS2.p3.17.m7.1.1.3.3.cmml">a</mi></mrow></msub><annotation-xml encoding="MathML-Content" id="S2.SS2.p3.17.m7.1b"><apply id="S2.SS2.p3.17.m7.1.1.cmml" xref="S2.SS2.p3.17.m7.1.1"><csymbol cd="ambiguous" id="S2.SS2.p3.17.m7.1.1.1.cmml" xref="S2.SS2.p3.17.m7.1.1">subscript</csymbol><ci id="S2.SS2.p3.17.m7.1.1.2.cmml" xref="S2.SS2.p3.17.m7.1.1.2">𝒥</ci><apply id="S2.SS2.p3.17.m7.1.1.3.cmml" xref="S2.SS2.p3.17.m7.1.1.3"><times id="S2.SS2.p3.17.m7.1.1.3.1.cmml" xref="S2.SS2.p3.17.m7.1.1.3.1"></times><ci id="S2.SS2.p3.17.m7.1.1.3.2.cmml" xref="S2.SS2.p3.17.m7.1.1.3.2">𝑠</ci><ci id="S2.SS2.p3.17.m7.1.1.3.3.cmml" xref="S2.SS2.p3.17.m7.1.1.3.3">𝑎</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p3.17.m7.1c">\mathcal{J}_{sa}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p3.17.m7.1d">caligraphic_J start_POSTSUBSCRIPT italic_s italic_a end_POSTSUBSCRIPT</annotation></semantics></math> degenerates to vanilla CTC, as it treats all paths equally. From this perspective, this training objective can be understood as adding a path penalty upon the vanilla CTC objective, where the penalty corresponds to the risk function.</p> </div> <figure class="ltx_figure" id="S2.F1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="329" id="S2.F1.1.g1" src="x1.png" width="747"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 1: </span> A simplified illustration of the proposed speaker-aware risk function with CTC lattice. Red area indicates high risk and green for low risk. Tokens 1 and 2,3,4 are from different speakers. Two encouraged alignments are shown as examples. </figcaption> </figure> <div class="ltx_para" id="S2.SS2.p4"> <p class="ltx_p" id="S2.SS2.p4.1">Fig. <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S2.F1" title="Figure 1 ‣ II-B Speaker-aware CTC based on minimizing Bayes risk ‣ II Methods ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag">1</span></a> presents a simplified illustration of how the proposed training objective works. It requires the frond-end encoder to disentangle separate speakers onto specific frames.</p> </div> </section> </section> <section class="ltx_section" id="S3"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">III </span><span class="ltx_text ltx_font_smallcaps" id="S3.1.1">Experimental setup</span> </h2> <div class="ltx_para ltx_noindent" id="S3.p1"> <p class="ltx_p" id="S3.p1.1"><span class="ltx_text ltx_font_bold" id="S3.p1.1.1">Dataset</span> Our experiments employed LibriSpeechMix (LSM) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib1" title="">1</a>]</cite> as a benchmark dataset. This dataset is derived from the LibriSpeech (LS) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib34" title="">34</a>]</cite> corpus, simulated both 2-speaker (LSM-2mix) and 3-speaker (LSM-3mix) mixed speech. As LSM only provides development and test sets, we generated 2-speaker mixed speeches for training using the similar protocol as in <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib1" title="">1</a>, <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib22" title="">22</a>]</cite>. Specifically, for each sample in the LS 960-hour training set, we randomly sample another sample with a random offset to mix with. We expect a practical MTASR model can simultaneously handle single- and multi-talker scenarios. Thus the generated mixed data was combined with the single-talker LS training set, resulting in our training set containing around 560k utterances with 1.7k hours of speech. To prob model performance on varying degrees of overlapped speech, we further divided the LSM test set into three subsets, representing low, medium, and high overlap conditions. The corresponding overlap ratios are (0, 0.2], (0.2, 0.5], and (0.5, 1.0] respectively. The overlap ratio here is defined as the duration of overlaps divided by the total duration of mixed speech. Besides, we concatenate transcriptions from separate speakers as text labels, using the first-in-first-out serialization strategy.</p> </div> <div class="ltx_para ltx_noindent" id="S3.p2"> <p class="ltx_p" id="S3.p2.1"><span class="ltx_text ltx_font_bold" id="S3.p2.1.1">Model settings</span> We implemented CTC and AED ASR models with the ESPnet2 toolkit <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib35" title="">35</a>]</cite>. For the CTC model, we use a conformer encoder with 12 conformer blocks. Each block has 4-head self-attention with 256 hidden units and two 1024-dimensional feed-forward layers (macaron style). The AED model has an additional transformer decoder, comprising 8 transformer blocks with also 4 heads self-attention and 256 hidden units, but a 2048-dimensional feed-forward layer. As a result, CTC model has 22.14M parameters and AED model has 34.18M parameters.</p> </div> <div class="ltx_para ltx_noindent" id="S3.p3"> <p class="ltx_p" id="S3.p3.1"><span class="ltx_text ltx_font_bold" id="S3.p3.1.1">Training settings</span> The CTC models were trained with vanilla CTC or proposed speaker-aware CTC (SACTC) objectives. And AED model was trained with sole AED loss (w/o CTC) or joint-CTC/attention loss, where CTC weight was set as 0.3. During training, Adam optimizer was used with learning rate of 5e-4, warm-up steps of 25k, and batch bins of 35M. Our preliminary study shows CTC converges slower than AED in MTASR, thus we trained CTC model for 80 epochs, while AED models for 50 epochs. After training, the best 10 epochs on the dev set were fused as the final models.</p> </div> <div class="ltx_para ltx_noindent" id="S3.p4"> <p class="ltx_p" id="S3.p4.1"><span class="ltx_text ltx_font_bold" id="S3.p4.1.1">Metrics</span> For single-talker condition, we used the standard word error rate (WER) as the evaluation metric. For multi-talker condition, we deployed permutation-invariant WER, a common metric for SOT approaches. This metric compares all possible permutations of speaker orders and picks up the lowest WER. Additionally, we also implemented overlap-aware WER (OA-WER) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#bib.bib22" title="">22</a>]</cite>. OA-WER averages WERs across various overlap ratios, aiming to balance the impact of different degrees of overlapped speech.</p> </div> <figure class="ltx_table" id="S3.T1"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE I: </span> WER (%) of vanilla CTC and SACTC in MTASR. C1 and E1 are the main experiments. ”dec.” stands for decoding. </figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S3.T1.4" style="width:296.5pt;height:203.5pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-26.2pt,18.0pt) scale(0.85,0.85) ;"> <table class="ltx_tabular ltx_align_middle" id="S3.T1.4.4"> <tr class="ltx_tr" id="S3.T1.4.4.5"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_tt" id="S3.T1.4.4.5.1" rowspan="2" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S3.T1.4.4.5.1.1">ID</span></td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_tt" id="S3.T1.4.4.5.2" style="padding:1.5pt 3.0pt;"> <span class="ltx_text" id="S3.T1.4.4.5.2.1"></span> <span class="ltx_text" id="S3.T1.4.4.5.2.2"> <span class="ltx_tabular ltx_align_middle" id="S3.T1.4.4.5.2.2.1"> <span class="ltx_tr" id="S3.T1.4.4.5.2.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center ltx_rowspan ltx_rowspan_3" id="S3.T1.4.4.5.2.2.1.1.1" style="padding:1.5pt 3.0pt;"><span class="ltx_text" id="S3.T1.4.4.5.2.2.1.1.1.1"> <span class="ltx_text ltx_font_bold" id="S3.T1.4.4.5.2.2.1.1.1.1.1">System</span></span></span></span> </span></span><span class="ltx_text" id="S3.T1.4.4.5.2.3"></span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" colspan="2" id="S3.T1.4.4.5.3" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S3.T1.4.4.5.3.1">Librispeech</span></td> <td class="ltx_td ltx_align_center ltx_border_tt" colspan="6" id="S3.T1.4.4.5.4" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S3.T1.4.4.5.4.1">LibrispeechMix-2mix</span></td> </tr> <tr class="ltx_tr" id="S3.T1.4.4.6"> <td class="ltx_td ltx_border_r" id="S3.T1.4.4.6.1" style="padding:1.5pt 3.0pt;"></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.6.2" rowspan="2" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S3.T1.4.4.6.2.1">dev</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.4.4.6.3" rowspan="2" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S3.T1.4.4.6.3.1">test</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.6.4" rowspan="2" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S3.T1.4.4.6.4.1">Dev</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.4.4.6.5" rowspan="2" style="padding:1.5pt 3.0pt;"><span class="ltx_text" id="S3.T1.4.4.6.5.1"> <span class="ltx_inline-block" id="S3.T1.4.4.6.5.1.1"> <span class="ltx_p" id="S3.T1.4.4.6.5.1.1.1"><span class="ltx_text ltx_font_bold" id="S3.T1.4.4.6.5.1.1.1.1">Test</span></span> <span class="ltx_p" id="S3.T1.4.4.6.5.1.1.2">(Overall)</span> </span></span></td> <td class="ltx_td ltx_align_center ltx_border_t" colspan="4" id="S3.T1.4.4.6.6" style="padding:1.5pt 3.0pt;"> <span class="ltx_text ltx_font_bold" id="S3.T1.4.4.6.6.1">Test</span> (Conditional)</td> </tr> <tr class="ltx_tr" id="S3.T1.4.4.7"> <td class="ltx_td ltx_border_r" id="S3.T1.4.4.7.1" style="padding:1.5pt 3.0pt;"></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.7.2" style="padding:1.5pt 3.0pt;">low</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.7.3" style="padding:1.5pt 3.0pt;">mid</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.7.4" style="padding:1.5pt 3.0pt;">high</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.7.5" style="padding:1.5pt 3.0pt;">OA-WER</td> </tr> <tr class="ltx_tr" id="S3.T1.4.4.8"> <td class="ltx_td ltx_nopad_l ltx_nopad_r ltx_align_left" colspan="10" id="S3.T1.4.4.8.1" style="padding:1.5pt 3.0pt;"><span class="ltx_rule" style="width:100%;height:0.9pt;background:black;display:inline-block;"> </span></td> </tr> <tr class="ltx_tr" id="S3.T1.4.4.9"> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.4.4.9.1" style="padding:1.5pt 3.0pt;">A1</td> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.4.4.9.2" style="padding:1.5pt 3.0pt;">SOT</td> <td class="ltx_td ltx_align_center" id="S3.T1.4.4.9.3" style="padding:1.5pt 3.0pt;">4.1</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S3.T1.4.4.9.4" style="padding:1.5pt 3.0pt;">4.6</td> <td class="ltx_td ltx_align_center" id="S3.T1.4.4.9.5" style="padding:1.5pt 3.0pt;">7.9</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S3.T1.4.4.9.6" style="padding:1.5pt 3.0pt;">9.2</td> <td class="ltx_td ltx_align_center" id="S3.T1.4.4.9.7" style="padding:1.5pt 3.0pt;">9.0</td> <td class="ltx_td ltx_align_center" id="S3.T1.4.4.9.8" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S3.T1.4.4.9.8.1">8.0</span></td> <td class="ltx_td ltx_align_center" id="S3.T1.4.4.9.9" style="padding:1.5pt 3.0pt;">12.8</td> <td class="ltx_td ltx_align_center" id="S3.T1.4.4.9.10" style="padding:1.5pt 3.0pt;">9.9</td> </tr> <tr class="ltx_tr" id="S3.T1.4.4.10"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.4.4.10.1" style="padding:1.5pt 3.0pt;">B1</td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.4.4.10.2" style="padding:1.5pt 3.0pt;">CTC</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.10.3" style="padding:1.5pt 3.0pt;">5.0</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.4.4.10.4" style="padding:1.5pt 3.0pt;">5.4</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.10.5" style="padding:1.5pt 3.0pt;">11.7</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.4.4.10.6" style="padding:1.5pt 3.0pt;">11.1</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.10.7" style="padding:1.5pt 3.0pt;">7.5</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.10.8" style="padding:1.5pt 3.0pt;">12.4</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.10.9" style="padding:1.5pt 3.0pt;">18.2</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.10.10" style="padding:1.5pt 3.0pt;">12.7</td> </tr> <tr class="ltx_tr" id="S3.T1.4.4.11"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.4.4.11.1" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S3.T1.4.4.11.1.1">C1</span></td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.4.4.11.2" style="padding:1.5pt 3.0pt;">SOT+CTC</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.11.3" style="padding:1.5pt 3.0pt;">4.3</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.4.4.11.4" style="padding:1.5pt 3.0pt;">4.5</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.11.5" style="padding:1.5pt 3.0pt;">8.4</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.4.4.11.6" style="padding:1.5pt 3.0pt;">8.8</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.11.7" style="padding:1.5pt 3.0pt;">7.1</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.11.8" style="padding:1.5pt 3.0pt;">9.0</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.11.9" style="padding:1.5pt 3.0pt;">13.1</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.11.10" style="padding:1.5pt 3.0pt;">9.7</td> </tr> <tr class="ltx_tr" id="S3.T1.1.1.1"> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.1.1.1.2" style="padding:1.5pt 3.0pt;">C2</td> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.1.1.1.1" style="padding:1.5pt 3.0pt;"> <math alttext="\hookrightarrow" class="ltx_Math" display="inline" id="S3.T1.1.1.1.1.m1.1"><semantics id="S3.T1.1.1.1.1.m1.1a"><mo id="S3.T1.1.1.1.1.m1.1.1" stretchy="false" xref="S3.T1.1.1.1.1.m1.1.1.cmml">↪</mo><annotation-xml encoding="MathML-Content" id="S3.T1.1.1.1.1.m1.1b"><ci id="S3.T1.1.1.1.1.m1.1.1.cmml" xref="S3.T1.1.1.1.1.m1.1.1">↪</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.1.1.1.1.m1.1c">\hookrightarrow</annotation><annotation encoding="application/x-llamapun" id="S3.T1.1.1.1.1.m1.1d">↪</annotation></semantics></math> AED only dec.</td> <td class="ltx_td ltx_align_center" id="S3.T1.1.1.1.3" style="padding:1.5pt 3.0pt;">4.8</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S3.T1.1.1.1.4" style="padding:1.5pt 3.0pt;">5.4</td> <td class="ltx_td ltx_align_center" id="S3.T1.1.1.1.5" style="padding:1.5pt 3.0pt;">11.5</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S3.T1.1.1.1.6" style="padding:1.5pt 3.0pt;">12.9</td> <td class="ltx_td ltx_align_center" id="S3.T1.1.1.1.7" style="padding:1.5pt 3.0pt;">11.7</td> <td class="ltx_td ltx_align_center" id="S3.T1.1.1.1.8" style="padding:1.5pt 3.0pt;">12.5</td> <td class="ltx_td ltx_align_center" id="S3.T1.1.1.1.9" style="padding:1.5pt 3.0pt;">17.3</td> <td class="ltx_td ltx_align_center" id="S3.T1.1.1.1.10" style="padding:1.5pt 3.0pt;">13.8</td> </tr> <tr class="ltx_tr" id="S3.T1.2.2.2"> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.2.2.2.2" style="padding:1.5pt 3.0pt;">C3</td> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.2.2.2.1" style="padding:1.5pt 3.0pt;"> <math alttext="\hookrightarrow" class="ltx_Math" display="inline" id="S3.T1.2.2.2.1.m1.1"><semantics id="S3.T1.2.2.2.1.m1.1a"><mo id="S3.T1.2.2.2.1.m1.1.1" stretchy="false" xref="S3.T1.2.2.2.1.m1.1.1.cmml">↪</mo><annotation-xml encoding="MathML-Content" id="S3.T1.2.2.2.1.m1.1b"><ci id="S3.T1.2.2.2.1.m1.1.1.cmml" xref="S3.T1.2.2.2.1.m1.1.1">↪</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.2.2.2.1.m1.1c">\hookrightarrow</annotation><annotation encoding="application/x-llamapun" id="S3.T1.2.2.2.1.m1.1d">↪</annotation></semantics></math> CTC only dec.</td> <td class="ltx_td ltx_align_center" id="S3.T1.2.2.2.3" style="padding:1.5pt 3.0pt;">5.5</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S3.T1.2.2.2.4" style="padding:1.5pt 3.0pt;">5.6</td> <td class="ltx_td ltx_align_center" id="S3.T1.2.2.2.5" style="padding:1.5pt 3.0pt;">12.7</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S3.T1.2.2.2.6" style="padding:1.5pt 3.0pt;">12.0</td> <td class="ltx_td ltx_align_center" id="S3.T1.2.2.2.7" style="padding:1.5pt 3.0pt;">7.8</td> <td class="ltx_td ltx_align_center" id="S3.T1.2.2.2.8" style="padding:1.5pt 3.0pt;">13.8</td> <td class="ltx_td ltx_align_center" id="S3.T1.2.2.2.9" style="padding:1.5pt 3.0pt;">19.7</td> <td class="ltx_td ltx_align_center" id="S3.T1.2.2.2.10" style="padding:1.5pt 3.0pt;">13.8</td> </tr> <tr class="ltx_tr" id="S3.T1.4.4.12"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.4.4.12.1" style="padding:1.5pt 3.0pt;">D1</td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.4.4.12.2" style="padding:1.5pt 3.0pt;">SACTC</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.12.3" style="padding:1.5pt 3.0pt;">5.4</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.4.4.12.4" style="padding:1.5pt 3.0pt;">5.6</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.12.5" style="padding:1.5pt 3.0pt;">13.5</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.4.4.12.6" style="padding:1.5pt 3.0pt;">12.3</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.12.7" style="padding:1.5pt 3.0pt;">8.0</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.12.8" style="padding:1.5pt 3.0pt;">13.9</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.12.9" style="padding:1.5pt 3.0pt;">20.8</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.12.10" style="padding:1.5pt 3.0pt;">14.2</td> </tr> <tr class="ltx_tr" id="S3.T1.4.4.13"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.4.4.13.1" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S3.T1.4.4.13.1.1">E1</span></td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.4.4.13.2" style="padding:1.5pt 3.0pt;">SOT+SACTC</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.13.3" style="padding:1.5pt 3.0pt;">3.9</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.4.4.13.4" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S3.T1.4.4.13.4.1">4.1</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.13.5" style="padding:1.5pt 3.0pt;">8.2</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.4.4.13.6" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S3.T1.4.4.13.6.1">8.0</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.13.7" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S3.T1.4.4.13.7.1">6.0</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.13.8" style="padding:1.5pt 3.0pt;">8.4</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.13.9" style="padding:1.5pt 3.0pt;">12.8</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.4.13.10" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S3.T1.4.4.13.10.1">9.1</span></td> </tr> <tr class="ltx_tr" id="S3.T1.3.3.3"> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.3.3.3.2" style="padding:1.5pt 3.0pt;">E2</td> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.3.3.3.1" style="padding:1.5pt 3.0pt;"> <math alttext="\hookrightarrow" class="ltx_Math" display="inline" id="S3.T1.3.3.3.1.m1.1"><semantics id="S3.T1.3.3.3.1.m1.1a"><mo id="S3.T1.3.3.3.1.m1.1.1" stretchy="false" xref="S3.T1.3.3.3.1.m1.1.1.cmml">↪</mo><annotation-xml encoding="MathML-Content" id="S3.T1.3.3.3.1.m1.1b"><ci id="S3.T1.3.3.3.1.m1.1.1.cmml" xref="S3.T1.3.3.3.1.m1.1.1">↪</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.3.3.3.1.m1.1c">\hookrightarrow</annotation><annotation encoding="application/x-llamapun" id="S3.T1.3.3.3.1.m1.1d">↪</annotation></semantics></math> AED only dec.</td> <td class="ltx_td ltx_align_center" id="S3.T1.3.3.3.3" style="padding:1.5pt 3.0pt;">4.0</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S3.T1.3.3.3.4" style="padding:1.5pt 3.0pt;">4.5</td> <td class="ltx_td ltx_align_center" id="S3.T1.3.3.3.5" style="padding:1.5pt 3.0pt;">8.4</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S3.T1.3.3.3.6" style="padding:1.5pt 3.0pt;">8.8</td> <td class="ltx_td ltx_align_center" id="S3.T1.3.3.3.7" style="padding:1.5pt 3.0pt;">8.2</td> <td class="ltx_td ltx_align_center" id="S3.T1.3.3.3.8" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S3.T1.3.3.3.8.1">8.0</span></td> <td class="ltx_td ltx_align_center" id="S3.T1.3.3.3.9" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S3.T1.3.3.3.9.1">12.3</span></td> <td class="ltx_td ltx_align_center" id="S3.T1.3.3.3.10" style="padding:1.5pt 3.0pt;">9.5</td> </tr> <tr class="ltx_tr" id="S3.T1.4.4.4"> <td class="ltx_td ltx_align_left ltx_border_bb ltx_border_r" id="S3.T1.4.4.4.2" style="padding:1.5pt 3.0pt;">E3</td> <td class="ltx_td ltx_align_left ltx_border_bb ltx_border_r" id="S3.T1.4.4.4.1" style="padding:1.5pt 3.0pt;"> <math alttext="\hookrightarrow" class="ltx_Math" display="inline" id="S3.T1.4.4.4.1.m1.1"><semantics id="S3.T1.4.4.4.1.m1.1a"><mo id="S3.T1.4.4.4.1.m1.1.1" stretchy="false" xref="S3.T1.4.4.4.1.m1.1.1.cmml">↪</mo><annotation-xml encoding="MathML-Content" id="S3.T1.4.4.4.1.m1.1b"><ci id="S3.T1.4.4.4.1.m1.1.1.cmml" xref="S3.T1.4.4.4.1.m1.1.1">↪</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.4.4.4.1.m1.1c">\hookrightarrow</annotation><annotation encoding="application/x-llamapun" id="S3.T1.4.4.4.1.m1.1d">↪</annotation></semantics></math> CTC only dec.</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S3.T1.4.4.4.3" style="padding:1.5pt 3.0pt;">5.5</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S3.T1.4.4.4.4" style="padding:1.5pt 3.0pt;">5.8</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S3.T1.4.4.4.5" style="padding:1.5pt 3.0pt;">12.5</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S3.T1.4.4.4.6" style="padding:1.5pt 3.0pt;">11.9</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S3.T1.4.4.4.7" style="padding:1.5pt 3.0pt;">8.1</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S3.T1.4.4.4.8" style="padding:1.5pt 3.0pt;">13.1</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S3.T1.4.4.4.9" style="padding:1.5pt 3.0pt;">19.9</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S3.T1.4.4.4.10" style="padding:1.5pt 3.0pt;">13.7</td> </tr> </table> </span></div> </figure> </section> <section class="ltx_section" id="S4"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">IV </span><span class="ltx_text ltx_font_smallcaps" id="S4.1.1">Results and discussions</span> </h2> <div class="ltx_para" id="S4.p1"> <p class="ltx_p" id="S4.p1.1">In this section, we first analyze the effect of vanilla CTC in SOT-based MTASR. We then present and discuss the experimental results of the proposed SACTC approach, comparing it to vanilla CTC.</p> </div> <section class="ltx_subsection" id="S4.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S4.SS1.5.1.1">IV-A</span> </span><span class="ltx_text ltx_font_italic" id="S4.SS1.6.2">Analysis of vanilla CTC</span> </h3> <div class="ltx_para" id="S4.SS1.p1"> <p class="ltx_p" id="S4.SS1.p1.1">Previous research has demonstrated that integrating CTC with SOT improves MTASR performance. We reproduced these experiments, with results presented in Table <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S3.T1" title="TABLE I ‣ III Experimental setup ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag">I</span></a>, systems A to C. Comparing CTC with SOT, we observe that while CTC generally performed worse, it achieved better WER on the low-overlap subset (7.5 vs. 9.0). Moreover, incorporating CTC with SOT (C1) didn’t enhance single-talker performance but improved multi-talker recognition, particularly on the low-overlap subset (9.0→7.5). However, <span class="ltx_text ltx_font_italic" id="S4.SS1.p1.1.1">the addition of CTC led to decreased performance on mid- and high-overlap speech.</span> These results validate that CTC could assist in recognizing low-overlap speech, while it degrades performance when encountering more severe overlaps.</p> </div> <figure class="ltx_figure" id="S4.F2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="581" id="S4.F2.1.g1" src="x2.png" width="830"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 2: </span> Visualization of top-50 attended frames for two speakers (red and blue colors). Purple colors represent two speakers attending simultaneously. </figcaption> </figure> <figure class="ltx_figure" id="S4.F3"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="500" id="S4.F3.1.g1" src="x3.png" width="747"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure">Figure 3: </span> Attention matrices in the last conformer blocks of SOT (a) and SOT-CTC (b) models. In (b), the overlapped area was encoded into separate output frames. </figcaption> </figure> <div class="ltx_para" id="S4.SS1.p2"> <p class="ltx_p" id="S4.SS1.p2.1">To better understand the interaction of CTC and multi-talker speech, we examined the attention patterns in the conformer encoder for different speakers. In detail, we visualized the top 50 attended frames in self-attention for each CTC-emitted token, then accumulated attentions for tokens from two speakers in distinct colors. Fig. <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S4.F2" title="Figure 2 ‣ IV-A Analysis of vanilla CTC ‣ IV Results and discussions ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag">2</span></a> illustrates an interesting pattern: two speakers generally attended all frames in shallower blocks, while from layer 10 onwards, two speakers began to focus on distinct regions. This aligns with our derivation in Section <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S2.SS2" title="II-B Speaker-aware CTC based on minimizing Bayes risk ‣ II Methods ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag"><span class="ltx_text">II-B</span></span></a>. Notably, we did not observe this phenomenon in the sole SOT system. Fig. <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S4.F3" title="Figure 3 ‣ IV-A Analysis of vanilla CTC ‣ IV Results and discussions ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag">3</span></a> further visualize the attention matrices in the last conformer block. Compared to the sole SOT system, the use of CTC leads to information re-ordering: certain portions of the input embedding are attended to distinct regions of the output embedding (illustrated in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S4.F3" title="Figure 3 ‣ IV-A Analysis of vanilla CTC ‣ IV Results and discussions ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag">3</span></a>(b) Head 4). Moreover, these repeatedly attended regions show a direct correlation to the overlapped area in the input speech. These visualizations suggest that with CTC guidance, the self-attention modules disentangle different speakers along the time dimension to align with concatenated labels. Together with WER results, we hypothesize that this capacity is limited for handling severely overlapped speech.</p> </div> </section> <section class="ltx_subsection" id="S4.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S4.SS2.5.1.1">IV-B</span> </span><span class="ltx_text ltx_font_italic" id="S4.SS2.6.2">Performance of SACTC</span> </h3> <div class="ltx_para" id="S4.SS2.p1"> <p class="ltx_p" id="S4.SS2.p1.1">We evaluated the proposed SACTC approach using a default risk factor parameter of 15. Initially, we trained a model with the SACTC objective alone. As shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S3.T1" title="TABLE I ‣ III Experimental setup ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag">I</span></a>, SACTC by itself did not outperform the vanilla CTC model (B1 vs. D1). However, when combined with SOT (D1), the model showed significant improvements over vanilla CTC: overall LSM-2mix WER improved from 8.8 to 8.0, and mid-overlap WER from 12.4 to 8.4. This outcome is understandable, as SACTC is designed to enhance MTASR embedding with deterministic speaker disentangling, thus not necessarily improving token-level recognition<span class="ltx_note ltx_role_footnote" id="footnote1"><sup class="ltx_note_mark">1</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">1</sup><span class="ltx_tag ltx_tag_note">1</span> For encoder-only models, there might exist potential trade-offs between these two aspects. </span></span></span>. When integrated with SOT, SACTC enhanced low-overlap recognition similar to vanilla CTC, while mitigating performance degradation on more severe overlaps. Experiment E2 also supports this interpretation. For the model trained with SOT+SACTC, AED-only decoding led to performance gains compared to SOT+CTC, particularly improving recognition in high-overlap conditions (13.1→12.3). This suggests that SACTC resulted in embeddings with enhanced speaker discriminability. Based on these findings, we propose that AED-only decoding should be preferred for heavily overlapped scenarios.</p> </div> <figure class="ltx_table" id="S4.T2"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE II: </span> WER (%) of SOT+SACTC with different risk factors, where 15 is the default setting. </figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T2.1" style="width:266.5pt;height:137.7pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-23.5pt,12.1pt) scale(0.85,0.85) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T2.1.1"> <tr class="ltx_tr" id="S4.T2.1.1.1"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_tt" id="S4.T2.1.1.1.1" rowspan="2" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S4.T2.1.1.1.1.1">ID</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T2.1.1.1.2" rowspan="3" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S4.T2.1.1.1.2.1">Risk factor</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" colspan="2" id="S4.T2.1.1.1.3" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S4.T2.1.1.1.3.1">Librispeech</span></td> <td class="ltx_td ltx_align_center ltx_border_tt" colspan="6" id="S4.T2.1.1.1.4" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S4.T2.1.1.1.4.1">LibrispeechMix-2mix</span></td> </tr> <tr class="ltx_tr" id="S4.T2.1.1.2"> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.2.1" rowspan="2" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S4.T2.1.1.2.1.1">dev</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.1.1.2.2" rowspan="2" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S4.T2.1.1.2.2.1">test</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.2.3" rowspan="2" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold" id="S4.T2.1.1.2.3.1">Dev</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.1.1.2.4" rowspan="2" style="padding:1.5pt 3.0pt;"><span class="ltx_text" id="S4.T2.1.1.2.4.1"> <span class="ltx_inline-block" id="S4.T2.1.1.2.4.1.1"> <span class="ltx_p" id="S4.T2.1.1.2.4.1.1.1"><span class="ltx_text ltx_font_bold" id="S4.T2.1.1.2.4.1.1.1.1">Test</span></span> <span class="ltx_p" id="S4.T2.1.1.2.4.1.1.2">(Overall)</span> </span></span></td> <td class="ltx_td ltx_align_center ltx_border_t" colspan="4" id="S4.T2.1.1.2.5" style="padding:1.5pt 3.0pt;"> <span class="ltx_text ltx_font_bold" id="S4.T2.1.1.2.5.1">Test</span> (Conditional)</td> </tr> <tr class="ltx_tr" id="S4.T2.1.1.3"> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.3.1" style="padding:1.5pt 3.0pt;">low</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.3.2" style="padding:1.5pt 3.0pt;">mid</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.3.3" style="padding:1.5pt 3.0pt;">high</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.3.4" style="padding:1.5pt 3.0pt;">OA-WER</td> </tr> <tr class="ltx_tr" id="S4.T2.1.1.4"> <td class="ltx_td ltx_nopad_l ltx_nopad_r ltx_align_left" colspan="10" id="S4.T2.1.1.4.1" style="padding:1.5pt 3.0pt;"><span class="ltx_rule" style="width:100%;height:0.9pt;background:black;display:inline-block;"> </span></td> </tr> <tr class="ltx_tr" id="S4.T2.1.1.5"> <td class="ltx_td ltx_align_left ltx_border_r" id="S4.T2.1.1.5.1" style="padding:1.5pt 3.0pt;">C1</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.1.1.5.2" style="padding:1.5pt 3.0pt;">SOT+CTC</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.5.3" style="padding:1.5pt 3.0pt;">4.3</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.1.1.5.4" style="padding:1.5pt 3.0pt;">4.5</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.5.5" style="padding:1.5pt 3.0pt;">8.4</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.1.1.5.6" style="padding:1.5pt 3.0pt;">8.8</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.5.7" style="padding:1.5pt 3.0pt;">7.1</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.5.8" style="padding:1.5pt 3.0pt;">9.0</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.5.9" style="padding:1.5pt 3.0pt;">13.1</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.5.10" style="padding:1.5pt 3.0pt;">9.7</td> </tr> <tr class="ltx_tr" id="S4.T2.1.1.6"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S4.T2.1.1.6.1" style="padding:1.5pt 3.0pt;">F1</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.1.1.6.2" style="padding:1.5pt 3.0pt;">5</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.6.3" style="padding:1.5pt 3.0pt;">4.3</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.1.1.6.4" style="padding:1.5pt 3.0pt;">4.6</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.6.5" style="padding:1.5pt 3.0pt;">8.4</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.1.1.6.6" style="padding:1.5pt 3.0pt;">8.8</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.6.7" style="padding:1.5pt 3.0pt;">7.3</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.6.8" style="padding:1.5pt 3.0pt;">8.9</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.6.9" style="padding:1.5pt 3.0pt;">12.7</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.1.1.6.10" style="padding:1.5pt 3.0pt;">9.6</td> </tr> <tr class="ltx_tr" id="S4.T2.1.1.7"> <td class="ltx_td ltx_align_left ltx_border_r" id="S4.T2.1.1.7.1" style="padding:1.5pt 3.0pt;">F2</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.1.1.7.2" style="padding:1.5pt 3.0pt;">10</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.7.3" style="padding:1.5pt 3.0pt;">4.0</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.1.1.7.4" style="padding:1.5pt 3.0pt;">4.4</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.7.5" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T2.1.1.7.5.1">8.1</span></td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.1.1.7.6" style="padding:1.5pt 3.0pt;">8.3</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.7.7" style="padding:1.5pt 3.0pt;">6.5</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.7.8" style="padding:1.5pt 3.0pt;">8.6</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.7.9" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T2.1.1.7.9.1">12.4</span></td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.7.10" style="padding:1.5pt 3.0pt;">9.2</td> </tr> <tr class="ltx_tr" id="S4.T2.1.1.8"> <td class="ltx_td ltx_align_left ltx_border_r" id="S4.T2.1.1.8.1" style="padding:1.5pt 3.0pt;">F3</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.1.1.8.2" style="padding:1.5pt 3.0pt;">15</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.8.3" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T2.1.1.8.3.1">3.9</span></td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.1.1.8.4" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T2.1.1.8.4.1">4.1</span></td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.8.5" style="padding:1.5pt 3.0pt;">8.2</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.1.1.8.6" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T2.1.1.8.6.1">8.0</span></td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.8.7" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T2.1.1.8.7.1">6.0</span></td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.8.8" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T2.1.1.8.8.1">8.4</span></td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.8.9" style="padding:1.5pt 3.0pt;">12.8</td> <td class="ltx_td ltx_align_center" id="S4.T2.1.1.8.10" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T2.1.1.8.10.1">9.1</span></td> </tr> <tr class="ltx_tr" id="S4.T2.1.1.9"> <td class="ltx_td ltx_align_left ltx_border_bb ltx_border_r" id="S4.T2.1.1.9.1" style="padding:1.5pt 3.0pt;">F4</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T2.1.1.9.2" style="padding:1.5pt 3.0pt;">20</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.1.1.9.3" style="padding:1.5pt 3.0pt;">4.1</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T2.1.1.9.4" style="padding:1.5pt 3.0pt;">4.3</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.1.1.9.5" style="padding:1.5pt 3.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T2.1.1.9.5.1">8.1</span></td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T2.1.1.9.6" style="padding:1.5pt 3.0pt;">8.3</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.1.1.9.7" style="padding:1.5pt 3.0pt;">6.4</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.1.1.9.8" style="padding:1.5pt 3.0pt;">8.6</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.1.1.9.9" style="padding:1.5pt 3.0pt;">12.7</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.1.1.9.10" style="padding:1.5pt 3.0pt;">9.2</td> </tr> </table> </span></div> </figure> <div class="ltx_para ltx_noindent" id="S4.SS2.p2"> <p class="ltx_p" id="S4.SS2.p2.1"><span class="ltx_text ltx_font_bold" id="S4.SS2.p2.1.1">Impact of hyperparameter</span> We also investigated the impact of different risk factors (RFs) in SOT-SACTC, as shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S4.T2" title="TABLE II ‣ IV-B Performance of SACTC ‣ IV Results and discussions ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag">II</span></a>. It shows that all tested RFs yielded improvements over the baseline C1. With a small RF of 5, the performance was close to the baseline, while the best result was achieved using RF=15.</p> </div> <div class="ltx_para ltx_noindent" id="S4.SS2.p3"> <p class="ltx_p" id="S4.SS2.p3.1"><span class="ltx_text ltx_font_bold" id="S4.SS2.p3.1.1">Generalize to more speakers</span> A key advantage of SOT-based models is their ability to generalize to a greater number of speakers than present in the training data. To test this, we evaluated our models on the LSM-3mix test set, despite all models being trained on 1 and 2 speaker scenarios. As shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2409.12388v2#S4.T3" title="TABLE III ‣ IV-B Performance of SACTC ‣ IV Results and discussions ‣ Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC"><span class="ltx_text ltx_ref_tag">III</span></a>, the trends observed were similar to those in the 2-speaker scenario. The SOT-CTC model significantly outperformed the SOT model on low-overlap speech (17.7% vs. 23.6%), but showed degraded performance on high-overlap speech (30.1% vs. 29.5%). In contrast, the SOT-SACTC model achieved the best performance across all conditions.</p> </div> <figure class="ltx_table" id="S4.T3"> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table">TABLE III: </span> WER (%) of MTASR models in 3-speaker test set. </figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T3.1" style="width:242.7pt;height:91.8pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-21.4pt,8.1pt) scale(0.85,0.85) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T3.1.1"> <tr class="ltx_tr" id="S4.T3.1.1.1"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_tt" id="S4.T3.1.1.1.1" style="padding:1.5pt 4.0pt;"><span class="ltx_text ltx_font_bold" id="S4.T3.1.1.1.1.1">System</span></td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T3.1.1.1.2" rowspan="2" style="padding:1.5pt 4.0pt;"><span class="ltx_text ltx_font_bold" id="S4.T3.1.1.1.2.1">Dev</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T3.1.1.1.3" rowspan="2" style="padding:1.5pt 4.0pt;"><span class="ltx_text" id="S4.T3.1.1.1.3.1"> <span class="ltx_inline-block" id="S4.T3.1.1.1.3.1.1"> <span class="ltx_p" id="S4.T3.1.1.1.3.1.1.1"><span class="ltx_text ltx_font_bold" id="S4.T3.1.1.1.3.1.1.1.1">Test</span></span> <span class="ltx_p" id="S4.T3.1.1.1.3.1.1.2">(Overall)</span> </span></span></td> <td class="ltx_td ltx_align_center ltx_border_tt" colspan="4" id="S4.T3.1.1.1.4" style="padding:1.5pt 4.0pt;"> <span class="ltx_text ltx_font_bold" id="S4.T3.1.1.1.4.1">Test</span> (Conditional)</td> </tr> <tr class="ltx_tr" id="S4.T3.1.1.2"> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.1.1.2.1" style="padding:1.5pt 4.0pt;">low</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.1.1.2.2" style="padding:1.5pt 4.0pt;">mid</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.1.1.2.3" style="padding:1.5pt 4.0pt;">high</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.1.1.2.4" style="padding:1.5pt 4.0pt;">OA-WER</td> </tr> <tr class="ltx_tr" id="S4.T3.1.1.3"> <td class="ltx_td ltx_nopad_l ltx_nopad_r ltx_align_left" colspan="7" id="S4.T3.1.1.3.1" style="padding:1.5pt 4.0pt;"><span class="ltx_rule" style="width:100%;height:0.9pt;background:black;display:inline-block;"> </span></td> </tr> <tr class="ltx_tr" id="S4.T3.1.1.4"> <td class="ltx_td ltx_align_left ltx_border_r" id="S4.T3.1.1.4.1" style="padding:1.5pt 4.0pt;">SOT</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.1.4.2" style="padding:1.5pt 4.0pt;">22.9</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.1.1.4.3" style="padding:1.5pt 4.0pt;">25.3</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.1.4.4" style="padding:1.5pt 4.0pt;">23.6</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.1.4.5" style="padding:1.5pt 4.0pt;">24.3</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.1.4.6" style="padding:1.5pt 4.0pt;">29.5</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.1.4.7" style="padding:1.5pt 4.0pt;">25.8</td> </tr> <tr class="ltx_tr" id="S4.T3.1.1.5"> <td class="ltx_td ltx_align_left ltx_border_r" id="S4.T3.1.1.5.1" style="padding:1.5pt 4.0pt;">SOT-CTC</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.1.5.2" style="padding:1.5pt 4.0pt;">23.5</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.1.1.5.3" style="padding:1.5pt 4.0pt;">23.6</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.1.5.4" style="padding:1.5pt 4.0pt;">17.7</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.1.5.5" style="padding:1.5pt 4.0pt;">23.3</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.1.5.6" style="padding:1.5pt 4.0pt;">30.1</td> <td class="ltx_td ltx_align_center" id="S4.T3.1.1.5.7" style="padding:1.5pt 4.0pt;">23.7</td> </tr> <tr class="ltx_tr" id="S4.T3.1.1.6"> <td class="ltx_td ltx_align_left ltx_border_bb ltx_border_r" id="S4.T3.1.1.6.1" style="padding:1.5pt 4.0pt;">SOT-SACTC</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.1.1.6.2" style="padding:1.5pt 4.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T3.1.1.6.2.1">22.6</span></td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T3.1.1.6.3" style="padding:1.5pt 4.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T3.1.1.6.3.1">22.6</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.1.1.6.4" style="padding:1.5pt 4.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T3.1.1.6.4.1">15.9</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.1.1.6.5" style="padding:1.5pt 4.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T3.1.1.6.5.1">22.7</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.1.1.6.6" style="padding:1.5pt 4.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T3.1.1.6.6.1">29.1</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.1.1.6.7" style="padding:1.5pt 4.0pt;"><span class="ltx_text ltx_font_bold ltx_framed ltx_framed_underline" id="S4.T3.1.1.6.7.1">22.6</span></td> </tr> </table> </span></div> </figure> </section> </section> <section class="ltx_section" id="S5"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">V </span><span class="ltx_text ltx_font_smallcaps" id="S5.1.1">conclusions</span> </h2> <div class="ltx_para" id="S5.p1"> <p class="ltx_p" id="S5.p1.1">In this work, we investigated the effect of CTC in multi-talker speech recognition (MTASR) based on serialized output training (SOT). Our findings reveal that the CTC training objective guides the ASR encoder to encode different speakers into distinct temporal regions within acoustic embeddings. Building upon this insight, we leveraged the Bayes risk CTC framework and proposed a speaker-aware CTC (SACTC), an enhanced CTC variant tailored for MTASR. The core idea of SACTC is to constrain the encoder model to represent different speakers’ tokens at specific time frames, explicitly modeling speaker disentanglement. SACTC was used as an auxiliary loss for SOT-based MTASR models in our experiments. Experimental results demonstrate that the SOT-SACTC model consistently outperforms the standard SOT-CTC approach across various degrees of speech overlap. Notably, we observe relative WER reductions of 10% overall and of 15% on low-overlap speech. To our knowledge, this work represents the first exploration of CTC-based enhancements for MTASR tasks. Future research directions may include extending SACTC to streaming seniors and exploring its potential in non-autoregressive speech recognition.</p> </div> </section> <section class="ltx_section" id="S6"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">VI </span><span class="ltx_text ltx_font_smallcaps" id="S6.1.1">Acknowledgements</span> </h2> <div class="ltx_para" id="S6.p1"> <p class="ltx_p" id="S6.p1.1">This work is supported by the HKSARG Research Grants Council’s Theme-based Research Grant Scheme (Project No. T45- 407/19N) and the CUHK Stanley Ho Big Data Decision Research Centre.</p> </div> <div class="ltx_pagination ltx_role_newpage"></div> </section> <section class="ltx_bibliography" id="bib"> <h2 class="ltx_title ltx_title_bibliography">References</h2> <ul class="ltx_biblist"> <li class="ltx_bibitem" id="bib.bib1"> <span class="ltx_tag ltx_tag_bibitem">[1]</span> <span class="ltx_bibblock"> N. Kanda, Y. Gaur, X. Wang, Z. Meng, and T. Yoshioka, “Serialized output training for end-to-end overlapped speech recognition,” <em class="ltx_emph ltx_font_italic" id="bib.bib1.1.1">arXiv preprint arXiv:2003.12687</em>, 2020. </span> </li> <li class="ltx_bibitem" id="bib.bib2"> <span class="ltx_tag ltx_tag_bibitem">[2]</span> <span class="ltx_bibblock"> D. Yu, M. Kolbæk, Z.-H. Tan, and J. Jensen, “Permutation invariant training of deep models for speaker-independent multi-talker speech separation,” in <em class="ltx_emph ltx_font_italic" id="bib.bib2.1.1">2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em>. IEEE, 2017, pp. 241–245. </span> </li> <li class="ltx_bibitem" id="bib.bib3"> <span class="ltx_tag ltx_tag_bibitem">[3]</span> <span class="ltx_bibblock"> M. Kolbæk, D. Yu, Z.-H. Tan, and J. Jensen, “Multitalker speech separation with utterance-level permutation invariant training of deep recurrent neural networks,” <em class="ltx_emph ltx_font_italic" id="bib.bib3.1.1">IEEE/ACM Transactions on Audio, Speech, and Language Processing</em>, vol. 25, no. 10, pp. 1901–1913, 2017. </span> </li> <li class="ltx_bibitem" id="bib.bib4"> <span class="ltx_tag ltx_tag_bibitem">[4]</span> <span class="ltx_bibblock"> D. Yu, X. Chang, and Y. Qian, “Recognizing multi-talker speech with permutation invariant training,” <em class="ltx_emph ltx_font_italic" id="bib.bib4.1.1">arXiv preprint arXiv:1704.01985</em>, 2017. </span> </li> <li class="ltx_bibitem" id="bib.bib5"> <span class="ltx_tag ltx_tag_bibitem">[5]</span> <span class="ltx_bibblock"> H. Seki, T. Hori, S. Watanabe, J. L. Roux, and J. R. Hershey, “A purely end-to-end system for multi-speaker speech recognition,” <em class="ltx_emph ltx_font_italic" id="bib.bib5.1.1">arXiv preprint arXiv:1805.05826</em>, 2018. </span> </li> <li class="ltx_bibitem" id="bib.bib6"> <span class="ltx_tag ltx_tag_bibitem">[6]</span> <span class="ltx_bibblock"> X. Chang, W. Zhang, Y. Qian, J. Le Roux, and S. Watanabe, “End-to-end multi-speaker speech recognition with transformer,” in <em class="ltx_emph ltx_font_italic" id="bib.bib6.1.1">ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em>. IEEE, 2020, pp. 6134–6138. </span> </li> <li class="ltx_bibitem" id="bib.bib7"> <span class="ltx_tag ltx_tag_bibitem">[7]</span> <span class="ltx_bibblock"> L. Lu, N. Kanda, J. Li, and Y. Gong, “Streaming end-to-end multi-talker speech recognition,” <em class="ltx_emph ltx_font_italic" id="bib.bib7.1.1">IEEE Signal Processing Letters</em>, vol. 28, pp. 803–807, 2021. </span> </li> <li class="ltx_bibitem" id="bib.bib8"> <span class="ltx_tag ltx_tag_bibitem">[8]</span> <span class="ltx_bibblock"> D. Raj, D. Povey, and S. Khudanpur, “Surt 2.0: Advances in transducer-based multi-talker speech recognition,” <em class="ltx_emph ltx_font_italic" id="bib.bib8.1.1">IEEE/ACM Transactions on Audio, Speech, and Language Processing</em>, 2023. </span> </li> <li class="ltx_bibitem" id="bib.bib9"> <span class="ltx_tag ltx_tag_bibitem">[9]</span> <span class="ltx_bibblock"> A. Tripathi, H. Lu, and H. Sak, “End-to-end multi-talker overlapping speech recognition,” in <em class="ltx_emph ltx_font_italic" id="bib.bib9.1.1">ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em>. IEEE, 2020, pp. 6129–6133. </span> </li> <li class="ltx_bibitem" id="bib.bib10"> <span class="ltx_tag ltx_tag_bibitem">[10]</span> <span class="ltx_bibblock"> L. Meng, J. Kang, M. Cui, Y. Wang, X. Wu, and H. Meng, “A sidecar separator can convert a single-talker speech recognition system to a multi-talker one,” in <em class="ltx_emph ltx_font_italic" id="bib.bib10.1.1">ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em>. IEEE, 2023, pp. 1–5. </span> </li> <li class="ltx_bibitem" id="bib.bib11"> <span class="ltx_tag ltx_tag_bibitem">[11]</span> <span class="ltx_bibblock"> L. Meng, J. Kang, M. Cui, H. Wu, X. Wu, and H. Meng, “Unified modeling of multi-talker overlapped speech recognition and diarization with a sidecar separator,” in <em class="ltx_emph ltx_font_italic" id="bib.bib11.1.1">Proceedings of Interspeech</em>, 2023, pp. 3467–3471. </span> </li> <li class="ltx_bibitem" id="bib.bib12"> <span class="ltx_tag ltx_tag_bibitem">[12]</span> <span class="ltx_bibblock"> L. Meng, J. Kang, Y. Wang, Z. Jin, X. Wu, X. Liu, and H. Meng, “Empowering whisper as a joint multi-talker and target-talker speech recognition system,” <em class="ltx_emph ltx_font_italic" id="bib.bib12.1.1">arXiv preprint arXiv:2407.09817</em>, 2024. </span> </li> <li class="ltx_bibitem" id="bib.bib13"> <span class="ltx_tag ltx_tag_bibitem">[13]</span> <span class="ltx_bibblock"> W. Chan, N. Jaitly, Q. Le, and O. Vinyals, “Listen, attend and spell: A neural network for large vocabulary conversational speech recognition,” in <em class="ltx_emph ltx_font_italic" id="bib.bib13.1.1">2016 IEEE international conference on acoustics, speech and signal processing (ICASSP)</em>. IEEE, 2016, pp. 4960–4964. </span> </li> <li class="ltx_bibitem" id="bib.bib14"> <span class="ltx_tag ltx_tag_bibitem">[14]</span> <span class="ltx_bibblock"> F. Yu, S. Zhang, Y. Fu, L. Xie, S. Zheng, Z. Du, W. Huang, P. Guo, Z. Yan, B. Ma <em class="ltx_emph ltx_font_italic" id="bib.bib14.1.1">et al.</em>, “M2met: The icassp 2022 multi-channel multi-party meeting transcription challenge,” in <em class="ltx_emph ltx_font_italic" id="bib.bib14.2.2">ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em>. IEEE, 2022, pp. 6167–6171. </span> </li> <li class="ltx_bibitem" id="bib.bib15"> <span class="ltx_tag ltx_tag_bibitem">[15]</span> <span class="ltx_bibblock"> Y. Liang, M. Shi, F. Yu, Y. Li, S. Zhang, Z. Du, Q. Chen, L. Xie, Y. Qian, J. Wu <em class="ltx_emph ltx_font_italic" id="bib.bib15.1.1">et al.</em>, “The second multi-channel multi-party meeting transcription challenge (m2met 2.0): A benchmark for speaker-attributed asr,” in <em class="ltx_emph ltx_font_italic" id="bib.bib15.2.2">2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)</em>. IEEE, 2023, pp. 1–8. </span> </li> <li class="ltx_bibitem" id="bib.bib16"> <span class="ltx_tag ltx_tag_bibitem">[16]</span> <span class="ltx_bibblock"> N. Kanda, Y. Gaur, X. Wang, Z. Meng, Z. Chen, T. Zhou, and T. Yoshioka, “Joint speaker counting, speech recognition, and speaker identification for overlapped speech of any number of speakers,” <em class="ltx_emph ltx_font_italic" id="bib.bib16.1.1">arXiv preprint arXiv:2006.10930</em>, 2020. </span> </li> <li class="ltx_bibitem" id="bib.bib17"> <span class="ltx_tag ltx_tag_bibitem">[17]</span> <span class="ltx_bibblock"> Z. Fan, L. Dong, J. Zhang, L. Lu, and Z. Ma, “Sa-sot: Speaker-aware serialized output training for multi-talker asr,” in <em class="ltx_emph ltx_font_italic" id="bib.bib17.1.1">ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em>. IEEE, 2024, pp. 9986–9990. </span> </li> <li class="ltx_bibitem" id="bib.bib18"> <span class="ltx_tag ltx_tag_bibitem">[18]</span> <span class="ltx_bibblock"> Y. Liang, F. Yu, Y. Li, P. Guo, S. Zhang, Q. Chen, and L. Xie, “Ba-sot: Boundary-aware serialized output training for multi-talker asr,” <em class="ltx_emph ltx_font_italic" id="bib.bib18.1.1">arXiv preprint arXiv:2305.13716</em>, 2023. </span> </li> <li class="ltx_bibitem" id="bib.bib19"> <span class="ltx_tag ltx_tag_bibitem">[19]</span> <span class="ltx_bibblock"> N. Kanda, J. Wu, Y. Wu, X. Xiao, Z. Meng, X. Wang, Y. Gaur, Z. Chen, J. Li, and T. Yoshioka, “Streaming multi-talker asr with token-level serialized output training,” <em class="ltx_emph ltx_font_italic" id="bib.bib19.1.1">arXiv preprint arXiv:2202.00842</em>, 2022. </span> </li> <li class="ltx_bibitem" id="bib.bib20"> <span class="ltx_tag ltx_tag_bibitem">[20]</span> <span class="ltx_bibblock"> Y. Shi, L. Li, S. Yin, D. Wang, and J. Han, “Serialized output training by learned dominance,” <em class="ltx_emph ltx_font_italic" id="bib.bib20.1.1">arXiv preprint arXiv:2407.03966</em>, 2024. </span> </li> <li class="ltx_bibitem" id="bib.bib21"> <span class="ltx_tag ltx_tag_bibitem">[21]</span> <span class="ltx_bibblock"> L. Meng, S. Hu, J. Kang, Z. Li, Y. Wang, W. Wu, X. Wu, X. Liu, and H. Meng, “Large language model can transcribe speech in multi-talker scenarios with versatile instructions,” <em class="ltx_emph ltx_font_italic" id="bib.bib21.1.1">arXiv preprint arXiv:2409.08596</em>, 2024. </span> </li> <li class="ltx_bibitem" id="bib.bib22"> <span class="ltx_tag ltx_tag_bibitem">[22]</span> <span class="ltx_bibblock"> J. Kang, L. Meng, M. Cui, H. Guo, X. Wu, X. Liu, and H. Meng, “Cross-speaker encoding network for multi-talker speech recognition,” in <em class="ltx_emph ltx_font_italic" id="bib.bib22.1.1">ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em>. IEEE, 2024, pp. 11 986–11 990. </span> </li> <li class="ltx_bibitem" id="bib.bib23"> <span class="ltx_tag ltx_tag_bibitem">[23]</span> <span class="ltx_bibblock"> A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, “Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks,” in <em class="ltx_emph ltx_font_italic" id="bib.bib23.1.1">Proceedings of the 23rd international conference on Machine learning</em>, 2006, pp. 369–376. </span> </li> <li class="ltx_bibitem" id="bib.bib24"> <span class="ltx_tag ltx_tag_bibitem">[24]</span> <span class="ltx_bibblock"> A. Graves, “Sequence transduction with recurrent neural networks,” <em class="ltx_emph ltx_font_italic" id="bib.bib24.1.1">arXiv preprint arXiv:1211.3711</em>, 2012. </span> </li> <li class="ltx_bibitem" id="bib.bib25"> <span class="ltx_tag ltx_tag_bibitem">[25]</span> <span class="ltx_bibblock"> K. Rao, H. Sak, and R. Prabhavalkar, “Exploring architectures, data and units for streaming end-to-end speech recognition with rnn-transducer,” in <em class="ltx_emph ltx_font_italic" id="bib.bib25.1.1">2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)</em>. IEEE, 2017, pp. 193–199. </span> </li> <li class="ltx_bibitem" id="bib.bib26"> <span class="ltx_tag ltx_tag_bibitem">[26]</span> <span class="ltx_bibblock"> S. Watanabe, T. Hori, S. Kim, J. R. Hershey, and T. Hayashi, “Hybrid ctc/attention architecture for end-to-end speech recognition,” <em class="ltx_emph ltx_font_italic" id="bib.bib26.1.1">IEEE Journal of Selected Topics in Signal Processing</em>, vol. 11, no. 8, pp. 1240–1253, 2017. </span> </li> <li class="ltx_bibitem" id="bib.bib27"> <span class="ltx_tag ltx_tag_bibitem">[27]</span> <span class="ltx_bibblock"> F. Yu, S. Zhang, P. Guo, Y. Fu, Z. Du, S. Zheng, W. Huang, L. Xie, Z.-H. Tan, D. Wang <em class="ltx_emph ltx_font_italic" id="bib.bib27.1.1">et al.</em>, “Summary on the icassp 2022 multi-channel multi-party meeting transcription grand challenge,” in <em class="ltx_emph ltx_font_italic" id="bib.bib27.2.2">ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em>. IEEE, 2022, pp. 9156–9160. </span> </li> <li class="ltx_bibitem" id="bib.bib28"> <span class="ltx_tag ltx_tag_bibitem">[28]</span> <span class="ltx_bibblock"> C. Shen, Y. Liu, W. Fan, B. Wang, S. Wen, Y. Tian, J. Zhang, J. Yang, and Z. Ma, “The volcspeech system for the icassp 2022 multi-channel multi-party meeting transcription challenge,” in <em class="ltx_emph ltx_font_italic" id="bib.bib28.1.1">ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em>. IEEE, 2022, pp. 9176–9180. </span> </li> <li class="ltx_bibitem" id="bib.bib29"> <span class="ltx_tag ltx_tag_bibitem">[29]</span> <span class="ltx_bibblock"> S. Ye, P. Wang, S. Chen, X. Hu, and X. Xu, “The royalflush system of speech recognition for m2met challenge,” in <em class="ltx_emph ltx_font_italic" id="bib.bib29.1.1">ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)</em>. IEEE, 2022, pp. 9181–9185. </span> </li> <li class="ltx_bibitem" id="bib.bib30"> <span class="ltx_tag ltx_tag_bibitem">[30]</span> <span class="ltx_bibblock"> S.-P. Chuang, Y.-S. Chuang, C.-C. Chang, and H.-y. Lee, “Investigating the reordering capability in ctc-based non-autoregressive end-to-end speech translation,” <em class="ltx_emph ltx_font_italic" id="bib.bib30.1.1">arXiv preprint arXiv:2105.04840</em>, 2021. </span> </li> <li class="ltx_bibitem" id="bib.bib31"> <span class="ltx_tag ltx_tag_bibitem">[31]</span> <span class="ltx_bibblock"> Y. Shi, D. Wang, L. Li, and J. Han, “A glance is enough: Extract target sentence by looking at a keyword,” <em class="ltx_emph ltx_font_italic" id="bib.bib31.1.1">arXiv preprint arXiv:2310.05352</em>, 2023. </span> </li> <li class="ltx_bibitem" id="bib.bib32"> <span class="ltx_tag ltx_tag_bibitem">[32]</span> <span class="ltx_bibblock"> L. Rabiner and B. Juang, “An introduction to hidden markov models,” <em class="ltx_emph ltx_font_italic" id="bib.bib32.1.1">ieee assp magazine</em>, vol. 3, no. 1, pp. 4–16, 1986. </span> </li> <li class="ltx_bibitem" id="bib.bib33"> <span class="ltx_tag ltx_tag_bibitem">[33]</span> <span class="ltx_bibblock"> J. Tian, B. Yan, J. Yu, C. Weng, D. Yu, and S. Watanabe, “Bayes risk ctc: Controllable ctc alignment in sequence-to-sequence tasks,” <em class="ltx_emph ltx_font_italic" id="bib.bib33.1.1">arXiv preprint arXiv:2210.07499</em>, 2022. </span> </li> <li class="ltx_bibitem" id="bib.bib34"> <span class="ltx_tag ltx_tag_bibitem">[34]</span> <span class="ltx_bibblock"> V. Panayotov, G. Chen, D. Povey, and S. Khudanpur, “Librispeech: an asr corpus based on public domain audio books,” in <em class="ltx_emph ltx_font_italic" id="bib.bib34.1.1">2015 IEEE international conference on acoustics, speech and signal processing (ICASSP)</em>. IEEE, 2015, pp. 5206–5210. </span> </li> <li class="ltx_bibitem" id="bib.bib35"> <span class="ltx_tag ltx_tag_bibitem">[35]</span> <span class="ltx_bibblock"> S. Watanabe, T. Hori, S. Karita, T. Hayashi, J. Nishitoba, Y. Unno, N. Enrique Yalta Soplin, J. Heymann, M. Wiesner, N. Chen, A. Renduchintala, and T. Ochiai, “ESPnet: End-to-end speech processing toolkit,” in <em class="ltx_emph ltx_font_italic" id="bib.bib35.1.1">Proceedings of Interspeech</em>, 2018, pp. 2207–2211. [Online]. Available: <a class="ltx_ref ltx_url ltx_font_typewriter" href="http://dx.doi.org/10.21437/Interspeech.2018-1456" title="">http://dx.doi.org/10.21437/Interspeech.2018-1456</a> </span> </li> </ul> </section> </article> </div> <footer class="ltx_page_footer"> <div class="ltx_page_logo">Generated on Fri Jan 3 12:26:00 2025 by <a class="ltx_LaTeXML_logo" href="http://dlmf.nist.gov/LaTeXML/"><span style="letter-spacing:-0.2em; margin-right:0.1em;">L<span class="ltx_font_smallcaps" style="position:relative; bottom:2.2pt;">a</span>T<span class="ltx_font_smallcaps" style="font-size:120%;position:relative; bottom:-0.2ex;">e</span></span><span style="font-size:90%; position:relative; bottom:-0.2ex;">XML</span><img alt="Mascot Sammy" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAsAAAAOCAYAAAD5YeaVAAAAAXNSR0IArs4c6QAAAAZiS0dEAP8A/wD/oL2nkwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9wKExQZLWTEaOUAAAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAdpJREFUKM9tkL+L2nAARz9fPZNCKFapUn8kyI0e4iRHSR1Kb8ng0lJw6FYHFwv2LwhOpcWxTjeUunYqOmqd6hEoRDhtDWdA8ApRYsSUCDHNt5ul13vz4w0vWCgUnnEc975arX6ORqN3VqtVZbfbTQC4uEHANM3jSqXymFI6yWazP2KxWAXAL9zCUa1Wy2tXVxheKA9YNoR8Pt+aTqe4FVVVvz05O6MBhqUIBGk8Hn8HAOVy+T+XLJfLS4ZhTiRJgqIoVBRFIoric47jPnmeB1mW/9rr9ZpSSn3Lsmir1fJZlqWlUonKsvwWwD8ymc/nXwVBeLjf7xEKhdBut9Hr9WgmkyGEkJwsy5eHG5vN5g0AKIoCAEgkEkin0wQAfN9/cXPdheu6P33fBwB4ngcAcByHJpPJl+fn54mD3Gg0NrquXxeLRQAAwzAYj8cwTZPwPH9/sVg8PXweDAauqqr2cDjEer1GJBLBZDJBs9mE4zjwfZ85lAGg2+06hmGgXq+j3+/DsixYlgVN03a9Xu8jgCNCyIegIAgx13Vfd7vdu+FweG8YRkjXdWy329+dTgeSJD3ieZ7RNO0VAXAPwDEAO5VKndi2fWrb9jWl9Esul6PZbDY9Go1OZ7PZ9z/lyuD3OozU2wAAAABJRU5ErkJggg=="/></a> </div></footer> </div> </body> </html>