CINXE.COM

<!DOCTYPE html> <html lang="en"> <head> <meta content="text/html; charset=utf-8" http-equiv="content-type"/> <title>Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement</title>  <meta content="width=device-width, initial-scale=1, shrink-to-fit=no" name="viewport"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv-fonts.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/latexml_styles.css" rel="stylesheet" type="text/css"/> <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.3.3/html2canvas.min.js"></script> <script src="/static/browse/0.3.4/js/addons_new.js"></script> <script src="/static/browse/0.3.4/js/feedbackOverlay.js"></script> <base href="/html/2503.14854v1/"/></head> <body> <nav class="ltx_page_navbar"> <nav class="ltx_TOC"> <ol class="ltx_toclist"> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S1" title="In Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">1 </span>Introduction</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S2" title="In Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2 </span>NyTT and its related works</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S2.SS1" title="In 2 NyTT and its related works ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2.1 </span>Noise2Noise</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S2.SS2" title="In 2 NyTT and its related works ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2.2 </span>MixIT</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S2.SS3" title="In 2 NyTT and its related works ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2.3 </span>NyTT</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S3" title="In Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3 </span>Motivation and content of the investigation</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S3.SS1" title="In 3 Motivation and content of the investigation ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.1 </span>Validity of the interpretation of NyTT</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S3.SS2" title="In 3 Motivation and content of the investigation ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.2 </span>Improvement of NyTT through iteration</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S3.SS3" title="In 3 Motivation and content of the investigation ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.3 </span>Effects of mismatches between noise signals</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S3.SS4" title="In 3 Motivation and content of the investigation ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.4 </span>Effectiveness of utilizing noisy signals in a situation where clean target signals are available</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S3.SS5" title="In 3 Motivation and content of the investigation ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.5 </span>Capabilities in the dereverberation task</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S3.SS6" title="In 3 Motivation and content of the investigation ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.6 </span>Capabilities in the declipping task</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4" title="In Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4 </span>Experimental analysis in the denoising task</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS1" title="In 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.1 </span>Setups</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS2" title="In 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.2 </span>Validity of interpretation of NyTT</span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS2.SSS1" title="In 4.2 Validity of interpretation of NyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.2.1 </span>Analysis of signals processed in NyTT</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS2.SSS2" title="In 4.2 Validity of interpretation of NyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.2.2 </span>Evaluation of NyTT with loss functions that do not satisfy the conditions of Noise2Noise</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS3" title="In 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.3 </span>Effectiveness of IterNyTT</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS4" title="In 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.4 </span>Effects of noise mismatches</span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS4.SSS1" title="In 4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.4.1 </span>Effects of mismatches on the performance</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS4.SSS2" title="In 4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.4.2 </span>Difference in the impact of <math alttext="\mathrm{SNR}_{\textbf{x}}" class="ltx_Math" display="inline"><semantics><msub><mi>SNR</mi><mtext class="ltx_mathvariant_bold">x</mtext></msub><annotation-xml encoding="MathML-Content"><apply><csymbol cd="ambiguous">subscript</csymbol><ci>SNR</ci><ci><mtext class="ltx_mathvariant_bold" mathsize="70%">x</mtext></ci></apply></annotation-xml><annotation encoding="application/x-tex">\mathrm{SNR}_{\textbf{x}}</annotation><annotation encoding="application/x-llamapun">roman_SNR start_POSTSUBSCRIPT x end_POSTSUBSCRIPT</annotation></semantics></math> with and without the mismatch between <math alttext="\textbf{n}^{\rm obs}" class="ltx_Math" display="inline"><semantics><msup><mtext class="ltx_mathvariant_bold">n</mtext><mi>obs</mi></msup><annotation-xml encoding="MathML-Content"><apply><csymbol cd="ambiguous">superscript</csymbol><ci><mtext class="ltx_mathvariant_bold">n</mtext></ci><ci>obs</ci></apply></annotation-xml><annotation encoding="application/x-tex">\textbf{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun">n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\textbf{n}^{\rm test}" class="ltx_Math" display="inline"><semantics><msup><mtext class="ltx_mathvariant_bold">n</mtext><mi>test</mi></msup><annotation-xml encoding="MathML-Content"><apply><csymbol cd="ambiguous">superscript</csymbol><ci><mtext class="ltx_mathvariant_bold">n</mtext></ci><ci>test</ci></apply></annotation-xml><annotation encoding="application/x-tex">\textbf{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun">n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS4.SSS3" title="In 4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.4.3 </span>Difference in the impact of <math alttext="\mathrm{SNR}_{\textbf{y}}" class="ltx_Math" display="inline"><semantics><msub><mi>SNR</mi><mtext class="ltx_mathvariant_bold">y</mtext></msub><annotation-xml encoding="MathML-Content"><apply><csymbol cd="ambiguous">subscript</csymbol><ci>SNR</ci><ci><mtext class="ltx_mathvariant_bold" mathsize="70%">y</mtext></ci></apply></annotation-xml><annotation encoding="application/x-tex">\mathrm{SNR}_{\textbf{y}}</annotation><annotation encoding="application/x-llamapun">roman_SNR start_POSTSUBSCRIPT y end_POSTSUBSCRIPT</annotation></semantics></math> with and without the mismatch between <math alttext="\textbf{n}^{\rm obs}" class="ltx_Math" display="inline"><semantics><msup><mtext class="ltx_mathvariant_bold">n</mtext><mi>obs</mi></msup><annotation-xml encoding="MathML-Content"><apply><csymbol cd="ambiguous">superscript</csymbol><ci><mtext class="ltx_mathvariant_bold">n</mtext></ci><ci>obs</ci></apply></annotation-xml><annotation encoding="application/x-tex">\textbf{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun">n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\textbf{n}^{\rm test}" class="ltx_Math" display="inline"><semantics><msup><mtext class="ltx_mathvariant_bold">n</mtext><mi>test</mi></msup><annotation-xml encoding="MathML-Content"><apply><csymbol cd="ambiguous">superscript</csymbol><ci><mtext class="ltx_mathvariant_bold">n</mtext></ci><ci>test</ci></apply></annotation-xml><annotation encoding="application/x-tex">\textbf{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun">n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS5" title="In 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.5 </span>Effectiveness of utilizing noisy signals in a situation where clean target signals are available</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S5" title="In Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">5 </span>Experimental analysis in the dereverberation task</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S5.SS1" title="In 5 Experimental analysis in the dereverberation task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">5.1 </span>Setups</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S5.SS2" title="In 5 Experimental analysis in the dereverberation task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">5.2 </span>Effectiveness of NyTT in the dereverberation task</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S5.SS3" title="In 5 Experimental analysis in the dereverberation task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">5.3 </span>Effectiveness of IterNyTT in the dereverberation task</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S6" title="In Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">6 </span>Experimental analysis in the declipping task</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S6.SS1" title="In 6 Experimental analysis in the declipping task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">6.1 </span>Setups</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S6.SS2" title="In 6 Experimental analysis in the declipping task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">6.2 </span>Effectiveness of NyTT in the declipping task</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S6.SS3" title="In 6 Experimental analysis in the declipping task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">6.3 </span>Effectiveness of IterNyTT in the declipping task</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S7" title="In Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">7 </span>Conclusion</span></a></li> </ol></nav> </nav> <div class="ltx_page_main"> <div class="ltx_page_content"> <article class="ltx_document ltx_authors_1line"> <div class="ltx_para" id="p1"> <span class="ltx_ERROR undefined" id="p1.1">\addbibresource</span> <p class="ltx_p" id="p1.2">bibfiles.bib <span class="ltx_ERROR undefined" id="p1.2.1">\defbibheading</span>bibliography[References] <span class="ltx_ERROR undefined" id="p1.2.2">\DeclareSourcemap</span> <span class="ltx_ERROR undefined" id="p1.2.3">\maps</span>[datatype=bibtex, overwrite=true] <span class="ltx_ERROR undefined" id="p1.2.4">\map</span> <span class="ltx_ERROR undefined" id="p1.2.5">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.6">\regexp</span>.*EUSIPCO.*, replace=Proc. EUSIPCO] <span class="ltx_ERROR undefined" id="p1.2.7">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.8">\regexp</span>.*CVPR.*, replace=Proc. CVPR] <span class="ltx_ERROR undefined" id="p1.2.9">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.10">\regexp</span>.*Interspeech.*, replace=Proc. Interspeech] <span class="ltx_ERROR undefined" id="p1.2.11">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.12">\regexp</span>.*ICASSP.*, replace=Proc. ICASSP] <span class="ltx_ERROR undefined" id="p1.2.13">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.14">\regexp</span>.*ICLR.*, replace=Proc. ICLR] <span class="ltx_ERROR undefined" id="p1.2.15">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.16">\regexp</span>.*ICCV.*, replace=Proc. ICCV] <span class="ltx_ERROR undefined" id="p1.2.17">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.18">\regexp</span>.*ICML.*, replace=Proc. ICML] <span class="ltx_ERROR undefined" id="p1.2.19">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.20">\regexp</span>.*ASRU.*, replace=Proc. ASRU] <span class="ltx_ERROR undefined" id="p1.2.21">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.22">\regexp</span>.*SLT.*, replace=Proc. SLT] <span class="ltx_ERROR undefined" id="p1.2.23">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.24">\regexp</span>.*SSW.*, replace=Proc. SSW] <span class="ltx_ERROR undefined" id="p1.2.25">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.26">\regexp</span>.*WASPAA.*, replace=Proc. WASPAA] <span class="ltx_ERROR undefined" id="p1.2.27">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.28">\regexp</span>.*IJCNN.*, replace=Proc. IJCNN] <span class="ltx_ERROR undefined" id="p1.2.29">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.30">\regexp</span>.*Detection.*and.*Classification.*of.*Acoustic.*Scenes.*and.*Events.*Workshop.*, replace=Proc. DCASE] <span class="ltx_ERROR undefined" id="p1.2.31">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.32">\regexp</span>.*MLSP.*, replace=Proc. MLSP] <span class="ltx_ERROR undefined" id="p1.2.33">\step</span>[fieldsource=booktitle, match=<span class="ltx_ERROR undefined" id="p1.2.34">\regexp</span>.*ECCV.*, replace=Proc. ECCV] <span class="ltx_ERROR undefined" id="p1.2.35">\step</span>[fieldsource=journal, match=<span class="ltx_ERROR undefined" id="p1.2.36">\regexp</span>.*NeurIPS.*, replace=Advances in NuerIPS] <span class="ltx_ERROR undefined" id="p1.2.37">\step</span>[fieldsource=journal, match=<span class="ltx_ERROR undefined" id="p1.2.38">\regexp</span>.*TASLP.*, replace=IEEE/ACM TASLP] <span class="ltx_ERROR undefined" id="p1.2.39">\step</span>[fieldsource=journal, match=<span class="ltx_ERROR undefined" id="p1.2.40">\regexp</span>.*J-STSP.*, replace=IEEE J-STSP] <span class="ltx_ERROR undefined" id="p1.2.41">\step</span>[fieldsource=series, match=<span class="ltx_ERROR undefined" id="p1.2.42">\regexp</span>.+, replace=] <span class="ltx_ERROR undefined" id="p1.2.43">\step</span>[fieldsource=editor, match=<span class="ltx_ERROR undefined" id="p1.2.44">\regexp</span>.+, replace=] <span class="ltx_ERROR undefined" id="p1.2.45">\step</span>[fieldsource=publisher, match=<span class="ltx_ERROR undefined" id="p1.2.46">\regexp</span>.+, replace=] <span class="ltx_ERROR undefined" id="p1.2.47">\step</span>[fieldsource=month, match=<span class="ltx_ERROR undefined" id="p1.2.48">\regexp</span>.+, replace=] <span class="ltx_ERROR undefined" id="p1.2.49">\step</span>[fieldsource=location, match=<span class="ltx_ERROR undefined" id="p1.2.50">\regexp</span>.+, replace=] <span class="ltx_ERROR undefined" id="p1.2.51">\step</span>[fieldsource=address, match=<span class="ltx_ERROR undefined" id="p1.2.52">\regexp</span>.+, replace=] <span class="ltx_ERROR undefined" id="p1.2.53">\step</span>[fieldsource=organization, match=<span class="ltx_ERROR undefined" id="p1.2.54">\regexp</span>.+, replace=] <span class="ltx_ERROR undefined" id="p1.2.55">\addbibresource</span>bibfiles.bib</p> </div> <h1 class="ltx_title ltx_title_document">Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement</h1> <div class="ltx_authors"> <span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Takuya Fujimura <br class="ltx_break"/>Graduate School of Informatics, Nagoya University, Nagoya, Japan </span></span> <span class="ltx_author_before"> </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Tomoki Toda <br class="ltx_break"/>Information Technology Center, Nagoya University, Nagoya, Japan </span></span> </div> <div class="ltx_abstract"> <h6 class="ltx_title ltx_title_abstract">Abstract</h6> <p class="ltx_p" id="id1.id1">Deep neural network-based target signal enhancement (TSE) is usually trained in a supervised manner using clean target signals. However, collecting clean target signals is costly and such signals are not always available. Thus, it is desirable to develop an unsupervised method that does not rely on clean target signals. Among various studies on unsupervised TSE methods, Noisy-target Training (NyTT) has been established as a fundamental method. NyTT simply replaces clean target signals with noisy ones in the typical supervised training, and it has been experimentally shown to achieve TSE. Despite its effectiveness and simplicity, its mechanism and detailed behavior are still unclear. In this paper, to advance NyTT and, thus, unsupervised methods as a whole, we analyze NyTT from various perspectives. We experimentally demonstrate the mechanism of NyTT, the desirable conditions, and the effectiveness of utilizing noisy signals in situations where a small number of clean target signals are available. Furthermore, we propose an improved version of NyTT based on its properties and explore its capabilities in the dereverberation and declipping tasks, beyond the denoising task.</p> </div> <section class="ltx_section" id="S1"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">1 </span>Introduction</h2> <div class="ltx_para" id="S1.p1"> <p class="ltx_p" id="S1.p1.1">Target signal enhancement (TSE) is a technique to extract a target signal from a noisy observation. In various speech communication systems, such as online meetings, hearing aids, and automatic speech recognition (ASR) systems, this technique has been employed to extract human speech <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Narayanan_2013</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Yoshioka_2015</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Kinoshita_2020</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Fedorov_2020</span>]</cite>. It has also been applied to various types of target signal beyond speech, including music <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">hennequin2020spleeter</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">defossez2019music</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">rouard2022hybrid</span>]</cite> and environmental sounds <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">kavalerov2019universal</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">turpault2020improving</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">fujimura2023multi</span>]</cite>. This TSE technique can be classified into multi-channel and single-channel methods. Multi-channel methods extract the target signal by leveraging the spatial information obtained from multiple microphones <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">DeMuth_1977</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Veen_1988</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Trees_2004</span>]</cite>. However, the physical size of the microphone array can sometimes limit its application. Consequently, single-channel methods, which use a single microphone and perform TSE based on differences in acoustic features between the target signal and other noise signals, also play a crucial role in TSE applications. Classical signal processing-based single-channel methods <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Boll_1979</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Wiener_1949</span>]</cite> have been widely adopted owing to their simplicity and low computational cost; however, their enhancement performance is often insufficient. In contrast, recent single-channel TSE methods have achieved significant performance improvements by incorporating deep neural networks (DNNs) <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Williamson_2016</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">luo2019conv</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">subakan2021attention</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Koizumi_2021</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">wang2023tf</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">rouard2022hybrid</span>]</cite>.</p> </div> <div class="ltx_para" id="S1.p2"> <p class="ltx_p" id="S1.p2.1">Most single-channel DNN-based TSE methods rely on supervised learning with clean target signals, which we refer to as Clean-target Training (CTT) (Fig. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S1.F1" title="Figure 1 ‣ 1 Introduction ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">1</span></a>(a)). In CTT, we input a noisy signal into a DNN and train it to predict the corresponding clean target signal. CTT is an appropriate strategy, and various improvements have been made, including modifications to model architectures (e.g., convolutional networks <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">luo2019conv</span>]</cite>, recurrent networks <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">luo2020dual</span>]</cite>, and Transformers <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">subakan2021attention</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Koizumi_2021</span>]</cite>), loss functions (e.g., mean-squared-error (MSE) and signal-to-noise ratio (SNR) <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">erdogan18_interspeech</span>]</cite>), and signal representations to which TSE is applied (e.g., amplitude spectrograms <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">lu2013speech</span>]</cite>, complex spectrograms <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Williamson_2016</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">wang2023tf</span>]</cite>, waveforms <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">pascual2017segan</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">luo2019conv</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">subakan2021attention</span>]</cite>, and both spectrograms and waveforms <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">rouard2022hybrid</span>]</cite>). Furthermore, memory-efficient <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">tzinis2020sudo</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">subakan2022resource</span>]</cite> and real-time <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">defossez2020real</span>]</cite> architectures have also been explored. Despite these improvements, CTT still has one major problem: collecting clean target signals is costly. Typically, clean target signals are recorded in controlled settings, such as an anechoic chamber, to prevent degradation from environmental noise and reverberation. Consequently, the recording process is costly and time-consuming, limiting the amount of training data. Moreover, although it is theoretically possible to achieve the TSE of any target signals, such as animals and vehicle sounds, it is often not feasible to record such clean target signals. Therefore, the types of target signal used in CTT are realistically limited. </p> </div> <div class="ltx_para" id="S1.p3"> <p class="ltx_p" id="S1.p3.1">To alleviate this limitation, unsupervised<span class="ltx_note ltx_role_footnote" id="footnote1"><sup class="ltx_note_mark">1</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">1</sup><span class="ltx_tag ltx_tag_note">1</span>In the context of TSE, methods that do not require clean target signals are considered unsupervised, even if the training is performed in a supervised manner <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Wisdom_2020</span>]</cite>.</span></span></span> TSE methods have been studied <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Wisdom_2020</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">ito2023audio</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Fu_2022</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">kashyap2021speech</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">alamdari2021improving</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Fujimura_2021</span>]</cite>. PULSE <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">ito2023audio</span>]</cite> is an unsupervised TSE method based on positive-unlabeled (PU) learning <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">du2014analysis</span>]</cite> and uses noisy target signals and additional noise signals for its training. PU learning is a machine learning technique that enables the classification of positive and negative examples using positive and unlabeled training data. On the basis of this technique, PULSE classifies local patches of amplitude spectrograms into noise (positive) or target signal (negative) components. For training, patches of noise signals are used as positive data, while patches of noisy signals are used as unlabeled data since a noisy signal contains both noise and target signal patches. During inference, PULSE performs TSE by applying a mask to the input amplitude spectrogram, filtering out noise (positive) patches. Another method, MetricGAN-U, is based on a generative adversarial network (GAN) and uses only noisy signals for its training. In MetricGAN-U, the discriminator is trained to predict a signal quality metric, whereas the generator is trained to maximize the evaluation from the discriminator. This achieves the unsupervised TSE by employing a non-intrusive metric, which does not use a clean target signal as a reference, as the metric that the discriminator mimics. Although PULSE and MetricGAN-U have demonstrated their TSE capabilities, they cannot directly inherit advancements made in the CTT framework owing to their specialized training algorithms. For example, PULSE restricts TSE models to the time–frequency (T–F) masking approach because it relies on the classification of spectrogram patches, even though time-domain models have also been developed within the CTT framework <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">luo2019conv</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">subakan2021attention</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Koizumi_2021</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">rouard2022hybrid</span>]</cite>. Moreover, MetricGAN-U requires a non-intrusive evaluation metric for its training, but it is not always available. Although a pre-trained DNN-based evaluation metric predictor can be employed as the non-intrusive metric, as demonstrated in the experiments in <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Fu_2022</span>]</cite>, constructing this predictor still requires clean target signals. Thus, it does not serve as an essential solution, especially when developing TSE systems for new types of target signal.</p> </div> <div class="ltx_para" id="S1.p4"> <p class="ltx_p" id="S1.p4.1">In contrast to the aforementioned unsupervised TSE methods, another approach utilizes noisy signals in the same training algorithms as CTT <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Wisdom_2020</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">kashyap2021speech</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Fujimura_2021</span>]</cite>. Noisy-target Training (NyTT) <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Fujimura_2021</span>]</cite> is the basic method in this approach (Fig. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S1.F1" title="Figure 1 ‣ 1 Introduction ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">1</span></a>(d)). NyTT utilizes a noisy signal as the target signal instead of a clean one. It trains a DNN to predict the noisy target from a signal synthesized by mixing the noisy target with additional noise. During inference, the enhanced signal can be obtained directly from the DNN by inputting an unprocessed noisy signal. NyTT has been experimentally shown to achieve TSE without clean target signals and has the same training algorithm as CTT, which allows us to easily inherit advancements made in the CTT framework. However, the exact mechanism and desirable conditions of NyTT have not been clarified, hindering further advancements.</p> </div> <div class="ltx_para" id="S1.p5"> <p class="ltx_p" id="S1.p5.1">In this paper, we aim to advance the field of unsupervised TSE by analyzing the fundamental method, NyTT, from various perspectives and deepening our understanding of NyTT<span class="ltx_note ltx_role_footnote" id="footnote2"><sup class="ltx_note_mark">2</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">2</sup><span class="ltx_tag ltx_tag_note">2</span>This paper is an extension of our previous paper <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Fujimura_2023</span>]</cite>. Compared with the previous paper, this paper provides a more thorough discussion of related works, a more comprehensive analysis of the desirable conditions for NyTT, and an investigation of its capability in dereverberation and declipping tasks, employing multiple evaluation metrics.</span></span></span>. Through this analysis, we clarify 1) the mechanism of NyTT, 2) the desirable conditions of NyTT, and 3) the effectiveness of utilizing noisy signals in situations where a small number of clean target signals are available. Additionally, 4) we propose an improved version of NyTT based on its properties, demonstrating its potential to achieve performance comparable to CTT by iteratively improving the quality of the noisy target signals. Finally, 5) we demonstrate that NyTT can also handle dereverberation and declipping tasks, inheriting the broad applicability of CTT.</p> </div> <div class="ltx_para" id="S1.p6"> <p class="ltx_p" id="S1.p6.1">The rest of this paper is organized as follows: Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S2" title="2 NyTT and its related works ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">2</span></a> provides details of NyTT and its closely related works. In Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S3" title="3 Motivation and content of the investigation ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">3</span></a>, we outline the motivation and contents of our analyses. In Secs. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4" title="4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S5" title="5 Experimental analysis in the dereverberation task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">5</span></a>, and <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S6" title="6 Experimental analysis in the declipping task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">6</span></a>, we present experimental results in denoising, dereverberation, and declipping tasks, respectively. Finally, in Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S7" title="7 Conclusion ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">7</span></a>, we conclude this paper.</p> </div> <figure class="ltx_figure" id="S1.F1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="443" id="S1.F1.g1" src="x1.png" width="822"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure">Figure 1: </span>Comparison of NyTT and its related methods.</figcaption> </figure> </section> <section class="ltx_section" id="S2"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">2 </span>NyTT and its related works</h2> <section class="ltx_subsection" id="S2.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">2.1 </span>Noise2Noise</h3> <div class="ltx_para" id="S2.SS1.p1"> <p class="ltx_p" id="S2.SS1.p1.7">Noise2Noise <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Lehtinen_2018</span>]</cite> is an unsupervised training method originally proposed for image denoising (Fig. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S1.F1" title="Figure 1 ‣ 1 Introduction ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">1</span></a>(b)). In Noise2Noise, pairs of noisy signals <math alttext="(\bm{y}^{(1)}=\bm{s}+\bm{n}^{(1)},\bm{y}^{(2)}=\bm{s}+\bm{n}^{(2)})" class="ltx_Math" display="inline" id="S2.SS1.p1.1.m1.5"><semantics id="S2.SS1.p1.1.m1.5a"><mrow id="S2.SS1.p1.1.m1.5.5.1"><mo id="S2.SS1.p1.1.m1.5.5.1.2" stretchy="false">(</mo><mrow id="S2.SS1.p1.1.m1.5.5.1.1.2" xref="S2.SS1.p1.1.m1.5.5.1.1.3.cmml"><mrow id="S2.SS1.p1.1.m1.5.5.1.1.1.1" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.cmml"><msup id="S2.SS1.p1.1.m1.5.5.1.1.1.1.2" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.2.cmml"><mi id="S2.SS1.p1.1.m1.5.5.1.1.1.1.2.2" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.2.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.1.m1.1.1.1.3" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.2.cmml"><mo id="S2.SS1.p1.1.m1.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.2.cmml">(</mo><mn id="S2.SS1.p1.1.m1.1.1.1.1" xref="S2.SS1.p1.1.m1.1.1.1.1.cmml">1</mn><mo id="S2.SS1.p1.1.m1.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.2.cmml">)</mo></mrow></msup><mo id="S2.SS1.p1.1.m1.5.5.1.1.1.1.1" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.1.cmml">=</mo><mrow id="S2.SS1.p1.1.m1.5.5.1.1.1.1.3" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.cmml"><mi id="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.2" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.2.cmml">𝒔</mi><mo id="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.1" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.1.cmml">+</mo><msup id="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3.cmml"><mi id="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3.2" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3.2.cmml">𝒏</mi><mrow id="S2.SS1.p1.1.m1.2.2.1.3" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3.cmml"><mo id="S2.SS1.p1.1.m1.2.2.1.3.1" stretchy="false" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3.cmml">(</mo><mn id="S2.SS1.p1.1.m1.2.2.1.1" xref="S2.SS1.p1.1.m1.2.2.1.1.cmml">1</mn><mo id="S2.SS1.p1.1.m1.2.2.1.3.2" stretchy="false" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3.cmml">)</mo></mrow></msup></mrow></mrow><mo id="S2.SS1.p1.1.m1.5.5.1.1.2.3" xref="S2.SS1.p1.1.m1.5.5.1.1.3a.cmml">,</mo><mrow id="S2.SS1.p1.1.m1.5.5.1.1.2.2" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.cmml"><msup id="S2.SS1.p1.1.m1.5.5.1.1.2.2.2" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.2.cmml"><mi id="S2.SS1.p1.1.m1.5.5.1.1.2.2.2.2" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.2.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.1.m1.3.3.1.3" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.2.cmml"><mo id="S2.SS1.p1.1.m1.3.3.1.3.1" stretchy="false" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.2.cmml">(</mo><mn id="S2.SS1.p1.1.m1.3.3.1.1" xref="S2.SS1.p1.1.m1.3.3.1.1.cmml">2</mn><mo id="S2.SS1.p1.1.m1.3.3.1.3.2" stretchy="false" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.2.cmml">)</mo></mrow></msup><mo id="S2.SS1.p1.1.m1.5.5.1.1.2.2.1" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.1.cmml">=</mo><mrow id="S2.SS1.p1.1.m1.5.5.1.1.2.2.3" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.cmml"><mi id="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.2" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.2.cmml">𝒔</mi><mo id="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.1" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.1.cmml">+</mo><msup id="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3.cmml"><mi id="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3.2" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3.2.cmml">𝒏</mi><mrow id="S2.SS1.p1.1.m1.4.4.1.3" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3.cmml"><mo id="S2.SS1.p1.1.m1.4.4.1.3.1" stretchy="false" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3.cmml">(</mo><mn id="S2.SS1.p1.1.m1.4.4.1.1" xref="S2.SS1.p1.1.m1.4.4.1.1.cmml">2</mn><mo id="S2.SS1.p1.1.m1.4.4.1.3.2" stretchy="false" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3.cmml">)</mo></mrow></msup></mrow></mrow></mrow><mo id="S2.SS1.p1.1.m1.5.5.1.3" stretchy="false">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.1.m1.5b"><apply id="S2.SS1.p1.1.m1.5.5.1.1.3.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.1.m1.5.5.1.1.3a.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.3">formulae-sequence</csymbol><apply id="S2.SS1.p1.1.m1.5.5.1.1.1.1.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1"><eq id="S2.SS1.p1.1.m1.5.5.1.1.1.1.1.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.1"></eq><apply id="S2.SS1.p1.1.m1.5.5.1.1.1.1.2.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.1.m1.5.5.1.1.1.1.2.1.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.2">superscript</csymbol><ci id="S2.SS1.p1.1.m1.5.5.1.1.1.1.2.2.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.2.2">𝒚</ci><cn id="S2.SS1.p1.1.m1.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.1.m1.1.1.1.1">1</cn></apply><apply id="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3"><plus id="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.1.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.1"></plus><ci id="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.2.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.2">𝒔</ci><apply id="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3"><csymbol cd="ambiguous" id="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3.1.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3">superscript</csymbol><ci id="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3.2.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.1.1.3.3.2">𝒏</ci><cn id="S2.SS1.p1.1.m1.2.2.1.1.cmml" type="integer" xref="S2.SS1.p1.1.m1.2.2.1.1">1</cn></apply></apply></apply><apply id="S2.SS1.p1.1.m1.5.5.1.1.2.2.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2"><eq id="S2.SS1.p1.1.m1.5.5.1.1.2.2.1.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.1"></eq><apply id="S2.SS1.p1.1.m1.5.5.1.1.2.2.2.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.p1.1.m1.5.5.1.1.2.2.2.1.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.2">superscript</csymbol><ci id="S2.SS1.p1.1.m1.5.5.1.1.2.2.2.2.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.2.2">𝒚</ci><cn id="S2.SS1.p1.1.m1.3.3.1.1.cmml" type="integer" xref="S2.SS1.p1.1.m1.3.3.1.1">2</cn></apply><apply id="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3"><plus id="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.1.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.1"></plus><ci id="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.2.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.2">𝒔</ci><apply id="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3"><csymbol cd="ambiguous" id="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3.1.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3">superscript</csymbol><ci id="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3.2.cmml" xref="S2.SS1.p1.1.m1.5.5.1.1.2.2.3.3.2">𝒏</ci><cn id="S2.SS1.p1.1.m1.4.4.1.1.cmml" type="integer" xref="S2.SS1.p1.1.m1.4.4.1.1">2</cn></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.1.m1.5c">(\bm{y}^{(1)}=\bm{s}+\bm{n}^{(1)},\bm{y}^{(2)}=\bm{s}+\bm{n}^{(2)})</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.1.m1.5d">( bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT = bold_italic_s + bold_italic_n start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT = bold_italic_s + bold_italic_n start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT )</annotation></semantics></math> are used as training data, where the two noisy signals <math alttext="\bm{y}^{(1)}" class="ltx_Math" display="inline" id="S2.SS1.p1.2.m2.1"><semantics id="S2.SS1.p1.2.m2.1a"><msup id="S2.SS1.p1.2.m2.1.2" xref="S2.SS1.p1.2.m2.1.2.cmml"><mi id="S2.SS1.p1.2.m2.1.2.2" xref="S2.SS1.p1.2.m2.1.2.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.2.m2.1.1.1.3" xref="S2.SS1.p1.2.m2.1.2.cmml"><mo id="S2.SS1.p1.2.m2.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.2.m2.1.2.cmml">(</mo><mn id="S2.SS1.p1.2.m2.1.1.1.1" xref="S2.SS1.p1.2.m2.1.1.1.1.cmml">1</mn><mo id="S2.SS1.p1.2.m2.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.2.m2.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.2.m2.1b"><apply id="S2.SS1.p1.2.m2.1.2.cmml" xref="S2.SS1.p1.2.m2.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.2.m2.1.2.1.cmml" xref="S2.SS1.p1.2.m2.1.2">superscript</csymbol><ci id="S2.SS1.p1.2.m2.1.2.2.cmml" xref="S2.SS1.p1.2.m2.1.2.2">𝒚</ci><cn id="S2.SS1.p1.2.m2.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.2.m2.1.1.1.1">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.2.m2.1c">\bm{y}^{(1)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.2.m2.1d">bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{y}^{(2)}" class="ltx_Math" display="inline" id="S2.SS1.p1.3.m3.1"><semantics id="S2.SS1.p1.3.m3.1a"><msup id="S2.SS1.p1.3.m3.1.2" xref="S2.SS1.p1.3.m3.1.2.cmml"><mi id="S2.SS1.p1.3.m3.1.2.2" xref="S2.SS1.p1.3.m3.1.2.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.3.m3.1.1.1.3" xref="S2.SS1.p1.3.m3.1.2.cmml"><mo id="S2.SS1.p1.3.m3.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.3.m3.1.2.cmml">(</mo><mn id="S2.SS1.p1.3.m3.1.1.1.1" xref="S2.SS1.p1.3.m3.1.1.1.1.cmml">2</mn><mo id="S2.SS1.p1.3.m3.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.3.m3.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.3.m3.1b"><apply id="S2.SS1.p1.3.m3.1.2.cmml" xref="S2.SS1.p1.3.m3.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.3.m3.1.2.1.cmml" xref="S2.SS1.p1.3.m3.1.2">superscript</csymbol><ci id="S2.SS1.p1.3.m3.1.2.2.cmml" xref="S2.SS1.p1.3.m3.1.2.2">𝒚</ci><cn id="S2.SS1.p1.3.m3.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.3.m3.1.1.1.1">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.3.m3.1c">\bm{y}^{(2)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.3.m3.1d">bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT</annotation></semantics></math> share the same clean target signal <math alttext="\bm{s}" class="ltx_Math" display="inline" id="S2.SS1.p1.4.m4.1"><semantics id="S2.SS1.p1.4.m4.1a"><mi id="S2.SS1.p1.4.m4.1.1" xref="S2.SS1.p1.4.m4.1.1.cmml">𝒔</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.4.m4.1b"><ci id="S2.SS1.p1.4.m4.1.1.cmml" xref="S2.SS1.p1.4.m4.1.1">𝒔</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.4.m4.1c">\bm{s}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.4.m4.1d">bold_italic_s</annotation></semantics></math> but have different noise components <math alttext="\bm{n}^{(1)}" class="ltx_Math" display="inline" id="S2.SS1.p1.5.m5.1"><semantics id="S2.SS1.p1.5.m5.1a"><msup id="S2.SS1.p1.5.m5.1.2" xref="S2.SS1.p1.5.m5.1.2.cmml"><mi id="S2.SS1.p1.5.m5.1.2.2" xref="S2.SS1.p1.5.m5.1.2.2.cmml">𝒏</mi><mrow id="S2.SS1.p1.5.m5.1.1.1.3" xref="S2.SS1.p1.5.m5.1.2.cmml"><mo id="S2.SS1.p1.5.m5.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.5.m5.1.2.cmml">(</mo><mn id="S2.SS1.p1.5.m5.1.1.1.1" xref="S2.SS1.p1.5.m5.1.1.1.1.cmml">1</mn><mo id="S2.SS1.p1.5.m5.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.5.m5.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.5.m5.1b"><apply id="S2.SS1.p1.5.m5.1.2.cmml" xref="S2.SS1.p1.5.m5.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.5.m5.1.2.1.cmml" xref="S2.SS1.p1.5.m5.1.2">superscript</csymbol><ci id="S2.SS1.p1.5.m5.1.2.2.cmml" xref="S2.SS1.p1.5.m5.1.2.2">𝒏</ci><cn id="S2.SS1.p1.5.m5.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.5.m5.1.1.1.1">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.5.m5.1c">\bm{n}^{(1)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.5.m5.1d">bold_italic_n start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{(2)}" class="ltx_Math" display="inline" id="S2.SS1.p1.6.m6.1"><semantics id="S2.SS1.p1.6.m6.1a"><msup id="S2.SS1.p1.6.m6.1.2" xref="S2.SS1.p1.6.m6.1.2.cmml"><mi id="S2.SS1.p1.6.m6.1.2.2" xref="S2.SS1.p1.6.m6.1.2.2.cmml">𝒏</mi><mrow id="S2.SS1.p1.6.m6.1.1.1.3" xref="S2.SS1.p1.6.m6.1.2.cmml"><mo id="S2.SS1.p1.6.m6.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.6.m6.1.2.cmml">(</mo><mn id="S2.SS1.p1.6.m6.1.1.1.1" xref="S2.SS1.p1.6.m6.1.1.1.1.cmml">2</mn><mo id="S2.SS1.p1.6.m6.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.6.m6.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.6.m6.1b"><apply id="S2.SS1.p1.6.m6.1.2.cmml" xref="S2.SS1.p1.6.m6.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.6.m6.1.2.1.cmml" xref="S2.SS1.p1.6.m6.1.2">superscript</csymbol><ci id="S2.SS1.p1.6.m6.1.2.2.cmml" xref="S2.SS1.p1.6.m6.1.2.2">𝒏</ci><cn id="S2.SS1.p1.6.m6.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.6.m6.1.1.1.1">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.6.m6.1c">\bm{n}^{(2)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.6.m6.1d">bold_italic_n start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT</annotation></semantics></math>. A DNN <math alttext="f(\cdot)" class="ltx_Math" display="inline" id="S2.SS1.p1.7.m7.1"><semantics id="S2.SS1.p1.7.m7.1a"><mrow id="S2.SS1.p1.7.m7.1.2" xref="S2.SS1.p1.7.m7.1.2.cmml"><mi id="S2.SS1.p1.7.m7.1.2.2" xref="S2.SS1.p1.7.m7.1.2.2.cmml">f</mi><mo id="S2.SS1.p1.7.m7.1.2.1" xref="S2.SS1.p1.7.m7.1.2.1.cmml">⁢</mo><mrow id="S2.SS1.p1.7.m7.1.2.3.2" xref="S2.SS1.p1.7.m7.1.2.cmml"><mo id="S2.SS1.p1.7.m7.1.2.3.2.1" stretchy="false" xref="S2.SS1.p1.7.m7.1.2.cmml">(</mo><mo id="S2.SS1.p1.7.m7.1.1" lspace="0em" rspace="0em" xref="S2.SS1.p1.7.m7.1.1.cmml">⋅</mo><mo id="S2.SS1.p1.7.m7.1.2.3.2.2" stretchy="false" xref="S2.SS1.p1.7.m7.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.7.m7.1b"><apply id="S2.SS1.p1.7.m7.1.2.cmml" xref="S2.SS1.p1.7.m7.1.2"><times id="S2.SS1.p1.7.m7.1.2.1.cmml" xref="S2.SS1.p1.7.m7.1.2.1"></times><ci id="S2.SS1.p1.7.m7.1.2.2.cmml" xref="S2.SS1.p1.7.m7.1.2.2">𝑓</ci><ci id="S2.SS1.p1.7.m7.1.1.cmml" xref="S2.SS1.p1.7.m7.1.1">⋅</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.7.m7.1c">f(\cdot)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.7.m7.1d">italic_f ( ⋅ )</annotation></semantics></math> is trained to minimize the following prediction error:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E1"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="\mathcal{L}^{\rm N2N}=\mathbb{E}_{(\bm{y}^{(1)},\bm{y}^{(2)})\sim\mathcal{D}}[% L(f(\bm{y}^{(1)};\theta),\bm{y}^{(2)})]," class="ltx_Math" display="block" id="S2.E1.m1.8"><semantics id="S2.E1.m1.8a"><mrow id="S2.E1.m1.8.8.1" xref="S2.E1.m1.8.8.1.1.cmml"><mrow id="S2.E1.m1.8.8.1.1" xref="S2.E1.m1.8.8.1.1.cmml"><msup id="S2.E1.m1.8.8.1.1.3" xref="S2.E1.m1.8.8.1.1.3.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E1.m1.8.8.1.1.3.2" xref="S2.E1.m1.8.8.1.1.3.2.cmml">ℒ</mi><mi id="S2.E1.m1.8.8.1.1.3.3" xref="S2.E1.m1.8.8.1.1.3.3.cmml">N2N</mi></msup><mo id="S2.E1.m1.8.8.1.1.2" xref="S2.E1.m1.8.8.1.1.2.cmml">=</mo><mrow id="S2.E1.m1.8.8.1.1.1" xref="S2.E1.m1.8.8.1.1.1.cmml"><msub id="S2.E1.m1.8.8.1.1.1.3" xref="S2.E1.m1.8.8.1.1.1.3.cmml"><mi id="S2.E1.m1.8.8.1.1.1.3.2" xref="S2.E1.m1.8.8.1.1.1.3.2.cmml">𝔼</mi><mrow id="S2.E1.m1.4.4.4" xref="S2.E1.m1.4.4.4.cmml"><mrow id="S2.E1.m1.4.4.4.4.2" xref="S2.E1.m1.4.4.4.4.3.cmml"><mo id="S2.E1.m1.4.4.4.4.2.3" stretchy="false" xref="S2.E1.m1.4.4.4.4.3.cmml">(</mo><msup id="S2.E1.m1.3.3.3.3.1.1" xref="S2.E1.m1.3.3.3.3.1.1.cmml"><mi id="S2.E1.m1.3.3.3.3.1.1.2" xref="S2.E1.m1.3.3.3.3.1.1.2.cmml">𝒚</mi><mrow id="S2.E1.m1.1.1.1.1.1.3" xref="S2.E1.m1.3.3.3.3.1.1.cmml"><mo id="S2.E1.m1.1.1.1.1.1.3.1" stretchy="false" xref="S2.E1.m1.3.3.3.3.1.1.cmml">(</mo><mn id="S2.E1.m1.1.1.1.1.1.1" xref="S2.E1.m1.1.1.1.1.1.1.cmml">1</mn><mo id="S2.E1.m1.1.1.1.1.1.3.2" stretchy="false" xref="S2.E1.m1.3.3.3.3.1.1.cmml">)</mo></mrow></msup><mo id="S2.E1.m1.4.4.4.4.2.4" xref="S2.E1.m1.4.4.4.4.3.cmml">,</mo><msup id="S2.E1.m1.4.4.4.4.2.2" xref="S2.E1.m1.4.4.4.4.2.2.cmml"><mi id="S2.E1.m1.4.4.4.4.2.2.2" xref="S2.E1.m1.4.4.4.4.2.2.2.cmml">𝒚</mi><mrow id="S2.E1.m1.2.2.2.2.1.3" xref="S2.E1.m1.4.4.4.4.2.2.cmml"><mo id="S2.E1.m1.2.2.2.2.1.3.1" stretchy="false" xref="S2.E1.m1.4.4.4.4.2.2.cmml">(</mo><mn id="S2.E1.m1.2.2.2.2.1.1" xref="S2.E1.m1.2.2.2.2.1.1.cmml">2</mn><mo id="S2.E1.m1.2.2.2.2.1.3.2" stretchy="false" xref="S2.E1.m1.4.4.4.4.2.2.cmml">)</mo></mrow></msup><mo id="S2.E1.m1.4.4.4.4.2.5" stretchy="false" xref="S2.E1.m1.4.4.4.4.3.cmml">)</mo></mrow><mo id="S2.E1.m1.4.4.4.5" xref="S2.E1.m1.4.4.4.5.cmml">∼</mo><mi class="ltx_font_mathcaligraphic" id="S2.E1.m1.4.4.4.6" xref="S2.E1.m1.4.4.4.6.cmml">𝒟</mi></mrow></msub><mo id="S2.E1.m1.8.8.1.1.1.2" xref="S2.E1.m1.8.8.1.1.1.2.cmml">⁢</mo><mrow id="S2.E1.m1.8.8.1.1.1.1.1" xref="S2.E1.m1.8.8.1.1.1.1.2.cmml"><mo id="S2.E1.m1.8.8.1.1.1.1.1.2" stretchy="false" xref="S2.E1.m1.8.8.1.1.1.1.2.1.cmml">[</mo><mrow id="S2.E1.m1.8.8.1.1.1.1.1.1" xref="S2.E1.m1.8.8.1.1.1.1.1.1.cmml"><mi id="S2.E1.m1.8.8.1.1.1.1.1.1.4" xref="S2.E1.m1.8.8.1.1.1.1.1.1.4.cmml">L</mi><mo id="S2.E1.m1.8.8.1.1.1.1.1.1.3" xref="S2.E1.m1.8.8.1.1.1.1.1.1.3.cmml">⁢</mo><mrow id="S2.E1.m1.8.8.1.1.1.1.1.1.2.2" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.3.cmml"><mo id="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.3" stretchy="false" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.3.cmml">(</mo><mrow id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.cmml"><mi id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.3" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.3.cmml">f</mi><mo id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.2" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.2.cmml">⁢</mo><mrow id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.2.cmml"><mo id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.2.cmml">(</mo><msup id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml">𝒚</mi><mrow id="S2.E1.m1.5.5.1.3" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><mo id="S2.E1.m1.5.5.1.3.1" stretchy="false" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1.cmml">(</mo><mn id="S2.E1.m1.5.5.1.1" xref="S2.E1.m1.5.5.1.1.cmml">1</mn><mo id="S2.E1.m1.5.5.1.3.2" stretchy="false" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow></msup><mo id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.3" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.2.cmml">;</mo><mi id="S2.E1.m1.7.7" xref="S2.E1.m1.7.7.cmml">θ</mi><mo id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.4" stretchy="false" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.2.cmml">)</mo></mrow></mrow><mo id="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.4" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.3.cmml">,</mo><msup id="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2.cmml"><mi id="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2.2" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2.2.cmml">𝒚</mi><mrow id="S2.E1.m1.6.6.1.3" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2.cmml"><mo id="S2.E1.m1.6.6.1.3.1" stretchy="false" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2.cmml">(</mo><mn id="S2.E1.m1.6.6.1.1" xref="S2.E1.m1.6.6.1.1.cmml">2</mn><mo id="S2.E1.m1.6.6.1.3.2" stretchy="false" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2.cmml">)</mo></mrow></msup><mo id="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.5" stretchy="false" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.3.cmml">)</mo></mrow></mrow><mo id="S2.E1.m1.8.8.1.1.1.1.1.3" stretchy="false" xref="S2.E1.m1.8.8.1.1.1.1.2.1.cmml">]</mo></mrow></mrow></mrow><mo id="S2.E1.m1.8.8.1.2" xref="S2.E1.m1.8.8.1.1.cmml">,</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.E1.m1.8b"><apply id="S2.E1.m1.8.8.1.1.cmml" xref="S2.E1.m1.8.8.1"><eq id="S2.E1.m1.8.8.1.1.2.cmml" xref="S2.E1.m1.8.8.1.1.2"></eq><apply id="S2.E1.m1.8.8.1.1.3.cmml" xref="S2.E1.m1.8.8.1.1.3"><csymbol cd="ambiguous" id="S2.E1.m1.8.8.1.1.3.1.cmml" xref="S2.E1.m1.8.8.1.1.3">superscript</csymbol><ci id="S2.E1.m1.8.8.1.1.3.2.cmml" xref="S2.E1.m1.8.8.1.1.3.2">ℒ</ci><ci id="S2.E1.m1.8.8.1.1.3.3.cmml" xref="S2.E1.m1.8.8.1.1.3.3">N2N</ci></apply><apply id="S2.E1.m1.8.8.1.1.1.cmml" xref="S2.E1.m1.8.8.1.1.1"><times id="S2.E1.m1.8.8.1.1.1.2.cmml" xref="S2.E1.m1.8.8.1.1.1.2"></times><apply id="S2.E1.m1.8.8.1.1.1.3.cmml" xref="S2.E1.m1.8.8.1.1.1.3"><csymbol cd="ambiguous" id="S2.E1.m1.8.8.1.1.1.3.1.cmml" xref="S2.E1.m1.8.8.1.1.1.3">subscript</csymbol><ci id="S2.E1.m1.8.8.1.1.1.3.2.cmml" xref="S2.E1.m1.8.8.1.1.1.3.2">𝔼</ci><apply id="S2.E1.m1.4.4.4.cmml" xref="S2.E1.m1.4.4.4"><csymbol cd="latexml" id="S2.E1.m1.4.4.4.5.cmml" xref="S2.E1.m1.4.4.4.5">similar-to</csymbol><interval closure="open" id="S2.E1.m1.4.4.4.4.3.cmml" xref="S2.E1.m1.4.4.4.4.2"><apply id="S2.E1.m1.3.3.3.3.1.1.cmml" xref="S2.E1.m1.3.3.3.3.1.1"><csymbol cd="ambiguous" id="S2.E1.m1.3.3.3.3.1.1.1.cmml" xref="S2.E1.m1.3.3.3.3.1.1">superscript</csymbol><ci id="S2.E1.m1.3.3.3.3.1.1.2.cmml" xref="S2.E1.m1.3.3.3.3.1.1.2">𝒚</ci><cn id="S2.E1.m1.1.1.1.1.1.1.cmml" type="integer" xref="S2.E1.m1.1.1.1.1.1.1">1</cn></apply><apply id="S2.E1.m1.4.4.4.4.2.2.cmml" xref="S2.E1.m1.4.4.4.4.2.2"><csymbol cd="ambiguous" id="S2.E1.m1.4.4.4.4.2.2.1.cmml" xref="S2.E1.m1.4.4.4.4.2.2">superscript</csymbol><ci id="S2.E1.m1.4.4.4.4.2.2.2.cmml" xref="S2.E1.m1.4.4.4.4.2.2.2">𝒚</ci><cn id="S2.E1.m1.2.2.2.2.1.1.cmml" type="integer" xref="S2.E1.m1.2.2.2.2.1.1">2</cn></apply></interval><ci id="S2.E1.m1.4.4.4.6.cmml" xref="S2.E1.m1.4.4.4.6">𝒟</ci></apply></apply><apply id="S2.E1.m1.8.8.1.1.1.1.2.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1"><csymbol cd="latexml" id="S2.E1.m1.8.8.1.1.1.1.2.1.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.2">delimited-[]</csymbol><apply id="S2.E1.m1.8.8.1.1.1.1.1.1.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1"><times id="S2.E1.m1.8.8.1.1.1.1.1.1.3.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.3"></times><ci id="S2.E1.m1.8.8.1.1.1.1.1.1.4.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.4">𝐿</ci><interval closure="open" id="S2.E1.m1.8.8.1.1.1.1.1.1.2.3.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.2"><apply id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1"><times id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.2"></times><ci id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.3">𝑓</ci><list id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1"><apply id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1">superscript</csymbol><ci id="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.1.1.1.1.1.1.2">𝒚</ci><cn id="S2.E1.m1.5.5.1.1.cmml" type="integer" xref="S2.E1.m1.5.5.1.1">1</cn></apply><ci id="S2.E1.m1.7.7.cmml" xref="S2.E1.m1.7.7">𝜃</ci></list></apply><apply id="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2"><csymbol cd="ambiguous" id="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2.1.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2">superscript</csymbol><ci id="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2.2.cmml" xref="S2.E1.m1.8.8.1.1.1.1.1.1.2.2.2.2">𝒚</ci><cn id="S2.E1.m1.6.6.1.1.cmml" type="integer" xref="S2.E1.m1.6.6.1.1">2</cn></apply></interval></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E1.m1.8c">\mathcal{L}^{\rm N2N}=\mathbb{E}_{(\bm{y}^{(1)},\bm{y}^{(2)})\sim\mathcal{D}}[% L(f(\bm{y}^{(1)};\theta),\bm{y}^{(2)})],</annotation><annotation encoding="application/x-llamapun" id="S2.E1.m1.8d">caligraphic_L start_POSTSUPERSCRIPT N2N end_POSTSUPERSCRIPT = blackboard_E start_POSTSUBSCRIPT ( bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ) ∼ caligraphic_D end_POSTSUBSCRIPT [ italic_L ( italic_f ( bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ; italic_θ ) , bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ) ] ,</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(1)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS1.p1.19">where <math alttext="\mathbb{E}[\cdot]" class="ltx_Math" display="inline" id="S2.SS1.p1.8.m1.1"><semantics id="S2.SS1.p1.8.m1.1a"><mrow id="S2.SS1.p1.8.m1.1.2" xref="S2.SS1.p1.8.m1.1.2.cmml"><mi id="S2.SS1.p1.8.m1.1.2.2" xref="S2.SS1.p1.8.m1.1.2.2.cmml">𝔼</mi><mo id="S2.SS1.p1.8.m1.1.2.1" xref="S2.SS1.p1.8.m1.1.2.1.cmml">⁢</mo><mrow id="S2.SS1.p1.8.m1.1.2.3.2" xref="S2.SS1.p1.8.m1.1.2.3.1.cmml"><mo id="S2.SS1.p1.8.m1.1.2.3.2.1" stretchy="false" xref="S2.SS1.p1.8.m1.1.2.3.1.1.cmml">[</mo><mo id="S2.SS1.p1.8.m1.1.1" lspace="0em" rspace="0em" xref="S2.SS1.p1.8.m1.1.1.cmml">⋅</mo><mo id="S2.SS1.p1.8.m1.1.2.3.2.2" stretchy="false" xref="S2.SS1.p1.8.m1.1.2.3.1.1.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.8.m1.1b"><apply id="S2.SS1.p1.8.m1.1.2.cmml" xref="S2.SS1.p1.8.m1.1.2"><times id="S2.SS1.p1.8.m1.1.2.1.cmml" xref="S2.SS1.p1.8.m1.1.2.1"></times><ci id="S2.SS1.p1.8.m1.1.2.2.cmml" xref="S2.SS1.p1.8.m1.1.2.2">𝔼</ci><apply id="S2.SS1.p1.8.m1.1.2.3.1.cmml" xref="S2.SS1.p1.8.m1.1.2.3.2"><csymbol cd="latexml" id="S2.SS1.p1.8.m1.1.2.3.1.1.cmml" xref="S2.SS1.p1.8.m1.1.2.3.2.1">delimited-[]</csymbol><ci id="S2.SS1.p1.8.m1.1.1.cmml" xref="S2.SS1.p1.8.m1.1.1">⋅</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.8.m1.1c">\mathbb{E}[\cdot]</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.8.m1.1d">blackboard_E [ ⋅ ]</annotation></semantics></math> is the expectation operator, <math alttext="\mathcal{D}" class="ltx_Math" display="inline" id="S2.SS1.p1.9.m2.1"><semantics id="S2.SS1.p1.9.m2.1a"><mi class="ltx_font_mathcaligraphic" id="S2.SS1.p1.9.m2.1.1" xref="S2.SS1.p1.9.m2.1.1.cmml">𝒟</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.9.m2.1b"><ci id="S2.SS1.p1.9.m2.1.1.cmml" xref="S2.SS1.p1.9.m2.1.1">𝒟</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.9.m2.1c">\mathcal{D}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.9.m2.1d">caligraphic_D</annotation></semantics></math> is a training dataset, <math alttext="L(\cdot)" class="ltx_Math" display="inline" id="S2.SS1.p1.10.m3.1"><semantics id="S2.SS1.p1.10.m3.1a"><mrow id="S2.SS1.p1.10.m3.1.2" xref="S2.SS1.p1.10.m3.1.2.cmml"><mi id="S2.SS1.p1.10.m3.1.2.2" xref="S2.SS1.p1.10.m3.1.2.2.cmml">L</mi><mo id="S2.SS1.p1.10.m3.1.2.1" xref="S2.SS1.p1.10.m3.1.2.1.cmml">⁢</mo><mrow id="S2.SS1.p1.10.m3.1.2.3.2" xref="S2.SS1.p1.10.m3.1.2.cmml"><mo id="S2.SS1.p1.10.m3.1.2.3.2.1" stretchy="false" xref="S2.SS1.p1.10.m3.1.2.cmml">(</mo><mo id="S2.SS1.p1.10.m3.1.1" lspace="0em" rspace="0em" xref="S2.SS1.p1.10.m3.1.1.cmml">⋅</mo><mo id="S2.SS1.p1.10.m3.1.2.3.2.2" stretchy="false" xref="S2.SS1.p1.10.m3.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.10.m3.1b"><apply id="S2.SS1.p1.10.m3.1.2.cmml" xref="S2.SS1.p1.10.m3.1.2"><times id="S2.SS1.p1.10.m3.1.2.1.cmml" xref="S2.SS1.p1.10.m3.1.2.1"></times><ci id="S2.SS1.p1.10.m3.1.2.2.cmml" xref="S2.SS1.p1.10.m3.1.2.2">𝐿</ci><ci id="S2.SS1.p1.10.m3.1.1.cmml" xref="S2.SS1.p1.10.m3.1.1">⋅</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.10.m3.1c">L(\cdot)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.10.m3.1d">italic_L ( ⋅ )</annotation></semantics></math> is a loss function, and <math alttext="\theta" class="ltx_Math" display="inline" id="S2.SS1.p1.11.m4.1"><semantics id="S2.SS1.p1.11.m4.1a"><mi id="S2.SS1.p1.11.m4.1.1" xref="S2.SS1.p1.11.m4.1.1.cmml">θ</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.11.m4.1b"><ci id="S2.SS1.p1.11.m4.1.1.cmml" xref="S2.SS1.p1.11.m4.1.1">𝜃</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.11.m4.1c">\theta</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.11.m4.1d">italic_θ</annotation></semantics></math> is the set of parameters of the DNN <math alttext="f(\cdot)" class="ltx_Math" display="inline" id="S2.SS1.p1.12.m5.1"><semantics id="S2.SS1.p1.12.m5.1a"><mrow id="S2.SS1.p1.12.m5.1.2" xref="S2.SS1.p1.12.m5.1.2.cmml"><mi id="S2.SS1.p1.12.m5.1.2.2" xref="S2.SS1.p1.12.m5.1.2.2.cmml">f</mi><mo id="S2.SS1.p1.12.m5.1.2.1" xref="S2.SS1.p1.12.m5.1.2.1.cmml">⁢</mo><mrow id="S2.SS1.p1.12.m5.1.2.3.2" xref="S2.SS1.p1.12.m5.1.2.cmml"><mo id="S2.SS1.p1.12.m5.1.2.3.2.1" stretchy="false" xref="S2.SS1.p1.12.m5.1.2.cmml">(</mo><mo id="S2.SS1.p1.12.m5.1.1" lspace="0em" rspace="0em" xref="S2.SS1.p1.12.m5.1.1.cmml">⋅</mo><mo id="S2.SS1.p1.12.m5.1.2.3.2.2" stretchy="false" xref="S2.SS1.p1.12.m5.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.12.m5.1b"><apply id="S2.SS1.p1.12.m5.1.2.cmml" xref="S2.SS1.p1.12.m5.1.2"><times id="S2.SS1.p1.12.m5.1.2.1.cmml" xref="S2.SS1.p1.12.m5.1.2.1"></times><ci id="S2.SS1.p1.12.m5.1.2.2.cmml" xref="S2.SS1.p1.12.m5.1.2.2">𝑓</ci><ci id="S2.SS1.p1.12.m5.1.1.cmml" xref="S2.SS1.p1.12.m5.1.1">⋅</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.12.m5.1c">f(\cdot)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.12.m5.1d">italic_f ( ⋅ )</annotation></semantics></math>. Here, Noise2Noise trains the DNN to acquire one-to-one mappings between the two noisy signals <math alttext="\bm{y}^{(1)}" class="ltx_Math" display="inline" id="S2.SS1.p1.13.m6.1"><semantics id="S2.SS1.p1.13.m6.1a"><msup id="S2.SS1.p1.13.m6.1.2" xref="S2.SS1.p1.13.m6.1.2.cmml"><mi id="S2.SS1.p1.13.m6.1.2.2" xref="S2.SS1.p1.13.m6.1.2.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.13.m6.1.1.1.3" xref="S2.SS1.p1.13.m6.1.2.cmml"><mo id="S2.SS1.p1.13.m6.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.13.m6.1.2.cmml">(</mo><mn id="S2.SS1.p1.13.m6.1.1.1.1" xref="S2.SS1.p1.13.m6.1.1.1.1.cmml">1</mn><mo id="S2.SS1.p1.13.m6.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.13.m6.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.13.m6.1b"><apply id="S2.SS1.p1.13.m6.1.2.cmml" xref="S2.SS1.p1.13.m6.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.13.m6.1.2.1.cmml" xref="S2.SS1.p1.13.m6.1.2">superscript</csymbol><ci id="S2.SS1.p1.13.m6.1.2.2.cmml" xref="S2.SS1.p1.13.m6.1.2.2">𝒚</ci><cn id="S2.SS1.p1.13.m6.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.13.m6.1.1.1.1">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.13.m6.1c">\bm{y}^{(1)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.13.m6.1d">bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{y}^{(2)}" class="ltx_Math" display="inline" id="S2.SS1.p1.14.m7.1"><semantics id="S2.SS1.p1.14.m7.1a"><msup id="S2.SS1.p1.14.m7.1.2" xref="S2.SS1.p1.14.m7.1.2.cmml"><mi id="S2.SS1.p1.14.m7.1.2.2" xref="S2.SS1.p1.14.m7.1.2.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.14.m7.1.1.1.3" xref="S2.SS1.p1.14.m7.1.2.cmml"><mo id="S2.SS1.p1.14.m7.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.14.m7.1.2.cmml">(</mo><mn id="S2.SS1.p1.14.m7.1.1.1.1" xref="S2.SS1.p1.14.m7.1.1.1.1.cmml">2</mn><mo id="S2.SS1.p1.14.m7.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.14.m7.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.14.m7.1b"><apply id="S2.SS1.p1.14.m7.1.2.cmml" xref="S2.SS1.p1.14.m7.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.14.m7.1.2.1.cmml" xref="S2.SS1.p1.14.m7.1.2">superscript</csymbol><ci id="S2.SS1.p1.14.m7.1.2.2.cmml" xref="S2.SS1.p1.14.m7.1.2.2">𝒚</ci><cn id="S2.SS1.p1.14.m7.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.14.m7.1.1.1.1">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.14.m7.1c">\bm{y}^{(2)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.14.m7.1d">bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT</annotation></semantics></math>. However, multiple plausible outputs can exist for a given input, especially when there is no consistent relationship between the two noise signals <math alttext="\bm{n}^{(1)}" class="ltx_Math" display="inline" id="S2.SS1.p1.15.m8.1"><semantics id="S2.SS1.p1.15.m8.1a"><msup id="S2.SS1.p1.15.m8.1.2" xref="S2.SS1.p1.15.m8.1.2.cmml"><mi id="S2.SS1.p1.15.m8.1.2.2" xref="S2.SS1.p1.15.m8.1.2.2.cmml">𝒏</mi><mrow id="S2.SS1.p1.15.m8.1.1.1.3" xref="S2.SS1.p1.15.m8.1.2.cmml"><mo id="S2.SS1.p1.15.m8.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.15.m8.1.2.cmml">(</mo><mn id="S2.SS1.p1.15.m8.1.1.1.1" xref="S2.SS1.p1.15.m8.1.1.1.1.cmml">1</mn><mo id="S2.SS1.p1.15.m8.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.15.m8.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.15.m8.1b"><apply id="S2.SS1.p1.15.m8.1.2.cmml" xref="S2.SS1.p1.15.m8.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.15.m8.1.2.1.cmml" xref="S2.SS1.p1.15.m8.1.2">superscript</csymbol><ci id="S2.SS1.p1.15.m8.1.2.2.cmml" xref="S2.SS1.p1.15.m8.1.2.2">𝒏</ci><cn id="S2.SS1.p1.15.m8.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.15.m8.1.1.1.1">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.15.m8.1c">\bm{n}^{(1)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.15.m8.1d">bold_italic_n start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{(2)}" class="ltx_Math" display="inline" id="S2.SS1.p1.16.m9.1"><semantics id="S2.SS1.p1.16.m9.1a"><msup id="S2.SS1.p1.16.m9.1.2" xref="S2.SS1.p1.16.m9.1.2.cmml"><mi id="S2.SS1.p1.16.m9.1.2.2" xref="S2.SS1.p1.16.m9.1.2.2.cmml">𝒏</mi><mrow id="S2.SS1.p1.16.m9.1.1.1.3" xref="S2.SS1.p1.16.m9.1.2.cmml"><mo id="S2.SS1.p1.16.m9.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.16.m9.1.2.cmml">(</mo><mn id="S2.SS1.p1.16.m9.1.1.1.1" xref="S2.SS1.p1.16.m9.1.1.1.1.cmml">2</mn><mo id="S2.SS1.p1.16.m9.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.16.m9.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.16.m9.1b"><apply id="S2.SS1.p1.16.m9.1.2.cmml" xref="S2.SS1.p1.16.m9.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.16.m9.1.2.1.cmml" xref="S2.SS1.p1.16.m9.1.2">superscript</csymbol><ci id="S2.SS1.p1.16.m9.1.2.2.cmml" xref="S2.SS1.p1.16.m9.1.2.2">𝒏</ci><cn id="S2.SS1.p1.16.m9.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.16.m9.1.1.1.1">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.16.m9.1c">\bm{n}^{(2)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.16.m9.1d">bold_italic_n start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT</annotation></semantics></math>. In such a case, if the loss function is MSE, the optimal solution becomes the average of the plausible candidates. For example, we consider the optimal output <math alttext="\hat{\bm{y}}^{(2)}" class="ltx_Math" display="inline" id="S2.SS1.p1.17.m10.1"><semantics id="S2.SS1.p1.17.m10.1a"><msup id="S2.SS1.p1.17.m10.1.2" xref="S2.SS1.p1.17.m10.1.2.cmml"><mover accent="true" id="S2.SS1.p1.17.m10.1.2.2" xref="S2.SS1.p1.17.m10.1.2.2.cmml"><mi id="S2.SS1.p1.17.m10.1.2.2.2" xref="S2.SS1.p1.17.m10.1.2.2.2.cmml">𝒚</mi><mo id="S2.SS1.p1.17.m10.1.2.2.1" xref="S2.SS1.p1.17.m10.1.2.2.1.cmml">^</mo></mover><mrow id="S2.SS1.p1.17.m10.1.1.1.3" xref="S2.SS1.p1.17.m10.1.2.cmml"><mo id="S2.SS1.p1.17.m10.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.17.m10.1.2.cmml">(</mo><mn id="S2.SS1.p1.17.m10.1.1.1.1" xref="S2.SS1.p1.17.m10.1.1.1.1.cmml">2</mn><mo id="S2.SS1.p1.17.m10.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.17.m10.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.17.m10.1b"><apply id="S2.SS1.p1.17.m10.1.2.cmml" xref="S2.SS1.p1.17.m10.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.17.m10.1.2.1.cmml" xref="S2.SS1.p1.17.m10.1.2">superscript</csymbol><apply id="S2.SS1.p1.17.m10.1.2.2.cmml" xref="S2.SS1.p1.17.m10.1.2.2"><ci id="S2.SS1.p1.17.m10.1.2.2.1.cmml" xref="S2.SS1.p1.17.m10.1.2.2.1">^</ci><ci id="S2.SS1.p1.17.m10.1.2.2.2.cmml" xref="S2.SS1.p1.17.m10.1.2.2.2">𝒚</ci></apply><cn id="S2.SS1.p1.17.m10.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.17.m10.1.1.1.1">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.17.m10.1c">\hat{\bm{y}}^{(2)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.17.m10.1d">over^ start_ARG bold_italic_y end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT</annotation></semantics></math> for a given input <math alttext="\bm{y}^{(1)}" class="ltx_Math" display="inline" id="S2.SS1.p1.18.m11.1"><semantics id="S2.SS1.p1.18.m11.1a"><msup id="S2.SS1.p1.18.m11.1.2" xref="S2.SS1.p1.18.m11.1.2.cmml"><mi id="S2.SS1.p1.18.m11.1.2.2" xref="S2.SS1.p1.18.m11.1.2.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.18.m11.1.1.1.3" xref="S2.SS1.p1.18.m11.1.2.cmml"><mo id="S2.SS1.p1.18.m11.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.18.m11.1.2.cmml">(</mo><mn id="S2.SS1.p1.18.m11.1.1.1.1" xref="S2.SS1.p1.18.m11.1.1.1.1.cmml">1</mn><mo id="S2.SS1.p1.18.m11.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.18.m11.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.18.m11.1b"><apply id="S2.SS1.p1.18.m11.1.2.cmml" xref="S2.SS1.p1.18.m11.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.18.m11.1.2.1.cmml" xref="S2.SS1.p1.18.m11.1.2">superscript</csymbol><ci id="S2.SS1.p1.18.m11.1.2.2.cmml" xref="S2.SS1.p1.18.m11.1.2.2">𝒚</ci><cn id="S2.SS1.p1.18.m11.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.18.m11.1.1.1.1">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.18.m11.1c">\bm{y}^{(1)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.18.m11.1d">bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT</annotation></semantics></math>, when using MSE as the loss function. Here, the training objective is to minimize the following <math alttext="\mathcal{L}^{\rm N2N}_{\bm{y}^{(2)}|\bm{y}^{(1)}}" class="ltx_Math" display="inline" id="S2.SS1.p1.19.m12.2"><semantics id="S2.SS1.p1.19.m12.2a"><msubsup id="S2.SS1.p1.19.m12.2.3" xref="S2.SS1.p1.19.m12.2.3.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.SS1.p1.19.m12.2.3.2.2" xref="S2.SS1.p1.19.m12.2.3.2.2.cmml">ℒ</mi><mrow id="S2.SS1.p1.19.m12.2.2.2" xref="S2.SS1.p1.19.m12.2.2.2.cmml"><msup id="S2.SS1.p1.19.m12.2.2.2.4" xref="S2.SS1.p1.19.m12.2.2.2.4.cmml"><mi id="S2.SS1.p1.19.m12.2.2.2.4.2" xref="S2.SS1.p1.19.m12.2.2.2.4.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.19.m12.1.1.1.1.1.3" xref="S2.SS1.p1.19.m12.2.2.2.4.cmml"><mo id="S2.SS1.p1.19.m12.1.1.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.19.m12.2.2.2.4.cmml">(</mo><mn id="S2.SS1.p1.19.m12.1.1.1.1.1.1" xref="S2.SS1.p1.19.m12.1.1.1.1.1.1.cmml">2</mn><mo id="S2.SS1.p1.19.m12.1.1.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.19.m12.2.2.2.4.cmml">)</mo></mrow></msup><mo fence="false" id="S2.SS1.p1.19.m12.2.2.2.3" xref="S2.SS1.p1.19.m12.2.2.2.3.cmml">|</mo><msup id="S2.SS1.p1.19.m12.2.2.2.5" xref="S2.SS1.p1.19.m12.2.2.2.5.cmml"><mi id="S2.SS1.p1.19.m12.2.2.2.5.2" xref="S2.SS1.p1.19.m12.2.2.2.5.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.19.m12.2.2.2.2.1.3" xref="S2.SS1.p1.19.m12.2.2.2.5.cmml"><mo id="S2.SS1.p1.19.m12.2.2.2.2.1.3.1" stretchy="false" xref="S2.SS1.p1.19.m12.2.2.2.5.cmml">(</mo><mn id="S2.SS1.p1.19.m12.2.2.2.2.1.1" xref="S2.SS1.p1.19.m12.2.2.2.2.1.1.cmml">1</mn><mo id="S2.SS1.p1.19.m12.2.2.2.2.1.3.2" stretchy="false" xref="S2.SS1.p1.19.m12.2.2.2.5.cmml">)</mo></mrow></msup></mrow><mi id="S2.SS1.p1.19.m12.2.3.2.3" xref="S2.SS1.p1.19.m12.2.3.2.3.cmml">N2N</mi></msubsup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.19.m12.2b"><apply id="S2.SS1.p1.19.m12.2.3.cmml" xref="S2.SS1.p1.19.m12.2.3"><csymbol cd="ambiguous" id="S2.SS1.p1.19.m12.2.3.1.cmml" xref="S2.SS1.p1.19.m12.2.3">subscript</csymbol><apply id="S2.SS1.p1.19.m12.2.3.2.cmml" xref="S2.SS1.p1.19.m12.2.3"><csymbol cd="ambiguous" id="S2.SS1.p1.19.m12.2.3.2.1.cmml" xref="S2.SS1.p1.19.m12.2.3">superscript</csymbol><ci id="S2.SS1.p1.19.m12.2.3.2.2.cmml" xref="S2.SS1.p1.19.m12.2.3.2.2">ℒ</ci><ci id="S2.SS1.p1.19.m12.2.3.2.3.cmml" xref="S2.SS1.p1.19.m12.2.3.2.3">N2N</ci></apply><apply id="S2.SS1.p1.19.m12.2.2.2.cmml" xref="S2.SS1.p1.19.m12.2.2.2"><csymbol cd="latexml" id="S2.SS1.p1.19.m12.2.2.2.3.cmml" xref="S2.SS1.p1.19.m12.2.2.2.3">conditional</csymbol><apply id="S2.SS1.p1.19.m12.2.2.2.4.cmml" xref="S2.SS1.p1.19.m12.2.2.2.4"><csymbol cd="ambiguous" id="S2.SS1.p1.19.m12.2.2.2.4.1.cmml" xref="S2.SS1.p1.19.m12.2.2.2.4">superscript</csymbol><ci id="S2.SS1.p1.19.m12.2.2.2.4.2.cmml" xref="S2.SS1.p1.19.m12.2.2.2.4.2">𝒚</ci><cn id="S2.SS1.p1.19.m12.1.1.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.19.m12.1.1.1.1.1.1">2</cn></apply><apply id="S2.SS1.p1.19.m12.2.2.2.5.cmml" xref="S2.SS1.p1.19.m12.2.2.2.5"><csymbol cd="ambiguous" id="S2.SS1.p1.19.m12.2.2.2.5.1.cmml" xref="S2.SS1.p1.19.m12.2.2.2.5">superscript</csymbol><ci id="S2.SS1.p1.19.m12.2.2.2.5.2.cmml" xref="S2.SS1.p1.19.m12.2.2.2.5.2">𝒚</ci><cn id="S2.SS1.p1.19.m12.2.2.2.2.1.1.cmml" type="integer" xref="S2.SS1.p1.19.m12.2.2.2.2.1.1">1</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.19.m12.2c">\mathcal{L}^{\rm N2N}_{\bm{y}^{(2)}|\bm{y}^{(1)}}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.19.m12.2d">caligraphic_L start_POSTSUPERSCRIPT N2N end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT | bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT end_POSTSUBSCRIPT</annotation></semantics></math>:</p> <table class="ltx_equationgroup ltx_eqn_align ltx_eqn_table" id="Sx2.EGx1"> <tbody id="S2.E2"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\mathcal{L}^{\rm N2N}_{\bm{y}^{(2)}|\bm{y}^{(1)}}" class="ltx_Math" display="inline" id="S2.E2.m1.2"><semantics id="S2.E2.m1.2a"><msubsup id="S2.E2.m1.2.3" xref="S2.E2.m1.2.3.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E2.m1.2.3.2.2" xref="S2.E2.m1.2.3.2.2.cmml">ℒ</mi><mrow id="S2.E2.m1.2.2.2" xref="S2.E2.m1.2.2.2.cmml"><msup id="S2.E2.m1.2.2.2.4" xref="S2.E2.m1.2.2.2.4.cmml"><mi id="S2.E2.m1.2.2.2.4.2" xref="S2.E2.m1.2.2.2.4.2.cmml">𝒚</mi><mrow id="S2.E2.m1.1.1.1.1.1.3" xref="S2.E2.m1.2.2.2.4.cmml"><mo id="S2.E2.m1.1.1.1.1.1.3.1" stretchy="false" xref="S2.E2.m1.2.2.2.4.cmml">(</mo><mn id="S2.E2.m1.1.1.1.1.1.1" xref="S2.E2.m1.1.1.1.1.1.1.cmml">2</mn><mo id="S2.E2.m1.1.1.1.1.1.3.2" stretchy="false" xref="S2.E2.m1.2.2.2.4.cmml">)</mo></mrow></msup><mo fence="false" id="S2.E2.m1.2.2.2.3" xref="S2.E2.m1.2.2.2.3.cmml">|</mo><msup id="S2.E2.m1.2.2.2.5" xref="S2.E2.m1.2.2.2.5.cmml"><mi id="S2.E2.m1.2.2.2.5.2" xref="S2.E2.m1.2.2.2.5.2.cmml">𝒚</mi><mrow id="S2.E2.m1.2.2.2.2.1.3" xref="S2.E2.m1.2.2.2.5.cmml"><mo id="S2.E2.m1.2.2.2.2.1.3.1" stretchy="false" xref="S2.E2.m1.2.2.2.5.cmml">(</mo><mn id="S2.E2.m1.2.2.2.2.1.1" xref="S2.E2.m1.2.2.2.2.1.1.cmml">1</mn><mo id="S2.E2.m1.2.2.2.2.1.3.2" stretchy="false" xref="S2.E2.m1.2.2.2.5.cmml">)</mo></mrow></msup></mrow><mi id="S2.E2.m1.2.3.2.3" xref="S2.E2.m1.2.3.2.3.cmml">N2N</mi></msubsup><annotation-xml encoding="MathML-Content" id="S2.E2.m1.2b"><apply id="S2.E2.m1.2.3.cmml" xref="S2.E2.m1.2.3"><csymbol cd="ambiguous" id="S2.E2.m1.2.3.1.cmml" xref="S2.E2.m1.2.3">subscript</csymbol><apply id="S2.E2.m1.2.3.2.cmml" xref="S2.E2.m1.2.3"><csymbol cd="ambiguous" id="S2.E2.m1.2.3.2.1.cmml" xref="S2.E2.m1.2.3">superscript</csymbol><ci id="S2.E2.m1.2.3.2.2.cmml" xref="S2.E2.m1.2.3.2.2">ℒ</ci><ci id="S2.E2.m1.2.3.2.3.cmml" xref="S2.E2.m1.2.3.2.3">N2N</ci></apply><apply id="S2.E2.m1.2.2.2.cmml" xref="S2.E2.m1.2.2.2"><csymbol cd="latexml" id="S2.E2.m1.2.2.2.3.cmml" xref="S2.E2.m1.2.2.2.3">conditional</csymbol><apply id="S2.E2.m1.2.2.2.4.cmml" xref="S2.E2.m1.2.2.2.4"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.2.4.1.cmml" xref="S2.E2.m1.2.2.2.4">superscript</csymbol><ci id="S2.E2.m1.2.2.2.4.2.cmml" xref="S2.E2.m1.2.2.2.4.2">𝒚</ci><cn id="S2.E2.m1.1.1.1.1.1.1.cmml" type="integer" xref="S2.E2.m1.1.1.1.1.1.1">2</cn></apply><apply id="S2.E2.m1.2.2.2.5.cmml" xref="S2.E2.m1.2.2.2.5"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.2.5.1.cmml" xref="S2.E2.m1.2.2.2.5">superscript</csymbol><ci id="S2.E2.m1.2.2.2.5.2.cmml" xref="S2.E2.m1.2.2.2.5.2">𝒚</ci><cn id="S2.E2.m1.2.2.2.2.1.1.cmml" type="integer" xref="S2.E2.m1.2.2.2.2.1.1">1</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E2.m1.2c">\displaystyle\mathcal{L}^{\rm N2N}_{\bm{y}^{(2)}|\bm{y}^{(1)}}</annotation><annotation encoding="application/x-llamapun" id="S2.E2.m1.2d">caligraphic_L start_POSTSUPERSCRIPT N2N end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT | bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=\mathbb{E}_{\bm{y}^{(2)}|\bm{y}^{(1)}}[L(\hat{\bm{y}}^{(2)},\bm{% y}^{(2)})]." class="ltx_Math" display="inline" id="S2.E2.m2.5"><semantics id="S2.E2.m2.5a"><mrow id="S2.E2.m2.5.5.1" xref="S2.E2.m2.5.5.1.1.cmml"><mrow id="S2.E2.m2.5.5.1.1" xref="S2.E2.m2.5.5.1.1.cmml"><mi id="S2.E2.m2.5.5.1.1.3" xref="S2.E2.m2.5.5.1.1.3.cmml"></mi><mo id="S2.E2.m2.5.5.1.1.2" xref="S2.E2.m2.5.5.1.1.2.cmml">=</mo><mrow id="S2.E2.m2.5.5.1.1.1" xref="S2.E2.m2.5.5.1.1.1.cmml"><msub id="S2.E2.m2.5.5.1.1.1.3" xref="S2.E2.m2.5.5.1.1.1.3.cmml"><mi id="S2.E2.m2.5.5.1.1.1.3.2" xref="S2.E2.m2.5.5.1.1.1.3.2.cmml">𝔼</mi><mrow id="S2.E2.m2.2.2.2" xref="S2.E2.m2.2.2.2.cmml"><msup id="S2.E2.m2.2.2.2.4" xref="S2.E2.m2.2.2.2.4.cmml"><mi id="S2.E2.m2.2.2.2.4.2" xref="S2.E2.m2.2.2.2.4.2.cmml">𝒚</mi><mrow id="S2.E2.m2.1.1.1.1.1.3" xref="S2.E2.m2.2.2.2.4.cmml"><mo id="S2.E2.m2.1.1.1.1.1.3.1" stretchy="false" xref="S2.E2.m2.2.2.2.4.cmml">(</mo><mn id="S2.E2.m2.1.1.1.1.1.1" xref="S2.E2.m2.1.1.1.1.1.1.cmml">2</mn><mo id="S2.E2.m2.1.1.1.1.1.3.2" stretchy="false" xref="S2.E2.m2.2.2.2.4.cmml">)</mo></mrow></msup><mo fence="false" id="S2.E2.m2.2.2.2.3" xref="S2.E2.m2.2.2.2.3.cmml">|</mo><msup id="S2.E2.m2.2.2.2.5" xref="S2.E2.m2.2.2.2.5.cmml"><mi id="S2.E2.m2.2.2.2.5.2" xref="S2.E2.m2.2.2.2.5.2.cmml">𝒚</mi><mrow id="S2.E2.m2.2.2.2.2.1.3" xref="S2.E2.m2.2.2.2.5.cmml"><mo id="S2.E2.m2.2.2.2.2.1.3.1" stretchy="false" xref="S2.E2.m2.2.2.2.5.cmml">(</mo><mn id="S2.E2.m2.2.2.2.2.1.1" xref="S2.E2.m2.2.2.2.2.1.1.cmml">1</mn><mo id="S2.E2.m2.2.2.2.2.1.3.2" stretchy="false" xref="S2.E2.m2.2.2.2.5.cmml">)</mo></mrow></msup></mrow></msub><mo id="S2.E2.m2.5.5.1.1.1.2" xref="S2.E2.m2.5.5.1.1.1.2.cmml">⁢</mo><mrow id="S2.E2.m2.5.5.1.1.1.1.1" xref="S2.E2.m2.5.5.1.1.1.1.2.cmml"><mo id="S2.E2.m2.5.5.1.1.1.1.1.2" stretchy="false" xref="S2.E2.m2.5.5.1.1.1.1.2.1.cmml">[</mo><mrow id="S2.E2.m2.5.5.1.1.1.1.1.1" xref="S2.E2.m2.5.5.1.1.1.1.1.1.cmml"><mi id="S2.E2.m2.5.5.1.1.1.1.1.1.4" xref="S2.E2.m2.5.5.1.1.1.1.1.1.4.cmml">L</mi><mo id="S2.E2.m2.5.5.1.1.1.1.1.1.3" xref="S2.E2.m2.5.5.1.1.1.1.1.1.3.cmml">⁢</mo><mrow id="S2.E2.m2.5.5.1.1.1.1.1.1.2.2" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.3.cmml"><mo id="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.3" stretchy="false" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.3.cmml">(</mo><msup id="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.cmml"><mover accent="true" id="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2.2" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2.2.cmml">𝒚</mi><mo id="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2.1" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2.1.cmml">^</mo></mover><mrow id="S2.E2.m2.3.3.1.3" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.cmml"><mo id="S2.E2.m2.3.3.1.3.1" stretchy="false" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.cmml">(</mo><mn id="S2.E2.m2.3.3.1.1" xref="S2.E2.m2.3.3.1.1.cmml">2</mn><mo id="S2.E2.m2.3.3.1.3.2" stretchy="false" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.cmml">)</mo></mrow></msup><mo id="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.4" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.3.cmml">,</mo><msup id="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2.cmml"><mi id="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2.2" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2.2.cmml">𝒚</mi><mrow id="S2.E2.m2.4.4.1.3" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2.cmml"><mo id="S2.E2.m2.4.4.1.3.1" stretchy="false" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2.cmml">(</mo><mn id="S2.E2.m2.4.4.1.1" xref="S2.E2.m2.4.4.1.1.cmml">2</mn><mo id="S2.E2.m2.4.4.1.3.2" stretchy="false" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2.cmml">)</mo></mrow></msup><mo id="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.5" stretchy="false" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.3.cmml">)</mo></mrow></mrow><mo id="S2.E2.m2.5.5.1.1.1.1.1.3" stretchy="false" xref="S2.E2.m2.5.5.1.1.1.1.2.1.cmml">]</mo></mrow></mrow></mrow><mo id="S2.E2.m2.5.5.1.2" lspace="0em" xref="S2.E2.m2.5.5.1.1.cmml">.</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.E2.m2.5b"><apply id="S2.E2.m2.5.5.1.1.cmml" xref="S2.E2.m2.5.5.1"><eq id="S2.E2.m2.5.5.1.1.2.cmml" xref="S2.E2.m2.5.5.1.1.2"></eq><csymbol cd="latexml" id="S2.E2.m2.5.5.1.1.3.cmml" xref="S2.E2.m2.5.5.1.1.3">absent</csymbol><apply id="S2.E2.m2.5.5.1.1.1.cmml" xref="S2.E2.m2.5.5.1.1.1"><times id="S2.E2.m2.5.5.1.1.1.2.cmml" xref="S2.E2.m2.5.5.1.1.1.2"></times><apply id="S2.E2.m2.5.5.1.1.1.3.cmml" xref="S2.E2.m2.5.5.1.1.1.3"><csymbol cd="ambiguous" id="S2.E2.m2.5.5.1.1.1.3.1.cmml" xref="S2.E2.m2.5.5.1.1.1.3">subscript</csymbol><ci id="S2.E2.m2.5.5.1.1.1.3.2.cmml" xref="S2.E2.m2.5.5.1.1.1.3.2">𝔼</ci><apply id="S2.E2.m2.2.2.2.cmml" xref="S2.E2.m2.2.2.2"><csymbol cd="latexml" id="S2.E2.m2.2.2.2.3.cmml" xref="S2.E2.m2.2.2.2.3">conditional</csymbol><apply id="S2.E2.m2.2.2.2.4.cmml" xref="S2.E2.m2.2.2.2.4"><csymbol cd="ambiguous" id="S2.E2.m2.2.2.2.4.1.cmml" xref="S2.E2.m2.2.2.2.4">superscript</csymbol><ci id="S2.E2.m2.2.2.2.4.2.cmml" xref="S2.E2.m2.2.2.2.4.2">𝒚</ci><cn id="S2.E2.m2.1.1.1.1.1.1.cmml" type="integer" xref="S2.E2.m2.1.1.1.1.1.1">2</cn></apply><apply id="S2.E2.m2.2.2.2.5.cmml" xref="S2.E2.m2.2.2.2.5"><csymbol cd="ambiguous" id="S2.E2.m2.2.2.2.5.1.cmml" xref="S2.E2.m2.2.2.2.5">superscript</csymbol><ci id="S2.E2.m2.2.2.2.5.2.cmml" xref="S2.E2.m2.2.2.2.5.2">𝒚</ci><cn id="S2.E2.m2.2.2.2.2.1.1.cmml" type="integer" xref="S2.E2.m2.2.2.2.2.1.1">1</cn></apply></apply></apply><apply id="S2.E2.m2.5.5.1.1.1.1.2.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1"><csymbol cd="latexml" id="S2.E2.m2.5.5.1.1.1.1.2.1.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.2">delimited-[]</csymbol><apply id="S2.E2.m2.5.5.1.1.1.1.1.1.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1"><times id="S2.E2.m2.5.5.1.1.1.1.1.1.3.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1.3"></times><ci id="S2.E2.m2.5.5.1.1.1.1.1.1.4.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1.4">𝐿</ci><interval closure="open" id="S2.E2.m2.5.5.1.1.1.1.1.1.2.3.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.2"><apply id="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1">superscript</csymbol><apply id="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2"><ci id="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2.1">^</ci><ci id="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1.1.1.1.2.2">𝒚</ci></apply><cn id="S2.E2.m2.3.3.1.1.cmml" type="integer" xref="S2.E2.m2.3.3.1.1">2</cn></apply><apply id="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2"><csymbol cd="ambiguous" id="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2.1.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2">superscript</csymbol><ci id="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2.2.cmml" xref="S2.E2.m2.5.5.1.1.1.1.1.1.2.2.2.2">𝒚</ci><cn id="S2.E2.m2.4.4.1.1.cmml" type="integer" xref="S2.E2.m2.4.4.1.1">2</cn></apply></interval></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E2.m2.5c">\displaystyle=\mathbb{E}_{\bm{y}^{(2)}|\bm{y}^{(1)}}[L(\hat{\bm{y}}^{(2)},\bm{% y}^{(2)})].</annotation><annotation encoding="application/x-llamapun" id="S2.E2.m2.5d">= blackboard_E start_POSTSUBSCRIPT bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT | bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ italic_L ( over^ start_ARG bold_italic_y end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT , bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ) ] .</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(2)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS1.p1.20">Therefore, <math alttext="\hat{\bm{y}}^{(2)}" class="ltx_Math" display="inline" id="S2.SS1.p1.20.m1.1"><semantics id="S2.SS1.p1.20.m1.1a"><msup id="S2.SS1.p1.20.m1.1.2" xref="S2.SS1.p1.20.m1.1.2.cmml"><mover accent="true" id="S2.SS1.p1.20.m1.1.2.2" xref="S2.SS1.p1.20.m1.1.2.2.cmml"><mi id="S2.SS1.p1.20.m1.1.2.2.2" xref="S2.SS1.p1.20.m1.1.2.2.2.cmml">𝒚</mi><mo id="S2.SS1.p1.20.m1.1.2.2.1" xref="S2.SS1.p1.20.m1.1.2.2.1.cmml">^</mo></mover><mrow id="S2.SS1.p1.20.m1.1.1.1.3" xref="S2.SS1.p1.20.m1.1.2.cmml"><mo id="S2.SS1.p1.20.m1.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.20.m1.1.2.cmml">(</mo><mn id="S2.SS1.p1.20.m1.1.1.1.1" xref="S2.SS1.p1.20.m1.1.1.1.1.cmml">2</mn><mo id="S2.SS1.p1.20.m1.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.20.m1.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.20.m1.1b"><apply id="S2.SS1.p1.20.m1.1.2.cmml" xref="S2.SS1.p1.20.m1.1.2"><csymbol cd="ambiguous" id="S2.SS1.p1.20.m1.1.2.1.cmml" xref="S2.SS1.p1.20.m1.1.2">superscript</csymbol><apply id="S2.SS1.p1.20.m1.1.2.2.cmml" xref="S2.SS1.p1.20.m1.1.2.2"><ci id="S2.SS1.p1.20.m1.1.2.2.1.cmml" xref="S2.SS1.p1.20.m1.1.2.2.1">^</ci><ci id="S2.SS1.p1.20.m1.1.2.2.2.cmml" xref="S2.SS1.p1.20.m1.1.2.2.2">𝒚</ci></apply><cn id="S2.SS1.p1.20.m1.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.20.m1.1.1.1.1">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.20.m1.1c">\hat{\bm{y}}^{(2)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.20.m1.1d">over^ start_ARG bold_italic_y end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT</annotation></semantics></math> is obtained as</p> <table class="ltx_equationgroup ltx_eqn_align ltx_eqn_table" id="Sx2.EGx2"> <tbody id="S2.E3"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\frac{\partial}{\partial\hat{\bm{y}}^{(2)}}\mathcal{L}^{\rm N2N}_% {\bm{y}^{(2)}|\bm{y}^{(1)}}" class="ltx_Math" display="inline" id="S2.E3.m1.3"><semantics id="S2.E3.m1.3a"><mrow id="S2.E3.m1.3.4" xref="S2.E3.m1.3.4.cmml"><mstyle displaystyle="true" id="S2.E3.m1.1.1" xref="S2.E3.m1.1.1.cmml"><mfrac id="S2.E3.m1.1.1a" xref="S2.E3.m1.1.1.cmml"><mo id="S2.E3.m1.1.1.3" xref="S2.E3.m1.1.1.3.cmml">∂</mo><mrow id="S2.E3.m1.1.1.1" xref="S2.E3.m1.1.1.1.cmml"><mo id="S2.E3.m1.1.1.1.2" rspace="0em" xref="S2.E3.m1.1.1.1.2.cmml">∂</mo><msup id="S2.E3.m1.1.1.1.3" xref="S2.E3.m1.1.1.1.3.cmml"><mover accent="true" id="S2.E3.m1.1.1.1.3.2" xref="S2.E3.m1.1.1.1.3.2.cmml"><mi id="S2.E3.m1.1.1.1.3.2.2" xref="S2.E3.m1.1.1.1.3.2.2.cmml">𝒚</mi><mo id="S2.E3.m1.1.1.1.3.2.1" xref="S2.E3.m1.1.1.1.3.2.1.cmml">^</mo></mover><mrow id="S2.E3.m1.1.1.1.1.1.3" xref="S2.E3.m1.1.1.1.3.cmml"><mo id="S2.E3.m1.1.1.1.1.1.3.1" stretchy="false" xref="S2.E3.m1.1.1.1.3.cmml">(</mo><mn id="S2.E3.m1.1.1.1.1.1.1" xref="S2.E3.m1.1.1.1.1.1.1.cmml">2</mn><mo id="S2.E3.m1.1.1.1.1.1.3.2" stretchy="false" xref="S2.E3.m1.1.1.1.3.cmml">)</mo></mrow></msup></mrow></mfrac></mstyle><mo id="S2.E3.m1.3.4.1" xref="S2.E3.m1.3.4.1.cmml">⁢</mo><msubsup id="S2.E3.m1.3.4.2" xref="S2.E3.m1.3.4.2.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E3.m1.3.4.2.2.2" xref="S2.E3.m1.3.4.2.2.2.cmml">ℒ</mi><mrow id="S2.E3.m1.3.3.2" xref="S2.E3.m1.3.3.2.cmml"><msup id="S2.E3.m1.3.3.2.4" xref="S2.E3.m1.3.3.2.4.cmml"><mi id="S2.E3.m1.3.3.2.4.2" xref="S2.E3.m1.3.3.2.4.2.cmml">𝒚</mi><mrow id="S2.E3.m1.2.2.1.1.1.3" xref="S2.E3.m1.3.3.2.4.cmml"><mo id="S2.E3.m1.2.2.1.1.1.3.1" stretchy="false" xref="S2.E3.m1.3.3.2.4.cmml">(</mo><mn id="S2.E3.m1.2.2.1.1.1.1" xref="S2.E3.m1.2.2.1.1.1.1.cmml">2</mn><mo id="S2.E3.m1.2.2.1.1.1.3.2" stretchy="false" xref="S2.E3.m1.3.3.2.4.cmml">)</mo></mrow></msup><mo fence="false" id="S2.E3.m1.3.3.2.3" xref="S2.E3.m1.3.3.2.3.cmml">|</mo><msup id="S2.E3.m1.3.3.2.5" xref="S2.E3.m1.3.3.2.5.cmml"><mi id="S2.E3.m1.3.3.2.5.2" xref="S2.E3.m1.3.3.2.5.2.cmml">𝒚</mi><mrow id="S2.E3.m1.3.3.2.2.1.3" xref="S2.E3.m1.3.3.2.5.cmml"><mo id="S2.E3.m1.3.3.2.2.1.3.1" stretchy="false" xref="S2.E3.m1.3.3.2.5.cmml">(</mo><mn id="S2.E3.m1.3.3.2.2.1.1" xref="S2.E3.m1.3.3.2.2.1.1.cmml">1</mn><mo id="S2.E3.m1.3.3.2.2.1.3.2" stretchy="false" xref="S2.E3.m1.3.3.2.5.cmml">)</mo></mrow></msup></mrow><mi id="S2.E3.m1.3.4.2.2.3" xref="S2.E3.m1.3.4.2.2.3.cmml">N2N</mi></msubsup></mrow><annotation-xml encoding="MathML-Content" id="S2.E3.m1.3b"><apply id="S2.E3.m1.3.4.cmml" xref="S2.E3.m1.3.4"><times id="S2.E3.m1.3.4.1.cmml" xref="S2.E3.m1.3.4.1"></times><apply id="S2.E3.m1.1.1.cmml" xref="S2.E3.m1.1.1"><divide id="S2.E3.m1.1.1.2.cmml" xref="S2.E3.m1.1.1"></divide><partialdiff id="S2.E3.m1.1.1.3.cmml" xref="S2.E3.m1.1.1.3"></partialdiff><apply id="S2.E3.m1.1.1.1.cmml" xref="S2.E3.m1.1.1.1"><partialdiff id="S2.E3.m1.1.1.1.2.cmml" xref="S2.E3.m1.1.1.1.2"></partialdiff><apply id="S2.E3.m1.1.1.1.3.cmml" xref="S2.E3.m1.1.1.1.3"><csymbol cd="ambiguous" id="S2.E3.m1.1.1.1.3.1.cmml" xref="S2.E3.m1.1.1.1.3">superscript</csymbol><apply id="S2.E3.m1.1.1.1.3.2.cmml" xref="S2.E3.m1.1.1.1.3.2"><ci id="S2.E3.m1.1.1.1.3.2.1.cmml" xref="S2.E3.m1.1.1.1.3.2.1">^</ci><ci id="S2.E3.m1.1.1.1.3.2.2.cmml" xref="S2.E3.m1.1.1.1.3.2.2">𝒚</ci></apply><cn id="S2.E3.m1.1.1.1.1.1.1.cmml" type="integer" xref="S2.E3.m1.1.1.1.1.1.1">2</cn></apply></apply></apply><apply id="S2.E3.m1.3.4.2.cmml" xref="S2.E3.m1.3.4.2"><csymbol cd="ambiguous" id="S2.E3.m1.3.4.2.1.cmml" xref="S2.E3.m1.3.4.2">subscript</csymbol><apply id="S2.E3.m1.3.4.2.2.cmml" xref="S2.E3.m1.3.4.2"><csymbol cd="ambiguous" id="S2.E3.m1.3.4.2.2.1.cmml" xref="S2.E3.m1.3.4.2">superscript</csymbol><ci id="S2.E3.m1.3.4.2.2.2.cmml" xref="S2.E3.m1.3.4.2.2.2">ℒ</ci><ci id="S2.E3.m1.3.4.2.2.3.cmml" xref="S2.E3.m1.3.4.2.2.3">N2N</ci></apply><apply id="S2.E3.m1.3.3.2.cmml" xref="S2.E3.m1.3.3.2"><csymbol cd="latexml" id="S2.E3.m1.3.3.2.3.cmml" xref="S2.E3.m1.3.3.2.3">conditional</csymbol><apply id="S2.E3.m1.3.3.2.4.cmml" xref="S2.E3.m1.3.3.2.4"><csymbol cd="ambiguous" id="S2.E3.m1.3.3.2.4.1.cmml" xref="S2.E3.m1.3.3.2.4">superscript</csymbol><ci id="S2.E3.m1.3.3.2.4.2.cmml" xref="S2.E3.m1.3.3.2.4.2">𝒚</ci><cn id="S2.E3.m1.2.2.1.1.1.1.cmml" type="integer" xref="S2.E3.m1.2.2.1.1.1.1">2</cn></apply><apply id="S2.E3.m1.3.3.2.5.cmml" xref="S2.E3.m1.3.3.2.5"><csymbol cd="ambiguous" id="S2.E3.m1.3.3.2.5.1.cmml" xref="S2.E3.m1.3.3.2.5">superscript</csymbol><ci id="S2.E3.m1.3.3.2.5.2.cmml" xref="S2.E3.m1.3.3.2.5.2">𝒚</ci><cn id="S2.E3.m1.3.3.2.2.1.1.cmml" type="integer" xref="S2.E3.m1.3.3.2.2.1.1">1</cn></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E3.m1.3c">\displaystyle\frac{\partial}{\partial\hat{\bm{y}}^{(2)}}\mathcal{L}^{\rm N2N}_% {\bm{y}^{(2)}|\bm{y}^{(1)}}</annotation><annotation encoding="application/x-llamapun" id="S2.E3.m1.3d">divide start_ARG ∂ end_ARG start_ARG ∂ over^ start_ARG bold_italic_y end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT end_ARG caligraphic_L start_POSTSUPERSCRIPT N2N end_POSTSUPERSCRIPT start_POSTSUBSCRIPT bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT | bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=0," class="ltx_Math" display="inline" id="S2.E3.m2.1"><semantics id="S2.E3.m2.1a"><mrow id="S2.E3.m2.1.1.1" xref="S2.E3.m2.1.1.1.1.cmml"><mrow id="S2.E3.m2.1.1.1.1" xref="S2.E3.m2.1.1.1.1.cmml"><mi id="S2.E3.m2.1.1.1.1.2" xref="S2.E3.m2.1.1.1.1.2.cmml"></mi><mo id="S2.E3.m2.1.1.1.1.1" xref="S2.E3.m2.1.1.1.1.1.cmml">=</mo><mn id="S2.E3.m2.1.1.1.1.3" xref="S2.E3.m2.1.1.1.1.3.cmml">0</mn></mrow><mo id="S2.E3.m2.1.1.1.2" xref="S2.E3.m2.1.1.1.1.cmml">,</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.E3.m2.1b"><apply id="S2.E3.m2.1.1.1.1.cmml" xref="S2.E3.m2.1.1.1"><eq id="S2.E3.m2.1.1.1.1.1.cmml" xref="S2.E3.m2.1.1.1.1.1"></eq><csymbol cd="latexml" id="S2.E3.m2.1.1.1.1.2.cmml" xref="S2.E3.m2.1.1.1.1.2">absent</csymbol><cn id="S2.E3.m2.1.1.1.1.3.cmml" type="integer" xref="S2.E3.m2.1.1.1.1.3">0</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E3.m2.1c">\displaystyle=0,</annotation><annotation encoding="application/x-llamapun" id="S2.E3.m2.1d">= 0 ,</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(3)</span></td> </tr></tbody> <tbody id="S2.E4"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\frac{\partial}{\partial\hat{\bm{y}}^{(2)}}\mathbb{E}_{\bm{y}^{(2% )}|\bm{y}^{(1)}}\left[\|\hat{\bm{y}}^{(2)}-\bm{y}^{(2)}\|_{2}^{2}\right]" class="ltx_Math" display="inline" id="S2.E4.m1.6"><semantics id="S2.E4.m1.6a"><mrow id="S2.E4.m1.6.6" xref="S2.E4.m1.6.6.cmml"><mstyle displaystyle="true" id="S2.E4.m1.1.1" xref="S2.E4.m1.1.1.cmml"><mfrac id="S2.E4.m1.1.1a" xref="S2.E4.m1.1.1.cmml"><mo id="S2.E4.m1.1.1.3" xref="S2.E4.m1.1.1.3.cmml">∂</mo><mrow id="S2.E4.m1.1.1.1" xref="S2.E4.m1.1.1.1.cmml"><mo id="S2.E4.m1.1.1.1.2" rspace="0em" xref="S2.E4.m1.1.1.1.2.cmml">∂</mo><msup id="S2.E4.m1.1.1.1.3" xref="S2.E4.m1.1.1.1.3.cmml"><mover accent="true" id="S2.E4.m1.1.1.1.3.2" xref="S2.E4.m1.1.1.1.3.2.cmml"><mi id="S2.E4.m1.1.1.1.3.2.2" xref="S2.E4.m1.1.1.1.3.2.2.cmml">𝒚</mi><mo id="S2.E4.m1.1.1.1.3.2.1" xref="S2.E4.m1.1.1.1.3.2.1.cmml">^</mo></mover><mrow id="S2.E4.m1.1.1.1.1.1.3" xref="S2.E4.m1.1.1.1.3.cmml"><mo id="S2.E4.m1.1.1.1.1.1.3.1" stretchy="false" xref="S2.E4.m1.1.1.1.3.cmml">(</mo><mn id="S2.E4.m1.1.1.1.1.1.1" xref="S2.E4.m1.1.1.1.1.1.1.cmml">2</mn><mo id="S2.E4.m1.1.1.1.1.1.3.2" stretchy="false" xref="S2.E4.m1.1.1.1.3.cmml">)</mo></mrow></msup></mrow></mfrac></mstyle><mo id="S2.E4.m1.6.6.2" xref="S2.E4.m1.6.6.2.cmml">⁢</mo><msub id="S2.E4.m1.6.6.3" xref="S2.E4.m1.6.6.3.cmml"><mi id="S2.E4.m1.6.6.3.2" xref="S2.E4.m1.6.6.3.2.cmml">𝔼</mi><mrow id="S2.E4.m1.3.3.2" xref="S2.E4.m1.3.3.2.cmml"><msup id="S2.E4.m1.3.3.2.4" xref="S2.E4.m1.3.3.2.4.cmml"><mi id="S2.E4.m1.3.3.2.4.2" xref="S2.E4.m1.3.3.2.4.2.cmml">𝒚</mi><mrow id="S2.E4.m1.2.2.1.1.1.3" xref="S2.E4.m1.3.3.2.4.cmml"><mo id="S2.E4.m1.2.2.1.1.1.3.1" stretchy="false" xref="S2.E4.m1.3.3.2.4.cmml">(</mo><mn id="S2.E4.m1.2.2.1.1.1.1" xref="S2.E4.m1.2.2.1.1.1.1.cmml">2</mn><mo id="S2.E4.m1.2.2.1.1.1.3.2" stretchy="false" xref="S2.E4.m1.3.3.2.4.cmml">)</mo></mrow></msup><mo fence="false" id="S2.E4.m1.3.3.2.3" xref="S2.E4.m1.3.3.2.3.cmml">|</mo><msup id="S2.E4.m1.3.3.2.5" xref="S2.E4.m1.3.3.2.5.cmml"><mi id="S2.E4.m1.3.3.2.5.2" xref="S2.E4.m1.3.3.2.5.2.cmml">𝒚</mi><mrow id="S2.E4.m1.3.3.2.2.1.3" xref="S2.E4.m1.3.3.2.5.cmml"><mo id="S2.E4.m1.3.3.2.2.1.3.1" stretchy="false" xref="S2.E4.m1.3.3.2.5.cmml">(</mo><mn id="S2.E4.m1.3.3.2.2.1.1" xref="S2.E4.m1.3.3.2.2.1.1.cmml">1</mn><mo id="S2.E4.m1.3.3.2.2.1.3.2" stretchy="false" xref="S2.E4.m1.3.3.2.5.cmml">)</mo></mrow></msup></mrow></msub><mo id="S2.E4.m1.6.6.2a" xref="S2.E4.m1.6.6.2.cmml">⁢</mo><mrow id="S2.E4.m1.6.6.1.1" xref="S2.E4.m1.6.6.1.2.cmml"><mo id="S2.E4.m1.6.6.1.1.2" xref="S2.E4.m1.6.6.1.2.1.cmml">[</mo><msubsup id="S2.E4.m1.6.6.1.1.1" xref="S2.E4.m1.6.6.1.1.1.cmml"><mrow id="S2.E4.m1.6.6.1.1.1.1.1.1" xref="S2.E4.m1.6.6.1.1.1.1.1.2.cmml"><mo id="S2.E4.m1.6.6.1.1.1.1.1.1.2" stretchy="false" xref="S2.E4.m1.6.6.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S2.E4.m1.6.6.1.1.1.1.1.1.1" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.cmml"><msup id="S2.E4.m1.6.6.1.1.1.1.1.1.1.2" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.cmml"><mover accent="true" id="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2.cmml"><mi id="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2.2" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2.2.cmml">𝒚</mi><mo id="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2.1" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2.1.cmml">^</mo></mover><mrow id="S2.E4.m1.4.4.1.3" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.cmml"><mo id="S2.E4.m1.4.4.1.3.1" stretchy="false" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.cmml">(</mo><mn id="S2.E4.m1.4.4.1.1" xref="S2.E4.m1.4.4.1.1.cmml">2</mn><mo id="S2.E4.m1.4.4.1.3.2" stretchy="false" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.cmml">)</mo></mrow></msup><mo id="S2.E4.m1.6.6.1.1.1.1.1.1.1.1" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.1.cmml">−</mo><msup id="S2.E4.m1.6.6.1.1.1.1.1.1.1.3" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.3.cmml"><mi id="S2.E4.m1.6.6.1.1.1.1.1.1.1.3.2" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.3.2.cmml">𝒚</mi><mrow id="S2.E4.m1.5.5.1.3" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.3.cmml"><mo id="S2.E4.m1.5.5.1.3.1" stretchy="false" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.3.cmml">(</mo><mn id="S2.E4.m1.5.5.1.1" xref="S2.E4.m1.5.5.1.1.cmml">2</mn><mo id="S2.E4.m1.5.5.1.3.2" stretchy="false" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.3.cmml">)</mo></mrow></msup></mrow><mo id="S2.E4.m1.6.6.1.1.1.1.1.1.3" stretchy="false" xref="S2.E4.m1.6.6.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S2.E4.m1.6.6.1.1.1.1.3" xref="S2.E4.m1.6.6.1.1.1.1.3.cmml">2</mn><mn id="S2.E4.m1.6.6.1.1.1.3" xref="S2.E4.m1.6.6.1.1.1.3.cmml">2</mn></msubsup><mo id="S2.E4.m1.6.6.1.1.3" xref="S2.E4.m1.6.6.1.2.1.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E4.m1.6b"><apply id="S2.E4.m1.6.6.cmml" xref="S2.E4.m1.6.6"><times id="S2.E4.m1.6.6.2.cmml" xref="S2.E4.m1.6.6.2"></times><apply id="S2.E4.m1.1.1.cmml" xref="S2.E4.m1.1.1"><divide id="S2.E4.m1.1.1.2.cmml" xref="S2.E4.m1.1.1"></divide><partialdiff id="S2.E4.m1.1.1.3.cmml" xref="S2.E4.m1.1.1.3"></partialdiff><apply id="S2.E4.m1.1.1.1.cmml" xref="S2.E4.m1.1.1.1"><partialdiff id="S2.E4.m1.1.1.1.2.cmml" xref="S2.E4.m1.1.1.1.2"></partialdiff><apply id="S2.E4.m1.1.1.1.3.cmml" xref="S2.E4.m1.1.1.1.3"><csymbol cd="ambiguous" id="S2.E4.m1.1.1.1.3.1.cmml" xref="S2.E4.m1.1.1.1.3">superscript</csymbol><apply id="S2.E4.m1.1.1.1.3.2.cmml" xref="S2.E4.m1.1.1.1.3.2"><ci id="S2.E4.m1.1.1.1.3.2.1.cmml" xref="S2.E4.m1.1.1.1.3.2.1">^</ci><ci id="S2.E4.m1.1.1.1.3.2.2.cmml" xref="S2.E4.m1.1.1.1.3.2.2">𝒚</ci></apply><cn id="S2.E4.m1.1.1.1.1.1.1.cmml" type="integer" xref="S2.E4.m1.1.1.1.1.1.1">2</cn></apply></apply></apply><apply id="S2.E4.m1.6.6.3.cmml" xref="S2.E4.m1.6.6.3"><csymbol cd="ambiguous" id="S2.E4.m1.6.6.3.1.cmml" xref="S2.E4.m1.6.6.3">subscript</csymbol><ci id="S2.E4.m1.6.6.3.2.cmml" xref="S2.E4.m1.6.6.3.2">𝔼</ci><apply id="S2.E4.m1.3.3.2.cmml" xref="S2.E4.m1.3.3.2"><csymbol cd="latexml" id="S2.E4.m1.3.3.2.3.cmml" xref="S2.E4.m1.3.3.2.3">conditional</csymbol><apply id="S2.E4.m1.3.3.2.4.cmml" xref="S2.E4.m1.3.3.2.4"><csymbol cd="ambiguous" id="S2.E4.m1.3.3.2.4.1.cmml" xref="S2.E4.m1.3.3.2.4">superscript</csymbol><ci id="S2.E4.m1.3.3.2.4.2.cmml" xref="S2.E4.m1.3.3.2.4.2">𝒚</ci><cn id="S2.E4.m1.2.2.1.1.1.1.cmml" type="integer" xref="S2.E4.m1.2.2.1.1.1.1">2</cn></apply><apply id="S2.E4.m1.3.3.2.5.cmml" xref="S2.E4.m1.3.3.2.5"><csymbol cd="ambiguous" id="S2.E4.m1.3.3.2.5.1.cmml" xref="S2.E4.m1.3.3.2.5">superscript</csymbol><ci id="S2.E4.m1.3.3.2.5.2.cmml" xref="S2.E4.m1.3.3.2.5.2">𝒚</ci><cn id="S2.E4.m1.3.3.2.2.1.1.cmml" type="integer" xref="S2.E4.m1.3.3.2.2.1.1">1</cn></apply></apply></apply><apply id="S2.E4.m1.6.6.1.2.cmml" xref="S2.E4.m1.6.6.1.1"><csymbol cd="latexml" id="S2.E4.m1.6.6.1.2.1.cmml" xref="S2.E4.m1.6.6.1.1.2">delimited-[]</csymbol><apply id="S2.E4.m1.6.6.1.1.1.cmml" xref="S2.E4.m1.6.6.1.1.1"><csymbol cd="ambiguous" id="S2.E4.m1.6.6.1.1.1.2.cmml" xref="S2.E4.m1.6.6.1.1.1">superscript</csymbol><apply id="S2.E4.m1.6.6.1.1.1.1.cmml" xref="S2.E4.m1.6.6.1.1.1"><csymbol cd="ambiguous" id="S2.E4.m1.6.6.1.1.1.1.2.cmml" xref="S2.E4.m1.6.6.1.1.1">subscript</csymbol><apply id="S2.E4.m1.6.6.1.1.1.1.1.2.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1"><csymbol cd="latexml" id="S2.E4.m1.6.6.1.1.1.1.1.2.1.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1.2">norm</csymbol><apply id="S2.E4.m1.6.6.1.1.1.1.1.1.1.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1"><minus id="S2.E4.m1.6.6.1.1.1.1.1.1.1.1.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.1"></minus><apply id="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2">superscript</csymbol><apply id="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2"><ci id="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2.1.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2.1">^</ci><ci id="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2.2.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.2.2.2">𝒚</ci></apply><cn id="S2.E4.m1.4.4.1.1.cmml" type="integer" xref="S2.E4.m1.4.4.1.1">2</cn></apply><apply id="S2.E4.m1.6.6.1.1.1.1.1.1.1.3.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.E4.m1.6.6.1.1.1.1.1.1.1.3.1.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.3">superscript</csymbol><ci id="S2.E4.m1.6.6.1.1.1.1.1.1.1.3.2.cmml" xref="S2.E4.m1.6.6.1.1.1.1.1.1.1.3.2">𝒚</ci><cn id="S2.E4.m1.5.5.1.1.cmml" type="integer" xref="S2.E4.m1.5.5.1.1">2</cn></apply></apply></apply><cn id="S2.E4.m1.6.6.1.1.1.1.3.cmml" type="integer" xref="S2.E4.m1.6.6.1.1.1.1.3">2</cn></apply><cn id="S2.E4.m1.6.6.1.1.1.3.cmml" type="integer" xref="S2.E4.m1.6.6.1.1.1.3">2</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E4.m1.6c">\displaystyle\frac{\partial}{\partial\hat{\bm{y}}^{(2)}}\mathbb{E}_{\bm{y}^{(2% )}|\bm{y}^{(1)}}\left[\|\hat{\bm{y}}^{(2)}-\bm{y}^{(2)}\|_{2}^{2}\right]</annotation><annotation encoding="application/x-llamapun" id="S2.E4.m1.6d">divide start_ARG ∂ end_ARG start_ARG ∂ over^ start_ARG bold_italic_y end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT end_ARG blackboard_E start_POSTSUBSCRIPT bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT | bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ ∥ over^ start_ARG bold_italic_y end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT - bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ]</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=0," class="ltx_Math" display="inline" id="S2.E4.m2.1"><semantics id="S2.E4.m2.1a"><mrow id="S2.E4.m2.1.1.1" xref="S2.E4.m2.1.1.1.1.cmml"><mrow id="S2.E4.m2.1.1.1.1" xref="S2.E4.m2.1.1.1.1.cmml"><mi id="S2.E4.m2.1.1.1.1.2" xref="S2.E4.m2.1.1.1.1.2.cmml"></mi><mo id="S2.E4.m2.1.1.1.1.1" xref="S2.E4.m2.1.1.1.1.1.cmml">=</mo><mn id="S2.E4.m2.1.1.1.1.3" xref="S2.E4.m2.1.1.1.1.3.cmml">0</mn></mrow><mo id="S2.E4.m2.1.1.1.2" xref="S2.E4.m2.1.1.1.1.cmml">,</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.E4.m2.1b"><apply id="S2.E4.m2.1.1.1.1.cmml" xref="S2.E4.m2.1.1.1"><eq id="S2.E4.m2.1.1.1.1.1.cmml" xref="S2.E4.m2.1.1.1.1.1"></eq><csymbol cd="latexml" id="S2.E4.m2.1.1.1.1.2.cmml" xref="S2.E4.m2.1.1.1.1.2">absent</csymbol><cn id="S2.E4.m2.1.1.1.1.3.cmml" type="integer" xref="S2.E4.m2.1.1.1.1.3">0</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E4.m2.1c">\displaystyle=0,</annotation><annotation encoding="application/x-llamapun" id="S2.E4.m2.1d">= 0 ,</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(4)</span></td> </tr></tbody> <tbody id="S2.E5"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\hat{\bm{y}}^{(2)}" class="ltx_Math" display="inline" id="S2.E5.m1.1"><semantics id="S2.E5.m1.1a"><msup id="S2.E5.m1.1.2" xref="S2.E5.m1.1.2.cmml"><mover accent="true" id="S2.E5.m1.1.2.2" xref="S2.E5.m1.1.2.2.cmml"><mi id="S2.E5.m1.1.2.2.2" xref="S2.E5.m1.1.2.2.2.cmml">𝒚</mi><mo id="S2.E5.m1.1.2.2.1" xref="S2.E5.m1.1.2.2.1.cmml">^</mo></mover><mrow id="S2.E5.m1.1.1.1.3" xref="S2.E5.m1.1.2.cmml"><mo id="S2.E5.m1.1.1.1.3.1" stretchy="false" xref="S2.E5.m1.1.2.cmml">(</mo><mn id="S2.E5.m1.1.1.1.1" xref="S2.E5.m1.1.1.1.1.cmml">2</mn><mo id="S2.E5.m1.1.1.1.3.2" stretchy="false" xref="S2.E5.m1.1.2.cmml">)</mo></mrow></msup><annotation-xml encoding="MathML-Content" id="S2.E5.m1.1b"><apply id="S2.E5.m1.1.2.cmml" xref="S2.E5.m1.1.2"><csymbol cd="ambiguous" id="S2.E5.m1.1.2.1.cmml" xref="S2.E5.m1.1.2">superscript</csymbol><apply id="S2.E5.m1.1.2.2.cmml" xref="S2.E5.m1.1.2.2"><ci id="S2.E5.m1.1.2.2.1.cmml" xref="S2.E5.m1.1.2.2.1">^</ci><ci id="S2.E5.m1.1.2.2.2.cmml" xref="S2.E5.m1.1.2.2.2">𝒚</ci></apply><cn id="S2.E5.m1.1.1.1.1.cmml" type="integer" xref="S2.E5.m1.1.1.1.1">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E5.m1.1c">\displaystyle\hat{\bm{y}}^{(2)}</annotation><annotation encoding="application/x-llamapun" id="S2.E5.m1.1d">over^ start_ARG bold_italic_y end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=\mathbb{E}_{\bm{y}^{(2)}|\bm{y}^{(1)}}[\bm{y}^{(2)}]." class="ltx_Math" display="inline" id="S2.E5.m2.4"><semantics id="S2.E5.m2.4a"><mrow id="S2.E5.m2.4.4.1" xref="S2.E5.m2.4.4.1.1.cmml"><mrow id="S2.E5.m2.4.4.1.1" xref="S2.E5.m2.4.4.1.1.cmml"><mi id="S2.E5.m2.4.4.1.1.3" xref="S2.E5.m2.4.4.1.1.3.cmml"></mi><mo id="S2.E5.m2.4.4.1.1.2" xref="S2.E5.m2.4.4.1.1.2.cmml">=</mo><mrow id="S2.E5.m2.4.4.1.1.1" xref="S2.E5.m2.4.4.1.1.1.cmml"><msub id="S2.E5.m2.4.4.1.1.1.3" xref="S2.E5.m2.4.4.1.1.1.3.cmml"><mi id="S2.E5.m2.4.4.1.1.1.3.2" xref="S2.E5.m2.4.4.1.1.1.3.2.cmml">𝔼</mi><mrow id="S2.E5.m2.2.2.2" xref="S2.E5.m2.2.2.2.cmml"><msup id="S2.E5.m2.2.2.2.4" xref="S2.E5.m2.2.2.2.4.cmml"><mi id="S2.E5.m2.2.2.2.4.2" xref="S2.E5.m2.2.2.2.4.2.cmml">𝒚</mi><mrow id="S2.E5.m2.1.1.1.1.1.3" xref="S2.E5.m2.2.2.2.4.cmml"><mo id="S2.E5.m2.1.1.1.1.1.3.1" stretchy="false" xref="S2.E5.m2.2.2.2.4.cmml">(</mo><mn id="S2.E5.m2.1.1.1.1.1.1" xref="S2.E5.m2.1.1.1.1.1.1.cmml">2</mn><mo id="S2.E5.m2.1.1.1.1.1.3.2" stretchy="false" xref="S2.E5.m2.2.2.2.4.cmml">)</mo></mrow></msup><mo fence="false" id="S2.E5.m2.2.2.2.3" xref="S2.E5.m2.2.2.2.3.cmml">|</mo><msup id="S2.E5.m2.2.2.2.5" xref="S2.E5.m2.2.2.2.5.cmml"><mi id="S2.E5.m2.2.2.2.5.2" xref="S2.E5.m2.2.2.2.5.2.cmml">𝒚</mi><mrow id="S2.E5.m2.2.2.2.2.1.3" xref="S2.E5.m2.2.2.2.5.cmml"><mo id="S2.E5.m2.2.2.2.2.1.3.1" stretchy="false" xref="S2.E5.m2.2.2.2.5.cmml">(</mo><mn id="S2.E5.m2.2.2.2.2.1.1" xref="S2.E5.m2.2.2.2.2.1.1.cmml">1</mn><mo id="S2.E5.m2.2.2.2.2.1.3.2" stretchy="false" xref="S2.E5.m2.2.2.2.5.cmml">)</mo></mrow></msup></mrow></msub><mo id="S2.E5.m2.4.4.1.1.1.2" xref="S2.E5.m2.4.4.1.1.1.2.cmml">⁢</mo><mrow id="S2.E5.m2.4.4.1.1.1.1.1" xref="S2.E5.m2.4.4.1.1.1.1.2.cmml"><mo id="S2.E5.m2.4.4.1.1.1.1.1.2" stretchy="false" xref="S2.E5.m2.4.4.1.1.1.1.2.1.cmml">[</mo><msup id="S2.E5.m2.4.4.1.1.1.1.1.1" xref="S2.E5.m2.4.4.1.1.1.1.1.1.cmml"><mi id="S2.E5.m2.4.4.1.1.1.1.1.1.2" xref="S2.E5.m2.4.4.1.1.1.1.1.1.2.cmml">𝒚</mi><mrow id="S2.E5.m2.3.3.1.3" xref="S2.E5.m2.4.4.1.1.1.1.1.1.cmml"><mo id="S2.E5.m2.3.3.1.3.1" stretchy="false" xref="S2.E5.m2.4.4.1.1.1.1.1.1.cmml">(</mo><mn id="S2.E5.m2.3.3.1.1" xref="S2.E5.m2.3.3.1.1.cmml">2</mn><mo id="S2.E5.m2.3.3.1.3.2" stretchy="false" xref="S2.E5.m2.4.4.1.1.1.1.1.1.cmml">)</mo></mrow></msup><mo id="S2.E5.m2.4.4.1.1.1.1.1.3" stretchy="false" xref="S2.E5.m2.4.4.1.1.1.1.2.1.cmml">]</mo></mrow></mrow></mrow><mo id="S2.E5.m2.4.4.1.2" lspace="0em" xref="S2.E5.m2.4.4.1.1.cmml">.</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.E5.m2.4b"><apply id="S2.E5.m2.4.4.1.1.cmml" xref="S2.E5.m2.4.4.1"><eq id="S2.E5.m2.4.4.1.1.2.cmml" xref="S2.E5.m2.4.4.1.1.2"></eq><csymbol cd="latexml" id="S2.E5.m2.4.4.1.1.3.cmml" xref="S2.E5.m2.4.4.1.1.3">absent</csymbol><apply id="S2.E5.m2.4.4.1.1.1.cmml" xref="S2.E5.m2.4.4.1.1.1"><times id="S2.E5.m2.4.4.1.1.1.2.cmml" xref="S2.E5.m2.4.4.1.1.1.2"></times><apply id="S2.E5.m2.4.4.1.1.1.3.cmml" xref="S2.E5.m2.4.4.1.1.1.3"><csymbol cd="ambiguous" id="S2.E5.m2.4.4.1.1.1.3.1.cmml" xref="S2.E5.m2.4.4.1.1.1.3">subscript</csymbol><ci id="S2.E5.m2.4.4.1.1.1.3.2.cmml" xref="S2.E5.m2.4.4.1.1.1.3.2">𝔼</ci><apply id="S2.E5.m2.2.2.2.cmml" xref="S2.E5.m2.2.2.2"><csymbol cd="latexml" id="S2.E5.m2.2.2.2.3.cmml" xref="S2.E5.m2.2.2.2.3">conditional</csymbol><apply id="S2.E5.m2.2.2.2.4.cmml" xref="S2.E5.m2.2.2.2.4"><csymbol cd="ambiguous" id="S2.E5.m2.2.2.2.4.1.cmml" xref="S2.E5.m2.2.2.2.4">superscript</csymbol><ci id="S2.E5.m2.2.2.2.4.2.cmml" xref="S2.E5.m2.2.2.2.4.2">𝒚</ci><cn id="S2.E5.m2.1.1.1.1.1.1.cmml" type="integer" xref="S2.E5.m2.1.1.1.1.1.1">2</cn></apply><apply id="S2.E5.m2.2.2.2.5.cmml" xref="S2.E5.m2.2.2.2.5"><csymbol cd="ambiguous" id="S2.E5.m2.2.2.2.5.1.cmml" xref="S2.E5.m2.2.2.2.5">superscript</csymbol><ci id="S2.E5.m2.2.2.2.5.2.cmml" xref="S2.E5.m2.2.2.2.5.2">𝒚</ci><cn id="S2.E5.m2.2.2.2.2.1.1.cmml" type="integer" xref="S2.E5.m2.2.2.2.2.1.1">1</cn></apply></apply></apply><apply id="S2.E5.m2.4.4.1.1.1.1.2.cmml" xref="S2.E5.m2.4.4.1.1.1.1.1"><csymbol cd="latexml" id="S2.E5.m2.4.4.1.1.1.1.2.1.cmml" xref="S2.E5.m2.4.4.1.1.1.1.1.2">delimited-[]</csymbol><apply id="S2.E5.m2.4.4.1.1.1.1.1.1.cmml" xref="S2.E5.m2.4.4.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E5.m2.4.4.1.1.1.1.1.1.1.cmml" xref="S2.E5.m2.4.4.1.1.1.1.1.1">superscript</csymbol><ci id="S2.E5.m2.4.4.1.1.1.1.1.1.2.cmml" xref="S2.E5.m2.4.4.1.1.1.1.1.1.2">𝒚</ci><cn id="S2.E5.m2.3.3.1.1.cmml" type="integer" xref="S2.E5.m2.3.3.1.1">2</cn></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E5.m2.4c">\displaystyle=\mathbb{E}_{\bm{y}^{(2)}|\bm{y}^{(1)}}[\bm{y}^{(2)}].</annotation><annotation encoding="application/x-llamapun" id="S2.E5.m2.4d">= blackboard_E start_POSTSUBSCRIPT bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT | bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ] .</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(5)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS1.p1.21">This averaging effect is observed as a problem of a blurred output in super-resolution and a greyish output in autocoloring <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Ledig_2017</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Zhang_2016</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Isola_2017</span>]</cite>. On the basis of this property, Noise2Noise can achieve the same denoising training as CTT without requiring clean target signals since the averaging effect can remove zero-mean noise in the output signal (i.e., <math alttext="\hat{\bm{y}}^{(2)}=\mathbb{E}_{\bm{y}^{(2)}|\bm{y}^{(1)}}[\bm{s}+\bm{n}^{(2)}]% =\mathbb{E}_{\bm{y}^{(2)}|\bm{y}^{(1)}}[\bm{s}]" class="ltx_Math" display="inline" id="S2.SS1.p1.21.m1.8"><semantics id="S2.SS1.p1.21.m1.8a"><mrow id="S2.SS1.p1.21.m1.8.8" xref="S2.SS1.p1.21.m1.8.8.cmml"><msup id="S2.SS1.p1.21.m1.8.8.3" xref="S2.SS1.p1.21.m1.8.8.3.cmml"><mover accent="true" id="S2.SS1.p1.21.m1.8.8.3.2" xref="S2.SS1.p1.21.m1.8.8.3.2.cmml"><mi id="S2.SS1.p1.21.m1.8.8.3.2.2" xref="S2.SS1.p1.21.m1.8.8.3.2.2.cmml">𝒚</mi><mo id="S2.SS1.p1.21.m1.8.8.3.2.1" xref="S2.SS1.p1.21.m1.8.8.3.2.1.cmml">^</mo></mover><mrow id="S2.SS1.p1.21.m1.1.1.1.3" xref="S2.SS1.p1.21.m1.8.8.3.cmml"><mo id="S2.SS1.p1.21.m1.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.21.m1.8.8.3.cmml">(</mo><mn id="S2.SS1.p1.21.m1.1.1.1.1" xref="S2.SS1.p1.21.m1.1.1.1.1.cmml">2</mn><mo id="S2.SS1.p1.21.m1.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.21.m1.8.8.3.cmml">)</mo></mrow></msup><mo id="S2.SS1.p1.21.m1.8.8.4" xref="S2.SS1.p1.21.m1.8.8.4.cmml">=</mo><mrow id="S2.SS1.p1.21.m1.8.8.1" xref="S2.SS1.p1.21.m1.8.8.1.cmml"><msub id="S2.SS1.p1.21.m1.8.8.1.3" xref="S2.SS1.p1.21.m1.8.8.1.3.cmml"><mi id="S2.SS1.p1.21.m1.8.8.1.3.2" xref="S2.SS1.p1.21.m1.8.8.1.3.2.cmml">𝔼</mi><mrow id="S2.SS1.p1.21.m1.3.3.2" xref="S2.SS1.p1.21.m1.3.3.2.cmml"><msup id="S2.SS1.p1.21.m1.3.3.2.4" xref="S2.SS1.p1.21.m1.3.3.2.4.cmml"><mi id="S2.SS1.p1.21.m1.3.3.2.4.2" xref="S2.SS1.p1.21.m1.3.3.2.4.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.21.m1.2.2.1.1.1.3" xref="S2.SS1.p1.21.m1.3.3.2.4.cmml"><mo id="S2.SS1.p1.21.m1.2.2.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.21.m1.3.3.2.4.cmml">(</mo><mn id="S2.SS1.p1.21.m1.2.2.1.1.1.1" xref="S2.SS1.p1.21.m1.2.2.1.1.1.1.cmml">2</mn><mo id="S2.SS1.p1.21.m1.2.2.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.21.m1.3.3.2.4.cmml">)</mo></mrow></msup><mo fence="false" id="S2.SS1.p1.21.m1.3.3.2.3" xref="S2.SS1.p1.21.m1.3.3.2.3.cmml">|</mo><msup id="S2.SS1.p1.21.m1.3.3.2.5" xref="S2.SS1.p1.21.m1.3.3.2.5.cmml"><mi id="S2.SS1.p1.21.m1.3.3.2.5.2" xref="S2.SS1.p1.21.m1.3.3.2.5.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.21.m1.3.3.2.2.1.3" xref="S2.SS1.p1.21.m1.3.3.2.5.cmml"><mo id="S2.SS1.p1.21.m1.3.3.2.2.1.3.1" stretchy="false" xref="S2.SS1.p1.21.m1.3.3.2.5.cmml">(</mo><mn id="S2.SS1.p1.21.m1.3.3.2.2.1.1" xref="S2.SS1.p1.21.m1.3.3.2.2.1.1.cmml">1</mn><mo id="S2.SS1.p1.21.m1.3.3.2.2.1.3.2" stretchy="false" xref="S2.SS1.p1.21.m1.3.3.2.5.cmml">)</mo></mrow></msup></mrow></msub><mo id="S2.SS1.p1.21.m1.8.8.1.2" xref="S2.SS1.p1.21.m1.8.8.1.2.cmml">⁢</mo><mrow id="S2.SS1.p1.21.m1.8.8.1.1.1" xref="S2.SS1.p1.21.m1.8.8.1.1.2.cmml"><mo id="S2.SS1.p1.21.m1.8.8.1.1.1.2" stretchy="false" xref="S2.SS1.p1.21.m1.8.8.1.1.2.1.cmml">[</mo><mrow id="S2.SS1.p1.21.m1.8.8.1.1.1.1" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.cmml"><mi id="S2.SS1.p1.21.m1.8.8.1.1.1.1.2" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.2.cmml">𝒔</mi><mo id="S2.SS1.p1.21.m1.8.8.1.1.1.1.1" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.1.cmml">+</mo><msup id="S2.SS1.p1.21.m1.8.8.1.1.1.1.3" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.3.cmml"><mi id="S2.SS1.p1.21.m1.8.8.1.1.1.1.3.2" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.3.2.cmml">𝒏</mi><mrow id="S2.SS1.p1.21.m1.4.4.1.3" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.3.cmml"><mo id="S2.SS1.p1.21.m1.4.4.1.3.1" stretchy="false" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.3.cmml">(</mo><mn id="S2.SS1.p1.21.m1.4.4.1.1" xref="S2.SS1.p1.21.m1.4.4.1.1.cmml">2</mn><mo id="S2.SS1.p1.21.m1.4.4.1.3.2" stretchy="false" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.3.cmml">)</mo></mrow></msup></mrow><mo id="S2.SS1.p1.21.m1.8.8.1.1.1.3" stretchy="false" xref="S2.SS1.p1.21.m1.8.8.1.1.2.1.cmml">]</mo></mrow></mrow><mo id="S2.SS1.p1.21.m1.8.8.5" xref="S2.SS1.p1.21.m1.8.8.5.cmml">=</mo><mrow id="S2.SS1.p1.21.m1.8.8.6" xref="S2.SS1.p1.21.m1.8.8.6.cmml"><msub id="S2.SS1.p1.21.m1.8.8.6.2" xref="S2.SS1.p1.21.m1.8.8.6.2.cmml"><mi id="S2.SS1.p1.21.m1.8.8.6.2.2" xref="S2.SS1.p1.21.m1.8.8.6.2.2.cmml">𝔼</mi><mrow id="S2.SS1.p1.21.m1.6.6.2" xref="S2.SS1.p1.21.m1.6.6.2.cmml"><msup id="S2.SS1.p1.21.m1.6.6.2.4" xref="S2.SS1.p1.21.m1.6.6.2.4.cmml"><mi id="S2.SS1.p1.21.m1.6.6.2.4.2" xref="S2.SS1.p1.21.m1.6.6.2.4.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.21.m1.5.5.1.1.1.3" xref="S2.SS1.p1.21.m1.6.6.2.4.cmml"><mo id="S2.SS1.p1.21.m1.5.5.1.1.1.3.1" stretchy="false" xref="S2.SS1.p1.21.m1.6.6.2.4.cmml">(</mo><mn id="S2.SS1.p1.21.m1.5.5.1.1.1.1" xref="S2.SS1.p1.21.m1.5.5.1.1.1.1.cmml">2</mn><mo id="S2.SS1.p1.21.m1.5.5.1.1.1.3.2" stretchy="false" xref="S2.SS1.p1.21.m1.6.6.2.4.cmml">)</mo></mrow></msup><mo fence="false" id="S2.SS1.p1.21.m1.6.6.2.3" xref="S2.SS1.p1.21.m1.6.6.2.3.cmml">|</mo><msup id="S2.SS1.p1.21.m1.6.6.2.5" xref="S2.SS1.p1.21.m1.6.6.2.5.cmml"><mi id="S2.SS1.p1.21.m1.6.6.2.5.2" xref="S2.SS1.p1.21.m1.6.6.2.5.2.cmml">𝒚</mi><mrow id="S2.SS1.p1.21.m1.6.6.2.2.1.3" xref="S2.SS1.p1.21.m1.6.6.2.5.cmml"><mo id="S2.SS1.p1.21.m1.6.6.2.2.1.3.1" stretchy="false" xref="S2.SS1.p1.21.m1.6.6.2.5.cmml">(</mo><mn id="S2.SS1.p1.21.m1.6.6.2.2.1.1" xref="S2.SS1.p1.21.m1.6.6.2.2.1.1.cmml">1</mn><mo id="S2.SS1.p1.21.m1.6.6.2.2.1.3.2" stretchy="false" xref="S2.SS1.p1.21.m1.6.6.2.5.cmml">)</mo></mrow></msup></mrow></msub><mo id="S2.SS1.p1.21.m1.8.8.6.1" xref="S2.SS1.p1.21.m1.8.8.6.1.cmml">⁢</mo><mrow id="S2.SS1.p1.21.m1.8.8.6.3.2" xref="S2.SS1.p1.21.m1.8.8.6.3.1.cmml"><mo id="S2.SS1.p1.21.m1.8.8.6.3.2.1" stretchy="false" xref="S2.SS1.p1.21.m1.8.8.6.3.1.1.cmml">[</mo><mi id="S2.SS1.p1.21.m1.7.7" xref="S2.SS1.p1.21.m1.7.7.cmml">𝒔</mi><mo id="S2.SS1.p1.21.m1.8.8.6.3.2.2" stretchy="false" xref="S2.SS1.p1.21.m1.8.8.6.3.1.1.cmml">]</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p1.21.m1.8b"><apply id="S2.SS1.p1.21.m1.8.8.cmml" xref="S2.SS1.p1.21.m1.8.8"><and id="S2.SS1.p1.21.m1.8.8a.cmml" xref="S2.SS1.p1.21.m1.8.8"></and><apply id="S2.SS1.p1.21.m1.8.8b.cmml" xref="S2.SS1.p1.21.m1.8.8"><eq id="S2.SS1.p1.21.m1.8.8.4.cmml" xref="S2.SS1.p1.21.m1.8.8.4"></eq><apply id="S2.SS1.p1.21.m1.8.8.3.cmml" xref="S2.SS1.p1.21.m1.8.8.3"><csymbol cd="ambiguous" id="S2.SS1.p1.21.m1.8.8.3.1.cmml" xref="S2.SS1.p1.21.m1.8.8.3">superscript</csymbol><apply id="S2.SS1.p1.21.m1.8.8.3.2.cmml" xref="S2.SS1.p1.21.m1.8.8.3.2"><ci id="S2.SS1.p1.21.m1.8.8.3.2.1.cmml" xref="S2.SS1.p1.21.m1.8.8.3.2.1">^</ci><ci id="S2.SS1.p1.21.m1.8.8.3.2.2.cmml" xref="S2.SS1.p1.21.m1.8.8.3.2.2">𝒚</ci></apply><cn id="S2.SS1.p1.21.m1.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.21.m1.1.1.1.1">2</cn></apply><apply id="S2.SS1.p1.21.m1.8.8.1.cmml" xref="S2.SS1.p1.21.m1.8.8.1"><times id="S2.SS1.p1.21.m1.8.8.1.2.cmml" xref="S2.SS1.p1.21.m1.8.8.1.2"></times><apply id="S2.SS1.p1.21.m1.8.8.1.3.cmml" xref="S2.SS1.p1.21.m1.8.8.1.3"><csymbol cd="ambiguous" id="S2.SS1.p1.21.m1.8.8.1.3.1.cmml" xref="S2.SS1.p1.21.m1.8.8.1.3">subscript</csymbol><ci id="S2.SS1.p1.21.m1.8.8.1.3.2.cmml" xref="S2.SS1.p1.21.m1.8.8.1.3.2">𝔼</ci><apply id="S2.SS1.p1.21.m1.3.3.2.cmml" xref="S2.SS1.p1.21.m1.3.3.2"><csymbol cd="latexml" id="S2.SS1.p1.21.m1.3.3.2.3.cmml" xref="S2.SS1.p1.21.m1.3.3.2.3">conditional</csymbol><apply id="S2.SS1.p1.21.m1.3.3.2.4.cmml" xref="S2.SS1.p1.21.m1.3.3.2.4"><csymbol cd="ambiguous" id="S2.SS1.p1.21.m1.3.3.2.4.1.cmml" xref="S2.SS1.p1.21.m1.3.3.2.4">superscript</csymbol><ci id="S2.SS1.p1.21.m1.3.3.2.4.2.cmml" xref="S2.SS1.p1.21.m1.3.3.2.4.2">𝒚</ci><cn id="S2.SS1.p1.21.m1.2.2.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.21.m1.2.2.1.1.1.1">2</cn></apply><apply id="S2.SS1.p1.21.m1.3.3.2.5.cmml" xref="S2.SS1.p1.21.m1.3.3.2.5"><csymbol cd="ambiguous" id="S2.SS1.p1.21.m1.3.3.2.5.1.cmml" xref="S2.SS1.p1.21.m1.3.3.2.5">superscript</csymbol><ci id="S2.SS1.p1.21.m1.3.3.2.5.2.cmml" xref="S2.SS1.p1.21.m1.3.3.2.5.2">𝒚</ci><cn id="S2.SS1.p1.21.m1.3.3.2.2.1.1.cmml" type="integer" xref="S2.SS1.p1.21.m1.3.3.2.2.1.1">1</cn></apply></apply></apply><apply id="S2.SS1.p1.21.m1.8.8.1.1.2.cmml" xref="S2.SS1.p1.21.m1.8.8.1.1.1"><csymbol cd="latexml" id="S2.SS1.p1.21.m1.8.8.1.1.2.1.cmml" xref="S2.SS1.p1.21.m1.8.8.1.1.1.2">delimited-[]</csymbol><apply id="S2.SS1.p1.21.m1.8.8.1.1.1.1.cmml" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1"><plus id="S2.SS1.p1.21.m1.8.8.1.1.1.1.1.cmml" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.1"></plus><ci id="S2.SS1.p1.21.m1.8.8.1.1.1.1.2.cmml" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.2">𝒔</ci><apply id="S2.SS1.p1.21.m1.8.8.1.1.1.1.3.cmml" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.p1.21.m1.8.8.1.1.1.1.3.1.cmml" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.3">superscript</csymbol><ci id="S2.SS1.p1.21.m1.8.8.1.1.1.1.3.2.cmml" xref="S2.SS1.p1.21.m1.8.8.1.1.1.1.3.2">𝒏</ci><cn id="S2.SS1.p1.21.m1.4.4.1.1.cmml" type="integer" xref="S2.SS1.p1.21.m1.4.4.1.1">2</cn></apply></apply></apply></apply></apply><apply id="S2.SS1.p1.21.m1.8.8c.cmml" xref="S2.SS1.p1.21.m1.8.8"><eq id="S2.SS1.p1.21.m1.8.8.5.cmml" xref="S2.SS1.p1.21.m1.8.8.5"></eq><share href="https://arxiv.org/html/2503.14854v1#S2.SS1.p1.21.m1.8.8.1.cmml" id="S2.SS1.p1.21.m1.8.8d.cmml" xref="S2.SS1.p1.21.m1.8.8"></share><apply id="S2.SS1.p1.21.m1.8.8.6.cmml" xref="S2.SS1.p1.21.m1.8.8.6"><times id="S2.SS1.p1.21.m1.8.8.6.1.cmml" xref="S2.SS1.p1.21.m1.8.8.6.1"></times><apply id="S2.SS1.p1.21.m1.8.8.6.2.cmml" xref="S2.SS1.p1.21.m1.8.8.6.2"><csymbol cd="ambiguous" id="S2.SS1.p1.21.m1.8.8.6.2.1.cmml" xref="S2.SS1.p1.21.m1.8.8.6.2">subscript</csymbol><ci id="S2.SS1.p1.21.m1.8.8.6.2.2.cmml" xref="S2.SS1.p1.21.m1.8.8.6.2.2">𝔼</ci><apply id="S2.SS1.p1.21.m1.6.6.2.cmml" xref="S2.SS1.p1.21.m1.6.6.2"><csymbol cd="latexml" id="S2.SS1.p1.21.m1.6.6.2.3.cmml" xref="S2.SS1.p1.21.m1.6.6.2.3">conditional</csymbol><apply id="S2.SS1.p1.21.m1.6.6.2.4.cmml" xref="S2.SS1.p1.21.m1.6.6.2.4"><csymbol cd="ambiguous" id="S2.SS1.p1.21.m1.6.6.2.4.1.cmml" xref="S2.SS1.p1.21.m1.6.6.2.4">superscript</csymbol><ci id="S2.SS1.p1.21.m1.6.6.2.4.2.cmml" xref="S2.SS1.p1.21.m1.6.6.2.4.2">𝒚</ci><cn id="S2.SS1.p1.21.m1.5.5.1.1.1.1.cmml" type="integer" xref="S2.SS1.p1.21.m1.5.5.1.1.1.1">2</cn></apply><apply id="S2.SS1.p1.21.m1.6.6.2.5.cmml" xref="S2.SS1.p1.21.m1.6.6.2.5"><csymbol cd="ambiguous" id="S2.SS1.p1.21.m1.6.6.2.5.1.cmml" xref="S2.SS1.p1.21.m1.6.6.2.5">superscript</csymbol><ci id="S2.SS1.p1.21.m1.6.6.2.5.2.cmml" xref="S2.SS1.p1.21.m1.6.6.2.5.2">𝒚</ci><cn id="S2.SS1.p1.21.m1.6.6.2.2.1.1.cmml" type="integer" xref="S2.SS1.p1.21.m1.6.6.2.2.1.1">1</cn></apply></apply></apply><apply id="S2.SS1.p1.21.m1.8.8.6.3.1.cmml" xref="S2.SS1.p1.21.m1.8.8.6.3.2"><csymbol cd="latexml" id="S2.SS1.p1.21.m1.8.8.6.3.1.1.cmml" xref="S2.SS1.p1.21.m1.8.8.6.3.2.1">delimited-[]</csymbol><ci id="S2.SS1.p1.21.m1.7.7.cmml" xref="S2.SS1.p1.21.m1.7.7">𝒔</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p1.21.m1.8c">\hat{\bm{y}}^{(2)}=\mathbb{E}_{\bm{y}^{(2)}|\bm{y}^{(1)}}[\bm{s}+\bm{n}^{(2)}]% =\mathbb{E}_{\bm{y}^{(2)}|\bm{y}^{(1)}}[\bm{s}]</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p1.21.m1.8d">over^ start_ARG bold_italic_y end_ARG start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT = blackboard_E start_POSTSUBSCRIPT bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT | bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ bold_italic_s + bold_italic_n start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT ] = blackboard_E start_POSTSUBSCRIPT bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT | bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ bold_italic_s ]</annotation></semantics></math>).</p> </div> <div class="ltx_para" id="S2.SS1.p2"> <p class="ltx_p" id="S2.SS1.p2.3">It has been theoretically and experimentally proven that Noise2Noise can achieve denoising training using only noisy signals. In the case of images, it is relatively easy to obtain pairs of noisy signals, <math alttext="(\bm{y}^{(1)},\bm{y}^{(2)})" class="ltx_Math" display="inline" id="S2.SS1.p2.1.m1.4"><semantics id="S2.SS1.p2.1.m1.4a"><mrow id="S2.SS1.p2.1.m1.4.4.2" xref="S2.SS1.p2.1.m1.4.4.3.cmml"><mo id="S2.SS1.p2.1.m1.4.4.2.3" stretchy="false" xref="S2.SS1.p2.1.m1.4.4.3.cmml">(</mo><msup id="S2.SS1.p2.1.m1.3.3.1.1" xref="S2.SS1.p2.1.m1.3.3.1.1.cmml"><mi id="S2.SS1.p2.1.m1.3.3.1.1.2" xref="S2.SS1.p2.1.m1.3.3.1.1.2.cmml">𝒚</mi><mrow id="S2.SS1.p2.1.m1.1.1.1.3" xref="S2.SS1.p2.1.m1.3.3.1.1.cmml"><mo id="S2.SS1.p2.1.m1.1.1.1.3.1" stretchy="false" xref="S2.SS1.p2.1.m1.3.3.1.1.cmml">(</mo><mn id="S2.SS1.p2.1.m1.1.1.1.1" xref="S2.SS1.p2.1.m1.1.1.1.1.cmml">1</mn><mo id="S2.SS1.p2.1.m1.1.1.1.3.2" stretchy="false" xref="S2.SS1.p2.1.m1.3.3.1.1.cmml">)</mo></mrow></msup><mo id="S2.SS1.p2.1.m1.4.4.2.4" xref="S2.SS1.p2.1.m1.4.4.3.cmml">,</mo><msup id="S2.SS1.p2.1.m1.4.4.2.2" xref="S2.SS1.p2.1.m1.4.4.2.2.cmml"><mi id="S2.SS1.p2.1.m1.4.4.2.2.2" xref="S2.SS1.p2.1.m1.4.4.2.2.2.cmml">𝒚</mi><mrow id="S2.SS1.p2.1.m1.2.2.1.3" xref="S2.SS1.p2.1.m1.4.4.2.2.cmml"><mo id="S2.SS1.p2.1.m1.2.2.1.3.1" stretchy="false" xref="S2.SS1.p2.1.m1.4.4.2.2.cmml">(</mo><mn id="S2.SS1.p2.1.m1.2.2.1.1" xref="S2.SS1.p2.1.m1.2.2.1.1.cmml">2</mn><mo id="S2.SS1.p2.1.m1.2.2.1.3.2" stretchy="false" xref="S2.SS1.p2.1.m1.4.4.2.2.cmml">)</mo></mrow></msup><mo id="S2.SS1.p2.1.m1.4.4.2.5" stretchy="false" xref="S2.SS1.p2.1.m1.4.4.3.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.1.m1.4b"><interval closure="open" id="S2.SS1.p2.1.m1.4.4.3.cmml" xref="S2.SS1.p2.1.m1.4.4.2"><apply id="S2.SS1.p2.1.m1.3.3.1.1.cmml" xref="S2.SS1.p2.1.m1.3.3.1.1"><csymbol cd="ambiguous" id="S2.SS1.p2.1.m1.3.3.1.1.1.cmml" xref="S2.SS1.p2.1.m1.3.3.1.1">superscript</csymbol><ci id="S2.SS1.p2.1.m1.3.3.1.1.2.cmml" xref="S2.SS1.p2.1.m1.3.3.1.1.2">𝒚</ci><cn id="S2.SS1.p2.1.m1.1.1.1.1.cmml" type="integer" xref="S2.SS1.p2.1.m1.1.1.1.1">1</cn></apply><apply id="S2.SS1.p2.1.m1.4.4.2.2.cmml" xref="S2.SS1.p2.1.m1.4.4.2.2"><csymbol cd="ambiguous" id="S2.SS1.p2.1.m1.4.4.2.2.1.cmml" xref="S2.SS1.p2.1.m1.4.4.2.2">superscript</csymbol><ci id="S2.SS1.p2.1.m1.4.4.2.2.2.cmml" xref="S2.SS1.p2.1.m1.4.4.2.2.2">𝒚</ci><cn id="S2.SS1.p2.1.m1.2.2.1.1.cmml" type="integer" xref="S2.SS1.p2.1.m1.2.2.1.1">2</cn></apply></interval></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.1.m1.4c">(\bm{y}^{(1)},\bm{y}^{(2)})</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.1.m1.4d">( bold_italic_y start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT , bold_italic_y start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT )</annotation></semantics></math>, that share the same clean image <math alttext="\bm{s}" class="ltx_Math" display="inline" id="S2.SS1.p2.2.m2.1"><semantics id="S2.SS1.p2.2.m2.1a"><mi id="S2.SS1.p2.2.m2.1.1" xref="S2.SS1.p2.2.m2.1.1.cmml">𝒔</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.2.m2.1b"><ci id="S2.SS1.p2.2.m2.1.1.cmml" xref="S2.SS1.p2.2.m2.1.1">𝒔</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.2.m2.1c">\bm{s}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.2.m2.1d">bold_italic_s</annotation></semantics></math> by taking consecutive shots when the subject is static. In this case, Noise2Noise is a useful technique. However, in the case of audio, it is not possible to naturally obtain such pairs of noisy signals. Instead, we must synthesize them using a clean audio signal <math alttext="\bm{s}" class="ltx_Math" display="inline" id="S2.SS1.p2.3.m3.1"><semantics id="S2.SS1.p2.3.m3.1a"><mi id="S2.SS1.p2.3.m3.1.1" xref="S2.SS1.p2.3.m3.1.1.cmml">𝒔</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.p2.3.m3.1b"><ci id="S2.SS1.p2.3.m3.1.1.cmml" xref="S2.SS1.p2.3.m3.1.1">𝒔</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.p2.3.m3.1c">\bm{s}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.p2.3.m3.1d">bold_italic_s</annotation></semantics></math>. Therefore, in the audio TSE task, Noise2Noise does not serve as an essential solution <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">kashyap2021speech</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">alamdari2021improving</span>]</cite>.</p> </div> </section> <section class="ltx_subsection" id="S2.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">2.2 </span>MixIT</h3> <div class="ltx_para" id="S2.SS2.p1"> <p class="ltx_p" id="S2.SS2.p1.5">MixIT <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Wisdom_2020</span>]</cite> is an unsupervised training method for sound source separation, and it can also be used for TSE (Fig. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S1.F1" title="Figure 1 ‣ 1 Introduction ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">1</span></a>(c)). In MixIT, training is conducted using noisy signals <math alttext="\bm{x}=\bm{s}+\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S2.SS2.p1.1.m1.1"><semantics id="S2.SS2.p1.1.m1.1a"><mrow id="S2.SS2.p1.1.m1.1.1" xref="S2.SS2.p1.1.m1.1.1.cmml"><mi id="S2.SS2.p1.1.m1.1.1.2" xref="S2.SS2.p1.1.m1.1.1.2.cmml">𝒙</mi><mo id="S2.SS2.p1.1.m1.1.1.1" xref="S2.SS2.p1.1.m1.1.1.1.cmml">=</mo><mrow id="S2.SS2.p1.1.m1.1.1.3" xref="S2.SS2.p1.1.m1.1.1.3.cmml"><mi id="S2.SS2.p1.1.m1.1.1.3.2" xref="S2.SS2.p1.1.m1.1.1.3.2.cmml">𝒔</mi><mo id="S2.SS2.p1.1.m1.1.1.3.1" xref="S2.SS2.p1.1.m1.1.1.3.1.cmml">+</mo><msup id="S2.SS2.p1.1.m1.1.1.3.3" xref="S2.SS2.p1.1.m1.1.1.3.3.cmml"><mi id="S2.SS2.p1.1.m1.1.1.3.3.2" xref="S2.SS2.p1.1.m1.1.1.3.3.2.cmml">𝒏</mi><mi id="S2.SS2.p1.1.m1.1.1.3.3.3" xref="S2.SS2.p1.1.m1.1.1.3.3.3.cmml">obs</mi></msup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.1.m1.1b"><apply id="S2.SS2.p1.1.m1.1.1.cmml" xref="S2.SS2.p1.1.m1.1.1"><eq id="S2.SS2.p1.1.m1.1.1.1.cmml" xref="S2.SS2.p1.1.m1.1.1.1"></eq><ci id="S2.SS2.p1.1.m1.1.1.2.cmml" xref="S2.SS2.p1.1.m1.1.1.2">𝒙</ci><apply id="S2.SS2.p1.1.m1.1.1.3.cmml" xref="S2.SS2.p1.1.m1.1.1.3"><plus id="S2.SS2.p1.1.m1.1.1.3.1.cmml" xref="S2.SS2.p1.1.m1.1.1.3.1"></plus><ci id="S2.SS2.p1.1.m1.1.1.3.2.cmml" xref="S2.SS2.p1.1.m1.1.1.3.2">𝒔</ci><apply id="S2.SS2.p1.1.m1.1.1.3.3.cmml" xref="S2.SS2.p1.1.m1.1.1.3.3"><csymbol cd="ambiguous" id="S2.SS2.p1.1.m1.1.1.3.3.1.cmml" xref="S2.SS2.p1.1.m1.1.1.3.3">superscript</csymbol><ci id="S2.SS2.p1.1.m1.1.1.3.3.2.cmml" xref="S2.SS2.p1.1.m1.1.1.3.3.2">𝒏</ci><ci id="S2.SS2.p1.1.m1.1.1.3.3.3.cmml" xref="S2.SS2.p1.1.m1.1.1.3.3.3">obs</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.1.m1.1c">\bm{x}=\bm{s}+\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.1.m1.1d">bold_italic_x = bold_italic_s + bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and additional noise signals <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S2.SS2.p1.2.m2.1"><semantics id="S2.SS2.p1.2.m2.1a"><msup id="S2.SS2.p1.2.m2.1.1" xref="S2.SS2.p1.2.m2.1.1.cmml"><mi id="S2.SS2.p1.2.m2.1.1.2" xref="S2.SS2.p1.2.m2.1.1.2.cmml">𝒏</mi><mi id="S2.SS2.p1.2.m2.1.1.3" xref="S2.SS2.p1.2.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.2.m2.1b"><apply id="S2.SS2.p1.2.m2.1.1.cmml" xref="S2.SS2.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.2.m2.1.1.1.cmml" xref="S2.SS2.p1.2.m2.1.1">superscript</csymbol><ci id="S2.SS2.p1.2.m2.1.1.2.cmml" xref="S2.SS2.p1.2.m2.1.1.2">𝒏</ci><ci id="S2.SS2.p1.2.m2.1.1.3.cmml" xref="S2.SS2.p1.2.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.2.m2.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, where <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S2.SS2.p1.3.m3.1"><semantics id="S2.SS2.p1.3.m3.1a"><msup id="S2.SS2.p1.3.m3.1.1" xref="S2.SS2.p1.3.m3.1.1.cmml"><mi id="S2.SS2.p1.3.m3.1.1.2" xref="S2.SS2.p1.3.m3.1.1.2.cmml">𝒏</mi><mi id="S2.SS2.p1.3.m3.1.1.3" xref="S2.SS2.p1.3.m3.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.3.m3.1b"><apply id="S2.SS2.p1.3.m3.1.1.cmml" xref="S2.SS2.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.3.m3.1.1.1.cmml" xref="S2.SS2.p1.3.m3.1.1">superscript</csymbol><ci id="S2.SS2.p1.3.m3.1.1.2.cmml" xref="S2.SS2.p1.3.m3.1.1.2">𝒏</ci><ci id="S2.SS2.p1.3.m3.1.1.3.cmml" xref="S2.SS2.p1.3.m3.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.3.m3.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> represents the noise already present in <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S2.SS2.p1.4.m4.1"><semantics id="S2.SS2.p1.4.m4.1a"><mi id="S2.SS2.p1.4.m4.1.1" xref="S2.SS2.p1.4.m4.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.4.m4.1b"><ci id="S2.SS2.p1.4.m4.1.1.cmml" xref="S2.SS2.p1.4.m4.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.4.m4.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.4.m4.1d">bold_italic_x</annotation></semantics></math> at the time of observation. MixIT trains a DNN to minimize the following prediction error <math alttext="\mathcal{L}^{\rm MixIT}" class="ltx_Math" display="inline" id="S2.SS2.p1.5.m5.1"><semantics id="S2.SS2.p1.5.m5.1a"><msup id="S2.SS2.p1.5.m5.1.1" xref="S2.SS2.p1.5.m5.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.SS2.p1.5.m5.1.1.2" xref="S2.SS2.p1.5.m5.1.1.2.cmml">ℒ</mi><mi id="S2.SS2.p1.5.m5.1.1.3" xref="S2.SS2.p1.5.m5.1.1.3.cmml">MixIT</mi></msup><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.5.m5.1b"><apply id="S2.SS2.p1.5.m5.1.1.cmml" xref="S2.SS2.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.5.m5.1.1.1.cmml" xref="S2.SS2.p1.5.m5.1.1">superscript</csymbol><ci id="S2.SS2.p1.5.m5.1.1.2.cmml" xref="S2.SS2.p1.5.m5.1.1.2">ℒ</ci><ci id="S2.SS2.p1.5.m5.1.1.3.cmml" xref="S2.SS2.p1.5.m5.1.1.3">MixIT</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.5.m5.1c">\mathcal{L}^{\rm MixIT}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.5.m5.1d">caligraphic_L start_POSTSUPERSCRIPT roman_MixIT end_POSTSUPERSCRIPT</annotation></semantics></math>:</p> <table class="ltx_equationgroup ltx_eqn_align ltx_eqn_table" id="Sx2.EGx3"> <tbody id="S2.E6"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\mathcal{L}^{\rm MixIT}" class="ltx_Math" display="inline" id="S2.E6.m1.1"><semantics id="S2.E6.m1.1a"><msup id="S2.E6.m1.1.1" xref="S2.E6.m1.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E6.m1.1.1.2" xref="S2.E6.m1.1.1.2.cmml">ℒ</mi><mi id="S2.E6.m1.1.1.3" xref="S2.E6.m1.1.1.3.cmml">MixIT</mi></msup><annotation-xml encoding="MathML-Content" id="S2.E6.m1.1b"><apply id="S2.E6.m1.1.1.cmml" xref="S2.E6.m1.1.1"><csymbol cd="ambiguous" id="S2.E6.m1.1.1.1.cmml" xref="S2.E6.m1.1.1">superscript</csymbol><ci id="S2.E6.m1.1.1.2.cmml" xref="S2.E6.m1.1.1.2">ℒ</ci><ci id="S2.E6.m1.1.1.3.cmml" xref="S2.E6.m1.1.1.3">MixIT</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E6.m1.1c">\displaystyle\mathcal{L}^{\rm MixIT}</annotation><annotation encoding="application/x-llamapun" id="S2.E6.m1.1d">caligraphic_L start_POSTSUPERSCRIPT roman_MixIT end_POSTSUPERSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=\mathbb{E}_{(\bm{x},\,\bm{n}^{\rm{add}})\sim\mathcal{D}}\left[% \min(\mathcal{L}_{\rm MixIT1},\mathcal{L}_{\rm MixIT2})\right]," class="ltx_Math" display="inline" id="S2.E6.m2.4"><semantics id="S2.E6.m2.4a"><mrow id="S2.E6.m2.4.4.1" xref="S2.E6.m2.4.4.1.1.cmml"><mrow id="S2.E6.m2.4.4.1.1" xref="S2.E6.m2.4.4.1.1.cmml"><mi id="S2.E6.m2.4.4.1.1.3" xref="S2.E6.m2.4.4.1.1.3.cmml"></mi><mo id="S2.E6.m2.4.4.1.1.2" xref="S2.E6.m2.4.4.1.1.2.cmml">=</mo><mrow id="S2.E6.m2.4.4.1.1.1" xref="S2.E6.m2.4.4.1.1.1.cmml"><msub id="S2.E6.m2.4.4.1.1.1.3" xref="S2.E6.m2.4.4.1.1.1.3.cmml"><mi id="S2.E6.m2.4.4.1.1.1.3.2" xref="S2.E6.m2.4.4.1.1.1.3.2.cmml">𝔼</mi><mrow id="S2.E6.m2.2.2.2" xref="S2.E6.m2.2.2.2.cmml"><mrow id="S2.E6.m2.2.2.2.2.1" xref="S2.E6.m2.2.2.2.2.2.cmml"><mo id="S2.E6.m2.2.2.2.2.1.2" stretchy="false" xref="S2.E6.m2.2.2.2.2.2.cmml">(</mo><mi id="S2.E6.m2.1.1.1.1" xref="S2.E6.m2.1.1.1.1.cmml">𝒙</mi><mo id="S2.E6.m2.2.2.2.2.1.3" rspace="0.337em" xref="S2.E6.m2.2.2.2.2.2.cmml">,</mo><msup id="S2.E6.m2.2.2.2.2.1.1" xref="S2.E6.m2.2.2.2.2.1.1.cmml"><mi id="S2.E6.m2.2.2.2.2.1.1.2" xref="S2.E6.m2.2.2.2.2.1.1.2.cmml">𝒏</mi><mi id="S2.E6.m2.2.2.2.2.1.1.3" xref="S2.E6.m2.2.2.2.2.1.1.3.cmml">add</mi></msup><mo id="S2.E6.m2.2.2.2.2.1.4" stretchy="false" xref="S2.E6.m2.2.2.2.2.2.cmml">)</mo></mrow><mo id="S2.E6.m2.2.2.2.3" xref="S2.E6.m2.2.2.2.3.cmml">∼</mo><mi class="ltx_font_mathcaligraphic" id="S2.E6.m2.2.2.2.4" xref="S2.E6.m2.2.2.2.4.cmml">𝒟</mi></mrow></msub><mo id="S2.E6.m2.4.4.1.1.1.2" xref="S2.E6.m2.4.4.1.1.1.2.cmml">⁢</mo><mrow id="S2.E6.m2.4.4.1.1.1.1.1" xref="S2.E6.m2.4.4.1.1.1.1.2.cmml"><mo id="S2.E6.m2.4.4.1.1.1.1.1.2" xref="S2.E6.m2.4.4.1.1.1.1.2.1.cmml">[</mo><mrow id="S2.E6.m2.4.4.1.1.1.1.1.1.2" xref="S2.E6.m2.4.4.1.1.1.1.1.1.3.cmml"><mi id="S2.E6.m2.3.3" xref="S2.E6.m2.3.3.cmml">min</mi><mo id="S2.E6.m2.4.4.1.1.1.1.1.1.2a" xref="S2.E6.m2.4.4.1.1.1.1.1.1.3.cmml">⁡</mo><mrow id="S2.E6.m2.4.4.1.1.1.1.1.1.2.2" xref="S2.E6.m2.4.4.1.1.1.1.1.1.3.cmml"><mo id="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.3" stretchy="false" xref="S2.E6.m2.4.4.1.1.1.1.1.1.3.cmml">(</mo><msub id="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1" xref="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1.2" xref="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1.2.cmml">ℒ</mi><mi id="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1.3" xref="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1.3.cmml">MixIT1</mi></msub><mo id="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.4" xref="S2.E6.m2.4.4.1.1.1.1.1.1.3.cmml">,</mo><msub id="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2" xref="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2.2" xref="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2.2.cmml">ℒ</mi><mi id="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2.3" xref="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2.3.cmml">MixIT2</mi></msub><mo id="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.5" stretchy="false" xref="S2.E6.m2.4.4.1.1.1.1.1.1.3.cmml">)</mo></mrow></mrow><mo id="S2.E6.m2.4.4.1.1.1.1.1.3" xref="S2.E6.m2.4.4.1.1.1.1.2.1.cmml">]</mo></mrow></mrow></mrow><mo id="S2.E6.m2.4.4.1.2" xref="S2.E6.m2.4.4.1.1.cmml">,</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.E6.m2.4b"><apply id="S2.E6.m2.4.4.1.1.cmml" xref="S2.E6.m2.4.4.1"><eq id="S2.E6.m2.4.4.1.1.2.cmml" xref="S2.E6.m2.4.4.1.1.2"></eq><csymbol cd="latexml" id="S2.E6.m2.4.4.1.1.3.cmml" xref="S2.E6.m2.4.4.1.1.3">absent</csymbol><apply id="S2.E6.m2.4.4.1.1.1.cmml" xref="S2.E6.m2.4.4.1.1.1"><times id="S2.E6.m2.4.4.1.1.1.2.cmml" xref="S2.E6.m2.4.4.1.1.1.2"></times><apply id="S2.E6.m2.4.4.1.1.1.3.cmml" xref="S2.E6.m2.4.4.1.1.1.3"><csymbol cd="ambiguous" id="S2.E6.m2.4.4.1.1.1.3.1.cmml" xref="S2.E6.m2.4.4.1.1.1.3">subscript</csymbol><ci id="S2.E6.m2.4.4.1.1.1.3.2.cmml" xref="S2.E6.m2.4.4.1.1.1.3.2">𝔼</ci><apply id="S2.E6.m2.2.2.2.cmml" xref="S2.E6.m2.2.2.2"><csymbol cd="latexml" id="S2.E6.m2.2.2.2.3.cmml" xref="S2.E6.m2.2.2.2.3">similar-to</csymbol><interval closure="open" id="S2.E6.m2.2.2.2.2.2.cmml" xref="S2.E6.m2.2.2.2.2.1"><ci id="S2.E6.m2.1.1.1.1.cmml" xref="S2.E6.m2.1.1.1.1">𝒙</ci><apply id="S2.E6.m2.2.2.2.2.1.1.cmml" xref="S2.E6.m2.2.2.2.2.1.1"><csymbol cd="ambiguous" id="S2.E6.m2.2.2.2.2.1.1.1.cmml" xref="S2.E6.m2.2.2.2.2.1.1">superscript</csymbol><ci id="S2.E6.m2.2.2.2.2.1.1.2.cmml" xref="S2.E6.m2.2.2.2.2.1.1.2">𝒏</ci><ci id="S2.E6.m2.2.2.2.2.1.1.3.cmml" xref="S2.E6.m2.2.2.2.2.1.1.3">add</ci></apply></interval><ci id="S2.E6.m2.2.2.2.4.cmml" xref="S2.E6.m2.2.2.2.4">𝒟</ci></apply></apply><apply id="S2.E6.m2.4.4.1.1.1.1.2.cmml" xref="S2.E6.m2.4.4.1.1.1.1.1"><csymbol cd="latexml" id="S2.E6.m2.4.4.1.1.1.1.2.1.cmml" xref="S2.E6.m2.4.4.1.1.1.1.1.2">delimited-[]</csymbol><apply id="S2.E6.m2.4.4.1.1.1.1.1.1.3.cmml" xref="S2.E6.m2.4.4.1.1.1.1.1.1.2"><min id="S2.E6.m2.3.3.cmml" xref="S2.E6.m2.3.3"></min><apply id="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1">subscript</csymbol><ci id="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1.2">ℒ</ci><ci id="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E6.m2.4.4.1.1.1.1.1.1.1.1.1.3">MixIT1</ci></apply><apply id="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2.cmml" xref="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2"><csymbol cd="ambiguous" id="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2.1.cmml" xref="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2">subscript</csymbol><ci id="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2.2.cmml" xref="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2.2">ℒ</ci><ci id="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2.3.cmml" xref="S2.E6.m2.4.4.1.1.1.1.1.1.2.2.2.3">MixIT2</ci></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E6.m2.4c">\displaystyle=\mathbb{E}_{(\bm{x},\,\bm{n}^{\rm{add}})\sim\mathcal{D}}\left[% \min(\mathcal{L}_{\rm MixIT1},\mathcal{L}_{\rm MixIT2})\right],</annotation><annotation encoding="application/x-llamapun" id="S2.E6.m2.4d">= blackboard_E start_POSTSUBSCRIPT ( bold_italic_x , bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT ) ∼ caligraphic_D end_POSTSUBSCRIPT [ roman_min ( caligraphic_L start_POSTSUBSCRIPT MixIT1 end_POSTSUBSCRIPT , caligraphic_L start_POSTSUBSCRIPT MixIT2 end_POSTSUBSCRIPT ) ] ,</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(6)</span></td> </tr></tbody> <tbody id="S2.E7"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\mathcal{L}_{\rm MixIT1}" class="ltx_Math" display="inline" id="S2.E7.m1.1"><semantics id="S2.E7.m1.1a"><msub id="S2.E7.m1.1.1" xref="S2.E7.m1.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E7.m1.1.1.2" xref="S2.E7.m1.1.1.2.cmml">ℒ</mi><mi id="S2.E7.m1.1.1.3" xref="S2.E7.m1.1.1.3.cmml">MixIT1</mi></msub><annotation-xml encoding="MathML-Content" id="S2.E7.m1.1b"><apply id="S2.E7.m1.1.1.cmml" xref="S2.E7.m1.1.1"><csymbol cd="ambiguous" id="S2.E7.m1.1.1.1.cmml" xref="S2.E7.m1.1.1">subscript</csymbol><ci id="S2.E7.m1.1.1.2.cmml" xref="S2.E7.m1.1.1.2">ℒ</ci><ci id="S2.E7.m1.1.1.3.cmml" xref="S2.E7.m1.1.1.3">MixIT1</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E7.m1.1c">\displaystyle\mathcal{L}_{\rm MixIT1}</annotation><annotation encoding="application/x-llamapun" id="S2.E7.m1.1d">caligraphic_L start_POSTSUBSCRIPT MixIT1 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=L(\bm{u}_{1}+\bm{u}_{2},\bm{x})+L(\bm{u}_{3},\bm{n}^{\rm add})," class="ltx_Math" display="inline" id="S2.E7.m2.2"><semantics id="S2.E7.m2.2a"><mrow id="S2.E7.m2.2.2.1" xref="S2.E7.m2.2.2.1.1.cmml"><mrow id="S2.E7.m2.2.2.1.1" xref="S2.E7.m2.2.2.1.1.cmml"><mi id="S2.E7.m2.2.2.1.1.5" xref="S2.E7.m2.2.2.1.1.5.cmml"></mi><mo id="S2.E7.m2.2.2.1.1.4" xref="S2.E7.m2.2.2.1.1.4.cmml">=</mo><mrow id="S2.E7.m2.2.2.1.1.3" xref="S2.E7.m2.2.2.1.1.3.cmml"><mrow id="S2.E7.m2.2.2.1.1.1.1" xref="S2.E7.m2.2.2.1.1.1.1.cmml"><mi id="S2.E7.m2.2.2.1.1.1.1.3" xref="S2.E7.m2.2.2.1.1.1.1.3.cmml">L</mi><mo id="S2.E7.m2.2.2.1.1.1.1.2" xref="S2.E7.m2.2.2.1.1.1.1.2.cmml">⁢</mo><mrow id="S2.E7.m2.2.2.1.1.1.1.1.1" xref="S2.E7.m2.2.2.1.1.1.1.1.2.cmml"><mo id="S2.E7.m2.2.2.1.1.1.1.1.1.2" stretchy="false" xref="S2.E7.m2.2.2.1.1.1.1.1.2.cmml">(</mo><mrow id="S2.E7.m2.2.2.1.1.1.1.1.1.1" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.cmml"><msub id="S2.E7.m2.2.2.1.1.1.1.1.1.1.2" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.2.cmml"><mi id="S2.E7.m2.2.2.1.1.1.1.1.1.1.2.2" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.2.2.cmml">𝒖</mi><mn id="S2.E7.m2.2.2.1.1.1.1.1.1.1.2.3" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.2.3.cmml">1</mn></msub><mo id="S2.E7.m2.2.2.1.1.1.1.1.1.1.1" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.1.cmml">+</mo><msub id="S2.E7.m2.2.2.1.1.1.1.1.1.1.3" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.3.cmml"><mi id="S2.E7.m2.2.2.1.1.1.1.1.1.1.3.2" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.3.2.cmml">𝒖</mi><mn id="S2.E7.m2.2.2.1.1.1.1.1.1.1.3.3" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.3.3.cmml">2</mn></msub></mrow><mo id="S2.E7.m2.2.2.1.1.1.1.1.1.3" xref="S2.E7.m2.2.2.1.1.1.1.1.2.cmml">,</mo><mi id="S2.E7.m2.1.1" xref="S2.E7.m2.1.1.cmml">𝒙</mi><mo id="S2.E7.m2.2.2.1.1.1.1.1.1.4" stretchy="false" xref="S2.E7.m2.2.2.1.1.1.1.1.2.cmml">)</mo></mrow></mrow><mo id="S2.E7.m2.2.2.1.1.3.4" xref="S2.E7.m2.2.2.1.1.3.4.cmml">+</mo><mrow id="S2.E7.m2.2.2.1.1.3.3" xref="S2.E7.m2.2.2.1.1.3.3.cmml"><mi id="S2.E7.m2.2.2.1.1.3.3.4" xref="S2.E7.m2.2.2.1.1.3.3.4.cmml">L</mi><mo id="S2.E7.m2.2.2.1.1.3.3.3" xref="S2.E7.m2.2.2.1.1.3.3.3.cmml">⁢</mo><mrow id="S2.E7.m2.2.2.1.1.3.3.2.2" xref="S2.E7.m2.2.2.1.1.3.3.2.3.cmml"><mo id="S2.E7.m2.2.2.1.1.3.3.2.2.3" stretchy="false" xref="S2.E7.m2.2.2.1.1.3.3.2.3.cmml">(</mo><msub id="S2.E7.m2.2.2.1.1.2.2.1.1.1" xref="S2.E7.m2.2.2.1.1.2.2.1.1.1.cmml"><mi id="S2.E7.m2.2.2.1.1.2.2.1.1.1.2" xref="S2.E7.m2.2.2.1.1.2.2.1.1.1.2.cmml">𝒖</mi><mn id="S2.E7.m2.2.2.1.1.2.2.1.1.1.3" xref="S2.E7.m2.2.2.1.1.2.2.1.1.1.3.cmml">3</mn></msub><mo id="S2.E7.m2.2.2.1.1.3.3.2.2.4" xref="S2.E7.m2.2.2.1.1.3.3.2.3.cmml">,</mo><msup id="S2.E7.m2.2.2.1.1.3.3.2.2.2" xref="S2.E7.m2.2.2.1.1.3.3.2.2.2.cmml"><mi id="S2.E7.m2.2.2.1.1.3.3.2.2.2.2" xref="S2.E7.m2.2.2.1.1.3.3.2.2.2.2.cmml">𝒏</mi><mi id="S2.E7.m2.2.2.1.1.3.3.2.2.2.3" xref="S2.E7.m2.2.2.1.1.3.3.2.2.2.3.cmml">add</mi></msup><mo id="S2.E7.m2.2.2.1.1.3.3.2.2.5" stretchy="false" xref="S2.E7.m2.2.2.1.1.3.3.2.3.cmml">)</mo></mrow></mrow></mrow></mrow><mo id="S2.E7.m2.2.2.1.2" xref="S2.E7.m2.2.2.1.1.cmml">,</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.E7.m2.2b"><apply id="S2.E7.m2.2.2.1.1.cmml" xref="S2.E7.m2.2.2.1"><eq id="S2.E7.m2.2.2.1.1.4.cmml" xref="S2.E7.m2.2.2.1.1.4"></eq><csymbol cd="latexml" id="S2.E7.m2.2.2.1.1.5.cmml" xref="S2.E7.m2.2.2.1.1.5">absent</csymbol><apply id="S2.E7.m2.2.2.1.1.3.cmml" xref="S2.E7.m2.2.2.1.1.3"><plus id="S2.E7.m2.2.2.1.1.3.4.cmml" xref="S2.E7.m2.2.2.1.1.3.4"></plus><apply id="S2.E7.m2.2.2.1.1.1.1.cmml" xref="S2.E7.m2.2.2.1.1.1.1"><times id="S2.E7.m2.2.2.1.1.1.1.2.cmml" xref="S2.E7.m2.2.2.1.1.1.1.2"></times><ci id="S2.E7.m2.2.2.1.1.1.1.3.cmml" xref="S2.E7.m2.2.2.1.1.1.1.3">𝐿</ci><interval closure="open" id="S2.E7.m2.2.2.1.1.1.1.1.2.cmml" xref="S2.E7.m2.2.2.1.1.1.1.1.1"><apply id="S2.E7.m2.2.2.1.1.1.1.1.1.1.cmml" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1"><plus id="S2.E7.m2.2.2.1.1.1.1.1.1.1.1.cmml" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.1"></plus><apply id="S2.E7.m2.2.2.1.1.1.1.1.1.1.2.cmml" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.E7.m2.2.2.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S2.E7.m2.2.2.1.1.1.1.1.1.1.2.2.cmml" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.2.2">𝒖</ci><cn id="S2.E7.m2.2.2.1.1.1.1.1.1.1.2.3.cmml" type="integer" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.2.3">1</cn></apply><apply id="S2.E7.m2.2.2.1.1.1.1.1.1.1.3.cmml" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.E7.m2.2.2.1.1.1.1.1.1.1.3.1.cmml" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S2.E7.m2.2.2.1.1.1.1.1.1.1.3.2.cmml" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.3.2">𝒖</ci><cn id="S2.E7.m2.2.2.1.1.1.1.1.1.1.3.3.cmml" type="integer" xref="S2.E7.m2.2.2.1.1.1.1.1.1.1.3.3">2</cn></apply></apply><ci id="S2.E7.m2.1.1.cmml" xref="S2.E7.m2.1.1">𝒙</ci></interval></apply><apply id="S2.E7.m2.2.2.1.1.3.3.cmml" xref="S2.E7.m2.2.2.1.1.3.3"><times id="S2.E7.m2.2.2.1.1.3.3.3.cmml" xref="S2.E7.m2.2.2.1.1.3.3.3"></times><ci id="S2.E7.m2.2.2.1.1.3.3.4.cmml" xref="S2.E7.m2.2.2.1.1.3.3.4">𝐿</ci><interval closure="open" id="S2.E7.m2.2.2.1.1.3.3.2.3.cmml" xref="S2.E7.m2.2.2.1.1.3.3.2.2"><apply id="S2.E7.m2.2.2.1.1.2.2.1.1.1.cmml" xref="S2.E7.m2.2.2.1.1.2.2.1.1.1"><csymbol cd="ambiguous" id="S2.E7.m2.2.2.1.1.2.2.1.1.1.1.cmml" xref="S2.E7.m2.2.2.1.1.2.2.1.1.1">subscript</csymbol><ci id="S2.E7.m2.2.2.1.1.2.2.1.1.1.2.cmml" xref="S2.E7.m2.2.2.1.1.2.2.1.1.1.2">𝒖</ci><cn id="S2.E7.m2.2.2.1.1.2.2.1.1.1.3.cmml" type="integer" xref="S2.E7.m2.2.2.1.1.2.2.1.1.1.3">3</cn></apply><apply id="S2.E7.m2.2.2.1.1.3.3.2.2.2.cmml" xref="S2.E7.m2.2.2.1.1.3.3.2.2.2"><csymbol cd="ambiguous" id="S2.E7.m2.2.2.1.1.3.3.2.2.2.1.cmml" xref="S2.E7.m2.2.2.1.1.3.3.2.2.2">superscript</csymbol><ci id="S2.E7.m2.2.2.1.1.3.3.2.2.2.2.cmml" xref="S2.E7.m2.2.2.1.1.3.3.2.2.2.2">𝒏</ci><ci id="S2.E7.m2.2.2.1.1.3.3.2.2.2.3.cmml" xref="S2.E7.m2.2.2.1.1.3.3.2.2.2.3">add</ci></apply></interval></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E7.m2.2c">\displaystyle=L(\bm{u}_{1}+\bm{u}_{2},\bm{x})+L(\bm{u}_{3},\bm{n}^{\rm add}),</annotation><annotation encoding="application/x-llamapun" id="S2.E7.m2.2d">= italic_L ( bold_italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + bold_italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , bold_italic_x ) + italic_L ( bold_italic_u start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT ) ,</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(7)</span></td> </tr></tbody> <tbody id="S2.E8"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\mathcal{L}_{\rm MixIT2}" class="ltx_Math" display="inline" id="S2.E8.m1.1"><semantics id="S2.E8.m1.1a"><msub id="S2.E8.m1.1.1" xref="S2.E8.m1.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E8.m1.1.1.2" xref="S2.E8.m1.1.1.2.cmml">ℒ</mi><mi id="S2.E8.m1.1.1.3" xref="S2.E8.m1.1.1.3.cmml">MixIT2</mi></msub><annotation-xml encoding="MathML-Content" id="S2.E8.m1.1b"><apply id="S2.E8.m1.1.1.cmml" xref="S2.E8.m1.1.1"><csymbol cd="ambiguous" id="S2.E8.m1.1.1.1.cmml" xref="S2.E8.m1.1.1">subscript</csymbol><ci id="S2.E8.m1.1.1.2.cmml" xref="S2.E8.m1.1.1.2">ℒ</ci><ci id="S2.E8.m1.1.1.3.cmml" xref="S2.E8.m1.1.1.3">MixIT2</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E8.m1.1c">\displaystyle\mathcal{L}_{\rm MixIT2}</annotation><annotation encoding="application/x-llamapun" id="S2.E8.m1.1d">caligraphic_L start_POSTSUBSCRIPT MixIT2 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=L(\bm{u}_{1}+\bm{u}_{3},\bm{x})+L(\bm{u}_{2},\bm{n}^{\rm add})." class="ltx_Math" display="inline" id="S2.E8.m2.2"><semantics id="S2.E8.m2.2a"><mrow id="S2.E8.m2.2.2.1" xref="S2.E8.m2.2.2.1.1.cmml"><mrow id="S2.E8.m2.2.2.1.1" xref="S2.E8.m2.2.2.1.1.cmml"><mi id="S2.E8.m2.2.2.1.1.5" xref="S2.E8.m2.2.2.1.1.5.cmml"></mi><mo id="S2.E8.m2.2.2.1.1.4" xref="S2.E8.m2.2.2.1.1.4.cmml">=</mo><mrow id="S2.E8.m2.2.2.1.1.3" xref="S2.E8.m2.2.2.1.1.3.cmml"><mrow id="S2.E8.m2.2.2.1.1.1.1" xref="S2.E8.m2.2.2.1.1.1.1.cmml"><mi id="S2.E8.m2.2.2.1.1.1.1.3" xref="S2.E8.m2.2.2.1.1.1.1.3.cmml">L</mi><mo id="S2.E8.m2.2.2.1.1.1.1.2" xref="S2.E8.m2.2.2.1.1.1.1.2.cmml">⁢</mo><mrow id="S2.E8.m2.2.2.1.1.1.1.1.1" xref="S2.E8.m2.2.2.1.1.1.1.1.2.cmml"><mo id="S2.E8.m2.2.2.1.1.1.1.1.1.2" stretchy="false" xref="S2.E8.m2.2.2.1.1.1.1.1.2.cmml">(</mo><mrow id="S2.E8.m2.2.2.1.1.1.1.1.1.1" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.cmml"><msub id="S2.E8.m2.2.2.1.1.1.1.1.1.1.2" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.2.cmml"><mi id="S2.E8.m2.2.2.1.1.1.1.1.1.1.2.2" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.2.2.cmml">𝒖</mi><mn id="S2.E8.m2.2.2.1.1.1.1.1.1.1.2.3" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.2.3.cmml">1</mn></msub><mo id="S2.E8.m2.2.2.1.1.1.1.1.1.1.1" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.1.cmml">+</mo><msub id="S2.E8.m2.2.2.1.1.1.1.1.1.1.3" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.3.cmml"><mi id="S2.E8.m2.2.2.1.1.1.1.1.1.1.3.2" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.3.2.cmml">𝒖</mi><mn id="S2.E8.m2.2.2.1.1.1.1.1.1.1.3.3" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.3.3.cmml">3</mn></msub></mrow><mo id="S2.E8.m2.2.2.1.1.1.1.1.1.3" xref="S2.E8.m2.2.2.1.1.1.1.1.2.cmml">,</mo><mi id="S2.E8.m2.1.1" xref="S2.E8.m2.1.1.cmml">𝒙</mi><mo id="S2.E8.m2.2.2.1.1.1.1.1.1.4" stretchy="false" xref="S2.E8.m2.2.2.1.1.1.1.1.2.cmml">)</mo></mrow></mrow><mo id="S2.E8.m2.2.2.1.1.3.4" xref="S2.E8.m2.2.2.1.1.3.4.cmml">+</mo><mrow id="S2.E8.m2.2.2.1.1.3.3" xref="S2.E8.m2.2.2.1.1.3.3.cmml"><mi id="S2.E8.m2.2.2.1.1.3.3.4" xref="S2.E8.m2.2.2.1.1.3.3.4.cmml">L</mi><mo id="S2.E8.m2.2.2.1.1.3.3.3" xref="S2.E8.m2.2.2.1.1.3.3.3.cmml">⁢</mo><mrow id="S2.E8.m2.2.2.1.1.3.3.2.2" xref="S2.E8.m2.2.2.1.1.3.3.2.3.cmml"><mo id="S2.E8.m2.2.2.1.1.3.3.2.2.3" stretchy="false" xref="S2.E8.m2.2.2.1.1.3.3.2.3.cmml">(</mo><msub id="S2.E8.m2.2.2.1.1.2.2.1.1.1" xref="S2.E8.m2.2.2.1.1.2.2.1.1.1.cmml"><mi id="S2.E8.m2.2.2.1.1.2.2.1.1.1.2" xref="S2.E8.m2.2.2.1.1.2.2.1.1.1.2.cmml">𝒖</mi><mn id="S2.E8.m2.2.2.1.1.2.2.1.1.1.3" xref="S2.E8.m2.2.2.1.1.2.2.1.1.1.3.cmml">2</mn></msub><mo id="S2.E8.m2.2.2.1.1.3.3.2.2.4" xref="S2.E8.m2.2.2.1.1.3.3.2.3.cmml">,</mo><msup id="S2.E8.m2.2.2.1.1.3.3.2.2.2" xref="S2.E8.m2.2.2.1.1.3.3.2.2.2.cmml"><mi id="S2.E8.m2.2.2.1.1.3.3.2.2.2.2" xref="S2.E8.m2.2.2.1.1.3.3.2.2.2.2.cmml">𝒏</mi><mi id="S2.E8.m2.2.2.1.1.3.3.2.2.2.3" xref="S2.E8.m2.2.2.1.1.3.3.2.2.2.3.cmml">add</mi></msup><mo id="S2.E8.m2.2.2.1.1.3.3.2.2.5" stretchy="false" xref="S2.E8.m2.2.2.1.1.3.3.2.3.cmml">)</mo></mrow></mrow></mrow></mrow><mo id="S2.E8.m2.2.2.1.2" lspace="0em" xref="S2.E8.m2.2.2.1.1.cmml">.</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.E8.m2.2b"><apply id="S2.E8.m2.2.2.1.1.cmml" xref="S2.E8.m2.2.2.1"><eq id="S2.E8.m2.2.2.1.1.4.cmml" xref="S2.E8.m2.2.2.1.1.4"></eq><csymbol cd="latexml" id="S2.E8.m2.2.2.1.1.5.cmml" xref="S2.E8.m2.2.2.1.1.5">absent</csymbol><apply id="S2.E8.m2.2.2.1.1.3.cmml" xref="S2.E8.m2.2.2.1.1.3"><plus id="S2.E8.m2.2.2.1.1.3.4.cmml" xref="S2.E8.m2.2.2.1.1.3.4"></plus><apply id="S2.E8.m2.2.2.1.1.1.1.cmml" xref="S2.E8.m2.2.2.1.1.1.1"><times id="S2.E8.m2.2.2.1.1.1.1.2.cmml" xref="S2.E8.m2.2.2.1.1.1.1.2"></times><ci id="S2.E8.m2.2.2.1.1.1.1.3.cmml" xref="S2.E8.m2.2.2.1.1.1.1.3">𝐿</ci><interval closure="open" id="S2.E8.m2.2.2.1.1.1.1.1.2.cmml" xref="S2.E8.m2.2.2.1.1.1.1.1.1"><apply id="S2.E8.m2.2.2.1.1.1.1.1.1.1.cmml" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1"><plus id="S2.E8.m2.2.2.1.1.1.1.1.1.1.1.cmml" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.1"></plus><apply id="S2.E8.m2.2.2.1.1.1.1.1.1.1.2.cmml" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.E8.m2.2.2.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S2.E8.m2.2.2.1.1.1.1.1.1.1.2.2.cmml" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.2.2">𝒖</ci><cn id="S2.E8.m2.2.2.1.1.1.1.1.1.1.2.3.cmml" type="integer" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.2.3">1</cn></apply><apply id="S2.E8.m2.2.2.1.1.1.1.1.1.1.3.cmml" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.E8.m2.2.2.1.1.1.1.1.1.1.3.1.cmml" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S2.E8.m2.2.2.1.1.1.1.1.1.1.3.2.cmml" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.3.2">𝒖</ci><cn id="S2.E8.m2.2.2.1.1.1.1.1.1.1.3.3.cmml" type="integer" xref="S2.E8.m2.2.2.1.1.1.1.1.1.1.3.3">3</cn></apply></apply><ci id="S2.E8.m2.1.1.cmml" xref="S2.E8.m2.1.1">𝒙</ci></interval></apply><apply id="S2.E8.m2.2.2.1.1.3.3.cmml" xref="S2.E8.m2.2.2.1.1.3.3"><times id="S2.E8.m2.2.2.1.1.3.3.3.cmml" xref="S2.E8.m2.2.2.1.1.3.3.3"></times><ci id="S2.E8.m2.2.2.1.1.3.3.4.cmml" xref="S2.E8.m2.2.2.1.1.3.3.4">𝐿</ci><interval closure="open" id="S2.E8.m2.2.2.1.1.3.3.2.3.cmml" xref="S2.E8.m2.2.2.1.1.3.3.2.2"><apply id="S2.E8.m2.2.2.1.1.2.2.1.1.1.cmml" xref="S2.E8.m2.2.2.1.1.2.2.1.1.1"><csymbol cd="ambiguous" id="S2.E8.m2.2.2.1.1.2.2.1.1.1.1.cmml" xref="S2.E8.m2.2.2.1.1.2.2.1.1.1">subscript</csymbol><ci id="S2.E8.m2.2.2.1.1.2.2.1.1.1.2.cmml" xref="S2.E8.m2.2.2.1.1.2.2.1.1.1.2">𝒖</ci><cn id="S2.E8.m2.2.2.1.1.2.2.1.1.1.3.cmml" type="integer" xref="S2.E8.m2.2.2.1.1.2.2.1.1.1.3">2</cn></apply><apply id="S2.E8.m2.2.2.1.1.3.3.2.2.2.cmml" xref="S2.E8.m2.2.2.1.1.3.3.2.2.2"><csymbol cd="ambiguous" id="S2.E8.m2.2.2.1.1.3.3.2.2.2.1.cmml" xref="S2.E8.m2.2.2.1.1.3.3.2.2.2">superscript</csymbol><ci id="S2.E8.m2.2.2.1.1.3.3.2.2.2.2.cmml" xref="S2.E8.m2.2.2.1.1.3.3.2.2.2.2">𝒏</ci><ci id="S2.E8.m2.2.2.1.1.3.3.2.2.2.3.cmml" xref="S2.E8.m2.2.2.1.1.3.3.2.2.2.3">add</ci></apply></interval></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E8.m2.2c">\displaystyle=L(\bm{u}_{1}+\bm{u}_{3},\bm{x})+L(\bm{u}_{2},\bm{n}^{\rm add}).</annotation><annotation encoding="application/x-llamapun" id="S2.E8.m2.2d">= italic_L ( bold_italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + bold_italic_u start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , bold_italic_x ) + italic_L ( bold_italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT ) .</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(8)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS2.p1.26">Here, <math alttext="\bm{u}_{1}" class="ltx_Math" display="inline" id="S2.SS2.p1.6.m1.1"><semantics id="S2.SS2.p1.6.m1.1a"><msub id="S2.SS2.p1.6.m1.1.1" xref="S2.SS2.p1.6.m1.1.1.cmml"><mi id="S2.SS2.p1.6.m1.1.1.2" xref="S2.SS2.p1.6.m1.1.1.2.cmml">𝒖</mi><mn id="S2.SS2.p1.6.m1.1.1.3" xref="S2.SS2.p1.6.m1.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.6.m1.1b"><apply id="S2.SS2.p1.6.m1.1.1.cmml" xref="S2.SS2.p1.6.m1.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.6.m1.1.1.1.cmml" xref="S2.SS2.p1.6.m1.1.1">subscript</csymbol><ci id="S2.SS2.p1.6.m1.1.1.2.cmml" xref="S2.SS2.p1.6.m1.1.1.2">𝒖</ci><cn id="S2.SS2.p1.6.m1.1.1.3.cmml" type="integer" xref="S2.SS2.p1.6.m1.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.6.m1.1c">\bm{u}_{1}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.6.m1.1d">bold_italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math>, <math alttext="\bm{u}_{2}" class="ltx_Math" display="inline" id="S2.SS2.p1.7.m2.1"><semantics id="S2.SS2.p1.7.m2.1a"><msub id="S2.SS2.p1.7.m2.1.1" xref="S2.SS2.p1.7.m2.1.1.cmml"><mi id="S2.SS2.p1.7.m2.1.1.2" xref="S2.SS2.p1.7.m2.1.1.2.cmml">𝒖</mi><mn id="S2.SS2.p1.7.m2.1.1.3" xref="S2.SS2.p1.7.m2.1.1.3.cmml">2</mn></msub><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.7.m2.1b"><apply id="S2.SS2.p1.7.m2.1.1.cmml" xref="S2.SS2.p1.7.m2.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.7.m2.1.1.1.cmml" xref="S2.SS2.p1.7.m2.1.1">subscript</csymbol><ci id="S2.SS2.p1.7.m2.1.1.2.cmml" xref="S2.SS2.p1.7.m2.1.1.2">𝒖</ci><cn id="S2.SS2.p1.7.m2.1.1.3.cmml" type="integer" xref="S2.SS2.p1.7.m2.1.1.3">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.7.m2.1c">\bm{u}_{2}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.7.m2.1d">bold_italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math>, and <math alttext="\bm{u}_{3}" class="ltx_Math" display="inline" id="S2.SS2.p1.8.m3.1"><semantics id="S2.SS2.p1.8.m3.1a"><msub id="S2.SS2.p1.8.m3.1.1" xref="S2.SS2.p1.8.m3.1.1.cmml"><mi id="S2.SS2.p1.8.m3.1.1.2" xref="S2.SS2.p1.8.m3.1.1.2.cmml">𝒖</mi><mn id="S2.SS2.p1.8.m3.1.1.3" xref="S2.SS2.p1.8.m3.1.1.3.cmml">3</mn></msub><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.8.m3.1b"><apply id="S2.SS2.p1.8.m3.1.1.cmml" xref="S2.SS2.p1.8.m3.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.8.m3.1.1.1.cmml" xref="S2.SS2.p1.8.m3.1.1">subscript</csymbol><ci id="S2.SS2.p1.8.m3.1.1.2.cmml" xref="S2.SS2.p1.8.m3.1.1.2">𝒖</ci><cn id="S2.SS2.p1.8.m3.1.1.3.cmml" type="integer" xref="S2.SS2.p1.8.m3.1.1.3">3</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.8.m3.1c">\bm{u}_{3}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.8.m3.1d">bold_italic_u start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT</annotation></semantics></math> are the outputs of the DNN <math alttext="f(\bm{y};\theta)" class="ltx_Math" display="inline" id="S2.SS2.p1.9.m4.2"><semantics id="S2.SS2.p1.9.m4.2a"><mrow id="S2.SS2.p1.9.m4.2.3" xref="S2.SS2.p1.9.m4.2.3.cmml"><mi id="S2.SS2.p1.9.m4.2.3.2" xref="S2.SS2.p1.9.m4.2.3.2.cmml">f</mi><mo id="S2.SS2.p1.9.m4.2.3.1" xref="S2.SS2.p1.9.m4.2.3.1.cmml">⁢</mo><mrow id="S2.SS2.p1.9.m4.2.3.3.2" xref="S2.SS2.p1.9.m4.2.3.3.1.cmml"><mo id="S2.SS2.p1.9.m4.2.3.3.2.1" stretchy="false" xref="S2.SS2.p1.9.m4.2.3.3.1.cmml">(</mo><mi id="S2.SS2.p1.9.m4.1.1" xref="S2.SS2.p1.9.m4.1.1.cmml">𝒚</mi><mo id="S2.SS2.p1.9.m4.2.3.3.2.2" xref="S2.SS2.p1.9.m4.2.3.3.1.cmml">;</mo><mi id="S2.SS2.p1.9.m4.2.2" xref="S2.SS2.p1.9.m4.2.2.cmml">θ</mi><mo id="S2.SS2.p1.9.m4.2.3.3.2.3" stretchy="false" xref="S2.SS2.p1.9.m4.2.3.3.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.9.m4.2b"><apply id="S2.SS2.p1.9.m4.2.3.cmml" xref="S2.SS2.p1.9.m4.2.3"><times id="S2.SS2.p1.9.m4.2.3.1.cmml" xref="S2.SS2.p1.9.m4.2.3.1"></times><ci id="S2.SS2.p1.9.m4.2.3.2.cmml" xref="S2.SS2.p1.9.m4.2.3.2">𝑓</ci><list id="S2.SS2.p1.9.m4.2.3.3.1.cmml" xref="S2.SS2.p1.9.m4.2.3.3.2"><ci id="S2.SS2.p1.9.m4.1.1.cmml" xref="S2.SS2.p1.9.m4.1.1">𝒚</ci><ci id="S2.SS2.p1.9.m4.2.2.cmml" xref="S2.SS2.p1.9.m4.2.2">𝜃</ci></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.9.m4.2c">f(\bm{y};\theta)</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.9.m4.2d">italic_f ( bold_italic_y ; italic_θ )</annotation></semantics></math>, where <math alttext="\bm{y}=\bm{x}+\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S2.SS2.p1.10.m5.1"><semantics id="S2.SS2.p1.10.m5.1a"><mrow id="S2.SS2.p1.10.m5.1.1" xref="S2.SS2.p1.10.m5.1.1.cmml"><mi id="S2.SS2.p1.10.m5.1.1.2" xref="S2.SS2.p1.10.m5.1.1.2.cmml">𝒚</mi><mo id="S2.SS2.p1.10.m5.1.1.1" xref="S2.SS2.p1.10.m5.1.1.1.cmml">=</mo><mrow id="S2.SS2.p1.10.m5.1.1.3" xref="S2.SS2.p1.10.m5.1.1.3.cmml"><mi id="S2.SS2.p1.10.m5.1.1.3.2" xref="S2.SS2.p1.10.m5.1.1.3.2.cmml">𝒙</mi><mo id="S2.SS2.p1.10.m5.1.1.3.1" xref="S2.SS2.p1.10.m5.1.1.3.1.cmml">+</mo><msup id="S2.SS2.p1.10.m5.1.1.3.3" xref="S2.SS2.p1.10.m5.1.1.3.3.cmml"><mi id="S2.SS2.p1.10.m5.1.1.3.3.2" xref="S2.SS2.p1.10.m5.1.1.3.3.2.cmml">𝒏</mi><mi id="S2.SS2.p1.10.m5.1.1.3.3.3" xref="S2.SS2.p1.10.m5.1.1.3.3.3.cmml">add</mi></msup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.10.m5.1b"><apply id="S2.SS2.p1.10.m5.1.1.cmml" xref="S2.SS2.p1.10.m5.1.1"><eq id="S2.SS2.p1.10.m5.1.1.1.cmml" xref="S2.SS2.p1.10.m5.1.1.1"></eq><ci id="S2.SS2.p1.10.m5.1.1.2.cmml" xref="S2.SS2.p1.10.m5.1.1.2">𝒚</ci><apply id="S2.SS2.p1.10.m5.1.1.3.cmml" xref="S2.SS2.p1.10.m5.1.1.3"><plus id="S2.SS2.p1.10.m5.1.1.3.1.cmml" xref="S2.SS2.p1.10.m5.1.1.3.1"></plus><ci id="S2.SS2.p1.10.m5.1.1.3.2.cmml" xref="S2.SS2.p1.10.m5.1.1.3.2">𝒙</ci><apply id="S2.SS2.p1.10.m5.1.1.3.3.cmml" xref="S2.SS2.p1.10.m5.1.1.3.3"><csymbol cd="ambiguous" id="S2.SS2.p1.10.m5.1.1.3.3.1.cmml" xref="S2.SS2.p1.10.m5.1.1.3.3">superscript</csymbol><ci id="S2.SS2.p1.10.m5.1.1.3.3.2.cmml" xref="S2.SS2.p1.10.m5.1.1.3.3.2">𝒏</ci><ci id="S2.SS2.p1.10.m5.1.1.3.3.3.cmml" xref="S2.SS2.p1.10.m5.1.1.3.3.3">add</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.10.m5.1c">\bm{y}=\bm{x}+\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.10.m5.1d">bold_italic_y = bold_italic_x + bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>. MixIT can achieve TSE because <math alttext="(\bm{u}_{1},\bm{u}_{2},\bm{u}_{3})=(\bm{s},\bm{n}^{\rm obs},\bm{n}^{\rm add})" class="ltx_Math" display="inline" id="S2.SS2.p1.11.m6.6"><semantics id="S2.SS2.p1.11.m6.6a"><mrow id="S2.SS2.p1.11.m6.6.6" xref="S2.SS2.p1.11.m6.6.6.cmml"><mrow id="S2.SS2.p1.11.m6.4.4.3.3" xref="S2.SS2.p1.11.m6.4.4.3.4.cmml"><mo id="S2.SS2.p1.11.m6.4.4.3.3.4" stretchy="false" xref="S2.SS2.p1.11.m6.4.4.3.4.cmml">(</mo><msub id="S2.SS2.p1.11.m6.2.2.1.1.1" xref="S2.SS2.p1.11.m6.2.2.1.1.1.cmml"><mi id="S2.SS2.p1.11.m6.2.2.1.1.1.2" xref="S2.SS2.p1.11.m6.2.2.1.1.1.2.cmml">𝒖</mi><mn id="S2.SS2.p1.11.m6.2.2.1.1.1.3" xref="S2.SS2.p1.11.m6.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S2.SS2.p1.11.m6.4.4.3.3.5" xref="S2.SS2.p1.11.m6.4.4.3.4.cmml">,</mo><msub id="S2.SS2.p1.11.m6.3.3.2.2.2" xref="S2.SS2.p1.11.m6.3.3.2.2.2.cmml"><mi id="S2.SS2.p1.11.m6.3.3.2.2.2.2" xref="S2.SS2.p1.11.m6.3.3.2.2.2.2.cmml">𝒖</mi><mn id="S2.SS2.p1.11.m6.3.3.2.2.2.3" xref="S2.SS2.p1.11.m6.3.3.2.2.2.3.cmml">2</mn></msub><mo id="S2.SS2.p1.11.m6.4.4.3.3.6" xref="S2.SS2.p1.11.m6.4.4.3.4.cmml">,</mo><msub id="S2.SS2.p1.11.m6.4.4.3.3.3" xref="S2.SS2.p1.11.m6.4.4.3.3.3.cmml"><mi id="S2.SS2.p1.11.m6.4.4.3.3.3.2" xref="S2.SS2.p1.11.m6.4.4.3.3.3.2.cmml">𝒖</mi><mn id="S2.SS2.p1.11.m6.4.4.3.3.3.3" xref="S2.SS2.p1.11.m6.4.4.3.3.3.3.cmml">3</mn></msub><mo id="S2.SS2.p1.11.m6.4.4.3.3.7" stretchy="false" xref="S2.SS2.p1.11.m6.4.4.3.4.cmml">)</mo></mrow><mo id="S2.SS2.p1.11.m6.6.6.6" xref="S2.SS2.p1.11.m6.6.6.6.cmml">=</mo><mrow id="S2.SS2.p1.11.m6.6.6.5.2" xref="S2.SS2.p1.11.m6.6.6.5.3.cmml"><mo id="S2.SS2.p1.11.m6.6.6.5.2.3" stretchy="false" xref="S2.SS2.p1.11.m6.6.6.5.3.cmml">(</mo><mi id="S2.SS2.p1.11.m6.1.1" xref="S2.SS2.p1.11.m6.1.1.cmml">𝒔</mi><mo id="S2.SS2.p1.11.m6.6.6.5.2.4" xref="S2.SS2.p1.11.m6.6.6.5.3.cmml">,</mo><msup id="S2.SS2.p1.11.m6.5.5.4.1.1" xref="S2.SS2.p1.11.m6.5.5.4.1.1.cmml"><mi id="S2.SS2.p1.11.m6.5.5.4.1.1.2" xref="S2.SS2.p1.11.m6.5.5.4.1.1.2.cmml">𝒏</mi><mi id="S2.SS2.p1.11.m6.5.5.4.1.1.3" xref="S2.SS2.p1.11.m6.5.5.4.1.1.3.cmml">obs</mi></msup><mo id="S2.SS2.p1.11.m6.6.6.5.2.5" xref="S2.SS2.p1.11.m6.6.6.5.3.cmml">,</mo><msup id="S2.SS2.p1.11.m6.6.6.5.2.2" xref="S2.SS2.p1.11.m6.6.6.5.2.2.cmml"><mi id="S2.SS2.p1.11.m6.6.6.5.2.2.2" xref="S2.SS2.p1.11.m6.6.6.5.2.2.2.cmml">𝒏</mi><mi id="S2.SS2.p1.11.m6.6.6.5.2.2.3" xref="S2.SS2.p1.11.m6.6.6.5.2.2.3.cmml">add</mi></msup><mo id="S2.SS2.p1.11.m6.6.6.5.2.6" stretchy="false" xref="S2.SS2.p1.11.m6.6.6.5.3.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.11.m6.6b"><apply id="S2.SS2.p1.11.m6.6.6.cmml" xref="S2.SS2.p1.11.m6.6.6"><eq id="S2.SS2.p1.11.m6.6.6.6.cmml" xref="S2.SS2.p1.11.m6.6.6.6"></eq><vector id="S2.SS2.p1.11.m6.4.4.3.4.cmml" xref="S2.SS2.p1.11.m6.4.4.3.3"><apply id="S2.SS2.p1.11.m6.2.2.1.1.1.cmml" xref="S2.SS2.p1.11.m6.2.2.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.11.m6.2.2.1.1.1.1.cmml" xref="S2.SS2.p1.11.m6.2.2.1.1.1">subscript</csymbol><ci id="S2.SS2.p1.11.m6.2.2.1.1.1.2.cmml" xref="S2.SS2.p1.11.m6.2.2.1.1.1.2">𝒖</ci><cn id="S2.SS2.p1.11.m6.2.2.1.1.1.3.cmml" type="integer" xref="S2.SS2.p1.11.m6.2.2.1.1.1.3">1</cn></apply><apply id="S2.SS2.p1.11.m6.3.3.2.2.2.cmml" xref="S2.SS2.p1.11.m6.3.3.2.2.2"><csymbol cd="ambiguous" id="S2.SS2.p1.11.m6.3.3.2.2.2.1.cmml" xref="S2.SS2.p1.11.m6.3.3.2.2.2">subscript</csymbol><ci id="S2.SS2.p1.11.m6.3.3.2.2.2.2.cmml" xref="S2.SS2.p1.11.m6.3.3.2.2.2.2">𝒖</ci><cn id="S2.SS2.p1.11.m6.3.3.2.2.2.3.cmml" type="integer" xref="S2.SS2.p1.11.m6.3.3.2.2.2.3">2</cn></apply><apply id="S2.SS2.p1.11.m6.4.4.3.3.3.cmml" xref="S2.SS2.p1.11.m6.4.4.3.3.3"><csymbol cd="ambiguous" id="S2.SS2.p1.11.m6.4.4.3.3.3.1.cmml" xref="S2.SS2.p1.11.m6.4.4.3.3.3">subscript</csymbol><ci id="S2.SS2.p1.11.m6.4.4.3.3.3.2.cmml" xref="S2.SS2.p1.11.m6.4.4.3.3.3.2">𝒖</ci><cn id="S2.SS2.p1.11.m6.4.4.3.3.3.3.cmml" type="integer" xref="S2.SS2.p1.11.m6.4.4.3.3.3.3">3</cn></apply></vector><vector id="S2.SS2.p1.11.m6.6.6.5.3.cmml" xref="S2.SS2.p1.11.m6.6.6.5.2"><ci id="S2.SS2.p1.11.m6.1.1.cmml" xref="S2.SS2.p1.11.m6.1.1">𝒔</ci><apply id="S2.SS2.p1.11.m6.5.5.4.1.1.cmml" xref="S2.SS2.p1.11.m6.5.5.4.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.11.m6.5.5.4.1.1.1.cmml" xref="S2.SS2.p1.11.m6.5.5.4.1.1">superscript</csymbol><ci id="S2.SS2.p1.11.m6.5.5.4.1.1.2.cmml" xref="S2.SS2.p1.11.m6.5.5.4.1.1.2">𝒏</ci><ci id="S2.SS2.p1.11.m6.5.5.4.1.1.3.cmml" xref="S2.SS2.p1.11.m6.5.5.4.1.1.3">obs</ci></apply><apply id="S2.SS2.p1.11.m6.6.6.5.2.2.cmml" xref="S2.SS2.p1.11.m6.6.6.5.2.2"><csymbol cd="ambiguous" id="S2.SS2.p1.11.m6.6.6.5.2.2.1.cmml" xref="S2.SS2.p1.11.m6.6.6.5.2.2">superscript</csymbol><ci id="S2.SS2.p1.11.m6.6.6.5.2.2.2.cmml" xref="S2.SS2.p1.11.m6.6.6.5.2.2.2">𝒏</ci><ci id="S2.SS2.p1.11.m6.6.6.5.2.2.3.cmml" xref="S2.SS2.p1.11.m6.6.6.5.2.2.3">add</ci></apply></vector></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.11.m6.6c">(\bm{u}_{1},\bm{u}_{2},\bm{u}_{3})=(\bm{s},\bm{n}^{\rm obs},\bm{n}^{\rm add})</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.11.m6.6d">( bold_italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , bold_italic_u start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) = ( bold_italic_s , bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT , bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT )</annotation></semantics></math> or <math alttext="(\bm{s},\bm{n}^{\rm add},\bm{n}^{\rm obs})" class="ltx_Math" display="inline" id="S2.SS2.p1.12.m7.3"><semantics id="S2.SS2.p1.12.m7.3a"><mrow id="S2.SS2.p1.12.m7.3.3.2" xref="S2.SS2.p1.12.m7.3.3.3.cmml"><mo id="S2.SS2.p1.12.m7.3.3.2.3" stretchy="false" xref="S2.SS2.p1.12.m7.3.3.3.cmml">(</mo><mi id="S2.SS2.p1.12.m7.1.1" xref="S2.SS2.p1.12.m7.1.1.cmml">𝒔</mi><mo id="S2.SS2.p1.12.m7.3.3.2.4" xref="S2.SS2.p1.12.m7.3.3.3.cmml">,</mo><msup id="S2.SS2.p1.12.m7.2.2.1.1" xref="S2.SS2.p1.12.m7.2.2.1.1.cmml"><mi id="S2.SS2.p1.12.m7.2.2.1.1.2" xref="S2.SS2.p1.12.m7.2.2.1.1.2.cmml">𝒏</mi><mi id="S2.SS2.p1.12.m7.2.2.1.1.3" xref="S2.SS2.p1.12.m7.2.2.1.1.3.cmml">add</mi></msup><mo id="S2.SS2.p1.12.m7.3.3.2.5" xref="S2.SS2.p1.12.m7.3.3.3.cmml">,</mo><msup id="S2.SS2.p1.12.m7.3.3.2.2" xref="S2.SS2.p1.12.m7.3.3.2.2.cmml"><mi id="S2.SS2.p1.12.m7.3.3.2.2.2" xref="S2.SS2.p1.12.m7.3.3.2.2.2.cmml">𝒏</mi><mi id="S2.SS2.p1.12.m7.3.3.2.2.3" xref="S2.SS2.p1.12.m7.3.3.2.2.3.cmml">obs</mi></msup><mo id="S2.SS2.p1.12.m7.3.3.2.6" stretchy="false" xref="S2.SS2.p1.12.m7.3.3.3.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.12.m7.3b"><vector id="S2.SS2.p1.12.m7.3.3.3.cmml" xref="S2.SS2.p1.12.m7.3.3.2"><ci id="S2.SS2.p1.12.m7.1.1.cmml" xref="S2.SS2.p1.12.m7.1.1">𝒔</ci><apply id="S2.SS2.p1.12.m7.2.2.1.1.cmml" xref="S2.SS2.p1.12.m7.2.2.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.12.m7.2.2.1.1.1.cmml" xref="S2.SS2.p1.12.m7.2.2.1.1">superscript</csymbol><ci id="S2.SS2.p1.12.m7.2.2.1.1.2.cmml" xref="S2.SS2.p1.12.m7.2.2.1.1.2">𝒏</ci><ci id="S2.SS2.p1.12.m7.2.2.1.1.3.cmml" xref="S2.SS2.p1.12.m7.2.2.1.1.3">add</ci></apply><apply id="S2.SS2.p1.12.m7.3.3.2.2.cmml" xref="S2.SS2.p1.12.m7.3.3.2.2"><csymbol cd="ambiguous" id="S2.SS2.p1.12.m7.3.3.2.2.1.cmml" xref="S2.SS2.p1.12.m7.3.3.2.2">superscript</csymbol><ci id="S2.SS2.p1.12.m7.3.3.2.2.2.cmml" xref="S2.SS2.p1.12.m7.3.3.2.2.2">𝒏</ci><ci id="S2.SS2.p1.12.m7.3.3.2.2.3.cmml" xref="S2.SS2.p1.12.m7.3.3.2.2.3">obs</ci></apply></vector></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.12.m7.3c">(\bm{s},\bm{n}^{\rm add},\bm{n}^{\rm obs})</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.12.m7.3d">( bold_italic_s , bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT , bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT )</annotation></semantics></math> is the optimal solution for <math alttext="\mathcal{L}^{\rm MixIT}" class="ltx_Math" display="inline" id="S2.SS2.p1.13.m8.1"><semantics id="S2.SS2.p1.13.m8.1a"><msup id="S2.SS2.p1.13.m8.1.1" xref="S2.SS2.p1.13.m8.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.SS2.p1.13.m8.1.1.2" xref="S2.SS2.p1.13.m8.1.1.2.cmml">ℒ</mi><mi id="S2.SS2.p1.13.m8.1.1.3" xref="S2.SS2.p1.13.m8.1.1.3.cmml">MixIT</mi></msup><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.13.m8.1b"><apply id="S2.SS2.p1.13.m8.1.1.cmml" xref="S2.SS2.p1.13.m8.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.13.m8.1.1.1.cmml" xref="S2.SS2.p1.13.m8.1.1">superscript</csymbol><ci id="S2.SS2.p1.13.m8.1.1.2.cmml" xref="S2.SS2.p1.13.m8.1.1.2">ℒ</ci><ci id="S2.SS2.p1.13.m8.1.1.3.cmml" xref="S2.SS2.p1.13.m8.1.1.3">MixIT</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.13.m8.1c">\mathcal{L}^{\rm MixIT}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.13.m8.1d">caligraphic_L start_POSTSUPERSCRIPT roman_MixIT end_POSTSUPERSCRIPT</annotation></semantics></math>. Although <math alttext="\mathcal{L}^{\rm MixIT}" class="ltx_Math" display="inline" id="S2.SS2.p1.14.m9.1"><semantics id="S2.SS2.p1.14.m9.1a"><msup id="S2.SS2.p1.14.m9.1.1" xref="S2.SS2.p1.14.m9.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.SS2.p1.14.m9.1.1.2" xref="S2.SS2.p1.14.m9.1.1.2.cmml">ℒ</mi><mi id="S2.SS2.p1.14.m9.1.1.3" xref="S2.SS2.p1.14.m9.1.1.3.cmml">MixIT</mi></msup><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.14.m9.1b"><apply id="S2.SS2.p1.14.m9.1.1.cmml" xref="S2.SS2.p1.14.m9.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.14.m9.1.1.1.cmml" xref="S2.SS2.p1.14.m9.1.1">superscript</csymbol><ci id="S2.SS2.p1.14.m9.1.1.2.cmml" xref="S2.SS2.p1.14.m9.1.1.2">ℒ</ci><ci id="S2.SS2.p1.14.m9.1.1.3.cmml" xref="S2.SS2.p1.14.m9.1.1.3">MixIT</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.14.m9.1c">\mathcal{L}^{\rm MixIT}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.14.m9.1d">caligraphic_L start_POSTSUPERSCRIPT roman_MixIT end_POSTSUPERSCRIPT</annotation></semantics></math> can also be minimized by outputting a noisy signal as <math alttext="\bm{u}_{1}" class="ltx_Math" display="inline" id="S2.SS2.p1.15.m10.1"><semantics id="S2.SS2.p1.15.m10.1a"><msub id="S2.SS2.p1.15.m10.1.1" xref="S2.SS2.p1.15.m10.1.1.cmml"><mi id="S2.SS2.p1.15.m10.1.1.2" xref="S2.SS2.p1.15.m10.1.1.2.cmml">𝒖</mi><mn id="S2.SS2.p1.15.m10.1.1.3" xref="S2.SS2.p1.15.m10.1.1.3.cmml">1</mn></msub><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.15.m10.1b"><apply id="S2.SS2.p1.15.m10.1.1.cmml" xref="S2.SS2.p1.15.m10.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.15.m10.1.1.1.cmml" xref="S2.SS2.p1.15.m10.1.1">subscript</csymbol><ci id="S2.SS2.p1.15.m10.1.1.2.cmml" xref="S2.SS2.p1.15.m10.1.1.2">𝒖</ci><cn id="S2.SS2.p1.15.m10.1.1.3.cmml" type="integer" xref="S2.SS2.p1.15.m10.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.15.m10.1c">\bm{u}_{1}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.15.m10.1d">bold_italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT</annotation></semantics></math> (i.e., <math alttext="(\bm{u}_{1},\bm{u}_{2},\bm{u}_{3})=(\bm{x},\bm{0},\bm{n}^{\rm add})" class="ltx_Math" display="inline" id="S2.SS2.p1.16.m11.6"><semantics id="S2.SS2.p1.16.m11.6a"><mrow id="S2.SS2.p1.16.m11.6.6" xref="S2.SS2.p1.16.m11.6.6.cmml"><mrow id="S2.SS2.p1.16.m11.5.5.3.3" xref="S2.SS2.p1.16.m11.5.5.3.4.cmml"><mo id="S2.SS2.p1.16.m11.5.5.3.3.4" stretchy="false" xref="S2.SS2.p1.16.m11.5.5.3.4.cmml">(</mo><msub id="S2.SS2.p1.16.m11.3.3.1.1.1" xref="S2.SS2.p1.16.m11.3.3.1.1.1.cmml"><mi id="S2.SS2.p1.16.m11.3.3.1.1.1.2" xref="S2.SS2.p1.16.m11.3.3.1.1.1.2.cmml">𝒖</mi><mn id="S2.SS2.p1.16.m11.3.3.1.1.1.3" xref="S2.SS2.p1.16.m11.3.3.1.1.1.3.cmml">1</mn></msub><mo id="S2.SS2.p1.16.m11.5.5.3.3.5" xref="S2.SS2.p1.16.m11.5.5.3.4.cmml">,</mo><msub id="S2.SS2.p1.16.m11.4.4.2.2.2" xref="S2.SS2.p1.16.m11.4.4.2.2.2.cmml"><mi id="S2.SS2.p1.16.m11.4.4.2.2.2.2" xref="S2.SS2.p1.16.m11.4.4.2.2.2.2.cmml">𝒖</mi><mn id="S2.SS2.p1.16.m11.4.4.2.2.2.3" xref="S2.SS2.p1.16.m11.4.4.2.2.2.3.cmml">2</mn></msub><mo id="S2.SS2.p1.16.m11.5.5.3.3.6" xref="S2.SS2.p1.16.m11.5.5.3.4.cmml">,</mo><msub id="S2.SS2.p1.16.m11.5.5.3.3.3" xref="S2.SS2.p1.16.m11.5.5.3.3.3.cmml"><mi id="S2.SS2.p1.16.m11.5.5.3.3.3.2" xref="S2.SS2.p1.16.m11.5.5.3.3.3.2.cmml">𝒖</mi><mn id="S2.SS2.p1.16.m11.5.5.3.3.3.3" xref="S2.SS2.p1.16.m11.5.5.3.3.3.3.cmml">3</mn></msub><mo id="S2.SS2.p1.16.m11.5.5.3.3.7" stretchy="false" xref="S2.SS2.p1.16.m11.5.5.3.4.cmml">)</mo></mrow><mo id="S2.SS2.p1.16.m11.6.6.5" xref="S2.SS2.p1.16.m11.6.6.5.cmml">=</mo><mrow id="S2.SS2.p1.16.m11.6.6.4.1" xref="S2.SS2.p1.16.m11.6.6.4.2.cmml"><mo id="S2.SS2.p1.16.m11.6.6.4.1.2" stretchy="false" xref="S2.SS2.p1.16.m11.6.6.4.2.cmml">(</mo><mi id="S2.SS2.p1.16.m11.1.1" xref="S2.SS2.p1.16.m11.1.1.cmml">𝒙</mi><mo id="S2.SS2.p1.16.m11.6.6.4.1.3" xref="S2.SS2.p1.16.m11.6.6.4.2.cmml">,</mo><mn id="S2.SS2.p1.16.m11.2.2" xref="S2.SS2.p1.16.m11.2.2.cmml">𝟎</mn><mo id="S2.SS2.p1.16.m11.6.6.4.1.4" xref="S2.SS2.p1.16.m11.6.6.4.2.cmml">,</mo><msup id="S2.SS2.p1.16.m11.6.6.4.1.1" xref="S2.SS2.p1.16.m11.6.6.4.1.1.cmml"><mi id="S2.SS2.p1.16.m11.6.6.4.1.1.2" xref="S2.SS2.p1.16.m11.6.6.4.1.1.2.cmml">𝒏</mi><mi id="S2.SS2.p1.16.m11.6.6.4.1.1.3" xref="S2.SS2.p1.16.m11.6.6.4.1.1.3.cmml">add</mi></msup><mo id="S2.SS2.p1.16.m11.6.6.4.1.5" stretchy="false" xref="S2.SS2.p1.16.m11.6.6.4.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.16.m11.6b"><apply id="S2.SS2.p1.16.m11.6.6.cmml" xref="S2.SS2.p1.16.m11.6.6"><eq id="S2.SS2.p1.16.m11.6.6.5.cmml" xref="S2.SS2.p1.16.m11.6.6.5"></eq><vector id="S2.SS2.p1.16.m11.5.5.3.4.cmml" xref="S2.SS2.p1.16.m11.5.5.3.3"><apply id="S2.SS2.p1.16.m11.3.3.1.1.1.cmml" xref="S2.SS2.p1.16.m11.3.3.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.16.m11.3.3.1.1.1.1.cmml" xref="S2.SS2.p1.16.m11.3.3.1.1.1">subscript</csymbol><ci id="S2.SS2.p1.16.m11.3.3.1.1.1.2.cmml" xref="S2.SS2.p1.16.m11.3.3.1.1.1.2">𝒖</ci><cn id="S2.SS2.p1.16.m11.3.3.1.1.1.3.cmml" type="integer" xref="S2.SS2.p1.16.m11.3.3.1.1.1.3">1</cn></apply><apply id="S2.SS2.p1.16.m11.4.4.2.2.2.cmml" xref="S2.SS2.p1.16.m11.4.4.2.2.2"><csymbol cd="ambiguous" id="S2.SS2.p1.16.m11.4.4.2.2.2.1.cmml" xref="S2.SS2.p1.16.m11.4.4.2.2.2">subscript</csymbol><ci id="S2.SS2.p1.16.m11.4.4.2.2.2.2.cmml" xref="S2.SS2.p1.16.m11.4.4.2.2.2.2">𝒖</ci><cn id="S2.SS2.p1.16.m11.4.4.2.2.2.3.cmml" type="integer" xref="S2.SS2.p1.16.m11.4.4.2.2.2.3">2</cn></apply><apply id="S2.SS2.p1.16.m11.5.5.3.3.3.cmml" xref="S2.SS2.p1.16.m11.5.5.3.3.3"><csymbol cd="ambiguous" id="S2.SS2.p1.16.m11.5.5.3.3.3.1.cmml" xref="S2.SS2.p1.16.m11.5.5.3.3.3">subscript</csymbol><ci id="S2.SS2.p1.16.m11.5.5.3.3.3.2.cmml" xref="S2.SS2.p1.16.m11.5.5.3.3.3.2">𝒖</ci><cn id="S2.SS2.p1.16.m11.5.5.3.3.3.3.cmml" type="integer" xref="S2.SS2.p1.16.m11.5.5.3.3.3.3">3</cn></apply></vector><vector id="S2.SS2.p1.16.m11.6.6.4.2.cmml" xref="S2.SS2.p1.16.m11.6.6.4.1"><ci id="S2.SS2.p1.16.m11.1.1.cmml" xref="S2.SS2.p1.16.m11.1.1">𝒙</ci><cn id="S2.SS2.p1.16.m11.2.2.cmml" type="integer" xref="S2.SS2.p1.16.m11.2.2">0</cn><apply id="S2.SS2.p1.16.m11.6.6.4.1.1.cmml" xref="S2.SS2.p1.16.m11.6.6.4.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.16.m11.6.6.4.1.1.1.cmml" xref="S2.SS2.p1.16.m11.6.6.4.1.1">superscript</csymbol><ci id="S2.SS2.p1.16.m11.6.6.4.1.1.2.cmml" xref="S2.SS2.p1.16.m11.6.6.4.1.1.2">𝒏</ci><ci id="S2.SS2.p1.16.m11.6.6.4.1.1.3.cmml" xref="S2.SS2.p1.16.m11.6.6.4.1.1.3">add</ci></apply></vector></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.16.m11.6c">(\bm{u}_{1},\bm{u}_{2},\bm{u}_{3})=(\bm{x},\bm{0},\bm{n}^{\rm add})</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.16.m11.6d">( bold_italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , bold_italic_u start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) = ( bold_italic_x , bold_0 , bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT )</annotation></semantics></math>), this issue can be avoided under the assumption that <math alttext="\bm{y}" class="ltx_Math" display="inline" id="S2.SS2.p1.17.m12.1"><semantics id="S2.SS2.p1.17.m12.1a"><mi id="S2.SS2.p1.17.m12.1.1" xref="S2.SS2.p1.17.m12.1.1.cmml">𝒚</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.17.m12.1b"><ci id="S2.SS2.p1.17.m12.1.1.cmml" xref="S2.SS2.p1.17.m12.1.1">𝒚</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.17.m12.1c">\bm{y}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.17.m12.1d">bold_italic_y</annotation></semantics></math> does not provide any information indicating that <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S2.SS2.p1.18.m13.1"><semantics id="S2.SS2.p1.18.m13.1a"><mi id="S2.SS2.p1.18.m13.1.1" xref="S2.SS2.p1.18.m13.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.18.m13.1b"><ci id="S2.SS2.p1.18.m13.1.1.cmml" xref="S2.SS2.p1.18.m13.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.18.m13.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.18.m13.1d">bold_italic_x</annotation></semantics></math> consists of <math alttext="\bm{s}" class="ltx_Math" display="inline" id="S2.SS2.p1.19.m14.1"><semantics id="S2.SS2.p1.19.m14.1a"><mi id="S2.SS2.p1.19.m14.1.1" xref="S2.SS2.p1.19.m14.1.1.cmml">𝒔</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.19.m14.1b"><ci id="S2.SS2.p1.19.m14.1.1.cmml" xref="S2.SS2.p1.19.m14.1.1">𝒔</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.19.m14.1c">\bm{s}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.19.m14.1d">bold_italic_s</annotation></semantics></math> and <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S2.SS2.p1.20.m15.1"><semantics id="S2.SS2.p1.20.m15.1a"><msup id="S2.SS2.p1.20.m15.1.1" xref="S2.SS2.p1.20.m15.1.1.cmml"><mi id="S2.SS2.p1.20.m15.1.1.2" xref="S2.SS2.p1.20.m15.1.1.2.cmml">𝒏</mi><mi id="S2.SS2.p1.20.m15.1.1.3" xref="S2.SS2.p1.20.m15.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.20.m15.1b"><apply id="S2.SS2.p1.20.m15.1.1.cmml" xref="S2.SS2.p1.20.m15.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.20.m15.1.1.1.cmml" xref="S2.SS2.p1.20.m15.1.1">superscript</csymbol><ci id="S2.SS2.p1.20.m15.1.1.2.cmml" xref="S2.SS2.p1.20.m15.1.1.2">𝒏</ci><ci id="S2.SS2.p1.20.m15.1.1.3.cmml" xref="S2.SS2.p1.20.m15.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.20.m15.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.20.m15.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>. Under this assumption, the DNN cannot estimate a pair of signals that compose <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S2.SS2.p1.21.m16.1"><semantics id="S2.SS2.p1.21.m16.1a"><mi id="S2.SS2.p1.21.m16.1.1" xref="S2.SS2.p1.21.m16.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.21.m16.1b"><ci id="S2.SS2.p1.21.m16.1.1.cmml" xref="S2.SS2.p1.21.m16.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.21.m16.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.21.m16.1d">bold_italic_x</annotation></semantics></math> and, therefore, cannot always accurately estimate <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S2.SS2.p1.22.m17.1"><semantics id="S2.SS2.p1.22.m17.1a"><mi id="S2.SS2.p1.22.m17.1.1" xref="S2.SS2.p1.22.m17.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.22.m17.1b"><ci id="S2.SS2.p1.22.m17.1.1.cmml" xref="S2.SS2.p1.22.m17.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.22.m17.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.22.m17.1d">bold_italic_x</annotation></semantics></math>. Consequently, the DNN is trained to separate <math alttext="\bm{y}" class="ltx_Math" display="inline" id="S2.SS2.p1.23.m18.1"><semantics id="S2.SS2.p1.23.m18.1a"><mi id="S2.SS2.p1.23.m18.1.1" xref="S2.SS2.p1.23.m18.1.1.cmml">𝒚</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.23.m18.1b"><ci id="S2.SS2.p1.23.m18.1.1.cmml" xref="S2.SS2.p1.23.m18.1.1">𝒚</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.23.m18.1c">\bm{y}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.23.m18.1d">bold_italic_y</annotation></semantics></math> into individual sources, as this always minimizes <math alttext="\mathcal{L}^{\rm MixIT}" class="ltx_Math" display="inline" id="S2.SS2.p1.24.m19.1"><semantics id="S2.SS2.p1.24.m19.1a"><msup id="S2.SS2.p1.24.m19.1.1" xref="S2.SS2.p1.24.m19.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.SS2.p1.24.m19.1.1.2" xref="S2.SS2.p1.24.m19.1.1.2.cmml">ℒ</mi><mi id="S2.SS2.p1.24.m19.1.1.3" xref="S2.SS2.p1.24.m19.1.1.3.cmml">MixIT</mi></msup><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.24.m19.1b"><apply id="S2.SS2.p1.24.m19.1.1.cmml" xref="S2.SS2.p1.24.m19.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.24.m19.1.1.1.cmml" xref="S2.SS2.p1.24.m19.1.1">superscript</csymbol><ci id="S2.SS2.p1.24.m19.1.1.2.cmml" xref="S2.SS2.p1.24.m19.1.1.2">ℒ</ci><ci id="S2.SS2.p1.24.m19.1.1.3.cmml" xref="S2.SS2.p1.24.m19.1.1.3">MixIT</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.24.m19.1c">\mathcal{L}^{\rm MixIT}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.24.m19.1d">caligraphic_L start_POSTSUPERSCRIPT roman_MixIT end_POSTSUPERSCRIPT</annotation></semantics></math>. When this assumption does not hold (e.g., when the characteristics of <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S2.SS2.p1.25.m20.1"><semantics id="S2.SS2.p1.25.m20.1a"><msup id="S2.SS2.p1.25.m20.1.1" xref="S2.SS2.p1.25.m20.1.1.cmml"><mi id="S2.SS2.p1.25.m20.1.1.2" xref="S2.SS2.p1.25.m20.1.1.2.cmml">𝒏</mi><mi id="S2.SS2.p1.25.m20.1.1.3" xref="S2.SS2.p1.25.m20.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.25.m20.1b"><apply id="S2.SS2.p1.25.m20.1.1.cmml" xref="S2.SS2.p1.25.m20.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.25.m20.1.1.1.cmml" xref="S2.SS2.p1.25.m20.1.1">superscript</csymbol><ci id="S2.SS2.p1.25.m20.1.1.2.cmml" xref="S2.SS2.p1.25.m20.1.1.2">𝒏</ci><ci id="S2.SS2.p1.25.m20.1.1.3.cmml" xref="S2.SS2.p1.25.m20.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.25.m20.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.25.m20.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S2.SS2.p1.26.m21.1"><semantics id="S2.SS2.p1.26.m21.1a"><msup id="S2.SS2.p1.26.m21.1.1" xref="S2.SS2.p1.26.m21.1.1.cmml"><mi id="S2.SS2.p1.26.m21.1.1.2" xref="S2.SS2.p1.26.m21.1.1.2.cmml">𝒏</mi><mi id="S2.SS2.p1.26.m21.1.1.3" xref="S2.SS2.p1.26.m21.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.26.m21.1b"><apply id="S2.SS2.p1.26.m21.1.1.cmml" xref="S2.SS2.p1.26.m21.1.1"><csymbol cd="ambiguous" id="S2.SS2.p1.26.m21.1.1.1.cmml" xref="S2.SS2.p1.26.m21.1.1">superscript</csymbol><ci id="S2.SS2.p1.26.m21.1.1.2.cmml" xref="S2.SS2.p1.26.m21.1.1.2">𝒏</ci><ci id="S2.SS2.p1.26.m21.1.1.3.cmml" xref="S2.SS2.p1.26.m21.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.26.m21.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.26.m21.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> differ and are distinguishable), it has been observed that MixIT suffers from performance degradation <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">saito2021training</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">maciejewski2021training</span>]</cite>.</p> </div> <div class="ltx_para" id="S2.SS2.p2"> <p class="ltx_p" id="S2.SS2.p2.1">MixIT is one of the major unsupervised TSE methods, and several improvements have been proposed. For instance, one method mitigated the overseparation problem by introducing a penalty term for the number of active sources and the correlation between the output sources <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">wisdom2021sparse</span>]</cite>. Other methods produced better separation by using a pre-trained classification model (e.g., an audio event classification or an ASR model) <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">wisdom2021sparse</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">trinh2022unsupervised</span>]</cite> or by employing a loss function that relaxes the training difficulty <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">saito2021training</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">maciejewski2021training</span>]</cite>. Furthermore, the teacher–student learning approach has also been adopted <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">zhang2021teacher</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">Tzinis_2022</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">karamatli2022mixcycle</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">saijo2023self</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">li2024remixed2remixed</span>]</cite>. In this approach, the student model is trained using the outputs of the teacher model pre-trained by MixIT as the pseudo-target signals.</p> </div> </section> <section class="ltx_subsection" id="S2.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">2.3 </span>NyTT</h3> <div class="ltx_para" id="S2.SS3.p1"> <p class="ltx_p" id="S2.SS3.p1.5">NyTT <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Fujimura_2021</span>]</cite> is an unsupervised training method designed for TSE (Fig. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S1.F1" title="Figure 1 ‣ 1 Introduction ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">1</span></a>(d)). In NyTT, noisy signals <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S2.SS3.p1.1.m1.1"><semantics id="S2.SS3.p1.1.m1.1a"><mi id="S2.SS3.p1.1.m1.1.1" xref="S2.SS3.p1.1.m1.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S2.SS3.p1.1.m1.1b"><ci id="S2.SS3.p1.1.m1.1.1.cmml" xref="S2.SS3.p1.1.m1.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS3.p1.1.m1.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S2.SS3.p1.1.m1.1d">bold_italic_x</annotation></semantics></math> and additional noise signals <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S2.SS3.p1.2.m2.1"><semantics id="S2.SS3.p1.2.m2.1a"><msup id="S2.SS3.p1.2.m2.1.1" xref="S2.SS3.p1.2.m2.1.1.cmml"><mi id="S2.SS3.p1.2.m2.1.1.2" xref="S2.SS3.p1.2.m2.1.1.2.cmml">𝒏</mi><mi id="S2.SS3.p1.2.m2.1.1.3" xref="S2.SS3.p1.2.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S2.SS3.p1.2.m2.1b"><apply id="S2.SS3.p1.2.m2.1.1.cmml" xref="S2.SS3.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S2.SS3.p1.2.m2.1.1.1.cmml" xref="S2.SS3.p1.2.m2.1.1">superscript</csymbol><ci id="S2.SS3.p1.2.m2.1.1.2.cmml" xref="S2.SS3.p1.2.m2.1.1.2">𝒏</ci><ci id="S2.SS3.p1.2.m2.1.1.3.cmml" xref="S2.SS3.p1.2.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS3.p1.2.m2.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S2.SS3.p1.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> are used as training data, and a <span class="ltx_text ltx_font_italic" id="S2.SS3.p1.5.1">more noisy</span> signal <math alttext="\bm{y}" class="ltx_Math" display="inline" id="S2.SS3.p1.3.m3.1"><semantics id="S2.SS3.p1.3.m3.1a"><mi id="S2.SS3.p1.3.m3.1.1" xref="S2.SS3.p1.3.m3.1.1.cmml">𝒚</mi><annotation-xml encoding="MathML-Content" id="S2.SS3.p1.3.m3.1b"><ci id="S2.SS3.p1.3.m3.1.1.cmml" xref="S2.SS3.p1.3.m3.1.1">𝒚</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS3.p1.3.m3.1c">\bm{y}</annotation><annotation encoding="application/x-llamapun" id="S2.SS3.p1.3.m3.1d">bold_italic_y</annotation></semantics></math> is generated as <math alttext="\bm{y}=\bm{x}+\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S2.SS3.p1.4.m4.1"><semantics id="S2.SS3.p1.4.m4.1a"><mrow id="S2.SS3.p1.4.m4.1.1" xref="S2.SS3.p1.4.m4.1.1.cmml"><mi id="S2.SS3.p1.4.m4.1.1.2" xref="S2.SS3.p1.4.m4.1.1.2.cmml">𝒚</mi><mo id="S2.SS3.p1.4.m4.1.1.1" xref="S2.SS3.p1.4.m4.1.1.1.cmml">=</mo><mrow id="S2.SS3.p1.4.m4.1.1.3" xref="S2.SS3.p1.4.m4.1.1.3.cmml"><mi id="S2.SS3.p1.4.m4.1.1.3.2" xref="S2.SS3.p1.4.m4.1.1.3.2.cmml">𝒙</mi><mo id="S2.SS3.p1.4.m4.1.1.3.1" xref="S2.SS3.p1.4.m4.1.1.3.1.cmml">+</mo><msup id="S2.SS3.p1.4.m4.1.1.3.3" xref="S2.SS3.p1.4.m4.1.1.3.3.cmml"><mi id="S2.SS3.p1.4.m4.1.1.3.3.2" xref="S2.SS3.p1.4.m4.1.1.3.3.2.cmml">𝒏</mi><mi id="S2.SS3.p1.4.m4.1.1.3.3.3" xref="S2.SS3.p1.4.m4.1.1.3.3.3.cmml">add</mi></msup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS3.p1.4.m4.1b"><apply id="S2.SS3.p1.4.m4.1.1.cmml" xref="S2.SS3.p1.4.m4.1.1"><eq id="S2.SS3.p1.4.m4.1.1.1.cmml" xref="S2.SS3.p1.4.m4.1.1.1"></eq><ci id="S2.SS3.p1.4.m4.1.1.2.cmml" xref="S2.SS3.p1.4.m4.1.1.2">𝒚</ci><apply id="S2.SS3.p1.4.m4.1.1.3.cmml" xref="S2.SS3.p1.4.m4.1.1.3"><plus id="S2.SS3.p1.4.m4.1.1.3.1.cmml" xref="S2.SS3.p1.4.m4.1.1.3.1"></plus><ci id="S2.SS3.p1.4.m4.1.1.3.2.cmml" xref="S2.SS3.p1.4.m4.1.1.3.2">𝒙</ci><apply id="S2.SS3.p1.4.m4.1.1.3.3.cmml" xref="S2.SS3.p1.4.m4.1.1.3.3"><csymbol cd="ambiguous" id="S2.SS3.p1.4.m4.1.1.3.3.1.cmml" xref="S2.SS3.p1.4.m4.1.1.3.3">superscript</csymbol><ci id="S2.SS3.p1.4.m4.1.1.3.3.2.cmml" xref="S2.SS3.p1.4.m4.1.1.3.3.2">𝒏</ci><ci id="S2.SS3.p1.4.m4.1.1.3.3.3.cmml" xref="S2.SS3.p1.4.m4.1.1.3.3.3">add</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS3.p1.4.m4.1c">\bm{y}=\bm{x}+\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S2.SS3.p1.4.m4.1d">bold_italic_y = bold_italic_x + bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>. NyTT trains a DNN to minimize the following prediction error <math alttext="\mathcal{L}^{\rm NyTT}" class="ltx_Math" display="inline" id="S2.SS3.p1.5.m5.1"><semantics id="S2.SS3.p1.5.m5.1a"><msup id="S2.SS3.p1.5.m5.1.1" xref="S2.SS3.p1.5.m5.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.SS3.p1.5.m5.1.1.2" xref="S2.SS3.p1.5.m5.1.1.2.cmml">ℒ</mi><mi id="S2.SS3.p1.5.m5.1.1.3" xref="S2.SS3.p1.5.m5.1.1.3.cmml">NyTT</mi></msup><annotation-xml encoding="MathML-Content" id="S2.SS3.p1.5.m5.1b"><apply id="S2.SS3.p1.5.m5.1.1.cmml" xref="S2.SS3.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S2.SS3.p1.5.m5.1.1.1.cmml" xref="S2.SS3.p1.5.m5.1.1">superscript</csymbol><ci id="S2.SS3.p1.5.m5.1.1.2.cmml" xref="S2.SS3.p1.5.m5.1.1.2">ℒ</ci><ci id="S2.SS3.p1.5.m5.1.1.3.cmml" xref="S2.SS3.p1.5.m5.1.1.3">NyTT</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS3.p1.5.m5.1c">\mathcal{L}^{\rm NyTT}</annotation><annotation encoding="application/x-llamapun" id="S2.SS3.p1.5.m5.1d">caligraphic_L start_POSTSUPERSCRIPT roman_NyTT end_POSTSUPERSCRIPT</annotation></semantics></math>:</p> <table class="ltx_equationgroup ltx_eqn_align ltx_eqn_table" id="Sx2.EGx4"> <tbody id="S2.E9"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\mathcal{L}^{\rm NyTT}=\mathbb{E}_{(\bm{y},\,\bm{x})\sim\mathcal{% D}}\left[L(f(\bm{y};\theta),\bm{x})\right]." class="ltx_Math" display="inline" id="S2.E9.m1.6"><semantics id="S2.E9.m1.6a"><mrow id="S2.E9.m1.6.6.1" xref="S2.E9.m1.6.6.1.1.cmml"><mrow id="S2.E9.m1.6.6.1.1" xref="S2.E9.m1.6.6.1.1.cmml"><msup id="S2.E9.m1.6.6.1.1.3" xref="S2.E9.m1.6.6.1.1.3.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E9.m1.6.6.1.1.3.2" xref="S2.E9.m1.6.6.1.1.3.2.cmml">ℒ</mi><mi id="S2.E9.m1.6.6.1.1.3.3" xref="S2.E9.m1.6.6.1.1.3.3.cmml">NyTT</mi></msup><mo id="S2.E9.m1.6.6.1.1.2" xref="S2.E9.m1.6.6.1.1.2.cmml">=</mo><mrow id="S2.E9.m1.6.6.1.1.1" xref="S2.E9.m1.6.6.1.1.1.cmml"><msub id="S2.E9.m1.6.6.1.1.1.3" xref="S2.E9.m1.6.6.1.1.1.3.cmml"><mi id="S2.E9.m1.6.6.1.1.1.3.2" xref="S2.E9.m1.6.6.1.1.1.3.2.cmml">𝔼</mi><mrow id="S2.E9.m1.2.2.2" xref="S2.E9.m1.2.2.2.cmml"><mrow id="S2.E9.m1.2.2.2.4.2" xref="S2.E9.m1.2.2.2.4.1.cmml"><mo id="S2.E9.m1.2.2.2.4.2.1" stretchy="false" xref="S2.E9.m1.2.2.2.4.1.cmml">(</mo><mi id="S2.E9.m1.1.1.1.1" xref="S2.E9.m1.1.1.1.1.cmml">𝒚</mi><mo id="S2.E9.m1.2.2.2.4.2.2" rspace="0.337em" xref="S2.E9.m1.2.2.2.4.1.cmml">,</mo><mi id="S2.E9.m1.2.2.2.2" xref="S2.E9.m1.2.2.2.2.cmml">𝒙</mi><mo id="S2.E9.m1.2.2.2.4.2.3" stretchy="false" xref="S2.E9.m1.2.2.2.4.1.cmml">)</mo></mrow><mo id="S2.E9.m1.2.2.2.3" xref="S2.E9.m1.2.2.2.3.cmml">∼</mo><mi class="ltx_font_mathcaligraphic" id="S2.E9.m1.2.2.2.5" xref="S2.E9.m1.2.2.2.5.cmml">𝒟</mi></mrow></msub><mo id="S2.E9.m1.6.6.1.1.1.2" xref="S2.E9.m1.6.6.1.1.1.2.cmml">⁢</mo><mrow id="S2.E9.m1.6.6.1.1.1.1.1" xref="S2.E9.m1.6.6.1.1.1.1.2.cmml"><mo id="S2.E9.m1.6.6.1.1.1.1.1.2" xref="S2.E9.m1.6.6.1.1.1.1.2.1.cmml">[</mo><mrow id="S2.E9.m1.6.6.1.1.1.1.1.1" xref="S2.E9.m1.6.6.1.1.1.1.1.1.cmml"><mi id="S2.E9.m1.6.6.1.1.1.1.1.1.3" xref="S2.E9.m1.6.6.1.1.1.1.1.1.3.cmml">L</mi><mo id="S2.E9.m1.6.6.1.1.1.1.1.1.2" xref="S2.E9.m1.6.6.1.1.1.1.1.1.2.cmml">⁢</mo><mrow id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.2.cmml"><mo id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.2" stretchy="false" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.2.cmml">(</mo><mrow id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.cmml"><mi id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.2" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.2.cmml">f</mi><mo id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.1" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.1.cmml">⁢</mo><mrow id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.3.2" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.3.1.cmml"><mo id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.3.2.1" stretchy="false" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.3.1.cmml">(</mo><mi id="S2.E9.m1.3.3" xref="S2.E9.m1.3.3.cmml">𝒚</mi><mo id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.3.2.2" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.3.1.cmml">;</mo><mi id="S2.E9.m1.4.4" xref="S2.E9.m1.4.4.cmml">θ</mi><mo id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.3.2.3" stretchy="false" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.3.1.cmml">)</mo></mrow></mrow><mo id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.3" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.2.cmml">,</mo><mi id="S2.E9.m1.5.5" xref="S2.E9.m1.5.5.cmml">𝒙</mi><mo id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.4" stretchy="false" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.2.cmml">)</mo></mrow></mrow><mo id="S2.E9.m1.6.6.1.1.1.1.1.3" xref="S2.E9.m1.6.6.1.1.1.1.2.1.cmml">]</mo></mrow></mrow></mrow><mo id="S2.E9.m1.6.6.1.2" lspace="0em" xref="S2.E9.m1.6.6.1.1.cmml">.</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.E9.m1.6b"><apply id="S2.E9.m1.6.6.1.1.cmml" xref="S2.E9.m1.6.6.1"><eq id="S2.E9.m1.6.6.1.1.2.cmml" xref="S2.E9.m1.6.6.1.1.2"></eq><apply id="S2.E9.m1.6.6.1.1.3.cmml" xref="S2.E9.m1.6.6.1.1.3"><csymbol cd="ambiguous" id="S2.E9.m1.6.6.1.1.3.1.cmml" xref="S2.E9.m1.6.6.1.1.3">superscript</csymbol><ci id="S2.E9.m1.6.6.1.1.3.2.cmml" xref="S2.E9.m1.6.6.1.1.3.2">ℒ</ci><ci id="S2.E9.m1.6.6.1.1.3.3.cmml" xref="S2.E9.m1.6.6.1.1.3.3">NyTT</ci></apply><apply id="S2.E9.m1.6.6.1.1.1.cmml" xref="S2.E9.m1.6.6.1.1.1"><times id="S2.E9.m1.6.6.1.1.1.2.cmml" xref="S2.E9.m1.6.6.1.1.1.2"></times><apply id="S2.E9.m1.6.6.1.1.1.3.cmml" xref="S2.E9.m1.6.6.1.1.1.3"><csymbol cd="ambiguous" id="S2.E9.m1.6.6.1.1.1.3.1.cmml" xref="S2.E9.m1.6.6.1.1.1.3">subscript</csymbol><ci id="S2.E9.m1.6.6.1.1.1.3.2.cmml" xref="S2.E9.m1.6.6.1.1.1.3.2">𝔼</ci><apply id="S2.E9.m1.2.2.2.cmml" xref="S2.E9.m1.2.2.2"><csymbol cd="latexml" id="S2.E9.m1.2.2.2.3.cmml" xref="S2.E9.m1.2.2.2.3">similar-to</csymbol><interval closure="open" id="S2.E9.m1.2.2.2.4.1.cmml" xref="S2.E9.m1.2.2.2.4.2"><ci id="S2.E9.m1.1.1.1.1.cmml" xref="S2.E9.m1.1.1.1.1">𝒚</ci><ci id="S2.E9.m1.2.2.2.2.cmml" xref="S2.E9.m1.2.2.2.2">𝒙</ci></interval><ci id="S2.E9.m1.2.2.2.5.cmml" xref="S2.E9.m1.2.2.2.5">𝒟</ci></apply></apply><apply id="S2.E9.m1.6.6.1.1.1.1.2.cmml" xref="S2.E9.m1.6.6.1.1.1.1.1"><csymbol cd="latexml" id="S2.E9.m1.6.6.1.1.1.1.2.1.cmml" xref="S2.E9.m1.6.6.1.1.1.1.1.2">delimited-[]</csymbol><apply id="S2.E9.m1.6.6.1.1.1.1.1.1.cmml" xref="S2.E9.m1.6.6.1.1.1.1.1.1"><times id="S2.E9.m1.6.6.1.1.1.1.1.1.2.cmml" xref="S2.E9.m1.6.6.1.1.1.1.1.1.2"></times><ci id="S2.E9.m1.6.6.1.1.1.1.1.1.3.cmml" xref="S2.E9.m1.6.6.1.1.1.1.1.1.3">𝐿</ci><interval closure="open" id="S2.E9.m1.6.6.1.1.1.1.1.1.1.2.cmml" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1"><apply id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1"><times id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.1"></times><ci id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.2">𝑓</ci><list id="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S2.E9.m1.6.6.1.1.1.1.1.1.1.1.1.3.2"><ci id="S2.E9.m1.3.3.cmml" xref="S2.E9.m1.3.3">𝒚</ci><ci id="S2.E9.m1.4.4.cmml" xref="S2.E9.m1.4.4">𝜃</ci></list></apply><ci id="S2.E9.m1.5.5.cmml" xref="S2.E9.m1.5.5">𝒙</ci></interval></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E9.m1.6c">\displaystyle\mathcal{L}^{\rm NyTT}=\mathbb{E}_{(\bm{y},\,\bm{x})\sim\mathcal{% D}}\left[L(f(\bm{y};\theta),\bm{x})\right].</annotation><annotation encoding="application/x-llamapun" id="S2.E9.m1.6d">caligraphic_L start_POSTSUPERSCRIPT roman_NyTT end_POSTSUPERSCRIPT = blackboard_E start_POSTSUBSCRIPT ( bold_italic_y , bold_italic_x ) ∼ caligraphic_D end_POSTSUBSCRIPT [ italic_L ( italic_f ( bold_italic_y ; italic_θ ) , bold_italic_x ) ] .</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(9)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS3.p1.7">NyTT was inspired by Noise2Noise and realizes Noise2Noise training in the TSE task by considering <math alttext="\bm{y}=\bm{s}+(\bm{n}^{\rm obs}+\bm{n}^{\rm add})=\bm{s}+\bm{n}^{(1)}" class="ltx_Math" display="inline" id="S2.SS3.p1.6.m1.2"><semantics id="S2.SS3.p1.6.m1.2a"><mrow id="S2.SS3.p1.6.m1.2.2" xref="S2.SS3.p1.6.m1.2.2.cmml"><mi id="S2.SS3.p1.6.m1.2.2.3" xref="S2.SS3.p1.6.m1.2.2.3.cmml">𝒚</mi><mo id="S2.SS3.p1.6.m1.2.2.4" xref="S2.SS3.p1.6.m1.2.2.4.cmml">=</mo><mrow id="S2.SS3.p1.6.m1.2.2.1" xref="S2.SS3.p1.6.m1.2.2.1.cmml"><mi id="S2.SS3.p1.6.m1.2.2.1.3" xref="S2.SS3.p1.6.m1.2.2.1.3.cmml">𝒔</mi><mo id="S2.SS3.p1.6.m1.2.2.1.2" xref="S2.SS3.p1.6.m1.2.2.1.2.cmml">+</mo><mrow id="S2.SS3.p1.6.m1.2.2.1.1.1" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.cmml"><mo id="S2.SS3.p1.6.m1.2.2.1.1.1.2" stretchy="false" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.cmml">(</mo><mrow id="S2.SS3.p1.6.m1.2.2.1.1.1.1" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.cmml"><msup id="S2.SS3.p1.6.m1.2.2.1.1.1.1.2" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.2.cmml"><mi id="S2.SS3.p1.6.m1.2.2.1.1.1.1.2.2" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.2.2.cmml">𝒏</mi><mi id="S2.SS3.p1.6.m1.2.2.1.1.1.1.2.3" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.2.3.cmml">obs</mi></msup><mo id="S2.SS3.p1.6.m1.2.2.1.1.1.1.1" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.1.cmml">+</mo><msup id="S2.SS3.p1.6.m1.2.2.1.1.1.1.3" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.3.cmml"><mi id="S2.SS3.p1.6.m1.2.2.1.1.1.1.3.2" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.3.2.cmml">𝒏</mi><mi id="S2.SS3.p1.6.m1.2.2.1.1.1.1.3.3" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.3.3.cmml">add</mi></msup></mrow><mo id="S2.SS3.p1.6.m1.2.2.1.1.1.3" stretchy="false" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S2.SS3.p1.6.m1.2.2.5" xref="S2.SS3.p1.6.m1.2.2.5.cmml">=</mo><mrow id="S2.SS3.p1.6.m1.2.2.6" xref="S2.SS3.p1.6.m1.2.2.6.cmml"><mi id="S2.SS3.p1.6.m1.2.2.6.2" xref="S2.SS3.p1.6.m1.2.2.6.2.cmml">𝒔</mi><mo id="S2.SS3.p1.6.m1.2.2.6.1" xref="S2.SS3.p1.6.m1.2.2.6.1.cmml">+</mo><msup id="S2.SS3.p1.6.m1.2.2.6.3" xref="S2.SS3.p1.6.m1.2.2.6.3.cmml"><mi id="S2.SS3.p1.6.m1.2.2.6.3.2" xref="S2.SS3.p1.6.m1.2.2.6.3.2.cmml">𝒏</mi><mrow id="S2.SS3.p1.6.m1.1.1.1.3" xref="S2.SS3.p1.6.m1.2.2.6.3.cmml"><mo id="S2.SS3.p1.6.m1.1.1.1.3.1" stretchy="false" xref="S2.SS3.p1.6.m1.2.2.6.3.cmml">(</mo><mn id="S2.SS3.p1.6.m1.1.1.1.1" xref="S2.SS3.p1.6.m1.1.1.1.1.cmml">1</mn><mo id="S2.SS3.p1.6.m1.1.1.1.3.2" stretchy="false" xref="S2.SS3.p1.6.m1.2.2.6.3.cmml">)</mo></mrow></msup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS3.p1.6.m1.2b"><apply id="S2.SS3.p1.6.m1.2.2.cmml" xref="S2.SS3.p1.6.m1.2.2"><and id="S2.SS3.p1.6.m1.2.2a.cmml" xref="S2.SS3.p1.6.m1.2.2"></and><apply id="S2.SS3.p1.6.m1.2.2b.cmml" xref="S2.SS3.p1.6.m1.2.2"><eq id="S2.SS3.p1.6.m1.2.2.4.cmml" xref="S2.SS3.p1.6.m1.2.2.4"></eq><ci id="S2.SS3.p1.6.m1.2.2.3.cmml" xref="S2.SS3.p1.6.m1.2.2.3">𝒚</ci><apply id="S2.SS3.p1.6.m1.2.2.1.cmml" xref="S2.SS3.p1.6.m1.2.2.1"><plus id="S2.SS3.p1.6.m1.2.2.1.2.cmml" xref="S2.SS3.p1.6.m1.2.2.1.2"></plus><ci id="S2.SS3.p1.6.m1.2.2.1.3.cmml" xref="S2.SS3.p1.6.m1.2.2.1.3">𝒔</ci><apply id="S2.SS3.p1.6.m1.2.2.1.1.1.1.cmml" xref="S2.SS3.p1.6.m1.2.2.1.1.1"><plus id="S2.SS3.p1.6.m1.2.2.1.1.1.1.1.cmml" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.1"></plus><apply id="S2.SS3.p1.6.m1.2.2.1.1.1.1.2.cmml" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS3.p1.6.m1.2.2.1.1.1.1.2.1.cmml" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.2">superscript</csymbol><ci id="S2.SS3.p1.6.m1.2.2.1.1.1.1.2.2.cmml" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.2.2">𝒏</ci><ci id="S2.SS3.p1.6.m1.2.2.1.1.1.1.2.3.cmml" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.2.3">obs</ci></apply><apply id="S2.SS3.p1.6.m1.2.2.1.1.1.1.3.cmml" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS3.p1.6.m1.2.2.1.1.1.1.3.1.cmml" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.3">superscript</csymbol><ci id="S2.SS3.p1.6.m1.2.2.1.1.1.1.3.2.cmml" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.3.2">𝒏</ci><ci id="S2.SS3.p1.6.m1.2.2.1.1.1.1.3.3.cmml" xref="S2.SS3.p1.6.m1.2.2.1.1.1.1.3.3">add</ci></apply></apply></apply></apply><apply id="S2.SS3.p1.6.m1.2.2c.cmml" xref="S2.SS3.p1.6.m1.2.2"><eq id="S2.SS3.p1.6.m1.2.2.5.cmml" xref="S2.SS3.p1.6.m1.2.2.5"></eq><share href="https://arxiv.org/html/2503.14854v1#S2.SS3.p1.6.m1.2.2.1.cmml" id="S2.SS3.p1.6.m1.2.2d.cmml" xref="S2.SS3.p1.6.m1.2.2"></share><apply id="S2.SS3.p1.6.m1.2.2.6.cmml" xref="S2.SS3.p1.6.m1.2.2.6"><plus id="S2.SS3.p1.6.m1.2.2.6.1.cmml" xref="S2.SS3.p1.6.m1.2.2.6.1"></plus><ci id="S2.SS3.p1.6.m1.2.2.6.2.cmml" xref="S2.SS3.p1.6.m1.2.2.6.2">𝒔</ci><apply id="S2.SS3.p1.6.m1.2.2.6.3.cmml" xref="S2.SS3.p1.6.m1.2.2.6.3"><csymbol cd="ambiguous" id="S2.SS3.p1.6.m1.2.2.6.3.1.cmml" xref="S2.SS3.p1.6.m1.2.2.6.3">superscript</csymbol><ci id="S2.SS3.p1.6.m1.2.2.6.3.2.cmml" xref="S2.SS3.p1.6.m1.2.2.6.3.2">𝒏</ci><cn id="S2.SS3.p1.6.m1.1.1.1.1.cmml" type="integer" xref="S2.SS3.p1.6.m1.1.1.1.1">1</cn></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS3.p1.6.m1.2c">\bm{y}=\bm{s}+(\bm{n}^{\rm obs}+\bm{n}^{\rm add})=\bm{s}+\bm{n}^{(1)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS3.p1.6.m1.2d">bold_italic_y = bold_italic_s + ( bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT + bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT ) = bold_italic_s + bold_italic_n start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{x}=\bm{s}+\bm{n}^{\rm obs}=\bm{s}+\bm{n}^{(2)}" class="ltx_Math" display="inline" id="S2.SS3.p1.7.m2.1"><semantics id="S2.SS3.p1.7.m2.1a"><mrow id="S2.SS3.p1.7.m2.1.2" xref="S2.SS3.p1.7.m2.1.2.cmml"><mi id="S2.SS3.p1.7.m2.1.2.2" xref="S2.SS3.p1.7.m2.1.2.2.cmml">𝒙</mi><mo id="S2.SS3.p1.7.m2.1.2.3" xref="S2.SS3.p1.7.m2.1.2.3.cmml">=</mo><mrow id="S2.SS3.p1.7.m2.1.2.4" xref="S2.SS3.p1.7.m2.1.2.4.cmml"><mi id="S2.SS3.p1.7.m2.1.2.4.2" xref="S2.SS3.p1.7.m2.1.2.4.2.cmml">𝒔</mi><mo id="S2.SS3.p1.7.m2.1.2.4.1" xref="S2.SS3.p1.7.m2.1.2.4.1.cmml">+</mo><msup id="S2.SS3.p1.7.m2.1.2.4.3" xref="S2.SS3.p1.7.m2.1.2.4.3.cmml"><mi id="S2.SS3.p1.7.m2.1.2.4.3.2" xref="S2.SS3.p1.7.m2.1.2.4.3.2.cmml">𝒏</mi><mi id="S2.SS3.p1.7.m2.1.2.4.3.3" xref="S2.SS3.p1.7.m2.1.2.4.3.3.cmml">obs</mi></msup></mrow><mo id="S2.SS3.p1.7.m2.1.2.5" xref="S2.SS3.p1.7.m2.1.2.5.cmml">=</mo><mrow id="S2.SS3.p1.7.m2.1.2.6" xref="S2.SS3.p1.7.m2.1.2.6.cmml"><mi id="S2.SS3.p1.7.m2.1.2.6.2" xref="S2.SS3.p1.7.m2.1.2.6.2.cmml">𝒔</mi><mo id="S2.SS3.p1.7.m2.1.2.6.1" xref="S2.SS3.p1.7.m2.1.2.6.1.cmml">+</mo><msup id="S2.SS3.p1.7.m2.1.2.6.3" xref="S2.SS3.p1.7.m2.1.2.6.3.cmml"><mi id="S2.SS3.p1.7.m2.1.2.6.3.2" xref="S2.SS3.p1.7.m2.1.2.6.3.2.cmml">𝒏</mi><mrow id="S2.SS3.p1.7.m2.1.1.1.3" xref="S2.SS3.p1.7.m2.1.2.6.3.cmml"><mo id="S2.SS3.p1.7.m2.1.1.1.3.1" stretchy="false" xref="S2.SS3.p1.7.m2.1.2.6.3.cmml">(</mo><mn id="S2.SS3.p1.7.m2.1.1.1.1" xref="S2.SS3.p1.7.m2.1.1.1.1.cmml">2</mn><mo id="S2.SS3.p1.7.m2.1.1.1.3.2" stretchy="false" xref="S2.SS3.p1.7.m2.1.2.6.3.cmml">)</mo></mrow></msup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS3.p1.7.m2.1b"><apply id="S2.SS3.p1.7.m2.1.2.cmml" xref="S2.SS3.p1.7.m2.1.2"><and id="S2.SS3.p1.7.m2.1.2a.cmml" xref="S2.SS3.p1.7.m2.1.2"></and><apply id="S2.SS3.p1.7.m2.1.2b.cmml" xref="S2.SS3.p1.7.m2.1.2"><eq id="S2.SS3.p1.7.m2.1.2.3.cmml" xref="S2.SS3.p1.7.m2.1.2.3"></eq><ci id="S2.SS3.p1.7.m2.1.2.2.cmml" xref="S2.SS3.p1.7.m2.1.2.2">𝒙</ci><apply id="S2.SS3.p1.7.m2.1.2.4.cmml" xref="S2.SS3.p1.7.m2.1.2.4"><plus id="S2.SS3.p1.7.m2.1.2.4.1.cmml" xref="S2.SS3.p1.7.m2.1.2.4.1"></plus><ci id="S2.SS3.p1.7.m2.1.2.4.2.cmml" xref="S2.SS3.p1.7.m2.1.2.4.2">𝒔</ci><apply id="S2.SS3.p1.7.m2.1.2.4.3.cmml" xref="S2.SS3.p1.7.m2.1.2.4.3"><csymbol cd="ambiguous" id="S2.SS3.p1.7.m2.1.2.4.3.1.cmml" xref="S2.SS3.p1.7.m2.1.2.4.3">superscript</csymbol><ci id="S2.SS3.p1.7.m2.1.2.4.3.2.cmml" xref="S2.SS3.p1.7.m2.1.2.4.3.2">𝒏</ci><ci id="S2.SS3.p1.7.m2.1.2.4.3.3.cmml" xref="S2.SS3.p1.7.m2.1.2.4.3.3">obs</ci></apply></apply></apply><apply id="S2.SS3.p1.7.m2.1.2c.cmml" xref="S2.SS3.p1.7.m2.1.2"><eq id="S2.SS3.p1.7.m2.1.2.5.cmml" xref="S2.SS3.p1.7.m2.1.2.5"></eq><share href="https://arxiv.org/html/2503.14854v1#S2.SS3.p1.7.m2.1.2.4.cmml" id="S2.SS3.p1.7.m2.1.2d.cmml" xref="S2.SS3.p1.7.m2.1.2"></share><apply id="S2.SS3.p1.7.m2.1.2.6.cmml" xref="S2.SS3.p1.7.m2.1.2.6"><plus id="S2.SS3.p1.7.m2.1.2.6.1.cmml" xref="S2.SS3.p1.7.m2.1.2.6.1"></plus><ci id="S2.SS3.p1.7.m2.1.2.6.2.cmml" xref="S2.SS3.p1.7.m2.1.2.6.2">𝒔</ci><apply id="S2.SS3.p1.7.m2.1.2.6.3.cmml" xref="S2.SS3.p1.7.m2.1.2.6.3"><csymbol cd="ambiguous" id="S2.SS3.p1.7.m2.1.2.6.3.1.cmml" xref="S2.SS3.p1.7.m2.1.2.6.3">superscript</csymbol><ci id="S2.SS3.p1.7.m2.1.2.6.3.2.cmml" xref="S2.SS3.p1.7.m2.1.2.6.3.2">𝒏</ci><cn id="S2.SS3.p1.7.m2.1.1.1.1.cmml" type="integer" xref="S2.SS3.p1.7.m2.1.1.1.1">2</cn></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS3.p1.7.m2.1c">\bm{x}=\bm{s}+\bm{n}^{\rm obs}=\bm{s}+\bm{n}^{(2)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS3.p1.7.m2.1d">bold_italic_x = bold_italic_s + bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT = bold_italic_s + bold_italic_n start_POSTSUPERSCRIPT ( 2 ) end_POSTSUPERSCRIPT</annotation></semantics></math> as pairs of noisy signals. The prediction error is calculated using MSE in the time domain, under the assumption that the noise has a zero-mean distribution. Despite the lack of theoretical proof, NyTT has been experimentally demonstrated to achieve TSE without clean target signals <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Fujimura_2021</span>]</cite>.</p> </div> <div class="ltx_para" id="S2.SS3.p2"> <p class="ltx_p" id="S2.SS3.p2.3">Although NyTT and MixIT stem from different conceptual foundations, their resulting training algorithms are similar. The primary difference is which output is selected as the enhanced signal during the inference. MixIT involves a separation task, where <math alttext="\bm{y}" class="ltx_Math" display="inline" id="S2.SS3.p2.1.m1.1"><semantics id="S2.SS3.p2.1.m1.1a"><mi id="S2.SS3.p2.1.m1.1.1" xref="S2.SS3.p2.1.m1.1.1.cmml">𝒚</mi><annotation-xml encoding="MathML-Content" id="S2.SS3.p2.1.m1.1b"><ci id="S2.SS3.p2.1.m1.1.1.cmml" xref="S2.SS3.p2.1.m1.1.1">𝒚</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS3.p2.1.m1.1c">\bm{y}</annotation><annotation encoding="application/x-llamapun" id="S2.SS3.p2.1.m1.1d">bold_italic_y</annotation></semantics></math> is separated into several sound sources, and one of them is selected as the enhanced signal. In contrast, NyTT has only one slot and uses the output as the enhanced signal. Therefore, NyTT can be viewed as the simplified version of MixIT, where the DNN is trained with only <math alttext="L(\bm{u}_{1}+\bm{u}_{2},\bm{x})" class="ltx_Math" display="inline" id="S2.SS3.p2.2.m2.2"><semantics id="S2.SS3.p2.2.m2.2a"><mrow id="S2.SS3.p2.2.m2.2.2" xref="S2.SS3.p2.2.m2.2.2.cmml"><mi id="S2.SS3.p2.2.m2.2.2.3" xref="S2.SS3.p2.2.m2.2.2.3.cmml">L</mi><mo id="S2.SS3.p2.2.m2.2.2.2" xref="S2.SS3.p2.2.m2.2.2.2.cmml">⁢</mo><mrow id="S2.SS3.p2.2.m2.2.2.1.1" xref="S2.SS3.p2.2.m2.2.2.1.2.cmml"><mo id="S2.SS3.p2.2.m2.2.2.1.1.2" stretchy="false" xref="S2.SS3.p2.2.m2.2.2.1.2.cmml">(</mo><mrow id="S2.SS3.p2.2.m2.2.2.1.1.1" xref="S2.SS3.p2.2.m2.2.2.1.1.1.cmml"><msub id="S2.SS3.p2.2.m2.2.2.1.1.1.2" xref="S2.SS3.p2.2.m2.2.2.1.1.1.2.cmml"><mi id="S2.SS3.p2.2.m2.2.2.1.1.1.2.2" xref="S2.SS3.p2.2.m2.2.2.1.1.1.2.2.cmml">𝒖</mi><mn id="S2.SS3.p2.2.m2.2.2.1.1.1.2.3" xref="S2.SS3.p2.2.m2.2.2.1.1.1.2.3.cmml">1</mn></msub><mo id="S2.SS3.p2.2.m2.2.2.1.1.1.1" xref="S2.SS3.p2.2.m2.2.2.1.1.1.1.cmml">+</mo><msub id="S2.SS3.p2.2.m2.2.2.1.1.1.3" xref="S2.SS3.p2.2.m2.2.2.1.1.1.3.cmml"><mi id="S2.SS3.p2.2.m2.2.2.1.1.1.3.2" xref="S2.SS3.p2.2.m2.2.2.1.1.1.3.2.cmml">𝒖</mi><mn id="S2.SS3.p2.2.m2.2.2.1.1.1.3.3" xref="S2.SS3.p2.2.m2.2.2.1.1.1.3.3.cmml">2</mn></msub></mrow><mo id="S2.SS3.p2.2.m2.2.2.1.1.3" xref="S2.SS3.p2.2.m2.2.2.1.2.cmml">,</mo><mi id="S2.SS3.p2.2.m2.1.1" xref="S2.SS3.p2.2.m2.1.1.cmml">𝒙</mi><mo id="S2.SS3.p2.2.m2.2.2.1.1.4" stretchy="false" xref="S2.SS3.p2.2.m2.2.2.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS3.p2.2.m2.2b"><apply id="S2.SS3.p2.2.m2.2.2.cmml" xref="S2.SS3.p2.2.m2.2.2"><times id="S2.SS3.p2.2.m2.2.2.2.cmml" xref="S2.SS3.p2.2.m2.2.2.2"></times><ci id="S2.SS3.p2.2.m2.2.2.3.cmml" xref="S2.SS3.p2.2.m2.2.2.3">𝐿</ci><interval closure="open" id="S2.SS3.p2.2.m2.2.2.1.2.cmml" xref="S2.SS3.p2.2.m2.2.2.1.1"><apply id="S2.SS3.p2.2.m2.2.2.1.1.1.cmml" xref="S2.SS3.p2.2.m2.2.2.1.1.1"><plus id="S2.SS3.p2.2.m2.2.2.1.1.1.1.cmml" xref="S2.SS3.p2.2.m2.2.2.1.1.1.1"></plus><apply id="S2.SS3.p2.2.m2.2.2.1.1.1.2.cmml" xref="S2.SS3.p2.2.m2.2.2.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS3.p2.2.m2.2.2.1.1.1.2.1.cmml" xref="S2.SS3.p2.2.m2.2.2.1.1.1.2">subscript</csymbol><ci id="S2.SS3.p2.2.m2.2.2.1.1.1.2.2.cmml" xref="S2.SS3.p2.2.m2.2.2.1.1.1.2.2">𝒖</ci><cn id="S2.SS3.p2.2.m2.2.2.1.1.1.2.3.cmml" type="integer" xref="S2.SS3.p2.2.m2.2.2.1.1.1.2.3">1</cn></apply><apply id="S2.SS3.p2.2.m2.2.2.1.1.1.3.cmml" xref="S2.SS3.p2.2.m2.2.2.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS3.p2.2.m2.2.2.1.1.1.3.1.cmml" xref="S2.SS3.p2.2.m2.2.2.1.1.1.3">subscript</csymbol><ci id="S2.SS3.p2.2.m2.2.2.1.1.1.3.2.cmml" xref="S2.SS3.p2.2.m2.2.2.1.1.1.3.2">𝒖</ci><cn id="S2.SS3.p2.2.m2.2.2.1.1.1.3.3.cmml" type="integer" xref="S2.SS3.p2.2.m2.2.2.1.1.1.3.3">2</cn></apply></apply><ci id="S2.SS3.p2.2.m2.1.1.cmml" xref="S2.SS3.p2.2.m2.1.1">𝒙</ci></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS3.p2.2.m2.2c">L(\bm{u}_{1}+\bm{u}_{2},\bm{x})</annotation><annotation encoding="application/x-llamapun" id="S2.SS3.p2.2.m2.2d">italic_L ( bold_italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + bold_italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , bold_italic_x )</annotation></semantics></math> and the enhanced signal is created by <math alttext="\bm{u}_{1}+\bm{u}_{2}" class="ltx_Math" display="inline" id="S2.SS3.p2.3.m3.1"><semantics id="S2.SS3.p2.3.m3.1a"><mrow id="S2.SS3.p2.3.m3.1.1" xref="S2.SS3.p2.3.m3.1.1.cmml"><msub id="S2.SS3.p2.3.m3.1.1.2" xref="S2.SS3.p2.3.m3.1.1.2.cmml"><mi id="S2.SS3.p2.3.m3.1.1.2.2" xref="S2.SS3.p2.3.m3.1.1.2.2.cmml">𝒖</mi><mn id="S2.SS3.p2.3.m3.1.1.2.3" xref="S2.SS3.p2.3.m3.1.1.2.3.cmml">1</mn></msub><mo id="S2.SS3.p2.3.m3.1.1.1" xref="S2.SS3.p2.3.m3.1.1.1.cmml">+</mo><msub id="S2.SS3.p2.3.m3.1.1.3" xref="S2.SS3.p2.3.m3.1.1.3.cmml"><mi id="S2.SS3.p2.3.m3.1.1.3.2" xref="S2.SS3.p2.3.m3.1.1.3.2.cmml">𝒖</mi><mn id="S2.SS3.p2.3.m3.1.1.3.3" xref="S2.SS3.p2.3.m3.1.1.3.3.cmml">2</mn></msub></mrow><annotation-xml encoding="MathML-Content" id="S2.SS3.p2.3.m3.1b"><apply id="S2.SS3.p2.3.m3.1.1.cmml" xref="S2.SS3.p2.3.m3.1.1"><plus id="S2.SS3.p2.3.m3.1.1.1.cmml" xref="S2.SS3.p2.3.m3.1.1.1"></plus><apply id="S2.SS3.p2.3.m3.1.1.2.cmml" xref="S2.SS3.p2.3.m3.1.1.2"><csymbol cd="ambiguous" id="S2.SS3.p2.3.m3.1.1.2.1.cmml" xref="S2.SS3.p2.3.m3.1.1.2">subscript</csymbol><ci id="S2.SS3.p2.3.m3.1.1.2.2.cmml" xref="S2.SS3.p2.3.m3.1.1.2.2">𝒖</ci><cn id="S2.SS3.p2.3.m3.1.1.2.3.cmml" type="integer" xref="S2.SS3.p2.3.m3.1.1.2.3">1</cn></apply><apply id="S2.SS3.p2.3.m3.1.1.3.cmml" xref="S2.SS3.p2.3.m3.1.1.3"><csymbol cd="ambiguous" id="S2.SS3.p2.3.m3.1.1.3.1.cmml" xref="S2.SS3.p2.3.m3.1.1.3">subscript</csymbol><ci id="S2.SS3.p2.3.m3.1.1.3.2.cmml" xref="S2.SS3.p2.3.m3.1.1.3.2">𝒖</ci><cn id="S2.SS3.p2.3.m3.1.1.3.3.cmml" type="integer" xref="S2.SS3.p2.3.m3.1.1.3.3">2</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS3.p2.3.m3.1c">\bm{u}_{1}+\bm{u}_{2}</annotation><annotation encoding="application/x-llamapun" id="S2.SS3.p2.3.m3.1d">bold_italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + bold_italic_u start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math> during the inference. Both MixIT and NyTT are major unsupervised TSE methods, and they provide a greater flexibility than specialized unsupervised training algorithms. However, as mentioned above, NyTT has a simpler and more flexible architecture than MixIT. The simple architecture enables us to make improvements and expansions easily, and it is more suitable for analysis. For these reasons, we have chosen NyTT as the target of our analysis.</p> </div> <div class="ltx_para" id="S2.SS3.p3"> <p class="ltx_p" id="S2.SS3.p3.1">One limitation of NyTT is that it requires the zero-mean noise assumption and the use of MSE as conditions for Noise2Noise. Conversely, if NyTT does not require these conditions, it can be easily applied to wider ranges of tasks and loss functions. In Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS2" title="4.2 Validity of interpretation of NyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4.2</span></a>, we demonstrate that NyTT indeed works without these conditions, and in Secs. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S5" title="5 Experimental analysis in the dereverberation task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">5</span></a> and <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S6" title="6 Experimental analysis in the declipping task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">6</span></a>, we further show its effectiveness in dereverberation and declipping tasks.</p> </div> </section> </section> <section class="ltx_section" id="S3"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">3 </span>Motivation and content of the investigation</h2> <section class="ltx_subsection" id="S3.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.1 </span>Validity of the interpretation of NyTT</h3> <div class="ltx_para" id="S3.SS1.p1"> <p class="ltx_p" id="S3.SS1.p1.3">NyTT has been proposed, inspired by Noise2Noise, which utilizes noisy signals as target signals on the basis of the averaging effect of MSE loss function. On the other hand, NyTT can also be interpreted as being trained to remove <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S3.SS1.p1.1.m1.1"><semantics id="S3.SS1.p1.1.m1.1a"><msup id="S3.SS1.p1.1.m1.1.1" xref="S3.SS1.p1.1.m1.1.1.cmml"><mi id="S3.SS1.p1.1.m1.1.1.2" xref="S3.SS1.p1.1.m1.1.1.2.cmml">𝒏</mi><mi id="S3.SS1.p1.1.m1.1.1.3" xref="S3.SS1.p1.1.m1.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.1.m1.1b"><apply id="S3.SS1.p1.1.m1.1.1.cmml" xref="S3.SS1.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p1.1.m1.1.1.1.cmml" xref="S3.SS1.p1.1.m1.1.1">superscript</csymbol><ci id="S3.SS1.p1.1.m1.1.1.2.cmml" xref="S3.SS1.p1.1.m1.1.1.2">𝒏</ci><ci id="S3.SS1.p1.1.m1.1.1.3.cmml" xref="S3.SS1.p1.1.m1.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.1.m1.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> from the <span class="ltx_text ltx_font_italic" id="S3.SS1.p1.3.1">more noisy</span> signal <math alttext="\bm{y}" class="ltx_Math" display="inline" id="S3.SS1.p1.2.m2.1"><semantics id="S3.SS1.p1.2.m2.1a"><mi id="S3.SS1.p1.2.m2.1.1" xref="S3.SS1.p1.2.m2.1.1.cmml">𝒚</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.2.m2.1b"><ci id="S3.SS1.p1.2.m2.1.1.cmml" xref="S3.SS1.p1.2.m2.1.1">𝒚</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.2.m2.1c">\bm{y}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.2.m2.1d">bold_italic_y</annotation></semantics></math>, performing TSE by removing noise components corresponding to <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S3.SS1.p1.3.m3.1"><semantics id="S3.SS1.p1.3.m3.1a"><msup id="S3.SS1.p1.3.m3.1.1" xref="S3.SS1.p1.3.m3.1.1.cmml"><mi id="S3.SS1.p1.3.m3.1.1.2" xref="S3.SS1.p1.3.m3.1.1.2.cmml">𝒏</mi><mi id="S3.SS1.p1.3.m3.1.1.3" xref="S3.SS1.p1.3.m3.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.3.m3.1b"><apply id="S3.SS1.p1.3.m3.1.1.cmml" xref="S3.SS1.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S3.SS1.p1.3.m3.1.1.1.cmml" xref="S3.SS1.p1.3.m3.1.1">superscript</csymbol><ci id="S3.SS1.p1.3.m3.1.1.2.cmml" xref="S3.SS1.p1.3.m3.1.1.2">𝒏</ci><ci id="S3.SS1.p1.3.m3.1.1.3.cmml" xref="S3.SS1.p1.3.m3.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.3.m3.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> from the noisy input signal. Therefore, we investigate whether NyTT can be interpreted as Noise2Noise or not, through 1) the analysis of the signals processed in NyTT (Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS2.SSS1" title="4.2.1 Analysis of signals processed in NyTT ‣ 4.2 Validity of interpretation of NyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4.2.1</span></a>) and 2) the evaluation of NyTT with a loss function that does not satisfy the conditions of Noise2Noise (Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS2.SSS2" title="4.2.2 Evaluation of NyTT with loss functions that do not satisfy the conditions of Noise2Noise ‣ 4.2 Validity of interpretation of NyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4.2.2</span></a>). If NyTT is not Noise2Noise, the zero-mean noise assumption and the use of MSE will no longer be necessary, allowing us to use various loss functions and apply NyTT to various tasks.</p> </div> <figure class="ltx_figure" id="S3.F2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="162" id="S3.F2.g1" src="x2.png" width="822"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure">Figure 2: </span>Overview of IterNyTT.</figcaption> </figure> </section> <section class="ltx_subsection" id="S3.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.2 </span>Improvement of NyTT through iteration</h3> <div class="ltx_para" id="S3.SS2.p1"> <p class="ltx_p" id="S3.SS2.p1.9">It has been shown that the performance of NyTT improves as the SNR of the noisy target increases <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Fujimura_2021</span>]</cite>. On the basis of this property, we propose IterNyTT, which achieves better performance through an iterative process (Fig. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S3.F2" title="Figure 2 ‣ 3.1 Validity of the interpretation of NyTT ‣ 3 Motivation and content of the investigation ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">2</span></a>). In the first iteration of IterNyTT, we train a DNN <math alttext="f_{1}(\cdot)" class="ltx_Math" display="inline" id="S3.SS2.p1.1.m1.1"><semantics id="S3.SS2.p1.1.m1.1a"><mrow id="S3.SS2.p1.1.m1.1.2" xref="S3.SS2.p1.1.m1.1.2.cmml"><msub id="S3.SS2.p1.1.m1.1.2.2" xref="S3.SS2.p1.1.m1.1.2.2.cmml"><mi id="S3.SS2.p1.1.m1.1.2.2.2" xref="S3.SS2.p1.1.m1.1.2.2.2.cmml">f</mi><mn id="S3.SS2.p1.1.m1.1.2.2.3" xref="S3.SS2.p1.1.m1.1.2.2.3.cmml">1</mn></msub><mo id="S3.SS2.p1.1.m1.1.2.1" xref="S3.SS2.p1.1.m1.1.2.1.cmml">⁢</mo><mrow id="S3.SS2.p1.1.m1.1.2.3.2" xref="S3.SS2.p1.1.m1.1.2.cmml"><mo id="S3.SS2.p1.1.m1.1.2.3.2.1" stretchy="false" xref="S3.SS2.p1.1.m1.1.2.cmml">(</mo><mo id="S3.SS2.p1.1.m1.1.1" lspace="0em" rspace="0em" xref="S3.SS2.p1.1.m1.1.1.cmml">⋅</mo><mo id="S3.SS2.p1.1.m1.1.2.3.2.2" stretchy="false" xref="S3.SS2.p1.1.m1.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.1.m1.1b"><apply id="S3.SS2.p1.1.m1.1.2.cmml" xref="S3.SS2.p1.1.m1.1.2"><times id="S3.SS2.p1.1.m1.1.2.1.cmml" xref="S3.SS2.p1.1.m1.1.2.1"></times><apply id="S3.SS2.p1.1.m1.1.2.2.cmml" xref="S3.SS2.p1.1.m1.1.2.2"><csymbol cd="ambiguous" id="S3.SS2.p1.1.m1.1.2.2.1.cmml" xref="S3.SS2.p1.1.m1.1.2.2">subscript</csymbol><ci id="S3.SS2.p1.1.m1.1.2.2.2.cmml" xref="S3.SS2.p1.1.m1.1.2.2.2">𝑓</ci><cn id="S3.SS2.p1.1.m1.1.2.2.3.cmml" type="integer" xref="S3.SS2.p1.1.m1.1.2.2.3">1</cn></apply><ci id="S3.SS2.p1.1.m1.1.1.cmml" xref="S3.SS2.p1.1.m1.1.1">⋅</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.1.m1.1c">f_{1}(\cdot)</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.1.m1.1d">italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ⋅ )</annotation></semantics></math> using NyTT with the noisy target <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S3.SS2.p1.2.m2.1"><semantics id="S3.SS2.p1.2.m2.1a"><mi id="S3.SS2.p1.2.m2.1.1" xref="S3.SS2.p1.2.m2.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.2.m2.1b"><ci id="S3.SS2.p1.2.m2.1.1.cmml" xref="S3.SS2.p1.2.m2.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.2.m2.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.2.m2.1d">bold_italic_x</annotation></semantics></math>. Next, we apply TSE to the noisy target <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S3.SS2.p1.3.m3.1"><semantics id="S3.SS2.p1.3.m3.1a"><mi id="S3.SS2.p1.3.m3.1.1" xref="S3.SS2.p1.3.m3.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.3.m3.1b"><ci id="S3.SS2.p1.3.m3.1.1.cmml" xref="S3.SS2.p1.3.m3.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.3.m3.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.3.m3.1d">bold_italic_x</annotation></semantics></math> using <math alttext="f_{1}(\cdot)" class="ltx_Math" display="inline" id="S3.SS2.p1.4.m4.1"><semantics id="S3.SS2.p1.4.m4.1a"><mrow id="S3.SS2.p1.4.m4.1.2" xref="S3.SS2.p1.4.m4.1.2.cmml"><msub id="S3.SS2.p1.4.m4.1.2.2" xref="S3.SS2.p1.4.m4.1.2.2.cmml"><mi id="S3.SS2.p1.4.m4.1.2.2.2" xref="S3.SS2.p1.4.m4.1.2.2.2.cmml">f</mi><mn id="S3.SS2.p1.4.m4.1.2.2.3" xref="S3.SS2.p1.4.m4.1.2.2.3.cmml">1</mn></msub><mo id="S3.SS2.p1.4.m4.1.2.1" xref="S3.SS2.p1.4.m4.1.2.1.cmml">⁢</mo><mrow id="S3.SS2.p1.4.m4.1.2.3.2" xref="S3.SS2.p1.4.m4.1.2.cmml"><mo id="S3.SS2.p1.4.m4.1.2.3.2.1" stretchy="false" xref="S3.SS2.p1.4.m4.1.2.cmml">(</mo><mo id="S3.SS2.p1.4.m4.1.1" lspace="0em" rspace="0em" xref="S3.SS2.p1.4.m4.1.1.cmml">⋅</mo><mo id="S3.SS2.p1.4.m4.1.2.3.2.2" stretchy="false" xref="S3.SS2.p1.4.m4.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.4.m4.1b"><apply id="S3.SS2.p1.4.m4.1.2.cmml" xref="S3.SS2.p1.4.m4.1.2"><times id="S3.SS2.p1.4.m4.1.2.1.cmml" xref="S3.SS2.p1.4.m4.1.2.1"></times><apply id="S3.SS2.p1.4.m4.1.2.2.cmml" xref="S3.SS2.p1.4.m4.1.2.2"><csymbol cd="ambiguous" id="S3.SS2.p1.4.m4.1.2.2.1.cmml" xref="S3.SS2.p1.4.m4.1.2.2">subscript</csymbol><ci id="S3.SS2.p1.4.m4.1.2.2.2.cmml" xref="S3.SS2.p1.4.m4.1.2.2.2">𝑓</ci><cn id="S3.SS2.p1.4.m4.1.2.2.3.cmml" type="integer" xref="S3.SS2.p1.4.m4.1.2.2.3">1</cn></apply><ci id="S3.SS2.p1.4.m4.1.1.cmml" xref="S3.SS2.p1.4.m4.1.1">⋅</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.4.m4.1c">f_{1}(\cdot)</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.4.m4.1d">italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ⋅ )</annotation></semantics></math> and obtain the enhanced signal <math alttext="\hat{\bm{s}}_{i-1}" class="ltx_Math" display="inline" id="S3.SS2.p1.5.m5.1"><semantics id="S3.SS2.p1.5.m5.1a"><msub id="S3.SS2.p1.5.m5.1.1" xref="S3.SS2.p1.5.m5.1.1.cmml"><mover accent="true" id="S3.SS2.p1.5.m5.1.1.2" xref="S3.SS2.p1.5.m5.1.1.2.cmml"><mi id="S3.SS2.p1.5.m5.1.1.2.2" xref="S3.SS2.p1.5.m5.1.1.2.2.cmml">𝒔</mi><mo id="S3.SS2.p1.5.m5.1.1.2.1" xref="S3.SS2.p1.5.m5.1.1.2.1.cmml">^</mo></mover><mrow id="S3.SS2.p1.5.m5.1.1.3" xref="S3.SS2.p1.5.m5.1.1.3.cmml"><mi id="S3.SS2.p1.5.m5.1.1.3.2" xref="S3.SS2.p1.5.m5.1.1.3.2.cmml">i</mi><mo id="S3.SS2.p1.5.m5.1.1.3.1" xref="S3.SS2.p1.5.m5.1.1.3.1.cmml">−</mo><mn id="S3.SS2.p1.5.m5.1.1.3.3" xref="S3.SS2.p1.5.m5.1.1.3.3.cmml">1</mn></mrow></msub><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.5.m5.1b"><apply id="S3.SS2.p1.5.m5.1.1.cmml" xref="S3.SS2.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.5.m5.1.1.1.cmml" xref="S3.SS2.p1.5.m5.1.1">subscript</csymbol><apply id="S3.SS2.p1.5.m5.1.1.2.cmml" xref="S3.SS2.p1.5.m5.1.1.2"><ci id="S3.SS2.p1.5.m5.1.1.2.1.cmml" xref="S3.SS2.p1.5.m5.1.1.2.1">^</ci><ci id="S3.SS2.p1.5.m5.1.1.2.2.cmml" xref="S3.SS2.p1.5.m5.1.1.2.2">𝒔</ci></apply><apply id="S3.SS2.p1.5.m5.1.1.3.cmml" xref="S3.SS2.p1.5.m5.1.1.3"><minus id="S3.SS2.p1.5.m5.1.1.3.1.cmml" xref="S3.SS2.p1.5.m5.1.1.3.1"></minus><ci id="S3.SS2.p1.5.m5.1.1.3.2.cmml" xref="S3.SS2.p1.5.m5.1.1.3.2">𝑖</ci><cn id="S3.SS2.p1.5.m5.1.1.3.3.cmml" type="integer" xref="S3.SS2.p1.5.m5.1.1.3.3">1</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.5.m5.1c">\hat{\bm{s}}_{i-1}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.5.m5.1d">over^ start_ARG bold_italic_s end_ARG start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT</annotation></semantics></math>. Then, we train another DNN <math alttext="f_{2}(\cdot)" class="ltx_Math" display="inline" id="S3.SS2.p1.6.m6.1"><semantics id="S3.SS2.p1.6.m6.1a"><mrow id="S3.SS2.p1.6.m6.1.2" xref="S3.SS2.p1.6.m6.1.2.cmml"><msub id="S3.SS2.p1.6.m6.1.2.2" xref="S3.SS2.p1.6.m6.1.2.2.cmml"><mi id="S3.SS2.p1.6.m6.1.2.2.2" xref="S3.SS2.p1.6.m6.1.2.2.2.cmml">f</mi><mn id="S3.SS2.p1.6.m6.1.2.2.3" xref="S3.SS2.p1.6.m6.1.2.2.3.cmml">2</mn></msub><mo id="S3.SS2.p1.6.m6.1.2.1" xref="S3.SS2.p1.6.m6.1.2.1.cmml">⁢</mo><mrow id="S3.SS2.p1.6.m6.1.2.3.2" xref="S3.SS2.p1.6.m6.1.2.cmml"><mo id="S3.SS2.p1.6.m6.1.2.3.2.1" stretchy="false" xref="S3.SS2.p1.6.m6.1.2.cmml">(</mo><mo id="S3.SS2.p1.6.m6.1.1" lspace="0em" rspace="0em" xref="S3.SS2.p1.6.m6.1.1.cmml">⋅</mo><mo id="S3.SS2.p1.6.m6.1.2.3.2.2" stretchy="false" xref="S3.SS2.p1.6.m6.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.6.m6.1b"><apply id="S3.SS2.p1.6.m6.1.2.cmml" xref="S3.SS2.p1.6.m6.1.2"><times id="S3.SS2.p1.6.m6.1.2.1.cmml" xref="S3.SS2.p1.6.m6.1.2.1"></times><apply id="S3.SS2.p1.6.m6.1.2.2.cmml" xref="S3.SS2.p1.6.m6.1.2.2"><csymbol cd="ambiguous" id="S3.SS2.p1.6.m6.1.2.2.1.cmml" xref="S3.SS2.p1.6.m6.1.2.2">subscript</csymbol><ci id="S3.SS2.p1.6.m6.1.2.2.2.cmml" xref="S3.SS2.p1.6.m6.1.2.2.2">𝑓</ci><cn id="S3.SS2.p1.6.m6.1.2.2.3.cmml" type="integer" xref="S3.SS2.p1.6.m6.1.2.2.3">2</cn></apply><ci id="S3.SS2.p1.6.m6.1.1.cmml" xref="S3.SS2.p1.6.m6.1.1">⋅</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.6.m6.1c">f_{2}(\cdot)</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.6.m6.1d">italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( ⋅ )</annotation></semantics></math> using NyTT with <math alttext="\hat{\bm{s}}_{i-1}" class="ltx_Math" display="inline" id="S3.SS2.p1.7.m7.1"><semantics id="S3.SS2.p1.7.m7.1a"><msub id="S3.SS2.p1.7.m7.1.1" xref="S3.SS2.p1.7.m7.1.1.cmml"><mover accent="true" id="S3.SS2.p1.7.m7.1.1.2" xref="S3.SS2.p1.7.m7.1.1.2.cmml"><mi id="S3.SS2.p1.7.m7.1.1.2.2" xref="S3.SS2.p1.7.m7.1.1.2.2.cmml">𝒔</mi><mo id="S3.SS2.p1.7.m7.1.1.2.1" xref="S3.SS2.p1.7.m7.1.1.2.1.cmml">^</mo></mover><mrow id="S3.SS2.p1.7.m7.1.1.3" xref="S3.SS2.p1.7.m7.1.1.3.cmml"><mi id="S3.SS2.p1.7.m7.1.1.3.2" xref="S3.SS2.p1.7.m7.1.1.3.2.cmml">i</mi><mo id="S3.SS2.p1.7.m7.1.1.3.1" xref="S3.SS2.p1.7.m7.1.1.3.1.cmml">−</mo><mn id="S3.SS2.p1.7.m7.1.1.3.3" xref="S3.SS2.p1.7.m7.1.1.3.3.cmml">1</mn></mrow></msub><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.7.m7.1b"><apply id="S3.SS2.p1.7.m7.1.1.cmml" xref="S3.SS2.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.7.m7.1.1.1.cmml" xref="S3.SS2.p1.7.m7.1.1">subscript</csymbol><apply id="S3.SS2.p1.7.m7.1.1.2.cmml" xref="S3.SS2.p1.7.m7.1.1.2"><ci id="S3.SS2.p1.7.m7.1.1.2.1.cmml" xref="S3.SS2.p1.7.m7.1.1.2.1">^</ci><ci id="S3.SS2.p1.7.m7.1.1.2.2.cmml" xref="S3.SS2.p1.7.m7.1.1.2.2">𝒔</ci></apply><apply id="S3.SS2.p1.7.m7.1.1.3.cmml" xref="S3.SS2.p1.7.m7.1.1.3"><minus id="S3.SS2.p1.7.m7.1.1.3.1.cmml" xref="S3.SS2.p1.7.m7.1.1.3.1"></minus><ci id="S3.SS2.p1.7.m7.1.1.3.2.cmml" xref="S3.SS2.p1.7.m7.1.1.3.2">𝑖</ci><cn id="S3.SS2.p1.7.m7.1.1.3.3.cmml" type="integer" xref="S3.SS2.p1.7.m7.1.1.3.3">1</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.7.m7.1c">\hat{\bm{s}}_{i-1}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.7.m7.1d">over^ start_ARG bold_italic_s end_ARG start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT</annotation></semantics></math> as the noisy target signal. To prevent the degradation of the target signal component, we apply TSE to the original noisy target <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S3.SS2.p1.8.m8.1"><semantics id="S3.SS2.p1.8.m8.1a"><mi id="S3.SS2.p1.8.m8.1.1" xref="S3.SS2.p1.8.m8.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.8.m8.1b"><ci id="S3.SS2.p1.8.m8.1.1.cmml" xref="S3.SS2.p1.8.m8.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.8.m8.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.8.m8.1d">bold_italic_x</annotation></semantics></math>, not to the already enhanced signal <math alttext="\hat{\bm{s}}_{i-1}" class="ltx_Math" display="inline" id="S3.SS2.p1.9.m9.1"><semantics id="S3.SS2.p1.9.m9.1a"><msub id="S3.SS2.p1.9.m9.1.1" xref="S3.SS2.p1.9.m9.1.1.cmml"><mover accent="true" id="S3.SS2.p1.9.m9.1.1.2" xref="S3.SS2.p1.9.m9.1.1.2.cmml"><mi id="S3.SS2.p1.9.m9.1.1.2.2" xref="S3.SS2.p1.9.m9.1.1.2.2.cmml">𝒔</mi><mo id="S3.SS2.p1.9.m9.1.1.2.1" xref="S3.SS2.p1.9.m9.1.1.2.1.cmml">^</mo></mover><mrow id="S3.SS2.p1.9.m9.1.1.3" xref="S3.SS2.p1.9.m9.1.1.3.cmml"><mi id="S3.SS2.p1.9.m9.1.1.3.2" xref="S3.SS2.p1.9.m9.1.1.3.2.cmml">i</mi><mo id="S3.SS2.p1.9.m9.1.1.3.1" xref="S3.SS2.p1.9.m9.1.1.3.1.cmml">−</mo><mn id="S3.SS2.p1.9.m9.1.1.3.3" xref="S3.SS2.p1.9.m9.1.1.3.3.cmml">1</mn></mrow></msub><annotation-xml encoding="MathML-Content" id="S3.SS2.p1.9.m9.1b"><apply id="S3.SS2.p1.9.m9.1.1.cmml" xref="S3.SS2.p1.9.m9.1.1"><csymbol cd="ambiguous" id="S3.SS2.p1.9.m9.1.1.1.cmml" xref="S3.SS2.p1.9.m9.1.1">subscript</csymbol><apply id="S3.SS2.p1.9.m9.1.1.2.cmml" xref="S3.SS2.p1.9.m9.1.1.2"><ci id="S3.SS2.p1.9.m9.1.1.2.1.cmml" xref="S3.SS2.p1.9.m9.1.1.2.1">^</ci><ci id="S3.SS2.p1.9.m9.1.1.2.2.cmml" xref="S3.SS2.p1.9.m9.1.1.2.2">𝒔</ci></apply><apply id="S3.SS2.p1.9.m9.1.1.3.cmml" xref="S3.SS2.p1.9.m9.1.1.3"><minus id="S3.SS2.p1.9.m9.1.1.3.1.cmml" xref="S3.SS2.p1.9.m9.1.1.3.1"></minus><ci id="S3.SS2.p1.9.m9.1.1.3.2.cmml" xref="S3.SS2.p1.9.m9.1.1.3.2">𝑖</ci><cn id="S3.SS2.p1.9.m9.1.1.3.3.cmml" type="integer" xref="S3.SS2.p1.9.m9.1.1.3.3">1</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS2.p1.9.m9.1c">\hat{\bm{s}}_{i-1}</annotation><annotation encoding="application/x-llamapun" id="S3.SS2.p1.9.m9.1d">over^ start_ARG bold_italic_s end_ARG start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT</annotation></semantics></math>. Through this iterative process, we can improve the SNR of noisy targets and expect the performance improvement of NyTT. We investigate the effectiveness of IterNyTT in Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS3" title="4.3 Effectiveness of IterNyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4.3</span></a>.</p> </div> </section> <section class="ltx_subsection" id="S3.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.3 </span>Effects of mismatches between noise signals</h3> <div class="ltx_para" id="S3.SS3.p1"> <p class="ltx_p" id="S3.SS3.p1.11">In the NyTT framework, there are three types of noise signal: <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S3.SS3.p1.1.m1.1"><semantics id="S3.SS3.p1.1.m1.1a"><msup id="S3.SS3.p1.1.m1.1.1" xref="S3.SS3.p1.1.m1.1.1.cmml"><mi id="S3.SS3.p1.1.m1.1.1.2" xref="S3.SS3.p1.1.m1.1.1.2.cmml">𝒏</mi><mi id="S3.SS3.p1.1.m1.1.1.3" xref="S3.SS3.p1.1.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.1.m1.1b"><apply id="S3.SS3.p1.1.m1.1.1.cmml" xref="S3.SS3.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.1.m1.1.1.1.cmml" xref="S3.SS3.p1.1.m1.1.1">superscript</csymbol><ci id="S3.SS3.p1.1.m1.1.1.2.cmml" xref="S3.SS3.p1.1.m1.1.1.2">𝒏</ci><ci id="S3.SS3.p1.1.m1.1.1.3.cmml" xref="S3.SS3.p1.1.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.1.m1.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>, <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S3.SS3.p1.2.m2.1"><semantics id="S3.SS3.p1.2.m2.1a"><msup id="S3.SS3.p1.2.m2.1.1" xref="S3.SS3.p1.2.m2.1.1.cmml"><mi id="S3.SS3.p1.2.m2.1.1.2" xref="S3.SS3.p1.2.m2.1.1.2.cmml">𝒏</mi><mi id="S3.SS3.p1.2.m2.1.1.3" xref="S3.SS3.p1.2.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.2.m2.1b"><apply id="S3.SS3.p1.2.m2.1.1.cmml" xref="S3.SS3.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.2.m2.1.1.1.cmml" xref="S3.SS3.p1.2.m2.1.1">superscript</csymbol><ci id="S3.SS3.p1.2.m2.1.1.2.cmml" xref="S3.SS3.p1.2.m2.1.1.2">𝒏</ci><ci id="S3.SS3.p1.2.m2.1.1.3.cmml" xref="S3.SS3.p1.2.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.2.m2.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, and noise included in the test data <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S3.SS3.p1.3.m3.1"><semantics id="S3.SS3.p1.3.m3.1a"><msup id="S3.SS3.p1.3.m3.1.1" xref="S3.SS3.p1.3.m3.1.1.cmml"><mi id="S3.SS3.p1.3.m3.1.1.2" xref="S3.SS3.p1.3.m3.1.1.2.cmml">𝒏</mi><mi id="S3.SS3.p1.3.m3.1.1.3" xref="S3.SS3.p1.3.m3.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.3.m3.1b"><apply id="S3.SS3.p1.3.m3.1.1.cmml" xref="S3.SS3.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.3.m3.1.1.1.cmml" xref="S3.SS3.p1.3.m3.1.1">superscript</csymbol><ci id="S3.SS3.p1.3.m3.1.1.2.cmml" xref="S3.SS3.p1.3.m3.1.1.2">𝒏</ci><ci id="S3.SS3.p1.3.m3.1.1.3.cmml" xref="S3.SS3.p1.3.m3.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.3.m3.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>. In Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS4" title="4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4.4</span></a>, we investigate the effects of mismatches between types of noise signal on the performance of NyTT and IterNyTT. For instance, the performance of CTT is degraded when there is a mismatch between <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S3.SS3.p1.4.m4.1"><semantics id="S3.SS3.p1.4.m4.1a"><msup id="S3.SS3.p1.4.m4.1.1" xref="S3.SS3.p1.4.m4.1.1.cmml"><mi id="S3.SS3.p1.4.m4.1.1.2" xref="S3.SS3.p1.4.m4.1.1.2.cmml">𝒏</mi><mi id="S3.SS3.p1.4.m4.1.1.3" xref="S3.SS3.p1.4.m4.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.4.m4.1b"><apply id="S3.SS3.p1.4.m4.1.1.cmml" xref="S3.SS3.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.4.m4.1.1.1.cmml" xref="S3.SS3.p1.4.m4.1.1">superscript</csymbol><ci id="S3.SS3.p1.4.m4.1.1.2.cmml" xref="S3.SS3.p1.4.m4.1.1.2">𝒏</ci><ci id="S3.SS3.p1.4.m4.1.1.3.cmml" xref="S3.SS3.p1.4.m4.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.4.m4.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S3.SS3.p1.5.m5.1"><semantics id="S3.SS3.p1.5.m5.1a"><msup id="S3.SS3.p1.5.m5.1.1" xref="S3.SS3.p1.5.m5.1.1.cmml"><mi id="S3.SS3.p1.5.m5.1.1.2" xref="S3.SS3.p1.5.m5.1.1.2.cmml">𝒏</mi><mi id="S3.SS3.p1.5.m5.1.1.3" xref="S3.SS3.p1.5.m5.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.5.m5.1b"><apply id="S3.SS3.p1.5.m5.1.1.cmml" xref="S3.SS3.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.5.m5.1.1.1.cmml" xref="S3.SS3.p1.5.m5.1.1">superscript</csymbol><ci id="S3.SS3.p1.5.m5.1.1.2.cmml" xref="S3.SS3.p1.5.m5.1.1.2">𝒏</ci><ci id="S3.SS3.p1.5.m5.1.1.3.cmml" xref="S3.SS3.p1.5.m5.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.5.m5.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.5.m5.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math> <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Wisdom_2020</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">ito2023audio</span>]</cite>. In the experiments, 1) we evaluate the performance of CTT, NyTT, and IterNyTT under mismatched conditions, and investigate the effects of each mismatch (Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS4.SSS1" title="4.4.1 Effects of mismatches on the performance ‣ 4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4.4.1</span></a>). Additionally, considering the effects of mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S3.SS3.p1.6.m6.1"><semantics id="S3.SS3.p1.6.m6.1a"><msup id="S3.SS3.p1.6.m6.1.1" xref="S3.SS3.p1.6.m6.1.1.cmml"><mi id="S3.SS3.p1.6.m6.1.1.2" xref="S3.SS3.p1.6.m6.1.1.2.cmml">𝒏</mi><mi id="S3.SS3.p1.6.m6.1.1.3" xref="S3.SS3.p1.6.m6.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.6.m6.1b"><apply id="S3.SS3.p1.6.m6.1.1.cmml" xref="S3.SS3.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.6.m6.1.1.1.cmml" xref="S3.SS3.p1.6.m6.1.1">superscript</csymbol><ci id="S3.SS3.p1.6.m6.1.1.2.cmml" xref="S3.SS3.p1.6.m6.1.1.2">𝒏</ci><ci id="S3.SS3.p1.6.m6.1.1.3.cmml" xref="S3.SS3.p1.6.m6.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.6.m6.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.6.m6.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S3.SS3.p1.7.m7.1"><semantics id="S3.SS3.p1.7.m7.1a"><msup id="S3.SS3.p1.7.m7.1.1" xref="S3.SS3.p1.7.m7.1.1.cmml"><mi id="S3.SS3.p1.7.m7.1.1.2" xref="S3.SS3.p1.7.m7.1.1.2.cmml">𝒏</mi><mi id="S3.SS3.p1.7.m7.1.1.3" xref="S3.SS3.p1.7.m7.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.7.m7.1b"><apply id="S3.SS3.p1.7.m7.1.1.cmml" xref="S3.SS3.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.7.m7.1.1.1.cmml" xref="S3.SS3.p1.7.m7.1.1">superscript</csymbol><ci id="S3.SS3.p1.7.m7.1.1.2.cmml" xref="S3.SS3.p1.7.m7.1.1.2">𝒏</ci><ci id="S3.SS3.p1.7.m7.1.1.3.cmml" xref="S3.SS3.p1.7.m7.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.7.m7.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.7.m7.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, we investigate the effects of 2) the SNRs of the noisy targets <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S3.SS3.p1.8.m8.1"><semantics id="S3.SS3.p1.8.m8.1a"><mi id="S3.SS3.p1.8.m8.1.1" xref="S3.SS3.p1.8.m8.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.8.m8.1b"><ci id="S3.SS3.p1.8.m8.1.1.cmml" xref="S3.SS3.p1.8.m8.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.8.m8.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.8.m8.1d">bold_italic_x</annotation></semantics></math> (<math alttext="\mathrm{SNR}_{\bm{x}}=\log_{10}||\bm{s}||^{2}_{2}/||\bm{n}^{\rm obs}||^{2}_{2}" class="ltx_Math" display="inline" id="S3.SS3.p1.9.m9.2"><semantics id="S3.SS3.p1.9.m9.2a"><mrow id="S3.SS3.p1.9.m9.2.2" xref="S3.SS3.p1.9.m9.2.2.cmml"><msub id="S3.SS3.p1.9.m9.2.2.3" xref="S3.SS3.p1.9.m9.2.2.3.cmml"><mi id="S3.SS3.p1.9.m9.2.2.3.2" xref="S3.SS3.p1.9.m9.2.2.3.2.cmml">SNR</mi><mi id="S3.SS3.p1.9.m9.2.2.3.3" xref="S3.SS3.p1.9.m9.2.2.3.3.cmml">𝒙</mi></msub><mo id="S3.SS3.p1.9.m9.2.2.2" xref="S3.SS3.p1.9.m9.2.2.2.cmml">=</mo><mrow id="S3.SS3.p1.9.m9.2.2.1" xref="S3.SS3.p1.9.m9.2.2.1.cmml"><mrow id="S3.SS3.p1.9.m9.2.2.1.3" xref="S3.SS3.p1.9.m9.2.2.1.3.cmml"><msub id="S3.SS3.p1.9.m9.2.2.1.3.2" xref="S3.SS3.p1.9.m9.2.2.1.3.2.cmml"><mi id="S3.SS3.p1.9.m9.2.2.1.3.2.2" xref="S3.SS3.p1.9.m9.2.2.1.3.2.2.cmml">log</mi><mn id="S3.SS3.p1.9.m9.2.2.1.3.2.3" xref="S3.SS3.p1.9.m9.2.2.1.3.2.3.cmml">10</mn></msub><mo id="S3.SS3.p1.9.m9.2.2.1.3.1" xref="S3.SS3.p1.9.m9.2.2.1.3.1.cmml">⁢</mo><msubsup id="S3.SS3.p1.9.m9.2.2.1.3.3" xref="S3.SS3.p1.9.m9.2.2.1.3.3.cmml"><mrow id="S3.SS3.p1.9.m9.2.2.1.3.3.2.2.2" xref="S3.SS3.p1.9.m9.2.2.1.3.3.2.2.1.cmml"><mo id="S3.SS3.p1.9.m9.2.2.1.3.3.2.2.2.1" stretchy="false" xref="S3.SS3.p1.9.m9.2.2.1.3.3.2.2.1.1.cmml">‖</mo><mi id="S3.SS3.p1.9.m9.1.1" xref="S3.SS3.p1.9.m9.1.1.cmml">𝒔</mi><mo id="S3.SS3.p1.9.m9.2.2.1.3.3.2.2.2.2" stretchy="false" xref="S3.SS3.p1.9.m9.2.2.1.3.3.2.2.1.1.cmml">‖</mo></mrow><mn id="S3.SS3.p1.9.m9.2.2.1.3.3.3" xref="S3.SS3.p1.9.m9.2.2.1.3.3.3.cmml">2</mn><mn id="S3.SS3.p1.9.m9.2.2.1.3.3.2.3" xref="S3.SS3.p1.9.m9.2.2.1.3.3.2.3.cmml">2</mn></msubsup></mrow><mo id="S3.SS3.p1.9.m9.2.2.1.2" xref="S3.SS3.p1.9.m9.2.2.1.2.cmml">/</mo><msubsup id="S3.SS3.p1.9.m9.2.2.1.1" xref="S3.SS3.p1.9.m9.2.2.1.1.cmml"><mrow id="S3.SS3.p1.9.m9.2.2.1.1.1.1.1" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.2.cmml"><mo id="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.2" stretchy="false" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.2.1.cmml">‖</mo><msup id="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1.cmml"><mi id="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1.2" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1.2.cmml">𝒏</mi><mi id="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1.3" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1.3.cmml">obs</mi></msup><mo id="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.3" stretchy="false" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S3.SS3.p1.9.m9.2.2.1.1.3" xref="S3.SS3.p1.9.m9.2.2.1.1.3.cmml">2</mn><mn id="S3.SS3.p1.9.m9.2.2.1.1.1.3" xref="S3.SS3.p1.9.m9.2.2.1.1.1.3.cmml">2</mn></msubsup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.9.m9.2b"><apply id="S3.SS3.p1.9.m9.2.2.cmml" xref="S3.SS3.p1.9.m9.2.2"><eq id="S3.SS3.p1.9.m9.2.2.2.cmml" xref="S3.SS3.p1.9.m9.2.2.2"></eq><apply id="S3.SS3.p1.9.m9.2.2.3.cmml" xref="S3.SS3.p1.9.m9.2.2.3"><csymbol cd="ambiguous" id="S3.SS3.p1.9.m9.2.2.3.1.cmml" xref="S3.SS3.p1.9.m9.2.2.3">subscript</csymbol><ci id="S3.SS3.p1.9.m9.2.2.3.2.cmml" xref="S3.SS3.p1.9.m9.2.2.3.2">SNR</ci><ci id="S3.SS3.p1.9.m9.2.2.3.3.cmml" xref="S3.SS3.p1.9.m9.2.2.3.3">𝒙</ci></apply><apply id="S3.SS3.p1.9.m9.2.2.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1"><divide id="S3.SS3.p1.9.m9.2.2.1.2.cmml" xref="S3.SS3.p1.9.m9.2.2.1.2"></divide><apply id="S3.SS3.p1.9.m9.2.2.1.3.cmml" xref="S3.SS3.p1.9.m9.2.2.1.3"><times id="S3.SS3.p1.9.m9.2.2.1.3.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1.3.1"></times><apply id="S3.SS3.p1.9.m9.2.2.1.3.2.cmml" xref="S3.SS3.p1.9.m9.2.2.1.3.2"><csymbol cd="ambiguous" id="S3.SS3.p1.9.m9.2.2.1.3.2.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1.3.2">subscript</csymbol><log id="S3.SS3.p1.9.m9.2.2.1.3.2.2.cmml" xref="S3.SS3.p1.9.m9.2.2.1.3.2.2"></log><cn id="S3.SS3.p1.9.m9.2.2.1.3.2.3.cmml" type="integer" xref="S3.SS3.p1.9.m9.2.2.1.3.2.3">10</cn></apply><apply id="S3.SS3.p1.9.m9.2.2.1.3.3.cmml" xref="S3.SS3.p1.9.m9.2.2.1.3.3"><csymbol cd="ambiguous" id="S3.SS3.p1.9.m9.2.2.1.3.3.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1.3.3">subscript</csymbol><apply id="S3.SS3.p1.9.m9.2.2.1.3.3.2.cmml" xref="S3.SS3.p1.9.m9.2.2.1.3.3"><csymbol cd="ambiguous" id="S3.SS3.p1.9.m9.2.2.1.3.3.2.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1.3.3">superscript</csymbol><apply id="S3.SS3.p1.9.m9.2.2.1.3.3.2.2.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1.3.3.2.2.2"><csymbol cd="latexml" id="S3.SS3.p1.9.m9.2.2.1.3.3.2.2.1.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1.3.3.2.2.2.1">norm</csymbol><ci id="S3.SS3.p1.9.m9.1.1.cmml" xref="S3.SS3.p1.9.m9.1.1">𝒔</ci></apply><cn id="S3.SS3.p1.9.m9.2.2.1.3.3.2.3.cmml" type="integer" xref="S3.SS3.p1.9.m9.2.2.1.3.3.2.3">2</cn></apply><cn id="S3.SS3.p1.9.m9.2.2.1.3.3.3.cmml" type="integer" xref="S3.SS3.p1.9.m9.2.2.1.3.3.3">2</cn></apply></apply><apply id="S3.SS3.p1.9.m9.2.2.1.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.9.m9.2.2.1.1.2.cmml" xref="S3.SS3.p1.9.m9.2.2.1.1">subscript</csymbol><apply id="S3.SS3.p1.9.m9.2.2.1.1.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.9.m9.2.2.1.1.1.2.cmml" xref="S3.SS3.p1.9.m9.2.2.1.1">superscript</csymbol><apply id="S3.SS3.p1.9.m9.2.2.1.1.1.1.2.cmml" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.1"><csymbol cd="latexml" id="S3.SS3.p1.9.m9.2.2.1.1.1.1.2.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.2">norm</csymbol><apply id="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1.1.cmml" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1">superscript</csymbol><ci id="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1.2.cmml" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1.2">𝒏</ci><ci id="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1.3.cmml" xref="S3.SS3.p1.9.m9.2.2.1.1.1.1.1.1.3">obs</ci></apply></apply><cn id="S3.SS3.p1.9.m9.2.2.1.1.1.3.cmml" type="integer" xref="S3.SS3.p1.9.m9.2.2.1.1.1.3">2</cn></apply><cn id="S3.SS3.p1.9.m9.2.2.1.1.3.cmml" type="integer" xref="S3.SS3.p1.9.m9.2.2.1.1.3">2</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.9.m9.2c">\mathrm{SNR}_{\bm{x}}=\log_{10}||\bm{s}||^{2}_{2}/||\bm{n}^{\rm obs}||^{2}_{2}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.9.m9.2d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT = roman_log start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT | | bold_italic_s | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / | | bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math>) and 3) the SNRs of the <span class="ltx_text ltx_font_italic" id="S3.SS3.p1.11.1">more noisy</span> signals <math alttext="\bm{y}" class="ltx_Math" display="inline" id="S3.SS3.p1.10.m10.1"><semantics id="S3.SS3.p1.10.m10.1a"><mi id="S3.SS3.p1.10.m10.1.1" xref="S3.SS3.p1.10.m10.1.1.cmml">𝒚</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.10.m10.1b"><ci id="S3.SS3.p1.10.m10.1.1.cmml" xref="S3.SS3.p1.10.m10.1.1">𝒚</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.10.m10.1c">\bm{y}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.10.m10.1d">bold_italic_y</annotation></semantics></math> (<math alttext="\mathrm{SNR}_{\bm{y}}=\log_{10}||\bm{x}||^{2}_{2}/||\bm{n}^{\rm add}||^{2}_{2}" class="ltx_Math" display="inline" id="S3.SS3.p1.11.m11.2"><semantics id="S3.SS3.p1.11.m11.2a"><mrow id="S3.SS3.p1.11.m11.2.2" xref="S3.SS3.p1.11.m11.2.2.cmml"><msub id="S3.SS3.p1.11.m11.2.2.3" xref="S3.SS3.p1.11.m11.2.2.3.cmml"><mi id="S3.SS3.p1.11.m11.2.2.3.2" xref="S3.SS3.p1.11.m11.2.2.3.2.cmml">SNR</mi><mi id="S3.SS3.p1.11.m11.2.2.3.3" xref="S3.SS3.p1.11.m11.2.2.3.3.cmml">𝒚</mi></msub><mo id="S3.SS3.p1.11.m11.2.2.2" xref="S3.SS3.p1.11.m11.2.2.2.cmml">=</mo><mrow id="S3.SS3.p1.11.m11.2.2.1" xref="S3.SS3.p1.11.m11.2.2.1.cmml"><mrow id="S3.SS3.p1.11.m11.2.2.1.3" xref="S3.SS3.p1.11.m11.2.2.1.3.cmml"><msub id="S3.SS3.p1.11.m11.2.2.1.3.2" xref="S3.SS3.p1.11.m11.2.2.1.3.2.cmml"><mi id="S3.SS3.p1.11.m11.2.2.1.3.2.2" xref="S3.SS3.p1.11.m11.2.2.1.3.2.2.cmml">log</mi><mn id="S3.SS3.p1.11.m11.2.2.1.3.2.3" xref="S3.SS3.p1.11.m11.2.2.1.3.2.3.cmml">10</mn></msub><mo id="S3.SS3.p1.11.m11.2.2.1.3.1" xref="S3.SS3.p1.11.m11.2.2.1.3.1.cmml">⁢</mo><msubsup id="S3.SS3.p1.11.m11.2.2.1.3.3" xref="S3.SS3.p1.11.m11.2.2.1.3.3.cmml"><mrow id="S3.SS3.p1.11.m11.2.2.1.3.3.2.2.2" xref="S3.SS3.p1.11.m11.2.2.1.3.3.2.2.1.cmml"><mo id="S3.SS3.p1.11.m11.2.2.1.3.3.2.2.2.1" stretchy="false" xref="S3.SS3.p1.11.m11.2.2.1.3.3.2.2.1.1.cmml">‖</mo><mi id="S3.SS3.p1.11.m11.1.1" xref="S3.SS3.p1.11.m11.1.1.cmml">𝒙</mi><mo id="S3.SS3.p1.11.m11.2.2.1.3.3.2.2.2.2" stretchy="false" xref="S3.SS3.p1.11.m11.2.2.1.3.3.2.2.1.1.cmml">‖</mo></mrow><mn id="S3.SS3.p1.11.m11.2.2.1.3.3.3" xref="S3.SS3.p1.11.m11.2.2.1.3.3.3.cmml">2</mn><mn id="S3.SS3.p1.11.m11.2.2.1.3.3.2.3" xref="S3.SS3.p1.11.m11.2.2.1.3.3.2.3.cmml">2</mn></msubsup></mrow><mo id="S3.SS3.p1.11.m11.2.2.1.2" xref="S3.SS3.p1.11.m11.2.2.1.2.cmml">/</mo><msubsup id="S3.SS3.p1.11.m11.2.2.1.1" xref="S3.SS3.p1.11.m11.2.2.1.1.cmml"><mrow id="S3.SS3.p1.11.m11.2.2.1.1.1.1.1" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.2.cmml"><mo id="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.2" stretchy="false" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.2.1.cmml">‖</mo><msup id="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1.cmml"><mi id="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1.2" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1.2.cmml">𝒏</mi><mi id="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1.3" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1.3.cmml">add</mi></msup><mo id="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.3" stretchy="false" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S3.SS3.p1.11.m11.2.2.1.1.3" xref="S3.SS3.p1.11.m11.2.2.1.1.3.cmml">2</mn><mn id="S3.SS3.p1.11.m11.2.2.1.1.1.3" xref="S3.SS3.p1.11.m11.2.2.1.1.1.3.cmml">2</mn></msubsup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.11.m11.2b"><apply id="S3.SS3.p1.11.m11.2.2.cmml" xref="S3.SS3.p1.11.m11.2.2"><eq id="S3.SS3.p1.11.m11.2.2.2.cmml" xref="S3.SS3.p1.11.m11.2.2.2"></eq><apply id="S3.SS3.p1.11.m11.2.2.3.cmml" xref="S3.SS3.p1.11.m11.2.2.3"><csymbol cd="ambiguous" id="S3.SS3.p1.11.m11.2.2.3.1.cmml" xref="S3.SS3.p1.11.m11.2.2.3">subscript</csymbol><ci id="S3.SS3.p1.11.m11.2.2.3.2.cmml" xref="S3.SS3.p1.11.m11.2.2.3.2">SNR</ci><ci id="S3.SS3.p1.11.m11.2.2.3.3.cmml" xref="S3.SS3.p1.11.m11.2.2.3.3">𝒚</ci></apply><apply id="S3.SS3.p1.11.m11.2.2.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1"><divide id="S3.SS3.p1.11.m11.2.2.1.2.cmml" xref="S3.SS3.p1.11.m11.2.2.1.2"></divide><apply id="S3.SS3.p1.11.m11.2.2.1.3.cmml" xref="S3.SS3.p1.11.m11.2.2.1.3"><times id="S3.SS3.p1.11.m11.2.2.1.3.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1.3.1"></times><apply id="S3.SS3.p1.11.m11.2.2.1.3.2.cmml" xref="S3.SS3.p1.11.m11.2.2.1.3.2"><csymbol cd="ambiguous" id="S3.SS3.p1.11.m11.2.2.1.3.2.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1.3.2">subscript</csymbol><log id="S3.SS3.p1.11.m11.2.2.1.3.2.2.cmml" xref="S3.SS3.p1.11.m11.2.2.1.3.2.2"></log><cn id="S3.SS3.p1.11.m11.2.2.1.3.2.3.cmml" type="integer" xref="S3.SS3.p1.11.m11.2.2.1.3.2.3">10</cn></apply><apply id="S3.SS3.p1.11.m11.2.2.1.3.3.cmml" xref="S3.SS3.p1.11.m11.2.2.1.3.3"><csymbol cd="ambiguous" id="S3.SS3.p1.11.m11.2.2.1.3.3.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1.3.3">subscript</csymbol><apply id="S3.SS3.p1.11.m11.2.2.1.3.3.2.cmml" xref="S3.SS3.p1.11.m11.2.2.1.3.3"><csymbol cd="ambiguous" id="S3.SS3.p1.11.m11.2.2.1.3.3.2.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1.3.3">superscript</csymbol><apply id="S3.SS3.p1.11.m11.2.2.1.3.3.2.2.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1.3.3.2.2.2"><csymbol cd="latexml" id="S3.SS3.p1.11.m11.2.2.1.3.3.2.2.1.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1.3.3.2.2.2.1">norm</csymbol><ci id="S3.SS3.p1.11.m11.1.1.cmml" xref="S3.SS3.p1.11.m11.1.1">𝒙</ci></apply><cn id="S3.SS3.p1.11.m11.2.2.1.3.3.2.3.cmml" type="integer" xref="S3.SS3.p1.11.m11.2.2.1.3.3.2.3">2</cn></apply><cn id="S3.SS3.p1.11.m11.2.2.1.3.3.3.cmml" type="integer" xref="S3.SS3.p1.11.m11.2.2.1.3.3.3">2</cn></apply></apply><apply id="S3.SS3.p1.11.m11.2.2.1.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.11.m11.2.2.1.1.2.cmml" xref="S3.SS3.p1.11.m11.2.2.1.1">subscript</csymbol><apply id="S3.SS3.p1.11.m11.2.2.1.1.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.11.m11.2.2.1.1.1.2.cmml" xref="S3.SS3.p1.11.m11.2.2.1.1">superscript</csymbol><apply id="S3.SS3.p1.11.m11.2.2.1.1.1.1.2.cmml" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.1"><csymbol cd="latexml" id="S3.SS3.p1.11.m11.2.2.1.1.1.1.2.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.2">norm</csymbol><apply id="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1.1.cmml" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1">superscript</csymbol><ci id="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1.2.cmml" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1.2">𝒏</ci><ci id="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1.3.cmml" xref="S3.SS3.p1.11.m11.2.2.1.1.1.1.1.1.3">add</ci></apply></apply><cn id="S3.SS3.p1.11.m11.2.2.1.1.1.3.cmml" type="integer" xref="S3.SS3.p1.11.m11.2.2.1.1.1.3">2</cn></apply><cn id="S3.SS3.p1.11.m11.2.2.1.1.3.cmml" type="integer" xref="S3.SS3.p1.11.m11.2.2.1.1.3">2</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.11.m11.2c">\mathrm{SNR}_{\bm{y}}=\log_{10}||\bm{x}||^{2}_{2}/||\bm{n}^{\rm add}||^{2}_{2}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.11.m11.2d">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT = roman_log start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT | | bold_italic_x | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / | | bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT | | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math>) on performance.</p> </div> </section> <section class="ltx_subsection" id="S3.SS4"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.4 </span>Effectiveness of utilizing noisy signals in a situation where clean target signals are available</h3> <div class="ltx_para" id="S3.SS4.p1"> <p class="ltx_p" id="S3.SS4.p1.1">Collecting a large number of clean target signals is challenging owing to high recording costs. In some cases, no clean target signals may be available, whereas in others, a small number of clean target signals are available, depending on the task. In Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS5" title="4.5 Effectiveness of utilizing noisy signals in a situation where clean target signals are available ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4.5</span></a>, assuming situations where a small number of clean target signals are available, we investigate the effectiveness of utilizing a larger number of noisy signals.</p> </div> </section> <section class="ltx_subsection" id="S3.SS5"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.5 </span>Capabilities in the dereverberation task</h3> <div class="ltx_para" id="S3.SS5.p1"> <p class="ltx_p" id="S3.SS5.p1.7">The dereverberation task aims to restore an original signal from a reverberant signal. CTT can achieve dereverberation in the same manner as the denoising task by inputting reverberant signals and training a DNN to predict clean target signals. In Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S5" title="5 Experimental analysis in the dereverberation task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">5</span></a>, we investigate whether NyTT can also perform dereverberation as CTT does. Specifically, we evaluate the performance of CTT, NyTT, and IterNyTT, where CTT predicts a clean target signal <math alttext="\bm{s}" class="ltx_Math" display="inline" id="S3.SS5.p1.1.m1.1"><semantics id="S3.SS5.p1.1.m1.1a"><mi id="S3.SS5.p1.1.m1.1.1" xref="S3.SS5.p1.1.m1.1.1.cmml">𝒔</mi><annotation-xml encoding="MathML-Content" id="S3.SS5.p1.1.m1.1b"><ci id="S3.SS5.p1.1.m1.1.1.cmml" xref="S3.SS5.p1.1.m1.1.1">𝒔</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p1.1.m1.1c">\bm{s}</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p1.1.m1.1d">bold_italic_s</annotation></semantics></math> from a reverberant signal <math alttext="\bm{s}\ast\bm{r}^{\rm add}" class="ltx_Math" display="inline" id="S3.SS5.p1.2.m2.1"><semantics id="S3.SS5.p1.2.m2.1a"><mrow id="S3.SS5.p1.2.m2.1.1" xref="S3.SS5.p1.2.m2.1.1.cmml"><mi id="S3.SS5.p1.2.m2.1.1.2" xref="S3.SS5.p1.2.m2.1.1.2.cmml">𝒔</mi><mo id="S3.SS5.p1.2.m2.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.SS5.p1.2.m2.1.1.1.cmml">∗</mo><msup id="S3.SS5.p1.2.m2.1.1.3" xref="S3.SS5.p1.2.m2.1.1.3.cmml"><mi id="S3.SS5.p1.2.m2.1.1.3.2" xref="S3.SS5.p1.2.m2.1.1.3.2.cmml">𝒓</mi><mi id="S3.SS5.p1.2.m2.1.1.3.3" xref="S3.SS5.p1.2.m2.1.1.3.3.cmml">add</mi></msup></mrow><annotation-xml encoding="MathML-Content" id="S3.SS5.p1.2.m2.1b"><apply id="S3.SS5.p1.2.m2.1.1.cmml" xref="S3.SS5.p1.2.m2.1.1"><ci id="S3.SS5.p1.2.m2.1.1.1.cmml" xref="S3.SS5.p1.2.m2.1.1.1">∗</ci><ci id="S3.SS5.p1.2.m2.1.1.2.cmml" xref="S3.SS5.p1.2.m2.1.1.2">𝒔</ci><apply id="S3.SS5.p1.2.m2.1.1.3.cmml" xref="S3.SS5.p1.2.m2.1.1.3"><csymbol cd="ambiguous" id="S3.SS5.p1.2.m2.1.1.3.1.cmml" xref="S3.SS5.p1.2.m2.1.1.3">superscript</csymbol><ci id="S3.SS5.p1.2.m2.1.1.3.2.cmml" xref="S3.SS5.p1.2.m2.1.1.3.2">𝒓</ci><ci id="S3.SS5.p1.2.m2.1.1.3.3.cmml" xref="S3.SS5.p1.2.m2.1.1.3.3">add</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p1.2.m2.1c">\bm{s}\ast\bm{r}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p1.2.m2.1d">bold_italic_s ∗ bold_italic_r start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> and NyTT predicts <math alttext="\bm{x}=\bm{s}\ast\bm{r}^{\rm obs}" class="ltx_Math" display="inline" id="S3.SS5.p1.3.m3.1"><semantics id="S3.SS5.p1.3.m3.1a"><mrow id="S3.SS5.p1.3.m3.1.1" xref="S3.SS5.p1.3.m3.1.1.cmml"><mi id="S3.SS5.p1.3.m3.1.1.2" xref="S3.SS5.p1.3.m3.1.1.2.cmml">𝒙</mi><mo id="S3.SS5.p1.3.m3.1.1.1" xref="S3.SS5.p1.3.m3.1.1.1.cmml">=</mo><mrow id="S3.SS5.p1.3.m3.1.1.3" xref="S3.SS5.p1.3.m3.1.1.3.cmml"><mi id="S3.SS5.p1.3.m3.1.1.3.2" xref="S3.SS5.p1.3.m3.1.1.3.2.cmml">𝒔</mi><mo id="S3.SS5.p1.3.m3.1.1.3.1" lspace="0.222em" rspace="0.222em" xref="S3.SS5.p1.3.m3.1.1.3.1.cmml">∗</mo><msup id="S3.SS5.p1.3.m3.1.1.3.3" xref="S3.SS5.p1.3.m3.1.1.3.3.cmml"><mi id="S3.SS5.p1.3.m3.1.1.3.3.2" xref="S3.SS5.p1.3.m3.1.1.3.3.2.cmml">𝒓</mi><mi id="S3.SS5.p1.3.m3.1.1.3.3.3" xref="S3.SS5.p1.3.m3.1.1.3.3.3.cmml">obs</mi></msup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS5.p1.3.m3.1b"><apply id="S3.SS5.p1.3.m3.1.1.cmml" xref="S3.SS5.p1.3.m3.1.1"><eq id="S3.SS5.p1.3.m3.1.1.1.cmml" xref="S3.SS5.p1.3.m3.1.1.1"></eq><ci id="S3.SS5.p1.3.m3.1.1.2.cmml" xref="S3.SS5.p1.3.m3.1.1.2">𝒙</ci><apply id="S3.SS5.p1.3.m3.1.1.3.cmml" xref="S3.SS5.p1.3.m3.1.1.3"><ci id="S3.SS5.p1.3.m3.1.1.3.1.cmml" xref="S3.SS5.p1.3.m3.1.1.3.1">∗</ci><ci id="S3.SS5.p1.3.m3.1.1.3.2.cmml" xref="S3.SS5.p1.3.m3.1.1.3.2">𝒔</ci><apply id="S3.SS5.p1.3.m3.1.1.3.3.cmml" xref="S3.SS5.p1.3.m3.1.1.3.3"><csymbol cd="ambiguous" id="S3.SS5.p1.3.m3.1.1.3.3.1.cmml" xref="S3.SS5.p1.3.m3.1.1.3.3">superscript</csymbol><ci id="S3.SS5.p1.3.m3.1.1.3.3.2.cmml" xref="S3.SS5.p1.3.m3.1.1.3.3.2">𝒓</ci><ci id="S3.SS5.p1.3.m3.1.1.3.3.3.cmml" xref="S3.SS5.p1.3.m3.1.1.3.3.3">obs</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p1.3.m3.1c">\bm{x}=\bm{s}\ast\bm{r}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p1.3.m3.1d">bold_italic_x = bold_italic_s ∗ bold_italic_r start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> from <math alttext="\bm{x}\ast\bm{r}^{\rm add}" class="ltx_Math" display="inline" id="S3.SS5.p1.4.m4.1"><semantics id="S3.SS5.p1.4.m4.1a"><mrow id="S3.SS5.p1.4.m4.1.1" xref="S3.SS5.p1.4.m4.1.1.cmml"><mi id="S3.SS5.p1.4.m4.1.1.2" xref="S3.SS5.p1.4.m4.1.1.2.cmml">𝒙</mi><mo id="S3.SS5.p1.4.m4.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.SS5.p1.4.m4.1.1.1.cmml">∗</mo><msup id="S3.SS5.p1.4.m4.1.1.3" xref="S3.SS5.p1.4.m4.1.1.3.cmml"><mi id="S3.SS5.p1.4.m4.1.1.3.2" xref="S3.SS5.p1.4.m4.1.1.3.2.cmml">𝒓</mi><mi id="S3.SS5.p1.4.m4.1.1.3.3" xref="S3.SS5.p1.4.m4.1.1.3.3.cmml">add</mi></msup></mrow><annotation-xml encoding="MathML-Content" id="S3.SS5.p1.4.m4.1b"><apply id="S3.SS5.p1.4.m4.1.1.cmml" xref="S3.SS5.p1.4.m4.1.1"><ci id="S3.SS5.p1.4.m4.1.1.1.cmml" xref="S3.SS5.p1.4.m4.1.1.1">∗</ci><ci id="S3.SS5.p1.4.m4.1.1.2.cmml" xref="S3.SS5.p1.4.m4.1.1.2">𝒙</ci><apply id="S3.SS5.p1.4.m4.1.1.3.cmml" xref="S3.SS5.p1.4.m4.1.1.3"><csymbol cd="ambiguous" id="S3.SS5.p1.4.m4.1.1.3.1.cmml" xref="S3.SS5.p1.4.m4.1.1.3">superscript</csymbol><ci id="S3.SS5.p1.4.m4.1.1.3.2.cmml" xref="S3.SS5.p1.4.m4.1.1.3.2">𝒓</ci><ci id="S3.SS5.p1.4.m4.1.1.3.3.cmml" xref="S3.SS5.p1.4.m4.1.1.3.3">add</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p1.4.m4.1c">\bm{x}\ast\bm{r}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p1.4.m4.1d">bold_italic_x ∗ bold_italic_r start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, where <math alttext="\bm{r}^{\rm obs}" class="ltx_Math" display="inline" id="S3.SS5.p1.5.m5.1"><semantics id="S3.SS5.p1.5.m5.1a"><msup id="S3.SS5.p1.5.m5.1.1" xref="S3.SS5.p1.5.m5.1.1.cmml"><mi id="S3.SS5.p1.5.m5.1.1.2" xref="S3.SS5.p1.5.m5.1.1.2.cmml">𝒓</mi><mi id="S3.SS5.p1.5.m5.1.1.3" xref="S3.SS5.p1.5.m5.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS5.p1.5.m5.1b"><apply id="S3.SS5.p1.5.m5.1.1.cmml" xref="S3.SS5.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S3.SS5.p1.5.m5.1.1.1.cmml" xref="S3.SS5.p1.5.m5.1.1">superscript</csymbol><ci id="S3.SS5.p1.5.m5.1.1.2.cmml" xref="S3.SS5.p1.5.m5.1.1.2">𝒓</ci><ci id="S3.SS5.p1.5.m5.1.1.3.cmml" xref="S3.SS5.p1.5.m5.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p1.5.m5.1c">\bm{r}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p1.5.m5.1d">bold_italic_r start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{r}^{\rm add}" class="ltx_Math" display="inline" id="S3.SS5.p1.6.m6.1"><semantics id="S3.SS5.p1.6.m6.1a"><msup id="S3.SS5.p1.6.m6.1.1" xref="S3.SS5.p1.6.m6.1.1.cmml"><mi id="S3.SS5.p1.6.m6.1.1.2" xref="S3.SS5.p1.6.m6.1.1.2.cmml">𝒓</mi><mi id="S3.SS5.p1.6.m6.1.1.3" xref="S3.SS5.p1.6.m6.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S3.SS5.p1.6.m6.1b"><apply id="S3.SS5.p1.6.m6.1.1.cmml" xref="S3.SS5.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S3.SS5.p1.6.m6.1.1.1.cmml" xref="S3.SS5.p1.6.m6.1.1">superscript</csymbol><ci id="S3.SS5.p1.6.m6.1.1.2.cmml" xref="S3.SS5.p1.6.m6.1.1.2">𝒓</ci><ci id="S3.SS5.p1.6.m6.1.1.3.cmml" xref="S3.SS5.p1.6.m6.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p1.6.m6.1c">\bm{r}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p1.6.m6.1d">bold_italic_r start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> are room impulse responses (RIRs), and <math alttext="\ast" class="ltx_Math" display="inline" id="S3.SS5.p1.7.m7.1"><semantics id="S3.SS5.p1.7.m7.1a"><mo id="S3.SS5.p1.7.m7.1.1" xref="S3.SS5.p1.7.m7.1.1.cmml">∗</mo><annotation-xml encoding="MathML-Content" id="S3.SS5.p1.7.m7.1b"><ci id="S3.SS5.p1.7.m7.1.1.cmml" xref="S3.SS5.p1.7.m7.1.1">∗</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS5.p1.7.m7.1c">\ast</annotation><annotation encoding="application/x-llamapun" id="S3.SS5.p1.7.m7.1d">∗</annotation></semantics></math> denotes the convolution operation.</p> </div> </section> <section class="ltx_subsection" id="S3.SS6"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.6 </span>Capabilities in the declipping task</h3> <div class="ltx_para" id="S3.SS6.p1"> <p class="ltx_p" id="S3.SS6.p1.1">In addition to the dereverberation task, we investigate the capabilities of NyTT in the declipping task. The declipping task aims to restore an original signal from a clipped signal where the clipping function <math alttext="f_{\mathrm{clip}}:\mathbb{R}^{T}\rightarrow\mathbb{R}^{T}" class="ltx_Math" display="inline" id="S3.SS6.p1.1.m1.1"><semantics id="S3.SS6.p1.1.m1.1a"><mrow id="S3.SS6.p1.1.m1.1.1" xref="S3.SS6.p1.1.m1.1.1.cmml"><msub id="S3.SS6.p1.1.m1.1.1.2" xref="S3.SS6.p1.1.m1.1.1.2.cmml"><mi id="S3.SS6.p1.1.m1.1.1.2.2" xref="S3.SS6.p1.1.m1.1.1.2.2.cmml">f</mi><mi id="S3.SS6.p1.1.m1.1.1.2.3" xref="S3.SS6.p1.1.m1.1.1.2.3.cmml">clip</mi></msub><mo id="S3.SS6.p1.1.m1.1.1.1" lspace="0.278em" rspace="0.278em" xref="S3.SS6.p1.1.m1.1.1.1.cmml">:</mo><mrow id="S3.SS6.p1.1.m1.1.1.3" xref="S3.SS6.p1.1.m1.1.1.3.cmml"><msup id="S3.SS6.p1.1.m1.1.1.3.2" xref="S3.SS6.p1.1.m1.1.1.3.2.cmml"><mi id="S3.SS6.p1.1.m1.1.1.3.2.2" xref="S3.SS6.p1.1.m1.1.1.3.2.2.cmml">ℝ</mi><mi id="S3.SS6.p1.1.m1.1.1.3.2.3" xref="S3.SS6.p1.1.m1.1.1.3.2.3.cmml">T</mi></msup><mo id="S3.SS6.p1.1.m1.1.1.3.1" stretchy="false" xref="S3.SS6.p1.1.m1.1.1.3.1.cmml">→</mo><msup id="S3.SS6.p1.1.m1.1.1.3.3" xref="S3.SS6.p1.1.m1.1.1.3.3.cmml"><mi id="S3.SS6.p1.1.m1.1.1.3.3.2" xref="S3.SS6.p1.1.m1.1.1.3.3.2.cmml">ℝ</mi><mi id="S3.SS6.p1.1.m1.1.1.3.3.3" xref="S3.SS6.p1.1.m1.1.1.3.3.3.cmml">T</mi></msup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS6.p1.1.m1.1b"><apply id="S3.SS6.p1.1.m1.1.1.cmml" xref="S3.SS6.p1.1.m1.1.1"><ci id="S3.SS6.p1.1.m1.1.1.1.cmml" xref="S3.SS6.p1.1.m1.1.1.1">:</ci><apply id="S3.SS6.p1.1.m1.1.1.2.cmml" xref="S3.SS6.p1.1.m1.1.1.2"><csymbol cd="ambiguous" id="S3.SS6.p1.1.m1.1.1.2.1.cmml" xref="S3.SS6.p1.1.m1.1.1.2">subscript</csymbol><ci id="S3.SS6.p1.1.m1.1.1.2.2.cmml" xref="S3.SS6.p1.1.m1.1.1.2.2">𝑓</ci><ci id="S3.SS6.p1.1.m1.1.1.2.3.cmml" xref="S3.SS6.p1.1.m1.1.1.2.3">clip</ci></apply><apply id="S3.SS6.p1.1.m1.1.1.3.cmml" xref="S3.SS6.p1.1.m1.1.1.3"><ci id="S3.SS6.p1.1.m1.1.1.3.1.cmml" xref="S3.SS6.p1.1.m1.1.1.3.1">→</ci><apply id="S3.SS6.p1.1.m1.1.1.3.2.cmml" xref="S3.SS6.p1.1.m1.1.1.3.2"><csymbol cd="ambiguous" id="S3.SS6.p1.1.m1.1.1.3.2.1.cmml" xref="S3.SS6.p1.1.m1.1.1.3.2">superscript</csymbol><ci id="S3.SS6.p1.1.m1.1.1.3.2.2.cmml" xref="S3.SS6.p1.1.m1.1.1.3.2.2">ℝ</ci><ci id="S3.SS6.p1.1.m1.1.1.3.2.3.cmml" xref="S3.SS6.p1.1.m1.1.1.3.2.3">𝑇</ci></apply><apply id="S3.SS6.p1.1.m1.1.1.3.3.cmml" xref="S3.SS6.p1.1.m1.1.1.3.3"><csymbol cd="ambiguous" id="S3.SS6.p1.1.m1.1.1.3.3.1.cmml" xref="S3.SS6.p1.1.m1.1.1.3.3">superscript</csymbol><ci id="S3.SS6.p1.1.m1.1.1.3.3.2.cmml" xref="S3.SS6.p1.1.m1.1.1.3.3.2">ℝ</ci><ci id="S3.SS6.p1.1.m1.1.1.3.3.3.cmml" xref="S3.SS6.p1.1.m1.1.1.3.3.3">𝑇</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS6.p1.1.m1.1c">f_{\mathrm{clip}}:\mathbb{R}^{T}\rightarrow\mathbb{R}^{T}</annotation><annotation encoding="application/x-llamapun" id="S3.SS6.p1.1.m1.1d">italic_f start_POSTSUBSCRIPT roman_clip end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT</annotation></semantics></math> is defined as</p> <table class="ltx_equationgroup ltx_eqn_align ltx_eqn_table" id="Sx2.EGx5"> <tbody id="S3.E12"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle f_{\mathrm{clip}}(\bm{s};c)[m]=\left\{\begin{array}[]{ll}\bm{s}[% m]&|\bm{s}[m]|<c\\ c\cdot\mathrm{sgn}(\bm{s}[m])&\textrm{otherwise}\end{array}\right.\,\,," class="ltx_Math" display="inline" id="S3.E12.m1.5"><semantics id="S3.E12.m1.5a"><mrow id="S3.E12.m1.5.5.1" xref="S3.E12.m1.5.5.1.1.cmml"><mrow id="S3.E12.m1.5.5.1.1" xref="S3.E12.m1.5.5.1.1.cmml"><mrow id="S3.E12.m1.5.5.1.1.2" xref="S3.E12.m1.5.5.1.1.2.cmml"><msub id="S3.E12.m1.5.5.1.1.2.2" xref="S3.E12.m1.5.5.1.1.2.2.cmml"><mi id="S3.E12.m1.5.5.1.1.2.2.2" xref="S3.E12.m1.5.5.1.1.2.2.2.cmml">f</mi><mi id="S3.E12.m1.5.5.1.1.2.2.3" xref="S3.E12.m1.5.5.1.1.2.2.3.cmml">clip</mi></msub><mo id="S3.E12.m1.5.5.1.1.2.1" xref="S3.E12.m1.5.5.1.1.2.1.cmml">⁢</mo><mrow id="S3.E12.m1.5.5.1.1.2.3.2" xref="S3.E12.m1.5.5.1.1.2.3.1.cmml"><mo id="S3.E12.m1.5.5.1.1.2.3.2.1" stretchy="false" xref="S3.E12.m1.5.5.1.1.2.3.1.cmml">(</mo><mi id="S3.E12.m1.1.1" xref="S3.E12.m1.1.1.cmml">𝒔</mi><mo id="S3.E12.m1.5.5.1.1.2.3.2.2" xref="S3.E12.m1.5.5.1.1.2.3.1.cmml">;</mo><mi id="S3.E12.m1.2.2" xref="S3.E12.m1.2.2.cmml">c</mi><mo id="S3.E12.m1.5.5.1.1.2.3.2.3" stretchy="false" xref="S3.E12.m1.5.5.1.1.2.3.1.cmml">)</mo></mrow><mo id="S3.E12.m1.5.5.1.1.2.1a" xref="S3.E12.m1.5.5.1.1.2.1.cmml">⁢</mo><mrow id="S3.E12.m1.5.5.1.1.2.4.2" xref="S3.E12.m1.5.5.1.1.2.4.1.cmml"><mo id="S3.E12.m1.5.5.1.1.2.4.2.1" stretchy="false" xref="S3.E12.m1.5.5.1.1.2.4.1.1.cmml">[</mo><mi id="S3.E12.m1.3.3" xref="S3.E12.m1.3.3.cmml">m</mi><mo id="S3.E12.m1.5.5.1.1.2.4.2.2" stretchy="false" xref="S3.E12.m1.5.5.1.1.2.4.1.1.cmml">]</mo></mrow></mrow><mo id="S3.E12.m1.5.5.1.1.1" xref="S3.E12.m1.5.5.1.1.1.cmml">=</mo><mrow id="S3.E12.m1.5.5.1.1.3.2" xref="S3.E12.m1.5.5.1.1.3.1.cmml"><mo id="S3.E12.m1.5.5.1.1.3.2.1" xref="S3.E12.m1.5.5.1.1.3.1.1.cmml">{</mo><mtable columnspacing="5pt" id="S3.E12.m1.4.4" rowspacing="0pt" xref="S3.E12.m1.4.4.cmml"><mtr id="S3.E12.m1.4.4a" xref="S3.E12.m1.4.4.cmml"><mtd class="ltx_align_left" columnalign="left" id="S3.E12.m1.4.4b" xref="S3.E12.m1.4.4.cmml"><mrow id="S3.E10.1.1" xref="S3.E10.1.1.cmml"><mi id="S3.E10.1.1.3" xref="S3.E10.1.1.3.cmml">𝒔</mi><mo id="S3.E10.1.1.2" xref="S3.E10.1.1.2.cmml">⁢</mo><mrow id="S3.E10.1.1.4.2" xref="S3.E10.1.1.4.1.cmml"><mo id="S3.E10.1.1.4.2.1" stretchy="false" xref="S3.E10.1.1.4.1.1.cmml">[</mo><mi id="S3.E10.1.1.1" xref="S3.E10.1.1.1.cmml">m</mi><mo id="S3.E10.1.1.4.2.2" stretchy="false" xref="S3.E10.1.1.4.1.1.cmml">]</mo></mrow></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S3.E12.m1.4.4c" xref="S3.E12.m1.4.4.cmml"><mrow id="S3.E10.3.2" xref="S3.E10.3.2.cmml"><mrow id="S3.E10.3.2.2.1" xref="S3.E10.3.2.2.2.cmml"><mo id="S3.E10.3.2.2.1.2" stretchy="false" xref="S3.E10.3.2.2.2.1.cmml">|</mo><mrow id="S3.E10.3.2.2.1.1" xref="S3.E10.3.2.2.1.1.cmml"><mi id="S3.E10.3.2.2.1.1.2" xref="S3.E10.3.2.2.1.1.2.cmml">𝒔</mi><mo id="S3.E10.3.2.2.1.1.1" xref="S3.E10.3.2.2.1.1.1.cmml">⁢</mo><mrow id="S3.E10.3.2.2.1.1.3.2" xref="S3.E10.3.2.2.1.1.3.1.cmml"><mo id="S3.E10.3.2.2.1.1.3.2.1" stretchy="false" xref="S3.E10.3.2.2.1.1.3.1.1.cmml">[</mo><mi id="S3.E10.2.1.1" xref="S3.E10.2.1.1.cmml">m</mi><mo id="S3.E10.3.2.2.1.1.3.2.2" stretchy="false" xref="S3.E10.3.2.2.1.1.3.1.1.cmml">]</mo></mrow></mrow><mo id="S3.E10.3.2.2.1.3" stretchy="false" xref="S3.E10.3.2.2.2.1.cmml">|</mo></mrow><mo id="S3.E10.3.2.3" xref="S3.E10.3.2.3.cmml"><</mo><mi id="S3.E10.3.2.4" xref="S3.E10.3.2.4.cmml">c</mi></mrow></mtd></mtr><mtr id="S3.E12.m1.4.4d" xref="S3.E12.m1.4.4.cmml"><mtd class="ltx_align_left" columnalign="left" id="S3.E12.m1.4.4e" xref="S3.E12.m1.4.4.cmml"><mrow id="S3.E11.2.2" xref="S3.E11.2.2.cmml"><mrow id="S3.E11.2.2.4" xref="S3.E11.2.2.4.cmml"><mi id="S3.E11.2.2.4.2" xref="S3.E11.2.2.4.2.cmml">c</mi><mo id="S3.E11.2.2.4.1" lspace="0.222em" rspace="0.222em" xref="S3.E11.2.2.4.1.cmml">⋅</mo><mi id="S3.E11.2.2.4.3" xref="S3.E11.2.2.4.3.cmml">sgn</mi></mrow><mo id="S3.E11.2.2.3" xref="S3.E11.2.2.3.cmml">⁢</mo><mrow id="S3.E11.2.2.2.1" xref="S3.E11.2.2.2.1.1.cmml"><mo id="S3.E11.2.2.2.1.2" stretchy="false" xref="S3.E11.2.2.2.1.1.cmml">(</mo><mrow id="S3.E11.2.2.2.1.1" xref="S3.E11.2.2.2.1.1.cmml"><mi id="S3.E11.2.2.2.1.1.2" xref="S3.E11.2.2.2.1.1.2.cmml">𝒔</mi><mo id="S3.E11.2.2.2.1.1.1" xref="S3.E11.2.2.2.1.1.1.cmml">⁢</mo><mrow id="S3.E11.2.2.2.1.1.3.2" xref="S3.E11.2.2.2.1.1.3.1.cmml"><mo id="S3.E11.2.2.2.1.1.3.2.1" stretchy="false" xref="S3.E11.2.2.2.1.1.3.1.1.cmml">[</mo><mi id="S3.E11.1.1.1" xref="S3.E11.1.1.1.cmml">m</mi><mo id="S3.E11.2.2.2.1.1.3.2.2" stretchy="false" xref="S3.E11.2.2.2.1.1.3.1.1.cmml">]</mo></mrow></mrow><mo id="S3.E11.2.2.2.1.3" stretchy="false" xref="S3.E11.2.2.2.1.1.cmml">)</mo></mrow></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S3.E12.m1.4.4f" xref="S3.E12.m1.4.4.cmml"><mtext id="S3.E11.3.1" xref="S3.E11.3.1a.cmml">otherwise</mtext></mtd></mtr></mtable><mi id="S3.E12.m1.5.5.1.1.3.2.2" xref="S3.E12.m1.5.5.1.1.3.1.1.cmml"></mi></mrow></mrow><mo id="S3.E12.m1.5.5.1.2" xref="S3.E12.m1.5.5.1.1.cmml">,</mo></mrow><annotation-xml encoding="MathML-Content" id="S3.E12.m1.5b"><apply id="S3.E12.m1.5.5.1.1.cmml" xref="S3.E12.m1.5.5.1"><eq id="S3.E12.m1.5.5.1.1.1.cmml" xref="S3.E12.m1.5.5.1.1.1"></eq><apply id="S3.E12.m1.5.5.1.1.2.cmml" xref="S3.E12.m1.5.5.1.1.2"><times id="S3.E12.m1.5.5.1.1.2.1.cmml" xref="S3.E12.m1.5.5.1.1.2.1"></times><apply id="S3.E12.m1.5.5.1.1.2.2.cmml" xref="S3.E12.m1.5.5.1.1.2.2"><csymbol cd="ambiguous" id="S3.E12.m1.5.5.1.1.2.2.1.cmml" xref="S3.E12.m1.5.5.1.1.2.2">subscript</csymbol><ci id="S3.E12.m1.5.5.1.1.2.2.2.cmml" xref="S3.E12.m1.5.5.1.1.2.2.2">𝑓</ci><ci id="S3.E12.m1.5.5.1.1.2.2.3.cmml" xref="S3.E12.m1.5.5.1.1.2.2.3">clip</ci></apply><list id="S3.E12.m1.5.5.1.1.2.3.1.cmml" xref="S3.E12.m1.5.5.1.1.2.3.2"><ci id="S3.E12.m1.1.1.cmml" xref="S3.E12.m1.1.1">𝒔</ci><ci id="S3.E12.m1.2.2.cmml" xref="S3.E12.m1.2.2">𝑐</ci></list><apply id="S3.E12.m1.5.5.1.1.2.4.1.cmml" xref="S3.E12.m1.5.5.1.1.2.4.2"><csymbol cd="latexml" id="S3.E12.m1.5.5.1.1.2.4.1.1.cmml" xref="S3.E12.m1.5.5.1.1.2.4.2.1">delimited-[]</csymbol><ci id="S3.E12.m1.3.3.cmml" xref="S3.E12.m1.3.3">𝑚</ci></apply></apply><apply id="S3.E12.m1.5.5.1.1.3.1.cmml" xref="S3.E12.m1.5.5.1.1.3.2"><csymbol cd="latexml" id="S3.E12.m1.5.5.1.1.3.1.1.cmml" xref="S3.E12.m1.5.5.1.1.3.2.1">cases</csymbol><matrix id="S3.E12.m1.4.4.cmml" xref="S3.E12.m1.4.4"><matrixrow id="S3.E12.m1.4.4a.cmml" xref="S3.E12.m1.4.4"><apply id="S3.E10.1.1.cmml" xref="S3.E10.1.1"><times id="S3.E10.1.1.2.cmml" xref="S3.E10.1.1.2"></times><ci id="S3.E10.1.1.3.cmml" xref="S3.E10.1.1.3">𝒔</ci><apply id="S3.E10.1.1.4.1.cmml" xref="S3.E10.1.1.4.2"><csymbol cd="latexml" id="S3.E10.1.1.4.1.1.cmml" xref="S3.E10.1.1.4.2.1">delimited-[]</csymbol><ci id="S3.E10.1.1.1.cmml" xref="S3.E10.1.1.1">𝑚</ci></apply></apply><apply id="S3.E10.3.2.cmml" xref="S3.E10.3.2"><lt id="S3.E10.3.2.3.cmml" xref="S3.E10.3.2.3"></lt><apply id="S3.E10.3.2.2.2.cmml" xref="S3.E10.3.2.2.1"><abs id="S3.E10.3.2.2.2.1.cmml" xref="S3.E10.3.2.2.1.2"></abs><apply id="S3.E10.3.2.2.1.1.cmml" xref="S3.E10.3.2.2.1.1"><times id="S3.E10.3.2.2.1.1.1.cmml" xref="S3.E10.3.2.2.1.1.1"></times><ci id="S3.E10.3.2.2.1.1.2.cmml" xref="S3.E10.3.2.2.1.1.2">𝒔</ci><apply id="S3.E10.3.2.2.1.1.3.1.cmml" xref="S3.E10.3.2.2.1.1.3.2"><csymbol cd="latexml" id="S3.E10.3.2.2.1.1.3.1.1.cmml" xref="S3.E10.3.2.2.1.1.3.2.1">delimited-[]</csymbol><ci id="S3.E10.2.1.1.cmml" xref="S3.E10.2.1.1">𝑚</ci></apply></apply></apply><ci id="S3.E10.3.2.4.cmml" xref="S3.E10.3.2.4">𝑐</ci></apply></matrixrow><matrixrow id="S3.E12.m1.4.4b.cmml" xref="S3.E12.m1.4.4"><apply id="S3.E11.2.2.cmml" xref="S3.E11.2.2"><times id="S3.E11.2.2.3.cmml" xref="S3.E11.2.2.3"></times><apply id="S3.E11.2.2.4.cmml" xref="S3.E11.2.2.4"><ci id="S3.E11.2.2.4.1.cmml" xref="S3.E11.2.2.4.1">⋅</ci><ci id="S3.E11.2.2.4.2.cmml" xref="S3.E11.2.2.4.2">𝑐</ci><ci id="S3.E11.2.2.4.3.cmml" xref="S3.E11.2.2.4.3">sgn</ci></apply><apply id="S3.E11.2.2.2.1.1.cmml" xref="S3.E11.2.2.2.1"><times id="S3.E11.2.2.2.1.1.1.cmml" xref="S3.E11.2.2.2.1.1.1"></times><ci id="S3.E11.2.2.2.1.1.2.cmml" xref="S3.E11.2.2.2.1.1.2">𝒔</ci><apply id="S3.E11.2.2.2.1.1.3.1.cmml" xref="S3.E11.2.2.2.1.1.3.2"><csymbol cd="latexml" id="S3.E11.2.2.2.1.1.3.1.1.cmml" xref="S3.E11.2.2.2.1.1.3.2.1">delimited-[]</csymbol><ci id="S3.E11.1.1.1.cmml" xref="S3.E11.1.1.1">𝑚</ci></apply></apply></apply><ci id="S3.E11.3.1a.cmml" xref="S3.E11.3.1"><mtext id="S3.E11.3.1.cmml" xref="S3.E11.3.1">otherwise</mtext></ci></matrixrow></matrix></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.E12.m1.5c">\displaystyle f_{\mathrm{clip}}(\bm{s};c)[m]=\left\{\begin{array}[]{ll}\bm{s}[% m]&|\bm{s}[m]|<c\\ c\cdot\mathrm{sgn}(\bm{s}[m])&\textrm{otherwise}\end{array}\right.\,\,,</annotation><annotation encoding="application/x-llamapun" id="S3.E12.m1.5d">italic_f start_POSTSUBSCRIPT roman_clip end_POSTSUBSCRIPT ( bold_italic_s ; italic_c ) [ italic_m ] = { start_ARRAY start_ROW start_CELL bold_italic_s [ italic_m ] end_CELL start_CELL | bold_italic_s [ italic_m ] | < italic_c end_CELL end_ROW start_ROW start_CELL italic_c ⋅ roman_sgn ( bold_italic_s [ italic_m ] ) end_CELL start_CELL otherwise end_CELL end_ROW end_ARRAY ,</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(12)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S3.SS6.p1.8">where <math alttext="m" class="ltx_Math" display="inline" id="S3.SS6.p1.2.m1.1"><semantics id="S3.SS6.p1.2.m1.1a"><mi id="S3.SS6.p1.2.m1.1.1" xref="S3.SS6.p1.2.m1.1.1.cmml">m</mi><annotation-xml encoding="MathML-Content" id="S3.SS6.p1.2.m1.1b"><ci id="S3.SS6.p1.2.m1.1.1.cmml" xref="S3.SS6.p1.2.m1.1.1">𝑚</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS6.p1.2.m1.1c">m</annotation><annotation encoding="application/x-llamapun" id="S3.SS6.p1.2.m1.1d">italic_m</annotation></semantics></math> is the time index and <math alttext="c" class="ltx_Math" display="inline" id="S3.SS6.p1.3.m2.1"><semantics id="S3.SS6.p1.3.m2.1a"><mi id="S3.SS6.p1.3.m2.1.1" xref="S3.SS6.p1.3.m2.1.1.cmml">c</mi><annotation-xml encoding="MathML-Content" id="S3.SS6.p1.3.m2.1b"><ci id="S3.SS6.p1.3.m2.1.1.cmml" xref="S3.SS6.p1.3.m2.1.1">𝑐</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS6.p1.3.m2.1c">c</annotation><annotation encoding="application/x-llamapun" id="S3.SS6.p1.3.m2.1d">italic_c</annotation></semantics></math> is a clipping threshold. In Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S6" title="6 Experimental analysis in the declipping task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">6</span></a>, we evaluate the performance of CTT, NyTT, and IterNyTT, where CTT predicts a clean target signal <math alttext="\bm{s}" class="ltx_Math" display="inline" id="S3.SS6.p1.4.m3.1"><semantics id="S3.SS6.p1.4.m3.1a"><mi id="S3.SS6.p1.4.m3.1.1" xref="S3.SS6.p1.4.m3.1.1.cmml">𝒔</mi><annotation-xml encoding="MathML-Content" id="S3.SS6.p1.4.m3.1b"><ci id="S3.SS6.p1.4.m3.1.1.cmml" xref="S3.SS6.p1.4.m3.1.1">𝒔</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS6.p1.4.m3.1c">\bm{s}</annotation><annotation encoding="application/x-llamapun" id="S3.SS6.p1.4.m3.1d">bold_italic_s</annotation></semantics></math> from a clipped signal <math alttext="f_{\mathrm{clip}}(\bm{s};c^{\rm add})" class="ltx_Math" display="inline" id="S3.SS6.p1.5.m4.2"><semantics id="S3.SS6.p1.5.m4.2a"><mrow id="S3.SS6.p1.5.m4.2.2" xref="S3.SS6.p1.5.m4.2.2.cmml"><msub id="S3.SS6.p1.5.m4.2.2.3" xref="S3.SS6.p1.5.m4.2.2.3.cmml"><mi id="S3.SS6.p1.5.m4.2.2.3.2" xref="S3.SS6.p1.5.m4.2.2.3.2.cmml">f</mi><mi id="S3.SS6.p1.5.m4.2.2.3.3" xref="S3.SS6.p1.5.m4.2.2.3.3.cmml">clip</mi></msub><mo id="S3.SS6.p1.5.m4.2.2.2" xref="S3.SS6.p1.5.m4.2.2.2.cmml">⁢</mo><mrow id="S3.SS6.p1.5.m4.2.2.1.1" xref="S3.SS6.p1.5.m4.2.2.1.2.cmml"><mo id="S3.SS6.p1.5.m4.2.2.1.1.2" stretchy="false" xref="S3.SS6.p1.5.m4.2.2.1.2.cmml">(</mo><mi id="S3.SS6.p1.5.m4.1.1" xref="S3.SS6.p1.5.m4.1.1.cmml">𝒔</mi><mo id="S3.SS6.p1.5.m4.2.2.1.1.3" xref="S3.SS6.p1.5.m4.2.2.1.2.cmml">;</mo><msup id="S3.SS6.p1.5.m4.2.2.1.1.1" xref="S3.SS6.p1.5.m4.2.2.1.1.1.cmml"><mi id="S3.SS6.p1.5.m4.2.2.1.1.1.2" xref="S3.SS6.p1.5.m4.2.2.1.1.1.2.cmml">c</mi><mi id="S3.SS6.p1.5.m4.2.2.1.1.1.3" xref="S3.SS6.p1.5.m4.2.2.1.1.1.3.cmml">add</mi></msup><mo id="S3.SS6.p1.5.m4.2.2.1.1.4" stretchy="false" xref="S3.SS6.p1.5.m4.2.2.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS6.p1.5.m4.2b"><apply id="S3.SS6.p1.5.m4.2.2.cmml" xref="S3.SS6.p1.5.m4.2.2"><times id="S3.SS6.p1.5.m4.2.2.2.cmml" xref="S3.SS6.p1.5.m4.2.2.2"></times><apply id="S3.SS6.p1.5.m4.2.2.3.cmml" xref="S3.SS6.p1.5.m4.2.2.3"><csymbol cd="ambiguous" id="S3.SS6.p1.5.m4.2.2.3.1.cmml" xref="S3.SS6.p1.5.m4.2.2.3">subscript</csymbol><ci id="S3.SS6.p1.5.m4.2.2.3.2.cmml" xref="S3.SS6.p1.5.m4.2.2.3.2">𝑓</ci><ci id="S3.SS6.p1.5.m4.2.2.3.3.cmml" xref="S3.SS6.p1.5.m4.2.2.3.3">clip</ci></apply><list id="S3.SS6.p1.5.m4.2.2.1.2.cmml" xref="S3.SS6.p1.5.m4.2.2.1.1"><ci id="S3.SS6.p1.5.m4.1.1.cmml" xref="S3.SS6.p1.5.m4.1.1">𝒔</ci><apply id="S3.SS6.p1.5.m4.2.2.1.1.1.cmml" xref="S3.SS6.p1.5.m4.2.2.1.1.1"><csymbol cd="ambiguous" id="S3.SS6.p1.5.m4.2.2.1.1.1.1.cmml" xref="S3.SS6.p1.5.m4.2.2.1.1.1">superscript</csymbol><ci id="S3.SS6.p1.5.m4.2.2.1.1.1.2.cmml" xref="S3.SS6.p1.5.m4.2.2.1.1.1.2">𝑐</ci><ci id="S3.SS6.p1.5.m4.2.2.1.1.1.3.cmml" xref="S3.SS6.p1.5.m4.2.2.1.1.1.3">add</ci></apply></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS6.p1.5.m4.2c">f_{\mathrm{clip}}(\bm{s};c^{\rm add})</annotation><annotation encoding="application/x-llamapun" id="S3.SS6.p1.5.m4.2d">italic_f start_POSTSUBSCRIPT roman_clip end_POSTSUBSCRIPT ( bold_italic_s ; italic_c start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT )</annotation></semantics></math> NyTT predicts <math alttext="\bm{x}=f_{\mathrm{clip}}(\bm{s};c^{\rm obs})" class="ltx_Math" display="inline" id="S3.SS6.p1.6.m5.2"><semantics id="S3.SS6.p1.6.m5.2a"><mrow id="S3.SS6.p1.6.m5.2.2" xref="S3.SS6.p1.6.m5.2.2.cmml"><mi id="S3.SS6.p1.6.m5.2.2.3" xref="S3.SS6.p1.6.m5.2.2.3.cmml">𝒙</mi><mo id="S3.SS6.p1.6.m5.2.2.2" xref="S3.SS6.p1.6.m5.2.2.2.cmml">=</mo><mrow id="S3.SS6.p1.6.m5.2.2.1" xref="S3.SS6.p1.6.m5.2.2.1.cmml"><msub id="S3.SS6.p1.6.m5.2.2.1.3" xref="S3.SS6.p1.6.m5.2.2.1.3.cmml"><mi id="S3.SS6.p1.6.m5.2.2.1.3.2" xref="S3.SS6.p1.6.m5.2.2.1.3.2.cmml">f</mi><mi id="S3.SS6.p1.6.m5.2.2.1.3.3" xref="S3.SS6.p1.6.m5.2.2.1.3.3.cmml">clip</mi></msub><mo id="S3.SS6.p1.6.m5.2.2.1.2" xref="S3.SS6.p1.6.m5.2.2.1.2.cmml">⁢</mo><mrow id="S3.SS6.p1.6.m5.2.2.1.1.1" xref="S3.SS6.p1.6.m5.2.2.1.1.2.cmml"><mo id="S3.SS6.p1.6.m5.2.2.1.1.1.2" stretchy="false" xref="S3.SS6.p1.6.m5.2.2.1.1.2.cmml">(</mo><mi id="S3.SS6.p1.6.m5.1.1" xref="S3.SS6.p1.6.m5.1.1.cmml">𝒔</mi><mo id="S3.SS6.p1.6.m5.2.2.1.1.1.3" xref="S3.SS6.p1.6.m5.2.2.1.1.2.cmml">;</mo><msup id="S3.SS6.p1.6.m5.2.2.1.1.1.1" xref="S3.SS6.p1.6.m5.2.2.1.1.1.1.cmml"><mi id="S3.SS6.p1.6.m5.2.2.1.1.1.1.2" xref="S3.SS6.p1.6.m5.2.2.1.1.1.1.2.cmml">c</mi><mi id="S3.SS6.p1.6.m5.2.2.1.1.1.1.3" xref="S3.SS6.p1.6.m5.2.2.1.1.1.1.3.cmml">obs</mi></msup><mo id="S3.SS6.p1.6.m5.2.2.1.1.1.4" stretchy="false" xref="S3.SS6.p1.6.m5.2.2.1.1.2.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS6.p1.6.m5.2b"><apply id="S3.SS6.p1.6.m5.2.2.cmml" xref="S3.SS6.p1.6.m5.2.2"><eq id="S3.SS6.p1.6.m5.2.2.2.cmml" xref="S3.SS6.p1.6.m5.2.2.2"></eq><ci id="S3.SS6.p1.6.m5.2.2.3.cmml" xref="S3.SS6.p1.6.m5.2.2.3">𝒙</ci><apply id="S3.SS6.p1.6.m5.2.2.1.cmml" xref="S3.SS6.p1.6.m5.2.2.1"><times id="S3.SS6.p1.6.m5.2.2.1.2.cmml" xref="S3.SS6.p1.6.m5.2.2.1.2"></times><apply id="S3.SS6.p1.6.m5.2.2.1.3.cmml" xref="S3.SS6.p1.6.m5.2.2.1.3"><csymbol cd="ambiguous" id="S3.SS6.p1.6.m5.2.2.1.3.1.cmml" xref="S3.SS6.p1.6.m5.2.2.1.3">subscript</csymbol><ci id="S3.SS6.p1.6.m5.2.2.1.3.2.cmml" xref="S3.SS6.p1.6.m5.2.2.1.3.2">𝑓</ci><ci id="S3.SS6.p1.6.m5.2.2.1.3.3.cmml" xref="S3.SS6.p1.6.m5.2.2.1.3.3">clip</ci></apply><list id="S3.SS6.p1.6.m5.2.2.1.1.2.cmml" xref="S3.SS6.p1.6.m5.2.2.1.1.1"><ci id="S3.SS6.p1.6.m5.1.1.cmml" xref="S3.SS6.p1.6.m5.1.1">𝒔</ci><apply id="S3.SS6.p1.6.m5.2.2.1.1.1.1.cmml" xref="S3.SS6.p1.6.m5.2.2.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS6.p1.6.m5.2.2.1.1.1.1.1.cmml" xref="S3.SS6.p1.6.m5.2.2.1.1.1.1">superscript</csymbol><ci id="S3.SS6.p1.6.m5.2.2.1.1.1.1.2.cmml" xref="S3.SS6.p1.6.m5.2.2.1.1.1.1.2">𝑐</ci><ci id="S3.SS6.p1.6.m5.2.2.1.1.1.1.3.cmml" xref="S3.SS6.p1.6.m5.2.2.1.1.1.1.3">obs</ci></apply></list></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS6.p1.6.m5.2c">\bm{x}=f_{\mathrm{clip}}(\bm{s};c^{\rm obs})</annotation><annotation encoding="application/x-llamapun" id="S3.SS6.p1.6.m5.2d">bold_italic_x = italic_f start_POSTSUBSCRIPT roman_clip end_POSTSUBSCRIPT ( bold_italic_s ; italic_c start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT )</annotation></semantics></math> from <math alttext="f_{\mathrm{clip}}(\bm{x};c^{\rm add})" class="ltx_Math" display="inline" id="S3.SS6.p1.7.m6.2"><semantics id="S3.SS6.p1.7.m6.2a"><mrow id="S3.SS6.p1.7.m6.2.2" xref="S3.SS6.p1.7.m6.2.2.cmml"><msub id="S3.SS6.p1.7.m6.2.2.3" xref="S3.SS6.p1.7.m6.2.2.3.cmml"><mi id="S3.SS6.p1.7.m6.2.2.3.2" xref="S3.SS6.p1.7.m6.2.2.3.2.cmml">f</mi><mi id="S3.SS6.p1.7.m6.2.2.3.3" xref="S3.SS6.p1.7.m6.2.2.3.3.cmml">clip</mi></msub><mo id="S3.SS6.p1.7.m6.2.2.2" xref="S3.SS6.p1.7.m6.2.2.2.cmml">⁢</mo><mrow id="S3.SS6.p1.7.m6.2.2.1.1" xref="S3.SS6.p1.7.m6.2.2.1.2.cmml"><mo id="S3.SS6.p1.7.m6.2.2.1.1.2" stretchy="false" xref="S3.SS6.p1.7.m6.2.2.1.2.cmml">(</mo><mi id="S3.SS6.p1.7.m6.1.1" xref="S3.SS6.p1.7.m6.1.1.cmml">𝒙</mi><mo id="S3.SS6.p1.7.m6.2.2.1.1.3" xref="S3.SS6.p1.7.m6.2.2.1.2.cmml">;</mo><msup id="S3.SS6.p1.7.m6.2.2.1.1.1" xref="S3.SS6.p1.7.m6.2.2.1.1.1.cmml"><mi id="S3.SS6.p1.7.m6.2.2.1.1.1.2" xref="S3.SS6.p1.7.m6.2.2.1.1.1.2.cmml">c</mi><mi id="S3.SS6.p1.7.m6.2.2.1.1.1.3" xref="S3.SS6.p1.7.m6.2.2.1.1.1.3.cmml">add</mi></msup><mo id="S3.SS6.p1.7.m6.2.2.1.1.4" stretchy="false" xref="S3.SS6.p1.7.m6.2.2.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS6.p1.7.m6.2b"><apply id="S3.SS6.p1.7.m6.2.2.cmml" xref="S3.SS6.p1.7.m6.2.2"><times id="S3.SS6.p1.7.m6.2.2.2.cmml" xref="S3.SS6.p1.7.m6.2.2.2"></times><apply id="S3.SS6.p1.7.m6.2.2.3.cmml" xref="S3.SS6.p1.7.m6.2.2.3"><csymbol cd="ambiguous" id="S3.SS6.p1.7.m6.2.2.3.1.cmml" xref="S3.SS6.p1.7.m6.2.2.3">subscript</csymbol><ci id="S3.SS6.p1.7.m6.2.2.3.2.cmml" xref="S3.SS6.p1.7.m6.2.2.3.2">𝑓</ci><ci id="S3.SS6.p1.7.m6.2.2.3.3.cmml" xref="S3.SS6.p1.7.m6.2.2.3.3">clip</ci></apply><list id="S3.SS6.p1.7.m6.2.2.1.2.cmml" xref="S3.SS6.p1.7.m6.2.2.1.1"><ci id="S3.SS6.p1.7.m6.1.1.cmml" xref="S3.SS6.p1.7.m6.1.1">𝒙</ci><apply id="S3.SS6.p1.7.m6.2.2.1.1.1.cmml" xref="S3.SS6.p1.7.m6.2.2.1.1.1"><csymbol cd="ambiguous" id="S3.SS6.p1.7.m6.2.2.1.1.1.1.cmml" xref="S3.SS6.p1.7.m6.2.2.1.1.1">superscript</csymbol><ci id="S3.SS6.p1.7.m6.2.2.1.1.1.2.cmml" xref="S3.SS6.p1.7.m6.2.2.1.1.1.2">𝑐</ci><ci id="S3.SS6.p1.7.m6.2.2.1.1.1.3.cmml" xref="S3.SS6.p1.7.m6.2.2.1.1.1.3">add</ci></apply></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS6.p1.7.m6.2c">f_{\mathrm{clip}}(\bm{x};c^{\rm add})</annotation><annotation encoding="application/x-llamapun" id="S3.SS6.p1.7.m6.2d">italic_f start_POSTSUBSCRIPT roman_clip end_POSTSUBSCRIPT ( bold_italic_x ; italic_c start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT )</annotation></semantics></math>, and <math alttext="c^{\rm add}<c^{\rm obs}" class="ltx_Math" display="inline" id="S3.SS6.p1.8.m7.1"><semantics id="S3.SS6.p1.8.m7.1a"><mrow id="S3.SS6.p1.8.m7.1.1" xref="S3.SS6.p1.8.m7.1.1.cmml"><msup id="S3.SS6.p1.8.m7.1.1.2" xref="S3.SS6.p1.8.m7.1.1.2.cmml"><mi id="S3.SS6.p1.8.m7.1.1.2.2" xref="S3.SS6.p1.8.m7.1.1.2.2.cmml">c</mi><mi id="S3.SS6.p1.8.m7.1.1.2.3" xref="S3.SS6.p1.8.m7.1.1.2.3.cmml">add</mi></msup><mo id="S3.SS6.p1.8.m7.1.1.1" xref="S3.SS6.p1.8.m7.1.1.1.cmml"><</mo><msup id="S3.SS6.p1.8.m7.1.1.3" xref="S3.SS6.p1.8.m7.1.1.3.cmml"><mi id="S3.SS6.p1.8.m7.1.1.3.2" xref="S3.SS6.p1.8.m7.1.1.3.2.cmml">c</mi><mi id="S3.SS6.p1.8.m7.1.1.3.3" xref="S3.SS6.p1.8.m7.1.1.3.3.cmml">obs</mi></msup></mrow><annotation-xml encoding="MathML-Content" id="S3.SS6.p1.8.m7.1b"><apply id="S3.SS6.p1.8.m7.1.1.cmml" xref="S3.SS6.p1.8.m7.1.1"><lt id="S3.SS6.p1.8.m7.1.1.1.cmml" xref="S3.SS6.p1.8.m7.1.1.1"></lt><apply id="S3.SS6.p1.8.m7.1.1.2.cmml" xref="S3.SS6.p1.8.m7.1.1.2"><csymbol cd="ambiguous" id="S3.SS6.p1.8.m7.1.1.2.1.cmml" xref="S3.SS6.p1.8.m7.1.1.2">superscript</csymbol><ci id="S3.SS6.p1.8.m7.1.1.2.2.cmml" xref="S3.SS6.p1.8.m7.1.1.2.2">𝑐</ci><ci id="S3.SS6.p1.8.m7.1.1.2.3.cmml" xref="S3.SS6.p1.8.m7.1.1.2.3">add</ci></apply><apply id="S3.SS6.p1.8.m7.1.1.3.cmml" xref="S3.SS6.p1.8.m7.1.1.3"><csymbol cd="ambiguous" id="S3.SS6.p1.8.m7.1.1.3.1.cmml" xref="S3.SS6.p1.8.m7.1.1.3">superscript</csymbol><ci id="S3.SS6.p1.8.m7.1.1.3.2.cmml" xref="S3.SS6.p1.8.m7.1.1.3.2">𝑐</ci><ci id="S3.SS6.p1.8.m7.1.1.3.3.cmml" xref="S3.SS6.p1.8.m7.1.1.3.3">obs</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS6.p1.8.m7.1c">c^{\rm add}<c^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S3.SS6.p1.8.m7.1d">italic_c start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT < italic_c start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>.</p> </div> <figure class="ltx_table" id="S3.T1"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table">Table 1: </span>Datasets used in the denoising task</figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S3.T1.2.2" style="width:433.6pt;height:96pt;vertical-align:-0.8pt;"><span class="ltx_transformed_inner" style="transform:translate(-41.5pt,9.1pt) scale(0.839234880053318,0.839234880053318) ;"> <table class="ltx_tabular ltx_align_middle" id="S3.T1.2.2.2"> <tr class="ltx_tr" id="S3.T1.2.2.2.3"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_tt" id="S3.T1.2.2.2.3.1" style="padding-top:1pt;padding-bottom:1pt;">Signal type</td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_tt" id="S3.T1.2.2.2.3.2" style="padding-top:1pt;padding-bottom:1pt;">Original Dataset</td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_tt" colspan="2" id="S3.T1.2.2.2.3.3" style="padding-top:1pt;padding-bottom:1pt;">Train set</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S3.T1.2.2.2.3.4" style="padding-top:1pt;padding-bottom:1pt;">Test set</td> </tr> <tr class="ltx_tr" id="S3.T1.1.1.1.1"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.1.1.1.1.1" style="padding-top:1pt;padding-bottom:1pt;"> <span class="ltx_text" id="S3.T1.1.1.1.1.1.2"></span><span class="ltx_text" id="S3.T1.1.1.1.1.1.1"> <span class="ltx_tabular ltx_align_middle" id="S3.T1.1.1.1.1.1.1.1"> <span class="ltx_tr" id="S3.T1.1.1.1.1.1.1.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_left" id="S3.T1.1.1.1.1.1.1.1.1.1" style="padding-top:1pt;padding-bottom:1pt;">Clean signal <math alttext="\bm{s}" class="ltx_Math" display="inline" id="S3.T1.1.1.1.1.1.1.1.1.1.m1.1"><semantics id="S3.T1.1.1.1.1.1.1.1.1.1.m1.1a"><mi id="S3.T1.1.1.1.1.1.1.1.1.1.m1.1.1" xref="S3.T1.1.1.1.1.1.1.1.1.1.m1.1.1.cmml">𝒔</mi><annotation-xml encoding="MathML-Content" id="S3.T1.1.1.1.1.1.1.1.1.1.m1.1b"><ci id="S3.T1.1.1.1.1.1.1.1.1.1.m1.1.1.cmml" xref="S3.T1.1.1.1.1.1.1.1.1.1.m1.1.1">𝒔</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.1.1.1.1.1.1.1.1.1.m1.1c">\bm{s}</annotation><annotation encoding="application/x-llamapun" id="S3.T1.1.1.1.1.1.1.1.1.1.m1.1d">bold_italic_s</annotation></semantics></math></span></span> <span class="ltx_tr" id="S3.T1.1.1.1.1.1.1.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_left" id="S3.T1.1.1.1.1.1.1.1.2.1" style="padding-top:1pt;padding-bottom:1pt;">(Utterances)</span></span> </span></span><span class="ltx_text" id="S3.T1.1.1.1.1.1.3"></span></td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.1.1.1.1.2" style="padding-top:1pt;padding-bottom:1pt;">LibriSpeech <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">libri</span>]</cite> </td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" colspan="2" id="S3.T1.1.1.1.1.3" style="padding-top:1pt;padding-bottom:1pt;">(10,000)</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.1.1.1.1.4" style="padding-top:1pt;padding-bottom:1pt;">(1,000)</td> </tr> <tr class="ltx_tr" id="S3.T1.2.2.2.2"> <td class="ltx_td ltx_align_left ltx_border_bb ltx_border_r ltx_border_t" id="S3.T1.2.2.2.2.1" rowspan="3" style="padding-top:1pt;padding-bottom:1pt;"><span class="ltx_text" id="S3.T1.2.2.2.2.1.1"><span class="ltx_text" id="S3.T1.2.2.2.2.1.1.2"></span><span class="ltx_text" id="S3.T1.2.2.2.2.1.1.1"> <span class="ltx_tabular ltx_align_middle" id="S3.T1.2.2.2.2.1.1.1.1"> <span class="ltx_tr" id="S3.T1.2.2.2.2.1.1.1.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_left" id="S3.T1.2.2.2.2.1.1.1.1.1.1" style="padding-top:1pt;padding-bottom:1pt;">Noise <math alttext="\bm{n}" class="ltx_Math" display="inline" id="S3.T1.2.2.2.2.1.1.1.1.1.1.m1.1"><semantics id="S3.T1.2.2.2.2.1.1.1.1.1.1.m1.1a"><mi id="S3.T1.2.2.2.2.1.1.1.1.1.1.m1.1.1" xref="S3.T1.2.2.2.2.1.1.1.1.1.1.m1.1.1.cmml">𝒏</mi><annotation-xml encoding="MathML-Content" id="S3.T1.2.2.2.2.1.1.1.1.1.1.m1.1b"><ci id="S3.T1.2.2.2.2.1.1.1.1.1.1.m1.1.1.cmml" xref="S3.T1.2.2.2.2.1.1.1.1.1.1.m1.1.1">𝒏</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.2.2.2.2.1.1.1.1.1.1.m1.1c">\bm{n}</annotation><annotation encoding="application/x-llamapun" id="S3.T1.2.2.2.2.1.1.1.1.1.1.m1.1d">bold_italic_n</annotation></semantics></math></span></span> <span class="ltx_tr" id="S3.T1.2.2.2.2.1.1.1.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_left" id="S3.T1.2.2.2.2.1.1.1.1.2.1" style="padding-top:1pt;padding-bottom:1pt;">(Volume [h])</span></span> </span></span> <span class="ltx_text" id="S3.T1.2.2.2.2.1.1.3"></span></span></td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.2.2.2.2.2" style="padding-top:1pt;padding-bottom:1pt;">CHiME3 <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">chime3</span>]</cite> </td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.2.2.2.2.3" style="padding-top:1pt;padding-bottom:1pt;"> <span class="ltx_text ltx_font_typewriter" id="S3.T1.2.2.2.2.3.1">CHiME-A</span> (3.92)</td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.2.2.2.2.4" style="padding-top:1pt;padding-bottom:1pt;"> <span class="ltx_text ltx_font_typewriter" id="S3.T1.2.2.2.2.4.1">CHiME-B</span> (3.92)</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.2.2.2.2.5" style="padding-top:1pt;padding-bottom:1pt;"> <span class="ltx_text ltx_font_typewriter" id="S3.T1.2.2.2.2.5.1">CHiME-C</span> (0.56)</td> </tr> <tr class="ltx_tr" id="S3.T1.2.2.2.4"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.2.2.2.4.1" style="padding-top:1pt;padding-bottom:1pt;">VoiceBank-DEMAND <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">vbd</span>]</cite> </td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.2.2.2.4.2" style="padding-top:1pt;padding-bottom:1pt;"> <span class="ltx_text ltx_font_typewriter" id="S3.T1.2.2.2.4.2.1">DEMAND-A</span> (4.70)</td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.2.2.2.4.3" style="padding-top:1pt;padding-bottom:1pt;"> <span class="ltx_text ltx_font_typewriter" id="S3.T1.2.2.2.4.3.1">DEMAND-B</span> (4.69)</td> <td class="ltx_td ltx_border_t" id="S3.T1.2.2.2.4.4" style="padding-top:1pt;padding-bottom:1pt;"></td> </tr> <tr class="ltx_tr" id="S3.T1.2.2.2.5"> <td class="ltx_td ltx_align_left ltx_border_bb ltx_border_r ltx_border_t" id="S3.T1.2.2.2.5.1" style="padding-top:1pt;padding-bottom:1pt;">DCASE 2016 Task2 <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">dcase</span>]</cite> </td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r ltx_border_t" colspan="2" id="S3.T1.2.2.2.5.2" style="padding-top:1pt;padding-bottom:1pt;"> <span class="ltx_text ltx_font_typewriter" id="S3.T1.2.2.2.5.2.1">DCASE</span> (0.07)</td> <td class="ltx_td ltx_border_bb ltx_border_t" id="S3.T1.2.2.2.5.3" style="padding-top:1pt;padding-bottom:1pt;"></td> </tr> </table> </span></div> </figure> </section> </section> <section class="ltx_section" id="S4"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">4 </span>Experimental analysis in the denoising task</h2> <div class="ltx_para" id="S4.p1"> <p class="ltx_p" id="S4.p1.1">One promising application of unsupervised TSE methods is the extraction of environmental sounds, as clean human speech corpora have already been created through extensive community efforts <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">libri</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">timit</span>]</cite>. However, in our experiments, we used clean human speech as the target signal, as this allows us to control the quality and volume of the noisy signals by distorting the clean speech corpora. We believe that the insights gained from our experiments are equally applicable to other types of target signal.</p> </div> <section class="ltx_subsection" id="S4.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.1 </span>Setups</h3> <div class="ltx_para" id="S4.SS1.p1"> <p class="ltx_p" id="S4.SS1.p1.9">In the experiments, we used several datasets as shown in Table <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S3.T1" title="Table 1 ‣ 3.6 Capabilities in the declipping task ‣ 3 Motivation and content of the investigation ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">1</span></a>, including LibriSpeech <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">libri</span>]</cite>, CHiME3 <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">chime3</span>]</cite>, noise extracted from the training dataset of VoiceBank-DEMAND <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">vbd</span>]</cite>, and the training dataset of the DCASE 2016 Challenge Task2 dataset (<span class="ltx_text ltx_font_typewriter" id="S4.SS1.p1.9.1">DCASE</span>) <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">dcase</span>]</cite>. CHiME3 included background noise recorded in a bus, a cafe, a pedestrian area, and a street junction. VoiceBank-DEMAND included noise recorded in a kitchen, an office, a cafe, and a subway, along with artificially synthesized bubble and white noise. The DCASE 2016 Task2 dataset included sounds of coughing, door knocking, and telephone ringing. Thus, there are differences in the types of noise across these three datasets. The noise signals from CHiME3 were segmented every <math alttext="10\text{\,}\mathrm{s}" class="ltx_Math" display="inline" id="S4.SS1.p1.1.m1.3"><semantics id="S4.SS1.p1.1.m1.3a"><mrow id="S4.SS1.p1.1.m1.3.3" xref="S4.SS1.p1.1.m1.3.3.cmml"><mn id="S4.SS1.p1.1.m1.1.1.1.1.1.1" xref="S4.SS1.p1.1.m1.1.1.1.1.1.1.cmml">10</mn><mtext id="S4.SS1.p1.1.m1.2.2.2.2.2.2" xref="S4.SS1.p1.1.m1.2.2.2.2.2.2.cmml"> </mtext><mi id="S4.SS1.p1.1.m1.3.3.3.3.3.3" mathvariant="normal" xref="S4.SS1.p1.1.m1.3.3.3.3.3.3.cmml">s</mi></mrow><annotation-xml encoding="MathML-Content" id="S4.SS1.p1.1.m1.3b"><apply id="S4.SS1.p1.1.m1.3.3.cmml" xref="S4.SS1.p1.1.m1.3.3"><csymbol cd="latexml" id="S4.SS1.p1.1.m1.2.2.2.2.2.2.cmml" xref="S4.SS1.p1.1.m1.2.2.2.2.2.2">times</csymbol><cn id="S4.SS1.p1.1.m1.1.1.1.1.1.1.cmml" type="integer" xref="S4.SS1.p1.1.m1.1.1.1.1.1.1">10</cn><ci id="S4.SS1.p1.1.m1.3.3.3.3.3.3.cmml" xref="S4.SS1.p1.1.m1.3.3.3.3.3.3">s</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.p1.1.m1.3c">10\text{\,}\mathrm{s}</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.p1.1.m1.3d">start_ARG 10 end_ARG start_ARG times end_ARG start_ARG roman_s end_ARG</annotation></semantics></math>, and <math alttext="7.83\text{\,}\mathrm{h}" class="ltx_Math" display="inline" id="S4.SS1.p1.2.m2.3"><semantics id="S4.SS1.p1.2.m2.3a"><mrow id="S4.SS1.p1.2.m2.3.3" xref="S4.SS1.p1.2.m2.3.3.cmml"><mn id="S4.SS1.p1.2.m2.1.1.1.1.1.1" xref="S4.SS1.p1.2.m2.1.1.1.1.1.1.cmml">7.83</mn><mtext id="S4.SS1.p1.2.m2.2.2.2.2.2.2" xref="S4.SS1.p1.2.m2.2.2.2.2.2.2.cmml"> </mtext><mi id="S4.SS1.p1.2.m2.3.3.3.3.3.3" mathvariant="normal" xref="S4.SS1.p1.2.m2.3.3.3.3.3.3.cmml">h</mi></mrow><annotation-xml encoding="MathML-Content" id="S4.SS1.p1.2.m2.3b"><apply id="S4.SS1.p1.2.m2.3.3.cmml" xref="S4.SS1.p1.2.m2.3.3"><csymbol cd="latexml" id="S4.SS1.p1.2.m2.2.2.2.2.2.2.cmml" xref="S4.SS1.p1.2.m2.2.2.2.2.2.2">times</csymbol><cn id="S4.SS1.p1.2.m2.1.1.1.1.1.1.cmml" type="float" xref="S4.SS1.p1.2.m2.1.1.1.1.1.1">7.83</cn><ci id="S4.SS1.p1.2.m2.3.3.3.3.3.3.cmml" xref="S4.SS1.p1.2.m2.3.3.3.3.3.3">h</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.p1.2.m2.3c">7.83\text{\,}\mathrm{h}</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.p1.2.m2.3d">start_ARG 7.83 end_ARG start_ARG times end_ARG start_ARG roman_h end_ARG</annotation></semantics></math> of data were split into <span class="ltx_text ltx_font_typewriter" id="S4.SS1.p1.9.2">CHiME-A</span> and <span class="ltx_text ltx_font_typewriter" id="S4.SS1.p1.9.3">CHiME-B</span>, and another <math alttext="0.56\text{\,}\mathrm{h}" class="ltx_Math" display="inline" id="S4.SS1.p1.3.m3.3"><semantics id="S4.SS1.p1.3.m3.3a"><mrow id="S4.SS1.p1.3.m3.3.3" xref="S4.SS1.p1.3.m3.3.3.cmml"><mn id="S4.SS1.p1.3.m3.1.1.1.1.1.1" xref="S4.SS1.p1.3.m3.1.1.1.1.1.1.cmml">0.56</mn><mtext id="S4.SS1.p1.3.m3.2.2.2.2.2.2" xref="S4.SS1.p1.3.m3.2.2.2.2.2.2.cmml"> </mtext><mi id="S4.SS1.p1.3.m3.3.3.3.3.3.3" mathvariant="normal" xref="S4.SS1.p1.3.m3.3.3.3.3.3.3.cmml">h</mi></mrow><annotation-xml encoding="MathML-Content" id="S4.SS1.p1.3.m3.3b"><apply id="S4.SS1.p1.3.m3.3.3.cmml" xref="S4.SS1.p1.3.m3.3.3"><csymbol cd="latexml" id="S4.SS1.p1.3.m3.2.2.2.2.2.2.cmml" xref="S4.SS1.p1.3.m3.2.2.2.2.2.2">times</csymbol><cn id="S4.SS1.p1.3.m3.1.1.1.1.1.1.cmml" type="float" xref="S4.SS1.p1.3.m3.1.1.1.1.1.1">0.56</cn><ci id="S4.SS1.p1.3.m3.3.3.3.3.3.3.cmml" xref="S4.SS1.p1.3.m3.3.3.3.3.3.3">h</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.p1.3.m3.3c">0.56\text{\,}\mathrm{h}</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.p1.3.m3.3d">start_ARG 0.56 end_ARG start_ARG times end_ARG start_ARG roman_h end_ARG</annotation></semantics></math> of data were used as <span class="ltx_text ltx_font_typewriter" id="S4.SS1.p1.9.4">CHiME-C</span>. 11,572 clips of VoiceBank-DEMAND were split into two subsets, <span class="ltx_text ltx_font_typewriter" id="S4.SS1.p1.9.5">DEMAND-A</span> and <span class="ltx_text ltx_font_typewriter" id="S4.SS1.p1.9.6">DEMAND-B</span>. The noisy target training dataset was generated by mixing 10,000 utterances of clean target signals from LibriSpeech and noise signals <math alttext="\bm{n}^{\mathrm{obs}}" class="ltx_Math" display="inline" id="S4.SS1.p1.4.m4.1"><semantics id="S4.SS1.p1.4.m4.1a"><msup id="S4.SS1.p1.4.m4.1.1" xref="S4.SS1.p1.4.m4.1.1.cmml"><mi id="S4.SS1.p1.4.m4.1.1.2" xref="S4.SS1.p1.4.m4.1.1.2.cmml">𝒏</mi><mi id="S4.SS1.p1.4.m4.1.1.3" xref="S4.SS1.p1.4.m4.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS1.p1.4.m4.1b"><apply id="S4.SS1.p1.4.m4.1.1.cmml" xref="S4.SS1.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS1.p1.4.m4.1.1.1.cmml" xref="S4.SS1.p1.4.m4.1.1">superscript</csymbol><ci id="S4.SS1.p1.4.m4.1.1.2.cmml" xref="S4.SS1.p1.4.m4.1.1.2">𝒏</ci><ci id="S4.SS1.p1.4.m4.1.1.3.cmml" xref="S4.SS1.p1.4.m4.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.p1.4.m4.1c">\bm{n}^{\mathrm{obs}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.p1.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> at <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS1.p1.5.m5.1"><semantics id="S4.SS1.p1.5.m5.1a"><msub id="S4.SS1.p1.5.m5.1.1" xref="S4.SS1.p1.5.m5.1.1.cmml"><mi id="S4.SS1.p1.5.m5.1.1.2" xref="S4.SS1.p1.5.m5.1.1.2.cmml">SNR</mi><mi id="S4.SS1.p1.5.m5.1.1.3" xref="S4.SS1.p1.5.m5.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS1.p1.5.m5.1b"><apply id="S4.SS1.p1.5.m5.1.1.cmml" xref="S4.SS1.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S4.SS1.p1.5.m5.1.1.1.cmml" xref="S4.SS1.p1.5.m5.1.1">subscript</csymbol><ci id="S4.SS1.p1.5.m5.1.1.2.cmml" xref="S4.SS1.p1.5.m5.1.1.2">SNR</ci><ci id="S4.SS1.p1.5.m5.1.1.3.cmml" xref="S4.SS1.p1.5.m5.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.p1.5.m5.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.p1.5.m5.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> randomly selected from 0, 5, 10, and 15 dB. The test dataset was generated by mixing 1,000 utterances of clean target signals from LibriSpeech and <span class="ltx_text ltx_font_typewriter" id="S4.SS1.p1.9.7">CHiME-C</span> at SNR randomly selected from 2.5, 7.5, 12.5, and 17.5 dB. During training, the input signal to the DNN was generated by mixing target signals and additional noise signals, and the <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S4.SS1.p1.6.m6.1"><semantics id="S4.SS1.p1.6.m6.1a"><msub id="S4.SS1.p1.6.m6.1.1" xref="S4.SS1.p1.6.m6.1.1.cmml"><mi id="S4.SS1.p1.6.m6.1.1.2" xref="S4.SS1.p1.6.m6.1.1.2.cmml">SNR</mi><mi id="S4.SS1.p1.6.m6.1.1.3" xref="S4.SS1.p1.6.m6.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS1.p1.6.m6.1b"><apply id="S4.SS1.p1.6.m6.1.1.cmml" xref="S4.SS1.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S4.SS1.p1.6.m6.1.1.1.cmml" xref="S4.SS1.p1.6.m6.1.1">subscript</csymbol><ci id="S4.SS1.p1.6.m6.1.1.2.cmml" xref="S4.SS1.p1.6.m6.1.1.2">SNR</ci><ci id="S4.SS1.p1.6.m6.1.1.3.cmml" xref="S4.SS1.p1.6.m6.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.p1.6.m6.1c">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.p1.6.m6.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math> ranges were -5 to 5 dB for NyTT, and 0, 5, 10, and 15 dB for IterNyTT after the second iteration and CTT. We evaluated the performance using the best validation epoch, where the validation was conducted with 50 pairs of input and target signals generated under the same <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S4.SS1.p1.7.m7.1"><semantics id="S4.SS1.p1.7.m7.1a"><msub id="S4.SS1.p1.7.m7.1.1" xref="S4.SS1.p1.7.m7.1.1.cmml"><mi id="S4.SS1.p1.7.m7.1.1.2" xref="S4.SS1.p1.7.m7.1.1.2.cmml">SNR</mi><mi id="S4.SS1.p1.7.m7.1.1.3" xref="S4.SS1.p1.7.m7.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS1.p1.7.m7.1b"><apply id="S4.SS1.p1.7.m7.1.1.cmml" xref="S4.SS1.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S4.SS1.p1.7.m7.1.1.1.cmml" xref="S4.SS1.p1.7.m7.1.1">subscript</csymbol><ci id="S4.SS1.p1.7.m7.1.1.2.cmml" xref="S4.SS1.p1.7.m7.1.1.2">SNR</ci><ci id="S4.SS1.p1.7.m7.1.1.3.cmml" xref="S4.SS1.p1.7.m7.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.p1.7.m7.1c">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.p1.7.m7.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math>, <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS1.p1.8.m8.1"><semantics id="S4.SS1.p1.8.m8.1a"><msup id="S4.SS1.p1.8.m8.1.1" xref="S4.SS1.p1.8.m8.1.1.cmml"><mi id="S4.SS1.p1.8.m8.1.1.2" xref="S4.SS1.p1.8.m8.1.1.2.cmml">𝒏</mi><mi id="S4.SS1.p1.8.m8.1.1.3" xref="S4.SS1.p1.8.m8.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS1.p1.8.m8.1b"><apply id="S4.SS1.p1.8.m8.1.1.cmml" xref="S4.SS1.p1.8.m8.1.1"><csymbol cd="ambiguous" id="S4.SS1.p1.8.m8.1.1.1.cmml" xref="S4.SS1.p1.8.m8.1.1">superscript</csymbol><ci id="S4.SS1.p1.8.m8.1.1.2.cmml" xref="S4.SS1.p1.8.m8.1.1.2">𝒏</ci><ci id="S4.SS1.p1.8.m8.1.1.3.cmml" xref="S4.SS1.p1.8.m8.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.p1.8.m8.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.p1.8.m8.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>, and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS1.p1.9.m9.1"><semantics id="S4.SS1.p1.9.m9.1a"><msup id="S4.SS1.p1.9.m9.1.1" xref="S4.SS1.p1.9.m9.1.1.cmml"><mi id="S4.SS1.p1.9.m9.1.1.2" xref="S4.SS1.p1.9.m9.1.1.2.cmml">𝒏</mi><mi id="S4.SS1.p1.9.m9.1.1.3" xref="S4.SS1.p1.9.m9.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS1.p1.9.m9.1b"><apply id="S4.SS1.p1.9.m9.1.1.cmml" xref="S4.SS1.p1.9.m9.1.1"><csymbol cd="ambiguous" id="S4.SS1.p1.9.m9.1.1.1.cmml" xref="S4.SS1.p1.9.m9.1.1">superscript</csymbol><ci id="S4.SS1.p1.9.m9.1.1.2.cmml" xref="S4.SS1.p1.9.m9.1.1.2">𝒏</ci><ci id="S4.SS1.p1.9.m9.1.1.3.cmml" xref="S4.SS1.p1.9.m9.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.p1.9.m9.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.p1.9.m9.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> as in the training dataset.</p> </div> <div class="ltx_para" id="S4.SS1.p2"> <p class="ltx_p" id="S4.SS1.p2.1">The DNN architecture was CNN-BLSTM used in <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Kawanaka_2020</span>]</cite>. The input feature was the log-amplitude spectrogram, and the network estimated a complex-valued T–F mask. For the short-time Fourier transform (STFT) parameters, the frame shift, window size, and DFT size were set to 128, 512, and 512 samples, respectively, using the Hamming window with a sampling frequency of 16 kHz. We trained the DNN for 1,500 epochs with a mini-batch size of 50, using the Adam optimizer <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Kingma_2015</span>]</cite> with a fixed learning rate of 0.0001. For the loss function, we used MSE calculated in the time domain. As the metrics, we used the scale-invariant signal-to-distortion ratio (SI-SDR) <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Roux_2019</span>]</cite>, the perceptual evaluation of speech quality (PESQ) <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">pesq</span>]</cite>, and the short-time objective intelligibility (STOI) <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">taal2011algorithm</span>]</cite>.</p> </div> </section> <section class="ltx_subsection" id="S4.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.2 </span>Validity of interpretation of NyTT</h3> <div class="ltx_para" id="S4.SS2.p1"> <p class="ltx_p" id="S4.SS2.p1.2">In this section, we trained the DNN using <span class="ltx_text ltx_font_typewriter" id="S4.SS2.p1.2.1">CHiME-A</span> and <span class="ltx_text ltx_font_typewriter" id="S4.SS2.p1.2.2">CHiME-B</span> as <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS2.p1.1.m1.1"><semantics id="S4.SS2.p1.1.m1.1a"><msup id="S4.SS2.p1.1.m1.1.1" xref="S4.SS2.p1.1.m1.1.1.cmml"><mi id="S4.SS2.p1.1.m1.1.1.2" xref="S4.SS2.p1.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS2.p1.1.m1.1.1.3" xref="S4.SS2.p1.1.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS2.p1.1.m1.1b"><apply id="S4.SS2.p1.1.m1.1.1.cmml" xref="S4.SS2.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS2.p1.1.m1.1.1.1.cmml" xref="S4.SS2.p1.1.m1.1.1">superscript</csymbol><ci id="S4.SS2.p1.1.m1.1.1.2.cmml" xref="S4.SS2.p1.1.m1.1.1.2">𝒏</ci><ci id="S4.SS2.p1.1.m1.1.1.3.cmml" xref="S4.SS2.p1.1.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.p1.1.m1.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.p1.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS2.p1.2.m2.1"><semantics id="S4.SS2.p1.2.m2.1a"><msup id="S4.SS2.p1.2.m2.1.1" xref="S4.SS2.p1.2.m2.1.1.cmml"><mi id="S4.SS2.p1.2.m2.1.1.2" xref="S4.SS2.p1.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS2.p1.2.m2.1.1.3" xref="S4.SS2.p1.2.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS2.p1.2.m2.1b"><apply id="S4.SS2.p1.2.m2.1.1.cmml" xref="S4.SS2.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS2.p1.2.m2.1.1.1.cmml" xref="S4.SS2.p1.2.m2.1.1">superscript</csymbol><ci id="S4.SS2.p1.2.m2.1.1.2.cmml" xref="S4.SS2.p1.2.m2.1.1.2">𝒏</ci><ci id="S4.SS2.p1.2.m2.1.1.3.cmml" xref="S4.SS2.p1.2.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.p1.2.m2.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.p1.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, respectively.</p> </div> <section class="ltx_subsubsection" id="S4.SS2.SSS1"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.2.1 </span>Analysis of signals processed in NyTT</h4> <div class="ltx_para" id="S4.SS2.SSS1.p1"> <p class="ltx_p" id="S4.SS2.SSS1.p1.3">If NyTT is Noise2Noise, the output signal corresponding to the <span class="ltx_text ltx_font_italic" id="S4.SS2.SSS1.p1.3.1">more noisy</span> signal <math alttext="\bm{y}" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.1.m1.1"><semantics id="S4.SS2.SSS1.p1.1.m1.1a"><mi id="S4.SS2.SSS1.p1.1.m1.1.1" xref="S4.SS2.SSS1.p1.1.m1.1.1.cmml">𝒚</mi><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.1.m1.1b"><ci id="S4.SS2.SSS1.p1.1.m1.1.1.cmml" xref="S4.SS2.SSS1.p1.1.m1.1.1">𝒚</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.1.m1.1c">\bm{y}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.1.m1.1d">bold_italic_y</annotation></semantics></math> should be the estimate of the clean target signal <math alttext="\bm{s}" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.2.m2.1"><semantics id="S4.SS2.SSS1.p1.2.m2.1a"><mi id="S4.SS2.SSS1.p1.2.m2.1.1" xref="S4.SS2.SSS1.p1.2.m2.1.1.cmml">𝒔</mi><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.2.m2.1b"><ci id="S4.SS2.SSS1.p1.2.m2.1.1.cmml" xref="S4.SS2.SSS1.p1.2.m2.1.1">𝒔</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.2.m2.1c">\bm{s}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.2.m2.1d">bold_italic_s</annotation></semantics></math>. To investigate this, we analyzed the output signals when we input <math alttext="\bm{y}" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.3.m3.1"><semantics id="S4.SS2.SSS1.p1.3.m3.1a"><mi id="S4.SS2.SSS1.p1.3.m3.1.1" xref="S4.SS2.SSS1.p1.3.m3.1.1.cmml">𝒚</mi><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.3.m3.1b"><ci id="S4.SS2.SSS1.p1.3.m3.1.1.cmml" xref="S4.SS2.SSS1.p1.3.m3.1.1">𝒚</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.3.m3.1c">\bm{y}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.3.m3.1d">bold_italic_y</annotation></semantics></math> to the DNN. To analyze the output signals during training, we generated <span class="ltx_text ltx_font_italic" id="S4.SS2.SSS1.p1.3.2">more noisy</span> signals by using the training dataset. Additionally, to analyze the output signals for unseen <span class="ltx_text ltx_font_italic" id="S4.SS2.SSS1.p1.3.3">more noisy</span> signals, we generated <span class="ltx_text ltx_font_italic" id="S4.SS2.SSS1.p1.3.4">more noisy</span> signals by mixing noisy signals from the test dataset and <span class="ltx_text ltx_font_typewriter" id="S4.SS2.SSS1.p1.3.5">CHiME-C</span> at SNRs ranging from -5 to 5 dB.</p> </div> <div class="ltx_para" id="S4.SS2.SSS1.p2"> <p class="ltx_p" id="S4.SS2.SSS1.p2.6">Table <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.T2" title="Table 2 ‣ 4.2.1 Analysis of signals processed in NyTT ‣ 4.2 Validity of interpretation of NyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">2</span></a> shows the speech quality of 1,000 utterances of the noisy targets <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p2.1.m1.1"><semantics id="S4.SS2.SSS1.p2.1.m1.1a"><mi id="S4.SS2.SSS1.p2.1.m1.1.1" xref="S4.SS2.SSS1.p2.1.m1.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p2.1.m1.1b"><ci id="S4.SS2.SSS1.p2.1.m1.1.1.cmml" xref="S4.SS2.SSS1.p2.1.m1.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p2.1.m1.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p2.1.m1.1d">bold_italic_x</annotation></semantics></math>, the <span class="ltx_text ltx_font_italic" id="S4.SS2.SSS1.p2.6.1">more noisy</span> signals <math alttext="\bm{y}" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p2.2.m2.1"><semantics id="S4.SS2.SSS1.p2.2.m2.1a"><mi id="S4.SS2.SSS1.p2.2.m2.1.1" xref="S4.SS2.SSS1.p2.2.m2.1.1.cmml">𝒚</mi><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p2.2.m2.1b"><ci id="S4.SS2.SSS1.p2.2.m2.1.1.cmml" xref="S4.SS2.SSS1.p2.2.m2.1.1">𝒚</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p2.2.m2.1c">\bm{y}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p2.2.m2.1d">bold_italic_y</annotation></semantics></math>, and the corresponding output signals <math alttext="f(\bm{y};\theta)" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p2.3.m3.2"><semantics id="S4.SS2.SSS1.p2.3.m3.2a"><mrow id="S4.SS2.SSS1.p2.3.m3.2.3" xref="S4.SS2.SSS1.p2.3.m3.2.3.cmml"><mi id="S4.SS2.SSS1.p2.3.m3.2.3.2" xref="S4.SS2.SSS1.p2.3.m3.2.3.2.cmml">f</mi><mo id="S4.SS2.SSS1.p2.3.m3.2.3.1" xref="S4.SS2.SSS1.p2.3.m3.2.3.1.cmml">⁢</mo><mrow id="S4.SS2.SSS1.p2.3.m3.2.3.3.2" xref="S4.SS2.SSS1.p2.3.m3.2.3.3.1.cmml"><mo id="S4.SS2.SSS1.p2.3.m3.2.3.3.2.1" stretchy="false" xref="S4.SS2.SSS1.p2.3.m3.2.3.3.1.cmml">(</mo><mi id="S4.SS2.SSS1.p2.3.m3.1.1" xref="S4.SS2.SSS1.p2.3.m3.1.1.cmml">𝒚</mi><mo id="S4.SS2.SSS1.p2.3.m3.2.3.3.2.2" xref="S4.SS2.SSS1.p2.3.m3.2.3.3.1.cmml">;</mo><mi id="S4.SS2.SSS1.p2.3.m3.2.2" xref="S4.SS2.SSS1.p2.3.m3.2.2.cmml">θ</mi><mo id="S4.SS2.SSS1.p2.3.m3.2.3.3.2.3" stretchy="false" xref="S4.SS2.SSS1.p2.3.m3.2.3.3.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p2.3.m3.2b"><apply id="S4.SS2.SSS1.p2.3.m3.2.3.cmml" xref="S4.SS2.SSS1.p2.3.m3.2.3"><times id="S4.SS2.SSS1.p2.3.m3.2.3.1.cmml" xref="S4.SS2.SSS1.p2.3.m3.2.3.1"></times><ci id="S4.SS2.SSS1.p2.3.m3.2.3.2.cmml" xref="S4.SS2.SSS1.p2.3.m3.2.3.2">𝑓</ci><list id="S4.SS2.SSS1.p2.3.m3.2.3.3.1.cmml" xref="S4.SS2.SSS1.p2.3.m3.2.3.3.2"><ci id="S4.SS2.SSS1.p2.3.m3.1.1.cmml" xref="S4.SS2.SSS1.p2.3.m3.1.1">𝒚</ci><ci id="S4.SS2.SSS1.p2.3.m3.2.2.cmml" xref="S4.SS2.SSS1.p2.3.m3.2.2">𝜃</ci></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p2.3.m3.2c">f(\bm{y};\theta)</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p2.3.m3.2d">italic_f ( bold_italic_y ; italic_θ )</annotation></semantics></math>, for both the training and test datasets. Considering that the SI-SDR, PESQ, and STOI of clean speech are <math alttext="\infty" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p2.4.m4.1"><semantics id="S4.SS2.SSS1.p2.4.m4.1a"><mi id="S4.SS2.SSS1.p2.4.m4.1.1" mathvariant="normal" xref="S4.SS2.SSS1.p2.4.m4.1.1.cmml">∞</mi><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p2.4.m4.1b"><infinity id="S4.SS2.SSS1.p2.4.m4.1.1.cmml" xref="S4.SS2.SSS1.p2.4.m4.1.1"></infinity></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p2.4.m4.1c">\infty</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p2.4.m4.1d">∞</annotation></semantics></math> dB, <math alttext="4.64" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p2.5.m5.1"><semantics id="S4.SS2.SSS1.p2.5.m5.1a"><mn id="S4.SS2.SSS1.p2.5.m5.1.1" xref="S4.SS2.SSS1.p2.5.m5.1.1.cmml">4.64</mn><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p2.5.m5.1b"><cn id="S4.SS2.SSS1.p2.5.m5.1.1.cmml" type="float" xref="S4.SS2.SSS1.p2.5.m5.1.1">4.64</cn></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p2.5.m5.1c">4.64</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p2.5.m5.1d">4.64</annotation></semantics></math>, and <math alttext="1.00" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p2.6.m6.1"><semantics id="S4.SS2.SSS1.p2.6.m6.1a"><mn id="S4.SS2.SSS1.p2.6.m6.1.1" xref="S4.SS2.SSS1.p2.6.m6.1.1.cmml">1.00</mn><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p2.6.m6.1b"><cn id="S4.SS2.SSS1.p2.6.m6.1.1.cmml" type="float" xref="S4.SS2.SSS1.p2.6.m6.1.1">1.00</cn></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p2.6.m6.1c">1.00</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p2.6.m6.1d">1.00</annotation></semantics></math>, respectively, the output signals are closer in quality to the noisy targets than the clean target signals. Figure <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.F3" title="Figure 3 ‣ 4.2.1 Analysis of signals processed in NyTT ‣ 4.2 Validity of interpretation of NyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">3</span></a> provides examples of spectrograms from the test dataset, further illustrating that the output signal is better interpreted as the estimate of the noisy target rather than the clean target signal.</p> </div> <figure class="ltx_table" id="S4.T2"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table">Table 2: </span>Quality of singles processed in NyTT. <math alttext="\bm{n}^{\mathrm{obs}}" class="ltx_Math" display="inline" id="S4.T2.3.m1.1"><semantics id="S4.T2.3.m1.1b"><msup id="S4.T2.3.m1.1.1" xref="S4.T2.3.m1.1.1.cmml"><mi id="S4.T2.3.m1.1.1.2" xref="S4.T2.3.m1.1.1.2.cmml">𝒏</mi><mi id="S4.T2.3.m1.1.1.3" xref="S4.T2.3.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T2.3.m1.1c"><apply id="S4.T2.3.m1.1.1.cmml" xref="S4.T2.3.m1.1.1"><csymbol cd="ambiguous" id="S4.T2.3.m1.1.1.1.cmml" xref="S4.T2.3.m1.1.1">superscript</csymbol><ci id="S4.T2.3.m1.1.1.2.cmml" xref="S4.T2.3.m1.1.1.2">𝒏</ci><ci id="S4.T2.3.m1.1.1.3.cmml" xref="S4.T2.3.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T2.3.m1.1d">\bm{n}^{\mathrm{obs}}</annotation><annotation encoding="application/x-llamapun" id="S4.T2.3.m1.1e">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\mathrm{add}}" class="ltx_Math" display="inline" id="S4.T2.4.m2.1"><semantics id="S4.T2.4.m2.1b"><msup id="S4.T2.4.m2.1.1" xref="S4.T2.4.m2.1.1.cmml"><mi id="S4.T2.4.m2.1.1.2" xref="S4.T2.4.m2.1.1.2.cmml">𝒏</mi><mi id="S4.T2.4.m2.1.1.3" xref="S4.T2.4.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T2.4.m2.1c"><apply id="S4.T2.4.m2.1.1.cmml" xref="S4.T2.4.m2.1.1"><csymbol cd="ambiguous" id="S4.T2.4.m2.1.1.1.cmml" xref="S4.T2.4.m2.1.1">superscript</csymbol><ci id="S4.T2.4.m2.1.1.2.cmml" xref="S4.T2.4.m2.1.1.2">𝒏</ci><ci id="S4.T2.4.m2.1.1.3.cmml" xref="S4.T2.4.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T2.4.m2.1d">\bm{n}^{\mathrm{add}}</annotation><annotation encoding="application/x-llamapun" id="S4.T2.4.m2.1e">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> were <span class="ltx_text ltx_font_typewriter" id="S4.T2.7.1">CHiME-A</span> and <span class="ltx_text ltx_font_typewriter" id="S4.T2.8.2">CHiME-B</span>, respectively.</figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T2.9" style="width:205.2pt;height:94.5pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-34.2pt,15.8pt) scale(0.75,0.75) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T2.9.1"> <tr class="ltx_tr" id="S4.T2.9.1.1"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T2.9.1.1.1">Data</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T2.9.1.1.2">Metric</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T2.9.1.1.3">Noisy target</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T2.9.1.1.4">More noisy</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T2.9.1.1.5">Output</td> </tr> <tr class="ltx_tr" id="S4.T2.9.1.2"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.9.1.2.1" rowspan="3"><span class="ltx_text" id="S4.T2.9.1.2.1.1">Train set</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.9.1.2.2">SI-SDR</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.9.1.2.3"><span class="ltx_text ltx_font_bold" id="S4.T2.9.1.2.3.1">7.33</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.9.1.2.4">-2.13</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.9.1.2.5">6.16</td> </tr> <tr class="ltx_tr" id="S4.T2.9.1.3"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.9.1.3.1">PESQ</td> <td class="ltx_td ltx_align_center" id="S4.T2.9.1.3.2">1.35</td> <td class="ltx_td ltx_align_center" id="S4.T2.9.1.3.3">1.07</td> <td class="ltx_td ltx_align_center" id="S4.T2.9.1.3.4"><span class="ltx_text ltx_font_bold" id="S4.T2.9.1.3.4.1">1.37</span></td> </tr> <tr class="ltx_tr" id="S4.T2.9.1.4"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.9.1.4.1">STOI</td> <td class="ltx_td ltx_align_center" id="S4.T2.9.1.4.2"><span class="ltx_text ltx_font_bold" id="S4.T2.9.1.4.2.1">0.838</span></td> <td class="ltx_td ltx_align_center" id="S4.T2.9.1.4.3">0.666</td> <td class="ltx_td ltx_align_center" id="S4.T2.9.1.4.4">0.776</td> </tr> <tr class="ltx_tr" id="S4.T2.9.1.5"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r ltx_border_t" id="S4.T2.9.1.5.1" rowspan="3"><span class="ltx_text" id="S4.T2.9.1.5.1.1">Test set</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.9.1.5.2">SI-SDR</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.9.1.5.3"><span class="ltx_text ltx_font_bold" id="S4.T2.9.1.5.3.1">9.67</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.9.1.5.4">-1.58</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.9.1.5.5">7.76</td> </tr> <tr class="ltx_tr" id="S4.T2.9.1.6"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.9.1.6.1">PESQ</td> <td class="ltx_td ltx_align_center" id="S4.T2.9.1.6.2"><span class="ltx_text ltx_font_bold" id="S4.T2.9.1.6.2.1">1.48</span></td> <td class="ltx_td ltx_align_center" id="S4.T2.9.1.6.3">1.08</td> <td class="ltx_td ltx_align_center" id="S4.T2.9.1.6.4">1.47</td> </tr> <tr class="ltx_tr" id="S4.T2.9.1.7"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T2.9.1.7.1">STOI</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.9.1.7.2"><span class="ltx_text ltx_font_bold" id="S4.T2.9.1.7.2.1">0.874</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.9.1.7.3">0.696</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.9.1.7.4">0.811</td> </tr> </table> </span></div> </figure> <figure class="ltx_figure" id="S4.F3"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="296" id="S4.F3.g1" src="x3.png" width="821"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure">Figure 3: </span>Spectrograms of the test dataset. (a) Clean target <math alttext="\bm{s}" class="ltx_Math" display="inline" id="S4.F3.7.m1.1"><semantics id="S4.F3.7.m1.1b"><mi id="S4.F3.7.m1.1.1" xref="S4.F3.7.m1.1.1.cmml">𝒔</mi><annotation-xml encoding="MathML-Content" id="S4.F3.7.m1.1c"><ci id="S4.F3.7.m1.1.1.cmml" xref="S4.F3.7.m1.1.1">𝒔</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.F3.7.m1.1d">\bm{s}</annotation><annotation encoding="application/x-llamapun" id="S4.F3.7.m1.1e">bold_italic_s</annotation></semantics></math>, (b) noisy target <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S4.F3.8.m2.1"><semantics id="S4.F3.8.m2.1b"><mi id="S4.F3.8.m2.1.1" xref="S4.F3.8.m2.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S4.F3.8.m2.1c"><ci id="S4.F3.8.m2.1.1.cmml" xref="S4.F3.8.m2.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.F3.8.m2.1d">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S4.F3.8.m2.1e">bold_italic_x</annotation></semantics></math>, (c) <span class="ltx_text ltx_font_italic" id="S4.F3.16.1">more noisy</span> signal <math alttext="\bm{y}" class="ltx_Math" display="inline" id="S4.F3.9.m3.1"><semantics id="S4.F3.9.m3.1b"><mi id="S4.F3.9.m3.1.1" xref="S4.F3.9.m3.1.1.cmml">𝒚</mi><annotation-xml encoding="MathML-Content" id="S4.F3.9.m3.1c"><ci id="S4.F3.9.m3.1.1.cmml" xref="S4.F3.9.m3.1.1">𝒚</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.F3.9.m3.1d">\bm{y}</annotation><annotation encoding="application/x-llamapun" id="S4.F3.9.m3.1e">bold_italic_y</annotation></semantics></math>, and (d) output of a DNN <math alttext="f(\bm{y};\theta)" class="ltx_Math" display="inline" id="S4.F3.10.m4.2"><semantics id="S4.F3.10.m4.2b"><mrow id="S4.F3.10.m4.2.3" xref="S4.F3.10.m4.2.3.cmml"><mi id="S4.F3.10.m4.2.3.2" xref="S4.F3.10.m4.2.3.2.cmml">f</mi><mo id="S4.F3.10.m4.2.3.1" xref="S4.F3.10.m4.2.3.1.cmml">⁢</mo><mrow id="S4.F3.10.m4.2.3.3.2" xref="S4.F3.10.m4.2.3.3.1.cmml"><mo id="S4.F3.10.m4.2.3.3.2.1" stretchy="false" xref="S4.F3.10.m4.2.3.3.1.cmml">(</mo><mi id="S4.F3.10.m4.1.1" xref="S4.F3.10.m4.1.1.cmml">𝒚</mi><mo id="S4.F3.10.m4.2.3.3.2.2" xref="S4.F3.10.m4.2.3.3.1.cmml">;</mo><mi id="S4.F3.10.m4.2.2" xref="S4.F3.10.m4.2.2.cmml">θ</mi><mo id="S4.F3.10.m4.2.3.3.2.3" stretchy="false" xref="S4.F3.10.m4.2.3.3.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S4.F3.10.m4.2c"><apply id="S4.F3.10.m4.2.3.cmml" xref="S4.F3.10.m4.2.3"><times id="S4.F3.10.m4.2.3.1.cmml" xref="S4.F3.10.m4.2.3.1"></times><ci id="S4.F3.10.m4.2.3.2.cmml" xref="S4.F3.10.m4.2.3.2">𝑓</ci><list id="S4.F3.10.m4.2.3.3.1.cmml" xref="S4.F3.10.m4.2.3.3.2"><ci id="S4.F3.10.m4.1.1.cmml" xref="S4.F3.10.m4.1.1">𝒚</ci><ci id="S4.F3.10.m4.2.2.cmml" xref="S4.F3.10.m4.2.2">𝜃</ci></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.F3.10.m4.2d">f(\bm{y};\theta)</annotation><annotation encoding="application/x-llamapun" id="S4.F3.10.m4.2e">italic_f ( bold_italic_y ; italic_θ )</annotation></semantics></math>. <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.F3.11.m5.1"><semantics id="S4.F3.11.m5.1b"><msup id="S4.F3.11.m5.1.1" xref="S4.F3.11.m5.1.1.cmml"><mi id="S4.F3.11.m5.1.1.2" xref="S4.F3.11.m5.1.1.2.cmml">𝒏</mi><mi id="S4.F3.11.m5.1.1.3" xref="S4.F3.11.m5.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.F3.11.m5.1c"><apply id="S4.F3.11.m5.1.1.cmml" xref="S4.F3.11.m5.1.1"><csymbol cd="ambiguous" id="S4.F3.11.m5.1.1.1.cmml" xref="S4.F3.11.m5.1.1">superscript</csymbol><ci id="S4.F3.11.m5.1.1.2.cmml" xref="S4.F3.11.m5.1.1.2">𝒏</ci><ci id="S4.F3.11.m5.1.1.3.cmml" xref="S4.F3.11.m5.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.F3.11.m5.1d">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.F3.11.m5.1e">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.F3.12.m6.1"><semantics id="S4.F3.12.m6.1b"><msup id="S4.F3.12.m6.1.1" xref="S4.F3.12.m6.1.1.cmml"><mi id="S4.F3.12.m6.1.1.2" xref="S4.F3.12.m6.1.1.2.cmml">𝒏</mi><mi id="S4.F3.12.m6.1.1.3" xref="S4.F3.12.m6.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.F3.12.m6.1c"><apply id="S4.F3.12.m6.1.1.cmml" xref="S4.F3.12.m6.1.1"><csymbol cd="ambiguous" id="S4.F3.12.m6.1.1.1.cmml" xref="S4.F3.12.m6.1.1">superscript</csymbol><ci id="S4.F3.12.m6.1.1.2.cmml" xref="S4.F3.12.m6.1.1.2">𝒏</ci><ci id="S4.F3.12.m6.1.1.3.cmml" xref="S4.F3.12.m6.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.F3.12.m6.1d">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.F3.12.m6.1e">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> used for the training were <span class="ltx_text ltx_font_typewriter" id="S4.F3.17.2">CHiME-A</span> and <span class="ltx_text ltx_font_typewriter" id="S4.F3.18.3">CHiME-B</span>, respectively.</figcaption> </figure> </section> <section class="ltx_subsubsection" id="S4.SS2.SSS2"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.2.2 </span>Evaluation of NyTT with loss functions that do not satisfy the conditions of Noise2Noise</h4> <div class="ltx_para" id="S4.SS2.SSS2.p1"> <p class="ltx_p" id="S4.SS2.SSS2.p1.2">If NyTT is Noise2Noise, <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p1.1.m1.1"><semantics id="S4.SS2.SSS2.p1.1.m1.1a"><msup id="S4.SS2.SSS2.p1.1.m1.1.1" xref="S4.SS2.SSS2.p1.1.m1.1.1.cmml"><mi id="S4.SS2.SSS2.p1.1.m1.1.1.2" xref="S4.SS2.SSS2.p1.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS2.SSS2.p1.1.m1.1.1.3" xref="S4.SS2.SSS2.p1.1.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p1.1.m1.1b"><apply id="S4.SS2.SSS2.p1.1.m1.1.1.cmml" xref="S4.SS2.SSS2.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS2.SSS2.p1.1.m1.1.1.1.cmml" xref="S4.SS2.SSS2.p1.1.m1.1.1">superscript</csymbol><ci id="S4.SS2.SSS2.p1.1.m1.1.1.2.cmml" xref="S4.SS2.SSS2.p1.1.m1.1.1.2">𝒏</ci><ci id="S4.SS2.SSS2.p1.1.m1.1.1.3.cmml" xref="S4.SS2.SSS2.p1.1.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p1.1.m1.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p1.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> must have a zero-mean distribution, and therefore, the MSE of the loss function should be calculated in the time domain. To analyze the significance of this assumption in NyTT, we compared the performance of NyTT with MSE in the time domain (<span class="ltx_text ltx_font_typewriter" id="S4.SS2.SSS2.p1.2.1">Time</span>) and that with MSE in the amplitude spectrogram domain (<span class="ltx_text ltx_font_typewriter" id="S4.SS2.SSS2.p1.2.2">Spec</span>). If NyTT strictly adheres to the Noise2Noise framework, <span class="ltx_text ltx_font_typewriter" id="S4.SS2.SSS2.p1.2.3">Spec</span> should not be able to perform TSE, as the zero-mean distribution for <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p1.2.m2.1"><semantics id="S4.SS2.SSS2.p1.2.m2.1a"><msup id="S4.SS2.SSS2.p1.2.m2.1.1" xref="S4.SS2.SSS2.p1.2.m2.1.1.cmml"><mi id="S4.SS2.SSS2.p1.2.m2.1.1.2" xref="S4.SS2.SSS2.p1.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS2.SSS2.p1.2.m2.1.1.3" xref="S4.SS2.SSS2.p1.2.m2.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p1.2.m2.1b"><apply id="S4.SS2.SSS2.p1.2.m2.1.1.cmml" xref="S4.SS2.SSS2.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS2.SSS2.p1.2.m2.1.1.1.cmml" xref="S4.SS2.SSS2.p1.2.m2.1.1">superscript</csymbol><ci id="S4.SS2.SSS2.p1.2.m2.1.1.2.cmml" xref="S4.SS2.SSS2.p1.2.m2.1.1.2">𝒏</ci><ci id="S4.SS2.SSS2.p1.2.m2.1.1.3.cmml" xref="S4.SS2.SSS2.p1.2.m2.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p1.2.m2.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p1.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> cannot be satisfied in the amplitude spectrogram domain. In this experiment, we estimated real-valued T–F masks for the amplitude spectrograms and transformed the spectrograms to the time-domain signals using the phase of the unprocessed noisy signals.</p> </div> <div class="ltx_para" id="S4.SS2.SSS2.p2"> <p class="ltx_p" id="S4.SS2.SSS2.p2.3">Table <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.T3" title="Table 3 ‣ 4.2.2 Evaluation of NyTT with loss functions that do not satisfy the conditions of Noise2Noise ‣ 4.2 Validity of interpretation of NyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">3</span></a> shows the evaluation results, demonstrating that <span class="ltx_text ltx_font_typewriter" id="S4.SS2.SSS2.p2.3.1">Spec</span> can improve speech quality, despite the lack of the zero-mean assumption for <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p2.1.m1.1"><semantics id="S4.SS2.SSS2.p2.1.m1.1a"><msup id="S4.SS2.SSS2.p2.1.m1.1.1" xref="S4.SS2.SSS2.p2.1.m1.1.1.cmml"><mi id="S4.SS2.SSS2.p2.1.m1.1.1.2" xref="S4.SS2.SSS2.p2.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS2.SSS2.p2.1.m1.1.1.3" xref="S4.SS2.SSS2.p2.1.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p2.1.m1.1b"><apply id="S4.SS2.SSS2.p2.1.m1.1.1.cmml" xref="S4.SS2.SSS2.p2.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS2.SSS2.p2.1.m1.1.1.1.cmml" xref="S4.SS2.SSS2.p2.1.m1.1.1">superscript</csymbol><ci id="S4.SS2.SSS2.p2.1.m1.1.1.2.cmml" xref="S4.SS2.SSS2.p2.1.m1.1.1.2">𝒏</ci><ci id="S4.SS2.SSS2.p2.1.m1.1.1.3.cmml" xref="S4.SS2.SSS2.p2.1.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p2.1.m1.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p2.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>. The results in Secs. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS2.SSS1" title="4.2.1 Analysis of signals processed in NyTT ‣ 4.2 Validity of interpretation of NyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4.2.1</span></a> and <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS2.SSS2" title="4.2.2 Evaluation of NyTT with loss functions that do not satisfy the conditions of Noise2Noise ‣ 4.2 Validity of interpretation of NyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4.2.2</span></a> indicate that NyTT achieves TSE by reducing the noise component corresponding to <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p2.2.m2.1"><semantics id="S4.SS2.SSS2.p2.2.m2.1a"><msup id="S4.SS2.SSS2.p2.2.m2.1.1" xref="S4.SS2.SSS2.p2.2.m2.1.1.cmml"><mi id="S4.SS2.SSS2.p2.2.m2.1.1.2" xref="S4.SS2.SSS2.p2.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS2.SSS2.p2.2.m2.1.1.3" xref="S4.SS2.SSS2.p2.2.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p2.2.m2.1b"><apply id="S4.SS2.SSS2.p2.2.m2.1.1.cmml" xref="S4.SS2.SSS2.p2.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS2.SSS2.p2.2.m2.1.1.1.cmml" xref="S4.SS2.SSS2.p2.2.m2.1.1">superscript</csymbol><ci id="S4.SS2.SSS2.p2.2.m2.1.1.2.cmml" xref="S4.SS2.SSS2.p2.2.m2.1.1.2">𝒏</ci><ci id="S4.SS2.SSS2.p2.2.m2.1.1.3.cmml" xref="S4.SS2.SSS2.p2.2.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p2.2.m2.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p2.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> rather than the Noise2Noise framework. Therefore, the zero-mean distribution assumption for <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p2.3.m3.1"><semantics id="S4.SS2.SSS2.p2.3.m3.1a"><msup id="S4.SS2.SSS2.p2.3.m3.1.1" xref="S4.SS2.SSS2.p2.3.m3.1.1.cmml"><mi id="S4.SS2.SSS2.p2.3.m3.1.1.2" xref="S4.SS2.SSS2.p2.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS2.SSS2.p2.3.m3.1.1.3" xref="S4.SS2.SSS2.p2.3.m3.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p2.3.m3.1b"><apply id="S4.SS2.SSS2.p2.3.m3.1.1.cmml" xref="S4.SS2.SSS2.p2.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS2.SSS2.p2.3.m3.1.1.1.cmml" xref="S4.SS2.SSS2.p2.3.m3.1.1">superscript</csymbol><ci id="S4.SS2.SSS2.p2.3.m3.1.1.2.cmml" xref="S4.SS2.SSS2.p2.3.m3.1.1.2">𝒏</ci><ci id="S4.SS2.SSS2.p2.3.m3.1.1.3.cmml" xref="S4.SS2.SSS2.p2.3.m3.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p2.3.m3.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p2.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and the use of the MSE loss function are unnecessary, making NyTT a more flexible training strategy (this conclusion is also supported by the experimental results in Secs. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S5" title="5 Experimental analysis in the dereverberation task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">5</span></a> and <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S6" title="6 Experimental analysis in the declipping task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">6</span></a>).</p> </div> <figure class="ltx_table" id="S4.T3"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table">Table 3: </span>Comparison of loss functions. <math alttext="\bm{n}^{\mathrm{obs}}" class="ltx_Math" display="inline" id="S4.T3.3.m1.1"><semantics id="S4.T3.3.m1.1b"><msup id="S4.T3.3.m1.1.1" xref="S4.T3.3.m1.1.1.cmml"><mi id="S4.T3.3.m1.1.1.2" xref="S4.T3.3.m1.1.1.2.cmml">𝒏</mi><mi id="S4.T3.3.m1.1.1.3" xref="S4.T3.3.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T3.3.m1.1c"><apply id="S4.T3.3.m1.1.1.cmml" xref="S4.T3.3.m1.1.1"><csymbol cd="ambiguous" id="S4.T3.3.m1.1.1.1.cmml" xref="S4.T3.3.m1.1.1">superscript</csymbol><ci id="S4.T3.3.m1.1.1.2.cmml" xref="S4.T3.3.m1.1.1.2">𝒏</ci><ci id="S4.T3.3.m1.1.1.3.cmml" xref="S4.T3.3.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T3.3.m1.1d">\bm{n}^{\mathrm{obs}}</annotation><annotation encoding="application/x-llamapun" id="S4.T3.3.m1.1e">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\mathrm{add}}" class="ltx_Math" display="inline" id="S4.T3.4.m2.1"><semantics id="S4.T3.4.m2.1b"><msup id="S4.T3.4.m2.1.1" xref="S4.T3.4.m2.1.1.cmml"><mi id="S4.T3.4.m2.1.1.2" xref="S4.T3.4.m2.1.1.2.cmml">𝒏</mi><mi id="S4.T3.4.m2.1.1.3" xref="S4.T3.4.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T3.4.m2.1c"><apply id="S4.T3.4.m2.1.1.cmml" xref="S4.T3.4.m2.1.1"><csymbol cd="ambiguous" id="S4.T3.4.m2.1.1.1.cmml" xref="S4.T3.4.m2.1.1">superscript</csymbol><ci id="S4.T3.4.m2.1.1.2.cmml" xref="S4.T3.4.m2.1.1.2">𝒏</ci><ci id="S4.T3.4.m2.1.1.3.cmml" xref="S4.T3.4.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T3.4.m2.1d">\bm{n}^{\mathrm{add}}</annotation><annotation encoding="application/x-llamapun" id="S4.T3.4.m2.1e">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> were <span class="ltx_text ltx_font_typewriter" id="S4.T3.7.1">CHiME-A</span> and <span class="ltx_text ltx_font_typewriter" id="S4.T3.8.2">CHiME-B</span>, respectively.</figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T3.9" style="width:140.9pt;height:54pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-23.5pt,9.0pt) scale(0.75,0.75) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T3.9.1"> <tr class="ltx_tr" id="S4.T3.9.1.1"> <td class="ltx_td ltx_border_r ltx_border_tt" id="S4.T3.9.1.1.1"></td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T3.9.1.1.2">Unprocessed</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T3.9.1.1.3"><span class="ltx_text ltx_font_typewriter" id="S4.T3.9.1.1.3.1">Time</span></td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T3.9.1.1.4"><span class="ltx_text ltx_font_typewriter" id="S4.T3.9.1.1.4.1">Spec</span></td> </tr> <tr class="ltx_tr" id="S4.T3.9.1.2"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S4.T3.9.1.2.1">SI-SDR</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.9.1.2.2">9.67</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.9.1.2.3"><span class="ltx_text ltx_font_bold" id="S4.T3.9.1.2.3.1">15.89</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.9.1.2.4">14.94</td> </tr> <tr class="ltx_tr" id="S4.T3.9.1.3"> <td class="ltx_td ltx_align_left ltx_border_r" id="S4.T3.9.1.3.1">PESQ</td> <td class="ltx_td ltx_align_center" id="S4.T3.9.1.3.2">1.48</td> <td class="ltx_td ltx_align_center" id="S4.T3.9.1.3.3"><span class="ltx_text ltx_font_bold" id="S4.T3.9.1.3.3.1">2.33</span></td> <td class="ltx_td ltx_align_center" id="S4.T3.9.1.3.4">2.10</td> </tr> <tr class="ltx_tr" id="S4.T3.9.1.4"> <td class="ltx_td ltx_align_left ltx_border_bb ltx_border_r" id="S4.T3.9.1.4.1">STOI</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.9.1.4.2">0.874</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.9.1.4.3"><span class="ltx_text ltx_font_bold" id="S4.T3.9.1.4.3.1">0.928</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.9.1.4.4">0.923</td> </tr> </table> </span></div> </figure> </section> </section> <section class="ltx_subsection" id="S4.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.3 </span>Effectiveness of IterNyTT</h3> <div class="ltx_para" id="S4.SS3.p1"> <p class="ltx_p" id="S4.SS3.p1.2">To verify the effectiveness of IterNyTT, we evaluated its performance over five iterations. In this experiment, <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS3.p1.1.m1.1"><semantics id="S4.SS3.p1.1.m1.1a"><msup id="S4.SS3.p1.1.m1.1.1" xref="S4.SS3.p1.1.m1.1.1.cmml"><mi id="S4.SS3.p1.1.m1.1.1.2" xref="S4.SS3.p1.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS3.p1.1.m1.1.1.3" xref="S4.SS3.p1.1.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS3.p1.1.m1.1b"><apply id="S4.SS3.p1.1.m1.1.1.cmml" xref="S4.SS3.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS3.p1.1.m1.1.1.1.cmml" xref="S4.SS3.p1.1.m1.1.1">superscript</csymbol><ci id="S4.SS3.p1.1.m1.1.1.2.cmml" xref="S4.SS3.p1.1.m1.1.1.2">𝒏</ci><ci id="S4.SS3.p1.1.m1.1.1.3.cmml" xref="S4.SS3.p1.1.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p1.1.m1.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p1.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS3.p1.2.m2.1"><semantics id="S4.SS3.p1.2.m2.1a"><msup id="S4.SS3.p1.2.m2.1.1" xref="S4.SS3.p1.2.m2.1.1.cmml"><mi id="S4.SS3.p1.2.m2.1.1.2" xref="S4.SS3.p1.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS3.p1.2.m2.1.1.3" xref="S4.SS3.p1.2.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS3.p1.2.m2.1b"><apply id="S4.SS3.p1.2.m2.1.1.cmml" xref="S4.SS3.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS3.p1.2.m2.1.1.1.cmml" xref="S4.SS3.p1.2.m2.1.1">superscript</csymbol><ci id="S4.SS3.p1.2.m2.1.1.2.cmml" xref="S4.SS3.p1.2.m2.1.1.2">𝒏</ci><ci id="S4.SS3.p1.2.m2.1.1.3.cmml" xref="S4.SS3.p1.2.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS3.p1.2.m2.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS3.p1.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> were <span class="ltx_text ltx_font_typewriter" id="S4.SS3.p1.2.1">CHiME-A</span> and <span class="ltx_text ltx_font_typewriter" id="S4.SS3.p1.2.2">CHiME-B</span>, respectively.</p> </div> <div class="ltx_para" id="S4.SS3.p2"> <p class="ltx_p" id="S4.SS3.p2.1">Figure <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.F4" title="Figure 4 ‣ 4.3 Effectiveness of IterNyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4</span></a> illustrates the SI-SDR of the noisy targets, along with the SI-SDR, PESQ, and STOI of the processed results for the test dataset at each iteration of IterNyTT. The figure shows that IterNyTT improves the quality of the noisy targets, and thus, the performance on the test dataset approaches that of CTT as the number of iterations increases. Additionally, we observe that the SI-SDR of the noisy target is significantly improved at the second iteration and the performance on the test dataset is also improved at that time.</p> </div> <figure class="ltx_figure" id="S4.F4"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="288" id="S4.F4.g1" src="x4.png" width="830"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure">Figure 4: </span> Changes in SI-SDR of the target signals and evaluation results for the test dataset through IterNyTT. The first iteration of IterNyTT is equivalent to the original NyTT. Values in parentheses indicate the evaluation results of unprocessed input signals. <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.F4.3.m1.1"><semantics id="S4.F4.3.m1.1b"><msup id="S4.F4.3.m1.1.1" xref="S4.F4.3.m1.1.1.cmml"><mi id="S4.F4.3.m1.1.1.2" xref="S4.F4.3.m1.1.1.2.cmml">𝒏</mi><mi id="S4.F4.3.m1.1.1.3" xref="S4.F4.3.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.F4.3.m1.1c"><apply id="S4.F4.3.m1.1.1.cmml" xref="S4.F4.3.m1.1.1"><csymbol cd="ambiguous" id="S4.F4.3.m1.1.1.1.cmml" xref="S4.F4.3.m1.1.1">superscript</csymbol><ci id="S4.F4.3.m1.1.1.2.cmml" xref="S4.F4.3.m1.1.1.2">𝒏</ci><ci id="S4.F4.3.m1.1.1.3.cmml" xref="S4.F4.3.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.F4.3.m1.1d">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.F4.3.m1.1e">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.F4.4.m2.1"><semantics id="S4.F4.4.m2.1b"><msup id="S4.F4.4.m2.1.1" xref="S4.F4.4.m2.1.1.cmml"><mi id="S4.F4.4.m2.1.1.2" xref="S4.F4.4.m2.1.1.2.cmml">𝒏</mi><mi id="S4.F4.4.m2.1.1.3" xref="S4.F4.4.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.F4.4.m2.1c"><apply id="S4.F4.4.m2.1.1.cmml" xref="S4.F4.4.m2.1.1"><csymbol cd="ambiguous" id="S4.F4.4.m2.1.1.1.cmml" xref="S4.F4.4.m2.1.1">superscript</csymbol><ci id="S4.F4.4.m2.1.1.2.cmml" xref="S4.F4.4.m2.1.1.2">𝒏</ci><ci id="S4.F4.4.m2.1.1.3.cmml" xref="S4.F4.4.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.F4.4.m2.1d">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.F4.4.m2.1e">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> were <span class="ltx_text ltx_font_typewriter" id="S4.F4.7.1">CHiME-A</span> and <span class="ltx_text ltx_font_typewriter" id="S4.F4.8.2">CHiME-B</span>, respectively. </figcaption> </figure> </section> <section class="ltx_subsection" id="S4.SS4"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.4 </span>Effects of noise mismatches</h3> <div class="ltx_para" id="S4.SS4.p1"> <p class="ltx_p" id="S4.SS4.p1.5">To investigate the effects of mismatches between noise signals in NyTT (i.e., <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.p1.1.m1.1"><semantics id="S4.SS4.p1.1.m1.1a"><msup id="S4.SS4.p1.1.m1.1.1" xref="S4.SS4.p1.1.m1.1.1.cmml"><mi id="S4.SS4.p1.1.m1.1.1.2" xref="S4.SS4.p1.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.p1.1.m1.1.1.3" xref="S4.SS4.p1.1.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.p1.1.m1.1b"><apply id="S4.SS4.p1.1.m1.1.1.cmml" xref="S4.SS4.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS4.p1.1.m1.1.1.1.cmml" xref="S4.SS4.p1.1.m1.1.1">superscript</csymbol><ci id="S4.SS4.p1.1.m1.1.1.2.cmml" xref="S4.SS4.p1.1.m1.1.1.2">𝒏</ci><ci id="S4.SS4.p1.1.m1.1.1.3.cmml" xref="S4.SS4.p1.1.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.p1.1.m1.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.p1.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>, <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.p1.2.m2.1"><semantics id="S4.SS4.p1.2.m2.1a"><msup id="S4.SS4.p1.2.m2.1.1" xref="S4.SS4.p1.2.m2.1.1.cmml"><mi id="S4.SS4.p1.2.m2.1.1.2" xref="S4.SS4.p1.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.p1.2.m2.1.1.3" xref="S4.SS4.p1.2.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.p1.2.m2.1b"><apply id="S4.SS4.p1.2.m2.1.1.cmml" xref="S4.SS4.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS4.p1.2.m2.1.1.1.cmml" xref="S4.SS4.p1.2.m2.1.1">superscript</csymbol><ci id="S4.SS4.p1.2.m2.1.1.2.cmml" xref="S4.SS4.p1.2.m2.1.1.2">𝒏</ci><ci id="S4.SS4.p1.2.m2.1.1.3.cmml" xref="S4.SS4.p1.2.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.p1.2.m2.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.p1.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.p1.3.m3.1"><semantics id="S4.SS4.p1.3.m3.1a"><msup id="S4.SS4.p1.3.m3.1.1" xref="S4.SS4.p1.3.m3.1.1.cmml"><mi id="S4.SS4.p1.3.m3.1.1.2" xref="S4.SS4.p1.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.p1.3.m3.1.1.3" xref="S4.SS4.p1.3.m3.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.p1.3.m3.1b"><apply id="S4.SS4.p1.3.m3.1.1.cmml" xref="S4.SS4.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.p1.3.m3.1.1.1.cmml" xref="S4.SS4.p1.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.p1.3.m3.1.1.2.cmml" xref="S4.SS4.p1.3.m3.1.1.2">𝒏</ci><ci id="S4.SS4.p1.3.m3.1.1.3.cmml" xref="S4.SS4.p1.3.m3.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.p1.3.m3.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.p1.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>), we simulated mismatched conditions using <span class="ltx_text ltx_font_typewriter" id="S4.SS4.p1.5.1">CHiME-A</span>, <span class="ltx_text ltx_font_typewriter" id="S4.SS4.p1.5.2">DEMAND-A</span>, and <span class="ltx_text ltx_font_typewriter" id="S4.SS4.p1.5.3">DCASE</span> as <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.p1.4.m4.1"><semantics id="S4.SS4.p1.4.m4.1a"><msup id="S4.SS4.p1.4.m4.1.1" xref="S4.SS4.p1.4.m4.1.1.cmml"><mi id="S4.SS4.p1.4.m4.1.1.2" xref="S4.SS4.p1.4.m4.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.p1.4.m4.1.1.3" xref="S4.SS4.p1.4.m4.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.p1.4.m4.1b"><apply id="S4.SS4.p1.4.m4.1.1.cmml" xref="S4.SS4.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS4.p1.4.m4.1.1.1.cmml" xref="S4.SS4.p1.4.m4.1.1">superscript</csymbol><ci id="S4.SS4.p1.4.m4.1.1.2.cmml" xref="S4.SS4.p1.4.m4.1.1.2">𝒏</ci><ci id="S4.SS4.p1.4.m4.1.1.3.cmml" xref="S4.SS4.p1.4.m4.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.p1.4.m4.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.p1.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>, and <span class="ltx_text ltx_font_typewriter" id="S4.SS4.p1.5.4">CHiME-B</span> and <span class="ltx_text ltx_font_typewriter" id="S4.SS4.p1.5.5">DEMAND-B</span> as <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.p1.5.m5.1"><semantics id="S4.SS4.p1.5.m5.1a"><msup id="S4.SS4.p1.5.m5.1.1" xref="S4.SS4.p1.5.m5.1.1.cmml"><mi id="S4.SS4.p1.5.m5.1.1.2" xref="S4.SS4.p1.5.m5.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.p1.5.m5.1.1.3" xref="S4.SS4.p1.5.m5.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.p1.5.m5.1b"><apply id="S4.SS4.p1.5.m5.1.1.cmml" xref="S4.SS4.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S4.SS4.p1.5.m5.1.1.1.cmml" xref="S4.SS4.p1.5.m5.1.1">superscript</csymbol><ci id="S4.SS4.p1.5.m5.1.1.2.cmml" xref="S4.SS4.p1.5.m5.1.1.2">𝒏</ci><ci id="S4.SS4.p1.5.m5.1.1.3.cmml" xref="S4.SS4.p1.5.m5.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.p1.5.m5.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.p1.5.m5.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>.</p> </div> <div class="ltx_para" id="S4.SS4.p2"> <p class="ltx_p" id="S4.SS4.p2.1">To clarify the noise mismatches, we visualize the distribution of each noise dataset in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.F5" title="Figure 5 ‣ 4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">5</span></a>. The plots in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.F5" title="Figure 5 ‣ 4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">5</span></a> were created by extracting features using a pre-trained audio event classification model, VGGish <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">hershey2017cnn</span>]</cite>, and projecting them into a two-dimensional space using UMAP <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">mcinnes2018umap</span>]</cite>. For visibility, we randomly selected 200 samples of <math alttext="2\text{\,}\mathrm{s}" class="ltx_Math" display="inline" id="S4.SS4.p2.1.m1.3"><semantics id="S4.SS4.p2.1.m1.3a"><mrow id="S4.SS4.p2.1.m1.3.3" xref="S4.SS4.p2.1.m1.3.3.cmml"><mn id="S4.SS4.p2.1.m1.1.1.1.1.1.1" xref="S4.SS4.p2.1.m1.1.1.1.1.1.1.cmml">2</mn><mtext id="S4.SS4.p2.1.m1.2.2.2.2.2.2" xref="S4.SS4.p2.1.m1.2.2.2.2.2.2.cmml"> </mtext><mi id="S4.SS4.p2.1.m1.3.3.3.3.3.3" mathvariant="normal" xref="S4.SS4.p2.1.m1.3.3.3.3.3.3.cmml">s</mi></mrow><annotation-xml encoding="MathML-Content" id="S4.SS4.p2.1.m1.3b"><apply id="S4.SS4.p2.1.m1.3.3.cmml" xref="S4.SS4.p2.1.m1.3.3"><csymbol cd="latexml" id="S4.SS4.p2.1.m1.2.2.2.2.2.2.cmml" xref="S4.SS4.p2.1.m1.2.2.2.2.2.2">times</csymbol><cn id="S4.SS4.p2.1.m1.1.1.1.1.1.1.cmml" type="integer" xref="S4.SS4.p2.1.m1.1.1.1.1.1.1">2</cn><ci id="S4.SS4.p2.1.m1.3.3.3.3.3.3.cmml" xref="S4.SS4.p2.1.m1.3.3.3.3.3.3">s</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.p2.1.m1.3c">2\text{\,}\mathrm{s}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.p2.1.m1.3d">start_ARG 2 end_ARG start_ARG times end_ARG start_ARG roman_s end_ARG</annotation></semantics></math> noise signals from each dataset. The figure shows that each noise dataset forms a distinct cluster, indicating that they have different characteristics. We can also see that <span class="ltx_text ltx_font_typewriter" id="S4.SS4.p2.1.1">DCASE</span> has particularly different characteristics from the other datasets.</p> </div> <figure class="ltx_figure" id="S4.F5"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="207" id="S4.F5.g1" src="extracted/6291863/fig/vggish_umap_j.png" width="592"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure">Figure 5: </span>Distribution of each noise dataset. Although UMAP features were calculated using all noise datasets, they were separately plotted for visibility. All figures have the same axes.</figcaption> </figure> <section class="ltx_subsubsection" id="S4.SS4.SSS1"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.4.1 </span>Effects of mismatches on the performance</h4> <div class="ltx_para" id="S4.SS4.SSS1.p1"> <p class="ltx_p" id="S4.SS4.SSS1.p1.1">We investigated the effects of mismatches between noise signals on the performance of CTT, NyTT, and IterNyTT. In this experiment, we set the number of iterations for IterNyTT to three.</p> </div> <div class="ltx_para" id="S4.SS4.SSS1.p2"> <p class="ltx_p" id="S4.SS4.SSS1.p2.9">Table <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.T4" title="Table 4 ‣ 4.4.1 Effects of mismatches on the performance ‣ 4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4</span></a> shows the evaluation results of CTT, NyTT, and IterNyTT for each combination of <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p2.1.m1.1"><semantics id="S4.SS4.SSS1.p2.1.m1.1a"><msup id="S4.SS4.SSS1.p2.1.m1.1.1" xref="S4.SS4.SSS1.p2.1.m1.1.1.cmml"><mi id="S4.SS4.SSS1.p2.1.m1.1.1.2" xref="S4.SS4.SSS1.p2.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p2.1.m1.1.1.3" xref="S4.SS4.SSS1.p2.1.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p2.1.m1.1b"><apply id="S4.SS4.SSS1.p2.1.m1.1.1.cmml" xref="S4.SS4.SSS1.p2.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p2.1.m1.1.1.1.cmml" xref="S4.SS4.SSS1.p2.1.m1.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p2.1.m1.1.1.2.cmml" xref="S4.SS4.SSS1.p2.1.m1.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p2.1.m1.1.1.3.cmml" xref="S4.SS4.SSS1.p2.1.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p2.1.m1.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p2.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p2.2.m2.1"><semantics id="S4.SS4.SSS1.p2.2.m2.1a"><msup id="S4.SS4.SSS1.p2.2.m2.1.1" xref="S4.SS4.SSS1.p2.2.m2.1.1.cmml"><mi id="S4.SS4.SSS1.p2.2.m2.1.1.2" xref="S4.SS4.SSS1.p2.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p2.2.m2.1.1.3" xref="S4.SS4.SSS1.p2.2.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p2.2.m2.1b"><apply id="S4.SS4.SSS1.p2.2.m2.1.1.cmml" xref="S4.SS4.SSS1.p2.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p2.2.m2.1.1.1.cmml" xref="S4.SS4.SSS1.p2.2.m2.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p2.2.m2.1.1.2.cmml" xref="S4.SS4.SSS1.p2.2.m2.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p2.2.m2.1.1.3.cmml" xref="S4.SS4.SSS1.p2.2.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p2.2.m2.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p2.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>. The table also includes the evaluation results of IterNyTT using different noise datasets for the first and second iterations (training for the TSE of the noisy targets) and the third iteration (training for the TSE of the test dataset). We analyze these results from three perspectives: mismatches between a) <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p2.3.m3.1"><semantics id="S4.SS4.SSS1.p2.3.m3.1a"><msup id="S4.SS4.SSS1.p2.3.m3.1.1" xref="S4.SS4.SSS1.p2.3.m3.1.1.cmml"><mi id="S4.SS4.SSS1.p2.3.m3.1.1.2" xref="S4.SS4.SSS1.p2.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p2.3.m3.1.1.3" xref="S4.SS4.SSS1.p2.3.m3.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p2.3.m3.1b"><apply id="S4.SS4.SSS1.p2.3.m3.1.1.cmml" xref="S4.SS4.SSS1.p2.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p2.3.m3.1.1.1.cmml" xref="S4.SS4.SSS1.p2.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p2.3.m3.1.1.2.cmml" xref="S4.SS4.SSS1.p2.3.m3.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p2.3.m3.1.1.3.cmml" xref="S4.SS4.SSS1.p2.3.m3.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p2.3.m3.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p2.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p2.4.m4.1"><semantics id="S4.SS4.SSS1.p2.4.m4.1a"><msup id="S4.SS4.SSS1.p2.4.m4.1.1" xref="S4.SS4.SSS1.p2.4.m4.1.1.cmml"><mi id="S4.SS4.SSS1.p2.4.m4.1.1.2" xref="S4.SS4.SSS1.p2.4.m4.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p2.4.m4.1.1.3" xref="S4.SS4.SSS1.p2.4.m4.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p2.4.m4.1b"><apply id="S4.SS4.SSS1.p2.4.m4.1.1.cmml" xref="S4.SS4.SSS1.p2.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p2.4.m4.1.1.1.cmml" xref="S4.SS4.SSS1.p2.4.m4.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p2.4.m4.1.1.2.cmml" xref="S4.SS4.SSS1.p2.4.m4.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p2.4.m4.1.1.3.cmml" xref="S4.SS4.SSS1.p2.4.m4.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p2.4.m4.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p2.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, b) <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p2.5.m5.1"><semantics id="S4.SS4.SSS1.p2.5.m5.1a"><msup id="S4.SS4.SSS1.p2.5.m5.1.1" xref="S4.SS4.SSS1.p2.5.m5.1.1.cmml"><mi id="S4.SS4.SSS1.p2.5.m5.1.1.2" xref="S4.SS4.SSS1.p2.5.m5.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p2.5.m5.1.1.3" xref="S4.SS4.SSS1.p2.5.m5.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p2.5.m5.1b"><apply id="S4.SS4.SSS1.p2.5.m5.1.1.cmml" xref="S4.SS4.SSS1.p2.5.m5.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p2.5.m5.1.1.1.cmml" xref="S4.SS4.SSS1.p2.5.m5.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p2.5.m5.1.1.2.cmml" xref="S4.SS4.SSS1.p2.5.m5.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p2.5.m5.1.1.3.cmml" xref="S4.SS4.SSS1.p2.5.m5.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p2.5.m5.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p2.5.m5.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p2.6.m6.1"><semantics id="S4.SS4.SSS1.p2.6.m6.1a"><msup id="S4.SS4.SSS1.p2.6.m6.1.1" xref="S4.SS4.SSS1.p2.6.m6.1.1.cmml"><mi id="S4.SS4.SSS1.p2.6.m6.1.1.2" xref="S4.SS4.SSS1.p2.6.m6.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p2.6.m6.1.1.3" xref="S4.SS4.SSS1.p2.6.m6.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p2.6.m6.1b"><apply id="S4.SS4.SSS1.p2.6.m6.1.1.cmml" xref="S4.SS4.SSS1.p2.6.m6.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p2.6.m6.1.1.1.cmml" xref="S4.SS4.SSS1.p2.6.m6.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p2.6.m6.1.1.2.cmml" xref="S4.SS4.SSS1.p2.6.m6.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p2.6.m6.1.1.3.cmml" xref="S4.SS4.SSS1.p2.6.m6.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p2.6.m6.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p2.6.m6.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, and c) <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p2.7.m7.1"><semantics id="S4.SS4.SSS1.p2.7.m7.1a"><msup id="S4.SS4.SSS1.p2.7.m7.1.1" xref="S4.SS4.SSS1.p2.7.m7.1.1.cmml"><mi id="S4.SS4.SSS1.p2.7.m7.1.1.2" xref="S4.SS4.SSS1.p2.7.m7.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p2.7.m7.1.1.3" xref="S4.SS4.SSS1.p2.7.m7.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p2.7.m7.1b"><apply id="S4.SS4.SSS1.p2.7.m7.1.1.cmml" xref="S4.SS4.SSS1.p2.7.m7.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p2.7.m7.1.1.1.cmml" xref="S4.SS4.SSS1.p2.7.m7.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p2.7.m7.1.1.2.cmml" xref="S4.SS4.SSS1.p2.7.m7.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p2.7.m7.1.1.3.cmml" xref="S4.SS4.SSS1.p2.7.m7.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p2.7.m7.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p2.7.m7.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p2.8.m8.1"><semantics id="S4.SS4.SSS1.p2.8.m8.1a"><msup id="S4.SS4.SSS1.p2.8.m8.1.1" xref="S4.SS4.SSS1.p2.8.m8.1.1.cmml"><mi id="S4.SS4.SSS1.p2.8.m8.1.1.2" xref="S4.SS4.SSS1.p2.8.m8.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p2.8.m8.1.1.3" xref="S4.SS4.SSS1.p2.8.m8.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p2.8.m8.1b"><apply id="S4.SS4.SSS1.p2.8.m8.1.1.cmml" xref="S4.SS4.SSS1.p2.8.m8.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p2.8.m8.1.1.1.cmml" xref="S4.SS4.SSS1.p2.8.m8.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p2.8.m8.1.1.2.cmml" xref="S4.SS4.SSS1.p2.8.m8.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p2.8.m8.1.1.3.cmml" xref="S4.SS4.SSS1.p2.8.m8.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p2.8.m8.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p2.8.m8.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, where <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p2.9.m9.1"><semantics id="S4.SS4.SSS1.p2.9.m9.1a"><msup id="S4.SS4.SSS1.p2.9.m9.1.1" xref="S4.SS4.SSS1.p2.9.m9.1.1.cmml"><mi id="S4.SS4.SSS1.p2.9.m9.1.1.2" xref="S4.SS4.SSS1.p2.9.m9.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p2.9.m9.1.1.3" xref="S4.SS4.SSS1.p2.9.m9.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p2.9.m9.1b"><apply id="S4.SS4.SSS1.p2.9.m9.1.1.cmml" xref="S4.SS4.SSS1.p2.9.m9.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p2.9.m9.1.1.1.cmml" xref="S4.SS4.SSS1.p2.9.m9.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p2.9.m9.1.1.2.cmml" xref="S4.SS4.SSS1.p2.9.m9.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p2.9.m9.1.1.3.cmml" xref="S4.SS4.SSS1.p2.9.m9.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p2.9.m9.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p2.9.m9.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math> was <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p2.9.1">CHiME-C</span>.</p> </div> <div class="ltx_para" id="S4.SS4.SSS1.p3"> <p class="ltx_p" id="S4.SS4.SSS1.p3.8">First, we focus on the impact of the mismatch between <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p3.1.m1.1"><semantics id="S4.SS4.SSS1.p3.1.m1.1a"><msup id="S4.SS4.SSS1.p3.1.m1.1.1" xref="S4.SS4.SSS1.p3.1.m1.1.1.cmml"><mi id="S4.SS4.SSS1.p3.1.m1.1.1.2" xref="S4.SS4.SSS1.p3.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p3.1.m1.1.1.3" xref="S4.SS4.SSS1.p3.1.m1.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p3.1.m1.1b"><apply id="S4.SS4.SSS1.p3.1.m1.1.1.cmml" xref="S4.SS4.SSS1.p3.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p3.1.m1.1.1.1.cmml" xref="S4.SS4.SSS1.p3.1.m1.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p3.1.m1.1.1.2.cmml" xref="S4.SS4.SSS1.p3.1.m1.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p3.1.m1.1.1.3.cmml" xref="S4.SS4.SSS1.p3.1.m1.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p3.1.m1.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p3.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p3.2.m2.1"><semantics id="S4.SS4.SSS1.p3.2.m2.1a"><msup id="S4.SS4.SSS1.p3.2.m2.1.1" xref="S4.SS4.SSS1.p3.2.m2.1.1.cmml"><mi id="S4.SS4.SSS1.p3.2.m2.1.1.2" xref="S4.SS4.SSS1.p3.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p3.2.m2.1.1.3" xref="S4.SS4.SSS1.p3.2.m2.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p3.2.m2.1b"><apply id="S4.SS4.SSS1.p3.2.m2.1.1.cmml" xref="S4.SS4.SSS1.p3.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p3.2.m2.1.1.1.cmml" xref="S4.SS4.SSS1.p3.2.m2.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p3.2.m2.1.1.2.cmml" xref="S4.SS4.SSS1.p3.2.m2.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p3.2.m2.1.1.3.cmml" xref="S4.SS4.SSS1.p3.2.m2.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p3.2.m2.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p3.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math> on the performance of NyTT. For example, we analyze the performance of NyTT when <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p3.3.m3.1"><semantics id="S4.SS4.SSS1.p3.3.m3.1a"><msup id="S4.SS4.SSS1.p3.3.m3.1.1" xref="S4.SS4.SSS1.p3.3.m3.1.1.cmml"><mi id="S4.SS4.SSS1.p3.3.m3.1.1.2" xref="S4.SS4.SSS1.p3.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p3.3.m3.1.1.3" xref="S4.SS4.SSS1.p3.3.m3.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p3.3.m3.1b"><apply id="S4.SS4.SSS1.p3.3.m3.1.1.cmml" xref="S4.SS4.SSS1.p3.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p3.3.m3.1.1.1.cmml" xref="S4.SS4.SSS1.p3.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p3.3.m3.1.1.2.cmml" xref="S4.SS4.SSS1.p3.3.m3.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p3.3.m3.1.1.3.cmml" xref="S4.SS4.SSS1.p3.3.m3.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p3.3.m3.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p3.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p3.8.1">CHiME-A</span>. In this case, NyTT achieves SI-SDRs of 15.87 dB when <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p3.4.m4.1"><semantics id="S4.SS4.SSS1.p3.4.m4.1a"><msup id="S4.SS4.SSS1.p3.4.m4.1.1" xref="S4.SS4.SSS1.p3.4.m4.1.1.cmml"><mi id="S4.SS4.SSS1.p3.4.m4.1.1.2" xref="S4.SS4.SSS1.p3.4.m4.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p3.4.m4.1.1.3" xref="S4.SS4.SSS1.p3.4.m4.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p3.4.m4.1b"><apply id="S4.SS4.SSS1.p3.4.m4.1.1.cmml" xref="S4.SS4.SSS1.p3.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p3.4.m4.1.1.1.cmml" xref="S4.SS4.SSS1.p3.4.m4.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p3.4.m4.1.1.2.cmml" xref="S4.SS4.SSS1.p3.4.m4.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p3.4.m4.1.1.3.cmml" xref="S4.SS4.SSS1.p3.4.m4.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p3.4.m4.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p3.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p3.8.2">CHiME-B</span> and 10.27 dB when <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p3.5.m5.1"><semantics id="S4.SS4.SSS1.p3.5.m5.1a"><msup id="S4.SS4.SSS1.p3.5.m5.1.1" xref="S4.SS4.SSS1.p3.5.m5.1.1.cmml"><mi id="S4.SS4.SSS1.p3.5.m5.1.1.2" xref="S4.SS4.SSS1.p3.5.m5.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p3.5.m5.1.1.3" xref="S4.SS4.SSS1.p3.5.m5.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p3.5.m5.1b"><apply id="S4.SS4.SSS1.p3.5.m5.1.1.cmml" xref="S4.SS4.SSS1.p3.5.m5.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p3.5.m5.1.1.1.cmml" xref="S4.SS4.SSS1.p3.5.m5.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p3.5.m5.1.1.2.cmml" xref="S4.SS4.SSS1.p3.5.m5.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p3.5.m5.1.1.3.cmml" xref="S4.SS4.SSS1.p3.5.m5.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p3.5.m5.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p3.5.m5.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p3.8.3">DEMAND-B</span>. Similarly, even when <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p3.6.m6.1"><semantics id="S4.SS4.SSS1.p3.6.m6.1a"><msup id="S4.SS4.SSS1.p3.6.m6.1.1" xref="S4.SS4.SSS1.p3.6.m6.1.1.cmml"><mi id="S4.SS4.SSS1.p3.6.m6.1.1.2" xref="S4.SS4.SSS1.p3.6.m6.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p3.6.m6.1.1.3" xref="S4.SS4.SSS1.p3.6.m6.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p3.6.m6.1b"><apply id="S4.SS4.SSS1.p3.6.m6.1.1.cmml" xref="S4.SS4.SSS1.p3.6.m6.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p3.6.m6.1.1.1.cmml" xref="S4.SS4.SSS1.p3.6.m6.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p3.6.m6.1.1.2.cmml" xref="S4.SS4.SSS1.p3.6.m6.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p3.6.m6.1.1.3.cmml" xref="S4.SS4.SSS1.p3.6.m6.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p3.6.m6.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p3.6.m6.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p3.8.4">DEMAND-A</span> or <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p3.8.5">DCASE</span>, and even when the metric is PESQ or STOI, we can consistently see that NyTT performs better when there is no mismatch between <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p3.7.m7.1"><semantics id="S4.SS4.SSS1.p3.7.m7.1a"><msup id="S4.SS4.SSS1.p3.7.m7.1.1" xref="S4.SS4.SSS1.p3.7.m7.1.1.cmml"><mi id="S4.SS4.SSS1.p3.7.m7.1.1.2" xref="S4.SS4.SSS1.p3.7.m7.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p3.7.m7.1.1.3" xref="S4.SS4.SSS1.p3.7.m7.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p3.7.m7.1b"><apply id="S4.SS4.SSS1.p3.7.m7.1.1.cmml" xref="S4.SS4.SSS1.p3.7.m7.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p3.7.m7.1.1.1.cmml" xref="S4.SS4.SSS1.p3.7.m7.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p3.7.m7.1.1.2.cmml" xref="S4.SS4.SSS1.p3.7.m7.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p3.7.m7.1.1.3.cmml" xref="S4.SS4.SSS1.p3.7.m7.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p3.7.m7.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p3.7.m7.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p3.8.m8.1"><semantics id="S4.SS4.SSS1.p3.8.m8.1a"><msup id="S4.SS4.SSS1.p3.8.m8.1.1" xref="S4.SS4.SSS1.p3.8.m8.1.1.cmml"><mi id="S4.SS4.SSS1.p3.8.m8.1.1.2" xref="S4.SS4.SSS1.p3.8.m8.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p3.8.m8.1.1.3" xref="S4.SS4.SSS1.p3.8.m8.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p3.8.m8.1b"><apply id="S4.SS4.SSS1.p3.8.m8.1.1.cmml" xref="S4.SS4.SSS1.p3.8.m8.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p3.8.m8.1.1.1.cmml" xref="S4.SS4.SSS1.p3.8.m8.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p3.8.m8.1.1.2.cmml" xref="S4.SS4.SSS1.p3.8.m8.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p3.8.m8.1.1.3.cmml" xref="S4.SS4.SSS1.p3.8.m8.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p3.8.m8.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p3.8.m8.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, as in CTT.</p> </div> <div class="ltx_para" id="S4.SS4.SSS1.p4"> <p class="ltx_p" id="S4.SS4.SSS1.p4.14">Second, we focus on the impact of the mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.1.m1.1"><semantics id="S4.SS4.SSS1.p4.1.m1.1a"><msup id="S4.SS4.SSS1.p4.1.m1.1.1" xref="S4.SS4.SSS1.p4.1.m1.1.1.cmml"><mi id="S4.SS4.SSS1.p4.1.m1.1.1.2" xref="S4.SS4.SSS1.p4.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.1.m1.1.1.3" xref="S4.SS4.SSS1.p4.1.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.1.m1.1b"><apply id="S4.SS4.SSS1.p4.1.m1.1.1.cmml" xref="S4.SS4.SSS1.p4.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.1.m1.1.1.1.cmml" xref="S4.SS4.SSS1.p4.1.m1.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.1.m1.1.1.2.cmml" xref="S4.SS4.SSS1.p4.1.m1.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.1.m1.1.1.3.cmml" xref="S4.SS4.SSS1.p4.1.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.1.m1.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.2.m2.1"><semantics id="S4.SS4.SSS1.p4.2.m2.1a"><msup id="S4.SS4.SSS1.p4.2.m2.1.1" xref="S4.SS4.SSS1.p4.2.m2.1.1.cmml"><mi id="S4.SS4.SSS1.p4.2.m2.1.1.2" xref="S4.SS4.SSS1.p4.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.2.m2.1.1.3" xref="S4.SS4.SSS1.p4.2.m2.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.2.m2.1b"><apply id="S4.SS4.SSS1.p4.2.m2.1.1.cmml" xref="S4.SS4.SSS1.p4.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.2.m2.1.1.1.cmml" xref="S4.SS4.SSS1.p4.2.m2.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.2.m2.1.1.2.cmml" xref="S4.SS4.SSS1.p4.2.m2.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.2.m2.1.1.3.cmml" xref="S4.SS4.SSS1.p4.2.m2.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.2.m2.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math> on the performance of NyTT. For example, we analyze the performance of NyTT when <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.3.m3.1"><semantics id="S4.SS4.SSS1.p4.3.m3.1a"><msup id="S4.SS4.SSS1.p4.3.m3.1.1" xref="S4.SS4.SSS1.p4.3.m3.1.1.cmml"><mi id="S4.SS4.SSS1.p4.3.m3.1.1.2" xref="S4.SS4.SSS1.p4.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.3.m3.1.1.3" xref="S4.SS4.SSS1.p4.3.m3.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.3.m3.1b"><apply id="S4.SS4.SSS1.p4.3.m3.1.1.cmml" xref="S4.SS4.SSS1.p4.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.3.m3.1.1.1.cmml" xref="S4.SS4.SSS1.p4.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.3.m3.1.1.2.cmml" xref="S4.SS4.SSS1.p4.3.m3.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.3.m3.1.1.3.cmml" xref="S4.SS4.SSS1.p4.3.m3.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.3.m3.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p4.14.1">DEMAND-B</span>. In this case, NyTT achieves SI-SDRs of 10.27 dB when <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.4.m4.1"><semantics id="S4.SS4.SSS1.p4.4.m4.1a"><msup id="S4.SS4.SSS1.p4.4.m4.1.1" xref="S4.SS4.SSS1.p4.4.m4.1.1.cmml"><mi id="S4.SS4.SSS1.p4.4.m4.1.1.2" xref="S4.SS4.SSS1.p4.4.m4.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.4.m4.1.1.3" xref="S4.SS4.SSS1.p4.4.m4.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.4.m4.1b"><apply id="S4.SS4.SSS1.p4.4.m4.1.1.cmml" xref="S4.SS4.SSS1.p4.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.4.m4.1.1.1.cmml" xref="S4.SS4.SSS1.p4.4.m4.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.4.m4.1.1.2.cmml" xref="S4.SS4.SSS1.p4.4.m4.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.4.m4.1.1.3.cmml" xref="S4.SS4.SSS1.p4.4.m4.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.4.m4.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p4.14.2">CHiME-A</span>, 13.56 dB when <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.5.m5.1"><semantics id="S4.SS4.SSS1.p4.5.m5.1a"><msup id="S4.SS4.SSS1.p4.5.m5.1.1" xref="S4.SS4.SSS1.p4.5.m5.1.1.cmml"><mi id="S4.SS4.SSS1.p4.5.m5.1.1.2" xref="S4.SS4.SSS1.p4.5.m5.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.5.m5.1.1.3" xref="S4.SS4.SSS1.p4.5.m5.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.5.m5.1b"><apply id="S4.SS4.SSS1.p4.5.m5.1.1.cmml" xref="S4.SS4.SSS1.p4.5.m5.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.5.m5.1.1.1.cmml" xref="S4.SS4.SSS1.p4.5.m5.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.5.m5.1.1.2.cmml" xref="S4.SS4.SSS1.p4.5.m5.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.5.m5.1.1.3.cmml" xref="S4.SS4.SSS1.p4.5.m5.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.5.m5.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.5.m5.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p4.14.3">DEMAND-A</span>, and 14.20 dB when <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.6.m6.1"><semantics id="S4.SS4.SSS1.p4.6.m6.1a"><msup id="S4.SS4.SSS1.p4.6.m6.1.1" xref="S4.SS4.SSS1.p4.6.m6.1.1.cmml"><mi id="S4.SS4.SSS1.p4.6.m6.1.1.2" xref="S4.SS4.SSS1.p4.6.m6.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.6.m6.1.1.3" xref="S4.SS4.SSS1.p4.6.m6.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.6.m6.1b"><apply id="S4.SS4.SSS1.p4.6.m6.1.1.cmml" xref="S4.SS4.SSS1.p4.6.m6.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.6.m6.1.1.1.cmml" xref="S4.SS4.SSS1.p4.6.m6.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.6.m6.1.1.2.cmml" xref="S4.SS4.SSS1.p4.6.m6.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.6.m6.1.1.3.cmml" xref="S4.SS4.SSS1.p4.6.m6.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.6.m6.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.6.m6.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p4.14.4">DCASE</span>. Similarly, even when <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.7.m7.1"><semantics id="S4.SS4.SSS1.p4.7.m7.1a"><msup id="S4.SS4.SSS1.p4.7.m7.1.1" xref="S4.SS4.SSS1.p4.7.m7.1.1.cmml"><mi id="S4.SS4.SSS1.p4.7.m7.1.1.2" xref="S4.SS4.SSS1.p4.7.m7.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.7.m7.1.1.3" xref="S4.SS4.SSS1.p4.7.m7.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.7.m7.1b"><apply id="S4.SS4.SSS1.p4.7.m7.1.1.cmml" xref="S4.SS4.SSS1.p4.7.m7.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.7.m7.1.1.1.cmml" xref="S4.SS4.SSS1.p4.7.m7.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.7.m7.1.1.2.cmml" xref="S4.SS4.SSS1.p4.7.m7.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.7.m7.1.1.3.cmml" xref="S4.SS4.SSS1.p4.7.m7.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.7.m7.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.7.m7.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p4.14.5">CHiME-B</span>, and even when the metric is PESQ or STOI, we can consistently see that NyTT performs better when there is a mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.8.m8.1"><semantics id="S4.SS4.SSS1.p4.8.m8.1a"><msup id="S4.SS4.SSS1.p4.8.m8.1.1" xref="S4.SS4.SSS1.p4.8.m8.1.1.cmml"><mi id="S4.SS4.SSS1.p4.8.m8.1.1.2" xref="S4.SS4.SSS1.p4.8.m8.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.8.m8.1.1.3" xref="S4.SS4.SSS1.p4.8.m8.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.8.m8.1b"><apply id="S4.SS4.SSS1.p4.8.m8.1.1.cmml" xref="S4.SS4.SSS1.p4.8.m8.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.8.m8.1.1.1.cmml" xref="S4.SS4.SSS1.p4.8.m8.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.8.m8.1.1.2.cmml" xref="S4.SS4.SSS1.p4.8.m8.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.8.m8.1.1.3.cmml" xref="S4.SS4.SSS1.p4.8.m8.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.8.m8.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.8.m8.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.9.m9.1"><semantics id="S4.SS4.SSS1.p4.9.m9.1a"><msup id="S4.SS4.SSS1.p4.9.m9.1.1" xref="S4.SS4.SSS1.p4.9.m9.1.1.cmml"><mi id="S4.SS4.SSS1.p4.9.m9.1.1.2" xref="S4.SS4.SSS1.p4.9.m9.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.9.m9.1.1.3" xref="S4.SS4.SSS1.p4.9.m9.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.9.m9.1b"><apply id="S4.SS4.SSS1.p4.9.m9.1.1.cmml" xref="S4.SS4.SSS1.p4.9.m9.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.9.m9.1.1.1.cmml" xref="S4.SS4.SSS1.p4.9.m9.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.9.m9.1.1.2.cmml" xref="S4.SS4.SSS1.p4.9.m9.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.9.m9.1.1.3.cmml" xref="S4.SS4.SSS1.p4.9.m9.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.9.m9.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.9.m9.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>. In particular, we can see that NyTT achieves its best performance when <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.10.m10.1"><semantics id="S4.SS4.SSS1.p4.10.m10.1a"><msup id="S4.SS4.SSS1.p4.10.m10.1.1" xref="S4.SS4.SSS1.p4.10.m10.1.1.cmml"><mi id="S4.SS4.SSS1.p4.10.m10.1.1.2" xref="S4.SS4.SSS1.p4.10.m10.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.10.m10.1.1.3" xref="S4.SS4.SSS1.p4.10.m10.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.10.m10.1b"><apply id="S4.SS4.SSS1.p4.10.m10.1.1.cmml" xref="S4.SS4.SSS1.p4.10.m10.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.10.m10.1.1.1.cmml" xref="S4.SS4.SSS1.p4.10.m10.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.10.m10.1.1.2.cmml" xref="S4.SS4.SSS1.p4.10.m10.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.10.m10.1.1.3.cmml" xref="S4.SS4.SSS1.p4.10.m10.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.10.m10.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.10.m10.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p4.14.6">DCASE</span>, which has distinctly different characteristics from <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p4.14.7">CHiME-C</span> of <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.11.m11.1"><semantics id="S4.SS4.SSS1.p4.11.m11.1a"><msup id="S4.SS4.SSS1.p4.11.m11.1.1" xref="S4.SS4.SSS1.p4.11.m11.1.1.cmml"><mi id="S4.SS4.SSS1.p4.11.m11.1.1.2" xref="S4.SS4.SSS1.p4.11.m11.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.11.m11.1.1.3" xref="S4.SS4.SSS1.p4.11.m11.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.11.m11.1b"><apply id="S4.SS4.SSS1.p4.11.m11.1.1.cmml" xref="S4.SS4.SSS1.p4.11.m11.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.11.m11.1.1.1.cmml" xref="S4.SS4.SSS1.p4.11.m11.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.11.m11.1.1.2.cmml" xref="S4.SS4.SSS1.p4.11.m11.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.11.m11.1.1.3.cmml" xref="S4.SS4.SSS1.p4.11.m11.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.11.m11.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.11.m11.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, as shown in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.F5" title="Figure 5 ‣ 4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">5</span></a>. Since the DNN is trained to include <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.12.m12.1"><semantics id="S4.SS4.SSS1.p4.12.m12.1a"><msup id="S4.SS4.SSS1.p4.12.m12.1.1" xref="S4.SS4.SSS1.p4.12.m12.1.1.cmml"><mi id="S4.SS4.SSS1.p4.12.m12.1.1.2" xref="S4.SS4.SSS1.p4.12.m12.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.12.m12.1.1.3" xref="S4.SS4.SSS1.p4.12.m12.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.12.m12.1b"><apply id="S4.SS4.SSS1.p4.12.m12.1.1.cmml" xref="S4.SS4.SSS1.p4.12.m12.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.12.m12.1.1.1.cmml" xref="S4.SS4.SSS1.p4.12.m12.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.12.m12.1.1.2.cmml" xref="S4.SS4.SSS1.p4.12.m12.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.12.m12.1.1.3.cmml" xref="S4.SS4.SSS1.p4.12.m12.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.12.m12.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.12.m12.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> in the output signals in NyTT, it is expected that noise will remain in the output signals when there is no mismatch between <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.13.m13.1"><semantics id="S4.SS4.SSS1.p4.13.m13.1a"><msup id="S4.SS4.SSS1.p4.13.m13.1.1" xref="S4.SS4.SSS1.p4.13.m13.1.1.cmml"><mi id="S4.SS4.SSS1.p4.13.m13.1.1.2" xref="S4.SS4.SSS1.p4.13.m13.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.13.m13.1.1.3" xref="S4.SS4.SSS1.p4.13.m13.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.13.m13.1b"><apply id="S4.SS4.SSS1.p4.13.m13.1.1.cmml" xref="S4.SS4.SSS1.p4.13.m13.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.13.m13.1.1.1.cmml" xref="S4.SS4.SSS1.p4.13.m13.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.13.m13.1.1.2.cmml" xref="S4.SS4.SSS1.p4.13.m13.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.13.m13.1.1.3.cmml" xref="S4.SS4.SSS1.p4.13.m13.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.13.m13.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.13.m13.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p4.14.m14.1"><semantics id="S4.SS4.SSS1.p4.14.m14.1a"><msup id="S4.SS4.SSS1.p4.14.m14.1.1" xref="S4.SS4.SSS1.p4.14.m14.1.1.cmml"><mi id="S4.SS4.SSS1.p4.14.m14.1.1.2" xref="S4.SS4.SSS1.p4.14.m14.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p4.14.m14.1.1.3" xref="S4.SS4.SSS1.p4.14.m14.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p4.14.m14.1b"><apply id="S4.SS4.SSS1.p4.14.m14.1.1.cmml" xref="S4.SS4.SSS1.p4.14.m14.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p4.14.m14.1.1.1.cmml" xref="S4.SS4.SSS1.p4.14.m14.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p4.14.m14.1.1.2.cmml" xref="S4.SS4.SSS1.p4.14.m14.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p4.14.m14.1.1.3.cmml" xref="S4.SS4.SSS1.p4.14.m14.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p4.14.m14.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p4.14.m14.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>.</p> </div> <div class="ltx_para" id="S4.SS4.SSS1.p5"> <p class="ltx_p" id="S4.SS4.SSS1.p5.19">Third, we focus on the impact of the mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.1.m1.1"><semantics id="S4.SS4.SSS1.p5.1.m1.1a"><msup id="S4.SS4.SSS1.p5.1.m1.1.1" xref="S4.SS4.SSS1.p5.1.m1.1.1.cmml"><mi id="S4.SS4.SSS1.p5.1.m1.1.1.2" xref="S4.SS4.SSS1.p5.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.1.m1.1.1.3" xref="S4.SS4.SSS1.p5.1.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.1.m1.1b"><apply id="S4.SS4.SSS1.p5.1.m1.1.1.cmml" xref="S4.SS4.SSS1.p5.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.1.m1.1.1.1.cmml" xref="S4.SS4.SSS1.p5.1.m1.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.1.m1.1.1.2.cmml" xref="S4.SS4.SSS1.p5.1.m1.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.1.m1.1.1.3.cmml" xref="S4.SS4.SSS1.p5.1.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.1.m1.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.2.m2.1"><semantics id="S4.SS4.SSS1.p5.2.m2.1a"><msup id="S4.SS4.SSS1.p5.2.m2.1.1" xref="S4.SS4.SSS1.p5.2.m2.1.1.cmml"><mi id="S4.SS4.SSS1.p5.2.m2.1.1.2" xref="S4.SS4.SSS1.p5.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.2.m2.1.1.3" xref="S4.SS4.SSS1.p5.2.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.2.m2.1b"><apply id="S4.SS4.SSS1.p5.2.m2.1.1.cmml" xref="S4.SS4.SSS1.p5.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.2.m2.1.1.1.cmml" xref="S4.SS4.SSS1.p5.2.m2.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.2.m2.1.1.2.cmml" xref="S4.SS4.SSS1.p5.2.m2.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.2.m2.1.1.3.cmml" xref="S4.SS4.SSS1.p5.2.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.2.m2.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> on the performance of IterNyTT. For example, we analyze the performance of IterNyTT when <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.3.m3.1"><semantics id="S4.SS4.SSS1.p5.3.m3.1a"><msup id="S4.SS4.SSS1.p5.3.m3.1.1" xref="S4.SS4.SSS1.p5.3.m3.1.1.cmml"><mi id="S4.SS4.SSS1.p5.3.m3.1.1.2" xref="S4.SS4.SSS1.p5.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.3.m3.1.1.3" xref="S4.SS4.SSS1.p5.3.m3.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.3.m3.1b"><apply id="S4.SS4.SSS1.p5.3.m3.1.1.cmml" xref="S4.SS4.SSS1.p5.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.3.m3.1.1.1.cmml" xref="S4.SS4.SSS1.p5.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.3.m3.1.1.2.cmml" xref="S4.SS4.SSS1.p5.3.m3.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.3.m3.1.1.3.cmml" xref="S4.SS4.SSS1.p5.3.m3.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.3.m3.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.1">CHiME-A</span>. In this case, IterNyTT achieves SI-SDRs of 17.11 dB when <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.4.m4.1"><semantics id="S4.SS4.SSS1.p5.4.m4.1a"><msup id="S4.SS4.SSS1.p5.4.m4.1.1" xref="S4.SS4.SSS1.p5.4.m4.1.1.cmml"><mi id="S4.SS4.SSS1.p5.4.m4.1.1.2" xref="S4.SS4.SSS1.p5.4.m4.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.4.m4.1.1.3" xref="S4.SS4.SSS1.p5.4.m4.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.4.m4.1b"><apply id="S4.SS4.SSS1.p5.4.m4.1.1.cmml" xref="S4.SS4.SSS1.p5.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.4.m4.1.1.1.cmml" xref="S4.SS4.SSS1.p5.4.m4.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.4.m4.1.1.2.cmml" xref="S4.SS4.SSS1.p5.4.m4.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.4.m4.1.1.3.cmml" xref="S4.SS4.SSS1.p5.4.m4.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.4.m4.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.2">CHiME-B</span> and 14.64 dB when <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.5.m5.1"><semantics id="S4.SS4.SSS1.p5.5.m5.1a"><msup id="S4.SS4.SSS1.p5.5.m5.1.1" xref="S4.SS4.SSS1.p5.5.m5.1.1.cmml"><mi id="S4.SS4.SSS1.p5.5.m5.1.1.2" xref="S4.SS4.SSS1.p5.5.m5.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.5.m5.1.1.3" xref="S4.SS4.SSS1.p5.5.m5.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.5.m5.1b"><apply id="S4.SS4.SSS1.p5.5.m5.1.1.cmml" xref="S4.SS4.SSS1.p5.5.m5.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.5.m5.1.1.1.cmml" xref="S4.SS4.SSS1.p5.5.m5.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.5.m5.1.1.2.cmml" xref="S4.SS4.SSS1.p5.5.m5.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.5.m5.1.1.3.cmml" xref="S4.SS4.SSS1.p5.5.m5.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.5.m5.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.5.m5.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> is (<span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.3">DEMAND-B</span>, <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.4">CHiME-B</span>). When <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.6.m6.1"><semantics id="S4.SS4.SSS1.p5.6.m6.1a"><msup id="S4.SS4.SSS1.p5.6.m6.1.1" xref="S4.SS4.SSS1.p5.6.m6.1.1.cmml"><mi id="S4.SS4.SSS1.p5.6.m6.1.1.2" xref="S4.SS4.SSS1.p5.6.m6.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.6.m6.1.1.3" xref="S4.SS4.SSS1.p5.6.m6.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.6.m6.1b"><apply id="S4.SS4.SSS1.p5.6.m6.1.1.cmml" xref="S4.SS4.SSS1.p5.6.m6.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.6.m6.1.1.1.cmml" xref="S4.SS4.SSS1.p5.6.m6.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.6.m6.1.1.2.cmml" xref="S4.SS4.SSS1.p5.6.m6.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.6.m6.1.1.3.cmml" xref="S4.SS4.SSS1.p5.6.m6.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.6.m6.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.6.m6.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> is (<span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.5">DEMAND-B</span>, <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.6">CHiME-B</span>), in the first and second iterations, the mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.7.m7.1"><semantics id="S4.SS4.SSS1.p5.7.m7.1a"><msup id="S4.SS4.SSS1.p5.7.m7.1.1" xref="S4.SS4.SSS1.p5.7.m7.1.1.cmml"><mi id="S4.SS4.SSS1.p5.7.m7.1.1.2" xref="S4.SS4.SSS1.p5.7.m7.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.7.m7.1.1.3" xref="S4.SS4.SSS1.p5.7.m7.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.7.m7.1b"><apply id="S4.SS4.SSS1.p5.7.m7.1.1.cmml" xref="S4.SS4.SSS1.p5.7.m7.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.7.m7.1.1.1.cmml" xref="S4.SS4.SSS1.p5.7.m7.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.7.m7.1.1.2.cmml" xref="S4.SS4.SSS1.p5.7.m7.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.7.m7.1.1.3.cmml" xref="S4.SS4.SSS1.p5.7.m7.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.7.m7.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.7.m7.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>(=<span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.7">CHiME-A</span>) and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.8.m8.1"><semantics id="S4.SS4.SSS1.p5.8.m8.1a"><msup id="S4.SS4.SSS1.p5.8.m8.1.1" xref="S4.SS4.SSS1.p5.8.m8.1.1.cmml"><mi id="S4.SS4.SSS1.p5.8.m8.1.1.2" xref="S4.SS4.SSS1.p5.8.m8.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.8.m8.1.1.3" xref="S4.SS4.SSS1.p5.8.m8.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.8.m8.1b"><apply id="S4.SS4.SSS1.p5.8.m8.1.1.cmml" xref="S4.SS4.SSS1.p5.8.m8.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.8.m8.1.1.1.cmml" xref="S4.SS4.SSS1.p5.8.m8.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.8.m8.1.1.2.cmml" xref="S4.SS4.SSS1.p5.8.m8.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.8.m8.1.1.3.cmml" xref="S4.SS4.SSS1.p5.8.m8.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.8.m8.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.8.m8.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>(=<span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.8">DEMAND-B</span>) prevents IterNyTT from effectively removing <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.9.m9.1"><semantics id="S4.SS4.SSS1.p5.9.m9.1a"><msup id="S4.SS4.SSS1.p5.9.m9.1.1" xref="S4.SS4.SSS1.p5.9.m9.1.1.cmml"><mi id="S4.SS4.SSS1.p5.9.m9.1.1.2" xref="S4.SS4.SSS1.p5.9.m9.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.9.m9.1.1.3" xref="S4.SS4.SSS1.p5.9.m9.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.9.m9.1b"><apply id="S4.SS4.SSS1.p5.9.m9.1.1.cmml" xref="S4.SS4.SSS1.p5.9.m9.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.9.m9.1.1.1.cmml" xref="S4.SS4.SSS1.p5.9.m9.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.9.m9.1.1.2.cmml" xref="S4.SS4.SSS1.p5.9.m9.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.9.m9.1.1.3.cmml" xref="S4.SS4.SSS1.p5.9.m9.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.9.m9.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.9.m9.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> from the noisy targets. Thus, there is no performance improvement on the test dataset in the third iteration. Additionally, IterNyTT achieves a higher SI-SDR when <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.10.m10.1"><semantics id="S4.SS4.SSS1.p5.10.m10.1a"><msup id="S4.SS4.SSS1.p5.10.m10.1.1" xref="S4.SS4.SSS1.p5.10.m10.1.1.cmml"><mi id="S4.SS4.SSS1.p5.10.m10.1.1.2" xref="S4.SS4.SSS1.p5.10.m10.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.10.m10.1.1.3" xref="S4.SS4.SSS1.p5.10.m10.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.10.m10.1b"><apply id="S4.SS4.SSS1.p5.10.m10.1.1.cmml" xref="S4.SS4.SSS1.p5.10.m10.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.10.m10.1.1.1.cmml" xref="S4.SS4.SSS1.p5.10.m10.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.10.m10.1.1.2.cmml" xref="S4.SS4.SSS1.p5.10.m10.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.10.m10.1.1.3.cmml" xref="S4.SS4.SSS1.p5.10.m10.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.10.m10.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.10.m10.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> is (<span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.9">CHiME-B</span>, <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.10">DEMAND-B</span>) (10.67 dB) than when <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.11.m11.1"><semantics id="S4.SS4.SSS1.p5.11.m11.1a"><msup id="S4.SS4.SSS1.p5.11.m11.1.1" xref="S4.SS4.SSS1.p5.11.m11.1.1.cmml"><mi id="S4.SS4.SSS1.p5.11.m11.1.1.2" xref="S4.SS4.SSS1.p5.11.m11.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.11.m11.1.1.3" xref="S4.SS4.SSS1.p5.11.m11.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.11.m11.1b"><apply id="S4.SS4.SSS1.p5.11.m11.1.1.cmml" xref="S4.SS4.SSS1.p5.11.m11.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.11.m11.1.1.1.cmml" xref="S4.SS4.SSS1.p5.11.m11.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.11.m11.1.1.2.cmml" xref="S4.SS4.SSS1.p5.11.m11.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.11.m11.1.1.3.cmml" xref="S4.SS4.SSS1.p5.11.m11.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.11.m11.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.11.m11.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.11">DEMAND-B</span> (9.80 dB). Moreover, IterNyTT shows little performance improvement when <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.12.m12.1"><semantics id="S4.SS4.SSS1.p5.12.m12.1a"><msup id="S4.SS4.SSS1.p5.12.m12.1.1" xref="S4.SS4.SSS1.p5.12.m12.1.1.cmml"><mi id="S4.SS4.SSS1.p5.12.m12.1.1.2" xref="S4.SS4.SSS1.p5.12.m12.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.12.m12.1.1.3" xref="S4.SS4.SSS1.p5.12.m12.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.12.m12.1b"><apply id="S4.SS4.SSS1.p5.12.m12.1.1.cmml" xref="S4.SS4.SSS1.p5.12.m12.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.12.m12.1.1.1.cmml" xref="S4.SS4.SSS1.p5.12.m12.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.12.m12.1.1.2.cmml" xref="S4.SS4.SSS1.p5.12.m12.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.12.m12.1.1.3.cmml" xref="S4.SS4.SSS1.p5.12.m12.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.12.m12.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.12.m12.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.12">DCASE</span>. Similarly, even when <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.13.m13.1"><semantics id="S4.SS4.SSS1.p5.13.m13.1a"><msup id="S4.SS4.SSS1.p5.13.m13.1.1" xref="S4.SS4.SSS1.p5.13.m13.1.1.cmml"><mi id="S4.SS4.SSS1.p5.13.m13.1.1.2" xref="S4.SS4.SSS1.p5.13.m13.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.13.m13.1.1.3" xref="S4.SS4.SSS1.p5.13.m13.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.13.m13.1b"><apply id="S4.SS4.SSS1.p5.13.m13.1.1.cmml" xref="S4.SS4.SSS1.p5.13.m13.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.13.m13.1.1.1.cmml" xref="S4.SS4.SSS1.p5.13.m13.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.13.m13.1.1.2.cmml" xref="S4.SS4.SSS1.p5.13.m13.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.13.m13.1.1.3.cmml" xref="S4.SS4.SSS1.p5.13.m13.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.13.m13.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.13.m13.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS1.p5.19.13">DEMAND-A</span>, and even when the metric is PESQ or STOI, we can consistently see that IterNyTT performs better when using <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.14.m14.1"><semantics id="S4.SS4.SSS1.p5.14.m14.1a"><msup id="S4.SS4.SSS1.p5.14.m14.1.1" xref="S4.SS4.SSS1.p5.14.m14.1.1.cmml"><mi id="S4.SS4.SSS1.p5.14.m14.1.1.2" xref="S4.SS4.SSS1.p5.14.m14.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.14.m14.1.1.3" xref="S4.SS4.SSS1.p5.14.m14.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.14.m14.1b"><apply id="S4.SS4.SSS1.p5.14.m14.1.1.cmml" xref="S4.SS4.SSS1.p5.14.m14.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.14.m14.1.1.1.cmml" xref="S4.SS4.SSS1.p5.14.m14.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.14.m14.1.1.2.cmml" xref="S4.SS4.SSS1.p5.14.m14.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.14.m14.1.1.3.cmml" xref="S4.SS4.SSS1.p5.14.m14.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.14.m14.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.14.m14.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> matched with <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.15.m15.1"><semantics id="S4.SS4.SSS1.p5.15.m15.1a"><msup id="S4.SS4.SSS1.p5.15.m15.1.1" xref="S4.SS4.SSS1.p5.15.m15.1.1.cmml"><mi id="S4.SS4.SSS1.p5.15.m15.1.1.2" xref="S4.SS4.SSS1.p5.15.m15.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.15.m15.1.1.3" xref="S4.SS4.SSS1.p5.15.m15.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.15.m15.1b"><apply id="S4.SS4.SSS1.p5.15.m15.1.1.cmml" xref="S4.SS4.SSS1.p5.15.m15.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.15.m15.1.1.1.cmml" xref="S4.SS4.SSS1.p5.15.m15.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.15.m15.1.1.2.cmml" xref="S4.SS4.SSS1.p5.15.m15.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.15.m15.1.1.3.cmml" xref="S4.SS4.SSS1.p5.15.m15.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.15.m15.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.15.m15.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> in the first and second iterations. The impact of the mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.16.m16.1"><semantics id="S4.SS4.SSS1.p5.16.m16.1a"><msup id="S4.SS4.SSS1.p5.16.m16.1.1" xref="S4.SS4.SSS1.p5.16.m16.1.1.cmml"><mi id="S4.SS4.SSS1.p5.16.m16.1.1.2" xref="S4.SS4.SSS1.p5.16.m16.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.16.m16.1.1.3" xref="S4.SS4.SSS1.p5.16.m16.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.16.m16.1b"><apply id="S4.SS4.SSS1.p5.16.m16.1.1.cmml" xref="S4.SS4.SSS1.p5.16.m16.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.16.m16.1.1.1.cmml" xref="S4.SS4.SSS1.p5.16.m16.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.16.m16.1.1.2.cmml" xref="S4.SS4.SSS1.p5.16.m16.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.16.m16.1.1.3.cmml" xref="S4.SS4.SSS1.p5.16.m16.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.16.m16.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.16.m16.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.17.m17.1"><semantics id="S4.SS4.SSS1.p5.17.m17.1a"><msup id="S4.SS4.SSS1.p5.17.m17.1.1" xref="S4.SS4.SSS1.p5.17.m17.1.1.cmml"><mi id="S4.SS4.SSS1.p5.17.m17.1.1.2" xref="S4.SS4.SSS1.p5.17.m17.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.17.m17.1.1.3" xref="S4.SS4.SSS1.p5.17.m17.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.17.m17.1b"><apply id="S4.SS4.SSS1.p5.17.m17.1.1.cmml" xref="S4.SS4.SSS1.p5.17.m17.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.17.m17.1.1.1.cmml" xref="S4.SS4.SSS1.p5.17.m17.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.17.m17.1.1.2.cmml" xref="S4.SS4.SSS1.p5.17.m17.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.17.m17.1.1.3.cmml" xref="S4.SS4.SSS1.p5.17.m17.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.17.m17.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.17.m17.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> on IterNyTT is consistent with that of the mismatch between <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.18.m18.1"><semantics id="S4.SS4.SSS1.p5.18.m18.1a"><msup id="S4.SS4.SSS1.p5.18.m18.1.1" xref="S4.SS4.SSS1.p5.18.m18.1.1.cmml"><mi id="S4.SS4.SSS1.p5.18.m18.1.1.2" xref="S4.SS4.SSS1.p5.18.m18.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.18.m18.1.1.3" xref="S4.SS4.SSS1.p5.18.m18.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.18.m18.1b"><apply id="S4.SS4.SSS1.p5.18.m18.1.1.cmml" xref="S4.SS4.SSS1.p5.18.m18.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.18.m18.1.1.1.cmml" xref="S4.SS4.SSS1.p5.18.m18.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.18.m18.1.1.2.cmml" xref="S4.SS4.SSS1.p5.18.m18.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.18.m18.1.1.3.cmml" xref="S4.SS4.SSS1.p5.18.m18.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.18.m18.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.18.m18.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p5.19.m19.1"><semantics id="S4.SS4.SSS1.p5.19.m19.1a"><msup id="S4.SS4.SSS1.p5.19.m19.1.1" xref="S4.SS4.SSS1.p5.19.m19.1.1.cmml"><mi id="S4.SS4.SSS1.p5.19.m19.1.1.2" xref="S4.SS4.SSS1.p5.19.m19.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p5.19.m19.1.1.3" xref="S4.SS4.SSS1.p5.19.m19.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p5.19.m19.1b"><apply id="S4.SS4.SSS1.p5.19.m19.1.1.cmml" xref="S4.SS4.SSS1.p5.19.m19.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p5.19.m19.1.1.1.cmml" xref="S4.SS4.SSS1.p5.19.m19.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p5.19.m19.1.1.2.cmml" xref="S4.SS4.SSS1.p5.19.m19.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p5.19.m19.1.1.3.cmml" xref="S4.SS4.SSS1.p5.19.m19.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p5.19.m19.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p5.19.m19.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math> on NyTT.</p> </div> <div class="ltx_para" id="S4.SS4.SSS1.p6"> <p class="ltx_p" id="S4.SS4.SSS1.p6.6">Summing up the above results, we derive the desirable condition for the NyTT framework, as shown in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.F6" title="Figure 6 ‣ 4.4.1 Effects of mismatches on the performance ‣ 4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">6</span></a>. a) NyTT achieves high performance when there is no mismatch between <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p6.1.m1.1"><semantics id="S4.SS4.SSS1.p6.1.m1.1a"><msup id="S4.SS4.SSS1.p6.1.m1.1.1" xref="S4.SS4.SSS1.p6.1.m1.1.1.cmml"><mi id="S4.SS4.SSS1.p6.1.m1.1.1.2" xref="S4.SS4.SSS1.p6.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p6.1.m1.1.1.3" xref="S4.SS4.SSS1.p6.1.m1.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p6.1.m1.1b"><apply id="S4.SS4.SSS1.p6.1.m1.1.1.cmml" xref="S4.SS4.SSS1.p6.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p6.1.m1.1.1.1.cmml" xref="S4.SS4.SSS1.p6.1.m1.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p6.1.m1.1.1.2.cmml" xref="S4.SS4.SSS1.p6.1.m1.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p6.1.m1.1.1.3.cmml" xref="S4.SS4.SSS1.p6.1.m1.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p6.1.m1.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p6.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p6.2.m2.1"><semantics id="S4.SS4.SSS1.p6.2.m2.1a"><msup id="S4.SS4.SSS1.p6.2.m2.1.1" xref="S4.SS4.SSS1.p6.2.m2.1.1.cmml"><mi id="S4.SS4.SSS1.p6.2.m2.1.1.2" xref="S4.SS4.SSS1.p6.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p6.2.m2.1.1.3" xref="S4.SS4.SSS1.p6.2.m2.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p6.2.m2.1b"><apply id="S4.SS4.SSS1.p6.2.m2.1.1.cmml" xref="S4.SS4.SSS1.p6.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p6.2.m2.1.1.1.cmml" xref="S4.SS4.SSS1.p6.2.m2.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p6.2.m2.1.1.2.cmml" xref="S4.SS4.SSS1.p6.2.m2.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p6.2.m2.1.1.3.cmml" xref="S4.SS4.SSS1.p6.2.m2.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p6.2.m2.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p6.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, as in CTT, b) NyTT achieves high performance when there is a mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p6.3.m3.1"><semantics id="S4.SS4.SSS1.p6.3.m3.1a"><msup id="S4.SS4.SSS1.p6.3.m3.1.1" xref="S4.SS4.SSS1.p6.3.m3.1.1.cmml"><mi id="S4.SS4.SSS1.p6.3.m3.1.1.2" xref="S4.SS4.SSS1.p6.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p6.3.m3.1.1.3" xref="S4.SS4.SSS1.p6.3.m3.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p6.3.m3.1b"><apply id="S4.SS4.SSS1.p6.3.m3.1.1.cmml" xref="S4.SS4.SSS1.p6.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p6.3.m3.1.1.1.cmml" xref="S4.SS4.SSS1.p6.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p6.3.m3.1.1.2.cmml" xref="S4.SS4.SSS1.p6.3.m3.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p6.3.m3.1.1.3.cmml" xref="S4.SS4.SSS1.p6.3.m3.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p6.3.m3.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p6.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p6.4.m4.1"><semantics id="S4.SS4.SSS1.p6.4.m4.1a"><msup id="S4.SS4.SSS1.p6.4.m4.1.1" xref="S4.SS4.SSS1.p6.4.m4.1.1.cmml"><mi id="S4.SS4.SSS1.p6.4.m4.1.1.2" xref="S4.SS4.SSS1.p6.4.m4.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p6.4.m4.1.1.3" xref="S4.SS4.SSS1.p6.4.m4.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p6.4.m4.1b"><apply id="S4.SS4.SSS1.p6.4.m4.1.1.cmml" xref="S4.SS4.SSS1.p6.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p6.4.m4.1.1.1.cmml" xref="S4.SS4.SSS1.p6.4.m4.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p6.4.m4.1.1.2.cmml" xref="S4.SS4.SSS1.p6.4.m4.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p6.4.m4.1.1.3.cmml" xref="S4.SS4.SSS1.p6.4.m4.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p6.4.m4.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p6.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, and c) IterNyTT improves the performance when there is no mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p6.5.m5.1"><semantics id="S4.SS4.SSS1.p6.5.m5.1a"><msup id="S4.SS4.SSS1.p6.5.m5.1.1" xref="S4.SS4.SSS1.p6.5.m5.1.1.cmml"><mi id="S4.SS4.SSS1.p6.5.m5.1.1.2" xref="S4.SS4.SSS1.p6.5.m5.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p6.5.m5.1.1.3" xref="S4.SS4.SSS1.p6.5.m5.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p6.5.m5.1b"><apply id="S4.SS4.SSS1.p6.5.m5.1.1.cmml" xref="S4.SS4.SSS1.p6.5.m5.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p6.5.m5.1.1.1.cmml" xref="S4.SS4.SSS1.p6.5.m5.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p6.5.m5.1.1.2.cmml" xref="S4.SS4.SSS1.p6.5.m5.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p6.5.m5.1.1.3.cmml" xref="S4.SS4.SSS1.p6.5.m5.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p6.5.m5.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p6.5.m5.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS1.p6.6.m6.1"><semantics id="S4.SS4.SSS1.p6.6.m6.1a"><msup id="S4.SS4.SSS1.p6.6.m6.1.1" xref="S4.SS4.SSS1.p6.6.m6.1.1.cmml"><mi id="S4.SS4.SSS1.p6.6.m6.1.1.2" xref="S4.SS4.SSS1.p6.6.m6.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS1.p6.6.m6.1.1.3" xref="S4.SS4.SSS1.p6.6.m6.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS1.p6.6.m6.1b"><apply id="S4.SS4.SSS1.p6.6.m6.1.1.cmml" xref="S4.SS4.SSS1.p6.6.m6.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS1.p6.6.m6.1.1.1.cmml" xref="S4.SS4.SSS1.p6.6.m6.1.1">superscript</csymbol><ci id="S4.SS4.SSS1.p6.6.m6.1.1.2.cmml" xref="S4.SS4.SSS1.p6.6.m6.1.1.2">𝒏</ci><ci id="S4.SS4.SSS1.p6.6.m6.1.1.3.cmml" xref="S4.SS4.SSS1.p6.6.m6.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS1.p6.6.m6.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS1.p6.6.m6.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>. Additionally, these results are consistent with the interpretation of NyTT in Sec. <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.SS2" title="4.2 Validity of interpretation of NyTT ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">4.2</span></a>.</p> </div> <figure class="ltx_table" id="S4.T4"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table">Table 4: </span>Evaluation results on the test dataset. The SI-SDR, PESQ, and STOI of the unprocessed noisy signals were 10.27 dB, 1.48, and 0.874, respectively. <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.T4.4.m1.1"><semantics id="S4.T4.4.m1.1b"><msup id="S4.T4.4.m1.1.1" xref="S4.T4.4.m1.1.1.cmml"><mi id="S4.T4.4.m1.1.1.2" xref="S4.T4.4.m1.1.1.2.cmml">𝒏</mi><mi id="S4.T4.4.m1.1.1.3" xref="S4.T4.4.m1.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T4.4.m1.1c"><apply id="S4.T4.4.m1.1.1.cmml" xref="S4.T4.4.m1.1.1"><csymbol cd="ambiguous" id="S4.T4.4.m1.1.1.1.cmml" xref="S4.T4.4.m1.1.1">superscript</csymbol><ci id="S4.T4.4.m1.1.1.2.cmml" xref="S4.T4.4.m1.1.1.2">𝒏</ci><ci id="S4.T4.4.m1.1.1.3.cmml" xref="S4.T4.4.m1.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.4.m1.1d">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.T4.4.m1.1e">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math> was <span class="ltx_text ltx_font_typewriter" id="S4.T4.10.1">CHiME-C</span>. When IterNyTT used two noise datasets as <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.T4.5.m2.1"><semantics id="S4.T4.5.m2.1b"><msup id="S4.T4.5.m2.1.1" xref="S4.T4.5.m2.1.1.cmml"><mi id="S4.T4.5.m2.1.1.2" xref="S4.T4.5.m2.1.1.2.cmml">𝒏</mi><mi id="S4.T4.5.m2.1.1.3" xref="S4.T4.5.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T4.5.m2.1c"><apply id="S4.T4.5.m2.1.1.cmml" xref="S4.T4.5.m2.1.1"><csymbol cd="ambiguous" id="S4.T4.5.m2.1.1.1.cmml" xref="S4.T4.5.m2.1.1">superscript</csymbol><ci id="S4.T4.5.m2.1.1.2.cmml" xref="S4.T4.5.m2.1.1.2">𝒏</ci><ci id="S4.T4.5.m2.1.1.3.cmml" xref="S4.T4.5.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.5.m2.1d">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.T4.5.m2.1e">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, the first noise dataset in parentheses was used in the first and second iterations of IterNyTT, whereas the second noise dataset in parentheses was used in the third iteration of IterNyTT. In CTT, the choice of <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.T4.6.m3.1"><semantics id="S4.T4.6.m3.1b"><msup id="S4.T4.6.m3.1.1" xref="S4.T4.6.m3.1.1.cmml"><mi id="S4.T4.6.m3.1.1.2" xref="S4.T4.6.m3.1.1.2.cmml">𝒏</mi><mi id="S4.T4.6.m3.1.1.3" xref="S4.T4.6.m3.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T4.6.m3.1c"><apply id="S4.T4.6.m3.1.1.cmml" xref="S4.T4.6.m3.1.1"><csymbol cd="ambiguous" id="S4.T4.6.m3.1.1.1.cmml" xref="S4.T4.6.m3.1.1">superscript</csymbol><ci id="S4.T4.6.m3.1.1.2.cmml" xref="S4.T4.6.m3.1.1.2">𝒏</ci><ci id="S4.T4.6.m3.1.1.3.cmml" xref="S4.T4.6.m3.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.6.m3.1d">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.T4.6.m3.1e">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is not involved.</figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T4.8.2" style="width:433.6pt;height:160.8pt;vertical-align:-0.7pt;"><span class="ltx_transformed_inner" style="transform:translate(-75.7pt,27.9pt) scale(0.741229678191826,0.741229678191826) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T4.8.2.2"> <tr class="ltx_tr" id="S4.T4.8.2.2.2"> <td class="ltx_td ltx_align_center ltx_align_middle ltx_border_tt" id="S4.T4.7.1.1.1.1" rowspan="2"><span class="ltx_text" id="S4.T4.7.1.1.1.1.1"><math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.T4.7.1.1.1.1.1.m1.1"><semantics id="S4.T4.7.1.1.1.1.1.m1.1a"><msup id="S4.T4.7.1.1.1.1.1.m1.1.1" xref="S4.T4.7.1.1.1.1.1.m1.1.1.cmml"><mi id="S4.T4.7.1.1.1.1.1.m1.1.1.2" xref="S4.T4.7.1.1.1.1.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.T4.7.1.1.1.1.1.m1.1.1.3" xref="S4.T4.7.1.1.1.1.1.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T4.7.1.1.1.1.1.m1.1b"><apply id="S4.T4.7.1.1.1.1.1.m1.1.1.cmml" xref="S4.T4.7.1.1.1.1.1.m1.1.1"><csymbol cd="ambiguous" id="S4.T4.7.1.1.1.1.1.m1.1.1.1.cmml" xref="S4.T4.7.1.1.1.1.1.m1.1.1">superscript</csymbol><ci id="S4.T4.7.1.1.1.1.1.m1.1.1.2.cmml" xref="S4.T4.7.1.1.1.1.1.m1.1.1.2">𝒏</ci><ci id="S4.T4.7.1.1.1.1.1.m1.1.1.3.cmml" xref="S4.T4.7.1.1.1.1.1.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.7.1.1.1.1.1.m1.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.T4.7.1.1.1.1.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math></span></td> <td class="ltx_td ltx_align_center ltx_align_middle ltx_border_r ltx_border_tt" id="S4.T4.8.2.2.2.2" rowspan="2"><span class="ltx_text" id="S4.T4.8.2.2.2.2.1"><math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.T4.8.2.2.2.2.1.m1.1"><semantics id="S4.T4.8.2.2.2.2.1.m1.1a"><msup id="S4.T4.8.2.2.2.2.1.m1.1.1" xref="S4.T4.8.2.2.2.2.1.m1.1.1.cmml"><mi id="S4.T4.8.2.2.2.2.1.m1.1.1.2" xref="S4.T4.8.2.2.2.2.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.T4.8.2.2.2.2.1.m1.1.1.3" xref="S4.T4.8.2.2.2.2.1.m1.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T4.8.2.2.2.2.1.m1.1b"><apply id="S4.T4.8.2.2.2.2.1.m1.1.1.cmml" xref="S4.T4.8.2.2.2.2.1.m1.1.1"><csymbol cd="ambiguous" id="S4.T4.8.2.2.2.2.1.m1.1.1.1.cmml" xref="S4.T4.8.2.2.2.2.1.m1.1.1">superscript</csymbol><ci id="S4.T4.8.2.2.2.2.1.m1.1.1.2.cmml" xref="S4.T4.8.2.2.2.2.1.m1.1.1.2">𝒏</ci><ci id="S4.T4.8.2.2.2.2.1.m1.1.1.3.cmml" xref="S4.T4.8.2.2.2.2.1.m1.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.8.2.2.2.2.1.m1.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.T4.8.2.2.2.2.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math></span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" colspan="3" id="S4.T4.8.2.2.2.3">SISDR</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" colspan="3" id="S4.T4.8.2.2.2.4">PESQ</td> <td class="ltx_td ltx_align_center ltx_border_tt" colspan="3" id="S4.T4.8.2.2.2.5">STOI</td> </tr> <tr class="ltx_tr" id="S4.T4.8.2.2.3"> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.3.1">CTT</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.3.2">NyTT</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.3.3">IterNyTT</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.3.4">CTT</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.3.5">NyTT</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.3.6">IterNyTT</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.3.7">CTT</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.3.8">NyTT</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.3.9">IterNyTT</td> </tr> <tr class="ltx_tr" id="S4.T4.8.2.2.4"> <td class="ltx_td ltx_align_center ltx_align_middle ltx_border_t" id="S4.T4.8.2.2.4.1" rowspan="4"><span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.4.1.1">CHiME-A</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.8.2.2.4.2"><span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.4.2.1">CHiME-B</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.4.3"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.4.3.1">17.58</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.4.4">15.87</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.8.2.2.4.5">17.11</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.4.6"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.4.6.1">2.67</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.4.7">2.32</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.8.2.2.4.8">2.45</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.4.9"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.4.9.1">0.944</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.4.10">0.927</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.4.11">0.934</td> </tr> <tr class="ltx_tr" id="S4.T4.8.2.2.5"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.5.1"><span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.5.1.1">DEMAND-B</span></td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.5.2"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.5.2.1">15.04</span></td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.5.3">10.27</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.5.4">9.80</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.5.5"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.5.5.1">2.16</span></td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.5.6">1.57</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.5.7">1.51</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.5.8"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.5.8.1">0.926</span></td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.5.9">0.882</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.5.10">0.877</td> </tr> <tr class="ltx_tr" id="S4.T4.8.2.2.6"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.6.1">(<span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.6.1.1">CHiME-B</span>, <span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.6.1.2">DEMAND-B</span>)</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.6.2">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.6.3">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.6.4">10.67</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.6.5">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.6.6">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.6.7">1.59</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.6.8">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.6.9">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.6.10">0.881</td> </tr> <tr class="ltx_tr" id="S4.T4.8.2.2.7"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.7.1">(<span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.7.1.1">DEMAND-B</span>, <span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.7.1.2">CHiME-B</span>)</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.7.2">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.7.3">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.7.4">14.64</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.7.5">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.7.6">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.7.7">1.96</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.7.8">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.7.9">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.7.10">0.915</td> </tr> <tr class="ltx_tr" id="S4.T4.8.2.2.8"> <td class="ltx_td ltx_align_center ltx_align_middle ltx_border_t" id="S4.T4.8.2.2.8.1" rowspan="4"><span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.8.1.1">DEMAND-A</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.8.2.2.8.2"><span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.8.2.1">CHiME-B</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.8.3"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.8.3.1">17.58</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.8.4">16.59</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.8.2.2.8.5">17.20</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.8.6"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.8.6.1">2.67</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.8.7">2.53</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.8.2.2.8.8">2.48</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.8.9"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.8.9.1">0.944</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.8.10">0.936</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.8.11">0.937</td> </tr> <tr class="ltx_tr" id="S4.T4.8.2.2.9"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.9.1"><span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.9.1.1">DEMAND-B</span></td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.9.2"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.9.2.1">15.04</span></td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.9.3">13.56</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.9.4">14.11</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.9.5"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.9.5.1">2.16</span></td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.9.6">1.98</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.9.7">2.06</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.9.8"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.9.8.1">0.926</span></td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.9.9">0.905</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.9.10">0.917</td> </tr> <tr class="ltx_tr" id="S4.T4.8.2.2.10"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.10.1">(<span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.10.1.1">CHiME-B</span>, <span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.10.1.2">DEMAND-B</span>)</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.10.2">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.10.3">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.10.4">13.85</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.10.5">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.10.6">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.10.7">1.94</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.10.8">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.10.9">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.10.10">0.903</td> </tr> <tr class="ltx_tr" id="S4.T4.8.2.2.11"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.11.1">(<span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.11.1.1">DEMAND-B</span>, <span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.11.1.2">CHiME-B</span>)</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.11.2">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.11.3">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.11.4">17.33</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.11.5">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.11.6">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T4.8.2.2.11.7">2.56</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.11.8">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.11.9">-</td> <td class="ltx_td ltx_align_center" id="S4.T4.8.2.2.11.10">0.940</td> </tr> <tr class="ltx_tr" id="S4.T4.8.2.2.12"> <td class="ltx_td ltx_align_center ltx_align_middle ltx_border_bb ltx_border_t" id="S4.T4.8.2.2.12.1" rowspan="2"><span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.12.1.1">DCASE</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.8.2.2.12.2"><span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.12.2.1">CHiME-B</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.12.3"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.12.3.1">17.58</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.12.4">16.75</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.8.2.2.12.5">17.06</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.12.6"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.12.6.1">2.67</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.12.7">2.47</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T4.8.2.2.12.8">2.45</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.12.9"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.12.9.1">0.944</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.12.10">0.937</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T4.8.2.2.12.11">0.937</td> </tr> <tr class="ltx_tr" id="S4.T4.8.2.2.13"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T4.8.2.2.13.1"><span class="ltx_text ltx_font_typewriter" id="S4.T4.8.2.2.13.1.1">DEMAND-B</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T4.8.2.2.13.2"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.13.2.1">15.04</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T4.8.2.2.13.3">14.20</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T4.8.2.2.13.4">13.54</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T4.8.2.2.13.5"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.13.5.1">2.16</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T4.8.2.2.13.6">2.06</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T4.8.2.2.13.7">1.91</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T4.8.2.2.13.8"><span class="ltx_text ltx_font_bold" id="S4.T4.8.2.2.13.8.1">0.926</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T4.8.2.2.13.9">0.914</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T4.8.2.2.13.10">0.908</td> </tr> </table> </span></div> </figure> <figure class="ltx_figure" id="S4.F6"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="239" id="S4.F6.g1" src="x5.png" width="664"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure">Figure 6: </span>Desirable condition in NyTT framework.</figcaption> </figure> </section> <section class="ltx_subsubsection" id="S4.SS4.SSS2"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.4.2 </span>Difference in the impact of <math alttext="\mathrm{SNR}_{\textbf{x}}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.1.m1.1"><semantics id="S4.SS4.SSS2.1.m1.1b"><msub id="S4.SS4.SSS2.1.m1.1.1" xref="S4.SS4.SSS2.1.m1.1.1.cmml"><mi id="S4.SS4.SSS2.1.m1.1.1.2" xref="S4.SS4.SSS2.1.m1.1.1.2.cmml">SNR</mi><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS2.1.m1.1.1.3" xref="S4.SS4.SSS2.1.m1.1.1.3a.cmml">x</mtext></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.1.m1.1c"><apply id="S4.SS4.SSS2.1.m1.1.1.cmml" xref="S4.SS4.SSS2.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.1.m1.1.1.1.cmml" xref="S4.SS4.SSS2.1.m1.1.1">subscript</csymbol><ci id="S4.SS4.SSS2.1.m1.1.1.2.cmml" xref="S4.SS4.SSS2.1.m1.1.1.2">SNR</ci><ci id="S4.SS4.SSS2.1.m1.1.1.3a.cmml" xref="S4.SS4.SSS2.1.m1.1.1.3"><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS2.1.m1.1.1.3.cmml" mathsize="70%" xref="S4.SS4.SSS2.1.m1.1.1.3">x</mtext></ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.1.m1.1d">\mathrm{SNR}_{\textbf{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.1.m1.1e">roman_SNR start_POSTSUBSCRIPT x end_POSTSUBSCRIPT</annotation></semantics></math> with and without the mismatch between <math alttext="\textbf{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.2.m2.1"><semantics id="S4.SS4.SSS2.2.m2.1b"><msup id="S4.SS4.SSS2.2.m2.1.1" xref="S4.SS4.SSS2.2.m2.1.1.cmml"><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS2.2.m2.1.1.2" xref="S4.SS4.SSS2.2.m2.1.1.2a.cmml">n</mtext><mi id="S4.SS4.SSS2.2.m2.1.1.3" xref="S4.SS4.SSS2.2.m2.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.2.m2.1c"><apply id="S4.SS4.SSS2.2.m2.1.1.cmml" xref="S4.SS4.SSS2.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.2.m2.1.1.1.cmml" xref="S4.SS4.SSS2.2.m2.1.1">superscript</csymbol><ci id="S4.SS4.SSS2.2.m2.1.1.2a.cmml" xref="S4.SS4.SSS2.2.m2.1.1.2"><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS2.2.m2.1.1.2.cmml" xref="S4.SS4.SSS2.2.m2.1.1.2">n</mtext></ci><ci id="S4.SS4.SSS2.2.m2.1.1.3.cmml" xref="S4.SS4.SSS2.2.m2.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.2.m2.1d">\textbf{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.2.m2.1e">n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\textbf{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.3.m3.1"><semantics id="S4.SS4.SSS2.3.m3.1b"><msup id="S4.SS4.SSS2.3.m3.1.1" xref="S4.SS4.SSS2.3.m3.1.1.cmml"><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS2.3.m3.1.1.2" xref="S4.SS4.SSS2.3.m3.1.1.2a.cmml">n</mtext><mi id="S4.SS4.SSS2.3.m3.1.1.3" xref="S4.SS4.SSS2.3.m3.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.3.m3.1c"><apply id="S4.SS4.SSS2.3.m3.1.1.cmml" xref="S4.SS4.SSS2.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.3.m3.1.1.1.cmml" xref="S4.SS4.SSS2.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.SSS2.3.m3.1.1.2a.cmml" xref="S4.SS4.SSS2.3.m3.1.1.2"><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS2.3.m3.1.1.2.cmml" xref="S4.SS4.SSS2.3.m3.1.1.2">n</mtext></ci><ci id="S4.SS4.SSS2.3.m3.1.1.3.cmml" xref="S4.SS4.SSS2.3.m3.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.3.m3.1d">\textbf{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.3.m3.1e">n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math> </h4> <div class="ltx_para" id="S4.SS4.SSS2.p1"> <p class="ltx_p" id="S4.SS4.SSS2.p1.10">NyTT experiences performance degradation when <math alttext="\mathrm{SNR}{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p1.1.m1.1"><semantics id="S4.SS4.SSS2.p1.1.m1.1a"><mrow id="S4.SS4.SSS2.p1.1.m1.1.1" xref="S4.SS4.SSS2.p1.1.m1.1.1.cmml"><mi id="S4.SS4.SSS2.p1.1.m1.1.1.2" xref="S4.SS4.SSS2.p1.1.m1.1.1.2.cmml">SNR</mi><mo id="S4.SS4.SSS2.p1.1.m1.1.1.1" xref="S4.SS4.SSS2.p1.1.m1.1.1.1.cmml">⁢</mo><mi id="S4.SS4.SSS2.p1.1.m1.1.1.3" xref="S4.SS4.SSS2.p1.1.m1.1.1.3.cmml">𝒙</mi></mrow><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p1.1.m1.1b"><apply id="S4.SS4.SSS2.p1.1.m1.1.1.cmml" xref="S4.SS4.SSS2.p1.1.m1.1.1"><times id="S4.SS4.SSS2.p1.1.m1.1.1.1.cmml" xref="S4.SS4.SSS2.p1.1.m1.1.1.1"></times><ci id="S4.SS4.SSS2.p1.1.m1.1.1.2.cmml" xref="S4.SS4.SSS2.p1.1.m1.1.1.2">SNR</ci><ci id="S4.SS4.SSS2.p1.1.m1.1.1.3.cmml" xref="S4.SS4.SSS2.p1.1.m1.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p1.1.m1.1c">\mathrm{SNR}{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p1.1.m1.1d">roman_SNR bold_italic_x</annotation></semantics></math> is low <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Fujimura_2021</span>]</cite>. However, the impact of <math alttext="\mathrm{SNR}{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p1.2.m2.1"><semantics id="S4.SS4.SSS2.p1.2.m2.1a"><mrow id="S4.SS4.SSS2.p1.2.m2.1.1" xref="S4.SS4.SSS2.p1.2.m2.1.1.cmml"><mi id="S4.SS4.SSS2.p1.2.m2.1.1.2" xref="S4.SS4.SSS2.p1.2.m2.1.1.2.cmml">SNR</mi><mo id="S4.SS4.SSS2.p1.2.m2.1.1.1" xref="S4.SS4.SSS2.p1.2.m2.1.1.1.cmml">⁢</mo><mi id="S4.SS4.SSS2.p1.2.m2.1.1.3" xref="S4.SS4.SSS2.p1.2.m2.1.1.3.cmml">𝒙</mi></mrow><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p1.2.m2.1b"><apply id="S4.SS4.SSS2.p1.2.m2.1.1.cmml" xref="S4.SS4.SSS2.p1.2.m2.1.1"><times id="S4.SS4.SSS2.p1.2.m2.1.1.1.cmml" xref="S4.SS4.SSS2.p1.2.m2.1.1.1"></times><ci id="S4.SS4.SSS2.p1.2.m2.1.1.2.cmml" xref="S4.SS4.SSS2.p1.2.m2.1.1.2">SNR</ci><ci id="S4.SS4.SSS2.p1.2.m2.1.1.3.cmml" xref="S4.SS4.SSS2.p1.2.m2.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p1.2.m2.1c">\mathrm{SNR}{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p1.2.m2.1d">roman_SNR bold_italic_x</annotation></semantics></math> on performance is expected to vary depending on the mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p1.3.m3.1"><semantics id="S4.SS4.SSS2.p1.3.m3.1a"><msup id="S4.SS4.SSS2.p1.3.m3.1.1" xref="S4.SS4.SSS2.p1.3.m3.1.1.cmml"><mi id="S4.SS4.SSS2.p1.3.m3.1.1.2" xref="S4.SS4.SSS2.p1.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS2.p1.3.m3.1.1.3" xref="S4.SS4.SSS2.p1.3.m3.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p1.3.m3.1b"><apply id="S4.SS4.SSS2.p1.3.m3.1.1.cmml" xref="S4.SS4.SSS2.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p1.3.m3.1.1.1.cmml" xref="S4.SS4.SSS2.p1.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.SSS2.p1.3.m3.1.1.2.cmml" xref="S4.SS4.SSS2.p1.3.m3.1.1.2">𝒏</ci><ci id="S4.SS4.SSS2.p1.3.m3.1.1.3.cmml" xref="S4.SS4.SSS2.p1.3.m3.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p1.3.m3.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p1.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p1.4.m4.1"><semantics id="S4.SS4.SSS2.p1.4.m4.1a"><msup id="S4.SS4.SSS2.p1.4.m4.1.1" xref="S4.SS4.SSS2.p1.4.m4.1.1.cmml"><mi id="S4.SS4.SSS2.p1.4.m4.1.1.2" xref="S4.SS4.SSS2.p1.4.m4.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS2.p1.4.m4.1.1.3" xref="S4.SS4.SSS2.p1.4.m4.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p1.4.m4.1b"><apply id="S4.SS4.SSS2.p1.4.m4.1.1.cmml" xref="S4.SS4.SSS2.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p1.4.m4.1.1.1.cmml" xref="S4.SS4.SSS2.p1.4.m4.1.1">superscript</csymbol><ci id="S4.SS4.SSS2.p1.4.m4.1.1.2.cmml" xref="S4.SS4.SSS2.p1.4.m4.1.1.2">𝒏</ci><ci id="S4.SS4.SSS2.p1.4.m4.1.1.3.cmml" xref="S4.SS4.SSS2.p1.4.m4.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p1.4.m4.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p1.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>. To investigate this, we evaluated the performance of NyTT using <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS2.p1.10.1">CHiME-A</span>, <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS2.p1.10.2">DEMAND-A</span>, and <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS2.p1.10.3">DCASE</span> as <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p1.5.m5.1"><semantics id="S4.SS4.SSS2.p1.5.m5.1a"><msup id="S4.SS4.SSS2.p1.5.m5.1.1" xref="S4.SS4.SSS2.p1.5.m5.1.1.cmml"><mi id="S4.SS4.SSS2.p1.5.m5.1.1.2" xref="S4.SS4.SSS2.p1.5.m5.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS2.p1.5.m5.1.1.3" xref="S4.SS4.SSS2.p1.5.m5.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p1.5.m5.1b"><apply id="S4.SS4.SSS2.p1.5.m5.1.1.cmml" xref="S4.SS4.SSS2.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p1.5.m5.1.1.1.cmml" xref="S4.SS4.SSS2.p1.5.m5.1.1">superscript</csymbol><ci id="S4.SS4.SSS2.p1.5.m5.1.1.2.cmml" xref="S4.SS4.SSS2.p1.5.m5.1.1.2">𝒏</ci><ci id="S4.SS4.SSS2.p1.5.m5.1.1.3.cmml" xref="S4.SS4.SSS2.p1.5.m5.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p1.5.m5.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p1.5.m5.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>, and <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS2.p1.10.4">CHiME-B</span> as <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p1.6.m6.1"><semantics id="S4.SS4.SSS2.p1.6.m6.1a"><msup id="S4.SS4.SSS2.p1.6.m6.1.1" xref="S4.SS4.SSS2.p1.6.m6.1.1.cmml"><mi id="S4.SS4.SSS2.p1.6.m6.1.1.2" xref="S4.SS4.SSS2.p1.6.m6.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS2.p1.6.m6.1.1.3" xref="S4.SS4.SSS2.p1.6.m6.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p1.6.m6.1b"><apply id="S4.SS4.SSS2.p1.6.m6.1.1.cmml" xref="S4.SS4.SSS2.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p1.6.m6.1.1.1.cmml" xref="S4.SS4.SSS2.p1.6.m6.1.1">superscript</csymbol><ci id="S4.SS4.SSS2.p1.6.m6.1.1.2.cmml" xref="S4.SS4.SSS2.p1.6.m6.1.1.2">𝒏</ci><ci id="S4.SS4.SSS2.p1.6.m6.1.1.3.cmml" xref="S4.SS4.SSS2.p1.6.m6.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p1.6.m6.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p1.6.m6.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, and by adjusting <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p1.7.m7.1"><semantics id="S4.SS4.SSS2.p1.7.m7.1a"><msub id="S4.SS4.SSS2.p1.7.m7.1.1" xref="S4.SS4.SSS2.p1.7.m7.1.1.cmml"><mi id="S4.SS4.SSS2.p1.7.m7.1.1.2" xref="S4.SS4.SSS2.p1.7.m7.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS2.p1.7.m7.1.1.3" xref="S4.SS4.SSS2.p1.7.m7.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p1.7.m7.1b"><apply id="S4.SS4.SSS2.p1.7.m7.1.1.cmml" xref="S4.SS4.SSS2.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p1.7.m7.1.1.1.cmml" xref="S4.SS4.SSS2.p1.7.m7.1.1">subscript</csymbol><ci id="S4.SS4.SSS2.p1.7.m7.1.1.2.cmml" xref="S4.SS4.SSS2.p1.7.m7.1.1.2">SNR</ci><ci id="S4.SS4.SSS2.p1.7.m7.1.1.3.cmml" xref="S4.SS4.SSS2.p1.7.m7.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p1.7.m7.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p1.7.m7.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> to -5, 0, 5, 10, 15, and 20 dB. Additionally, we evaluated the performance of CTT for the case where <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p1.8.m8.1"><semantics id="S4.SS4.SSS2.p1.8.m8.1a"><msub id="S4.SS4.SSS2.p1.8.m8.1.1" xref="S4.SS4.SSS2.p1.8.m8.1.1.cmml"><mi id="S4.SS4.SSS2.p1.8.m8.1.1.2" xref="S4.SS4.SSS2.p1.8.m8.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS2.p1.8.m8.1.1.3" xref="S4.SS4.SSS2.p1.8.m8.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p1.8.m8.1b"><apply id="S4.SS4.SSS2.p1.8.m8.1.1.cmml" xref="S4.SS4.SSS2.p1.8.m8.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p1.8.m8.1.1.1.cmml" xref="S4.SS4.SSS2.p1.8.m8.1.1">subscript</csymbol><ci id="S4.SS4.SSS2.p1.8.m8.1.1.2.cmml" xref="S4.SS4.SSS2.p1.8.m8.1.1.2">SNR</ci><ci id="S4.SS4.SSS2.p1.8.m8.1.1.3.cmml" xref="S4.SS4.SSS2.p1.8.m8.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p1.8.m8.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p1.8.m8.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> is <math alttext="\infty" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p1.9.m9.1"><semantics id="S4.SS4.SSS2.p1.9.m9.1a"><mi id="S4.SS4.SSS2.p1.9.m9.1.1" mathvariant="normal" xref="S4.SS4.SSS2.p1.9.m9.1.1.cmml">∞</mi><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p1.9.m9.1b"><infinity id="S4.SS4.SSS2.p1.9.m9.1.1.cmml" xref="S4.SS4.SSS2.p1.9.m9.1.1"></infinity></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p1.9.m9.1c">\infty</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p1.9.m9.1d">∞</annotation></semantics></math> dB. In this experiment, <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p1.10.m10.1"><semantics id="S4.SS4.SSS2.p1.10.m10.1a"><msub id="S4.SS4.SSS2.p1.10.m10.1.1" xref="S4.SS4.SSS2.p1.10.m10.1.1.cmml"><mi id="S4.SS4.SSS2.p1.10.m10.1.1.2" xref="S4.SS4.SSS2.p1.10.m10.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS2.p1.10.m10.1.1.3" xref="S4.SS4.SSS2.p1.10.m10.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p1.10.m10.1b"><apply id="S4.SS4.SSS2.p1.10.m10.1.1.cmml" xref="S4.SS4.SSS2.p1.10.m10.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p1.10.m10.1.1.1.cmml" xref="S4.SS4.SSS2.p1.10.m10.1.1">subscript</csymbol><ci id="S4.SS4.SSS2.p1.10.m10.1.1.2.cmml" xref="S4.SS4.SSS2.p1.10.m10.1.1.2">SNR</ci><ci id="S4.SS4.SSS2.p1.10.m10.1.1.3.cmml" xref="S4.SS4.SSS2.p1.10.m10.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p1.10.m10.1c">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p1.10.m10.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math> was set to range from -5 to 5 dB for both CTT and NyTT.</p> </div> <div class="ltx_para" id="S4.SS4.SSS2.p2"> <p class="ltx_p" id="S4.SS4.SSS2.p2.8">Figure <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.F7" title='Figure 7 ‣ 4.4.2 Difference in the impact of SNR_"x" with and without the mismatch between "n"ᵒᵇˢ and "n"ᵗᵉˢᵗ ‣ 4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement'><span class="ltx_text ltx_ref_tag">7</span></a> shows the SI-SDR, PESQ, and STOI of the processed results for the test dataset at each <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p2.1.m1.1"><semantics id="S4.SS4.SSS2.p2.1.m1.1a"><msub id="S4.SS4.SSS2.p2.1.m1.1.1" xref="S4.SS4.SSS2.p2.1.m1.1.1.cmml"><mi id="S4.SS4.SSS2.p2.1.m1.1.1.2" xref="S4.SS4.SSS2.p2.1.m1.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS2.p2.1.m1.1.1.3" xref="S4.SS4.SSS2.p2.1.m1.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p2.1.m1.1b"><apply id="S4.SS4.SSS2.p2.1.m1.1.1.cmml" xref="S4.SS4.SSS2.p2.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p2.1.m1.1.1.1.cmml" xref="S4.SS4.SSS2.p2.1.m1.1.1">subscript</csymbol><ci id="S4.SS4.SSS2.p2.1.m1.1.1.2.cmml" xref="S4.SS4.SSS2.p2.1.m1.1.1.2">SNR</ci><ci id="S4.SS4.SSS2.p2.1.m1.1.1.3.cmml" xref="S4.SS4.SSS2.p2.1.m1.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p2.1.m1.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p2.1.m1.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math>. Overall, the performance of NyTT degrades as <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p2.2.m2.1"><semantics id="S4.SS4.SSS2.p2.2.m2.1a"><msub id="S4.SS4.SSS2.p2.2.m2.1.1" xref="S4.SS4.SSS2.p2.2.m2.1.1.cmml"><mi id="S4.SS4.SSS2.p2.2.m2.1.1.2" xref="S4.SS4.SSS2.p2.2.m2.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS2.p2.2.m2.1.1.3" xref="S4.SS4.SSS2.p2.2.m2.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p2.2.m2.1b"><apply id="S4.SS4.SSS2.p2.2.m2.1.1.cmml" xref="S4.SS4.SSS2.p2.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p2.2.m2.1.1.1.cmml" xref="S4.SS4.SSS2.p2.2.m2.1.1">subscript</csymbol><ci id="S4.SS4.SSS2.p2.2.m2.1.1.2.cmml" xref="S4.SS4.SSS2.p2.2.m2.1.1.2">SNR</ci><ci id="S4.SS4.SSS2.p2.2.m2.1.1.3.cmml" xref="S4.SS4.SSS2.p2.2.m2.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p2.2.m2.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p2.2.m2.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> decreases. The impact depends on mismatches; significant degradation occurs when <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p2.3.m3.1"><semantics id="S4.SS4.SSS2.p2.3.m3.1a"><msup id="S4.SS4.SSS2.p2.3.m3.1.1" xref="S4.SS4.SSS2.p2.3.m3.1.1.cmml"><mi id="S4.SS4.SSS2.p2.3.m3.1.1.2" xref="S4.SS4.SSS2.p2.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS2.p2.3.m3.1.1.3" xref="S4.SS4.SSS2.p2.3.m3.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p2.3.m3.1b"><apply id="S4.SS4.SSS2.p2.3.m3.1.1.cmml" xref="S4.SS4.SSS2.p2.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p2.3.m3.1.1.1.cmml" xref="S4.SS4.SSS2.p2.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.SSS2.p2.3.m3.1.1.2.cmml" xref="S4.SS4.SSS2.p2.3.m3.1.1.2">𝒏</ci><ci id="S4.SS4.SSS2.p2.3.m3.1.1.3.cmml" xref="S4.SS4.SSS2.p2.3.m3.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p2.3.m3.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p2.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS2.p2.8.1">CHiME-A</span>, which has no mismatch with <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p2.4.m4.1"><semantics id="S4.SS4.SSS2.p2.4.m4.1a"><msup id="S4.SS4.SSS2.p2.4.m4.1.1" xref="S4.SS4.SSS2.p2.4.m4.1.1.cmml"><mi id="S4.SS4.SSS2.p2.4.m4.1.1.2" xref="S4.SS4.SSS2.p2.4.m4.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS2.p2.4.m4.1.1.3" xref="S4.SS4.SSS2.p2.4.m4.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p2.4.m4.1b"><apply id="S4.SS4.SSS2.p2.4.m4.1.1.cmml" xref="S4.SS4.SSS2.p2.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p2.4.m4.1.1.1.cmml" xref="S4.SS4.SSS2.p2.4.m4.1.1">superscript</csymbol><ci id="S4.SS4.SSS2.p2.4.m4.1.1.2.cmml" xref="S4.SS4.SSS2.p2.4.m4.1.1.2">𝒏</ci><ci id="S4.SS4.SSS2.p2.4.m4.1.1.3.cmml" xref="S4.SS4.SSS2.p2.4.m4.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p2.4.m4.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p2.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>. In contrast, when there is a mismatch, the performance remains relatively high even at a low <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p2.5.m5.1"><semantics id="S4.SS4.SSS2.p2.5.m5.1a"><msub id="S4.SS4.SSS2.p2.5.m5.1.1" xref="S4.SS4.SSS2.p2.5.m5.1.1.cmml"><mi id="S4.SS4.SSS2.p2.5.m5.1.1.2" xref="S4.SS4.SSS2.p2.5.m5.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS2.p2.5.m5.1.1.3" xref="S4.SS4.SSS2.p2.5.m5.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p2.5.m5.1b"><apply id="S4.SS4.SSS2.p2.5.m5.1.1.cmml" xref="S4.SS4.SSS2.p2.5.m5.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p2.5.m5.1.1.1.cmml" xref="S4.SS4.SSS2.p2.5.m5.1.1">subscript</csymbol><ci id="S4.SS4.SSS2.p2.5.m5.1.1.2.cmml" xref="S4.SS4.SSS2.p2.5.m5.1.1.2">SNR</ci><ci id="S4.SS4.SSS2.p2.5.m5.1.1.3.cmml" xref="S4.SS4.SSS2.p2.5.m5.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p2.5.m5.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p2.5.m5.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math>. From the results of this experiment, we confirmed that the impact of <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p2.6.m6.1"><semantics id="S4.SS4.SSS2.p2.6.m6.1a"><msub id="S4.SS4.SSS2.p2.6.m6.1.1" xref="S4.SS4.SSS2.p2.6.m6.1.1.cmml"><mi id="S4.SS4.SSS2.p2.6.m6.1.1.2" xref="S4.SS4.SSS2.p2.6.m6.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS2.p2.6.m6.1.1.3" xref="S4.SS4.SSS2.p2.6.m6.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p2.6.m6.1b"><apply id="S4.SS4.SSS2.p2.6.m6.1.1.cmml" xref="S4.SS4.SSS2.p2.6.m6.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p2.6.m6.1.1.1.cmml" xref="S4.SS4.SSS2.p2.6.m6.1.1">subscript</csymbol><ci id="S4.SS4.SSS2.p2.6.m6.1.1.2.cmml" xref="S4.SS4.SSS2.p2.6.m6.1.1.2">SNR</ci><ci id="S4.SS4.SSS2.p2.6.m6.1.1.3.cmml" xref="S4.SS4.SSS2.p2.6.m6.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p2.6.m6.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p2.6.m6.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> significantly depends on the mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p2.7.m7.1"><semantics id="S4.SS4.SSS2.p2.7.m7.1a"><msup id="S4.SS4.SSS2.p2.7.m7.1.1" xref="S4.SS4.SSS2.p2.7.m7.1.1.cmml"><mi id="S4.SS4.SSS2.p2.7.m7.1.1.2" xref="S4.SS4.SSS2.p2.7.m7.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS2.p2.7.m7.1.1.3" xref="S4.SS4.SSS2.p2.7.m7.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p2.7.m7.1b"><apply id="S4.SS4.SSS2.p2.7.m7.1.1.cmml" xref="S4.SS4.SSS2.p2.7.m7.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p2.7.m7.1.1.1.cmml" xref="S4.SS4.SSS2.p2.7.m7.1.1">superscript</csymbol><ci id="S4.SS4.SSS2.p2.7.m7.1.1.2.cmml" xref="S4.SS4.SSS2.p2.7.m7.1.1.2">𝒏</ci><ci id="S4.SS4.SSS2.p2.7.m7.1.1.3.cmml" xref="S4.SS4.SSS2.p2.7.m7.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p2.7.m7.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p2.7.m7.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS2.p2.8.m8.1"><semantics id="S4.SS4.SSS2.p2.8.m8.1a"><msup id="S4.SS4.SSS2.p2.8.m8.1.1" xref="S4.SS4.SSS2.p2.8.m8.1.1.cmml"><mi id="S4.SS4.SSS2.p2.8.m8.1.1.2" xref="S4.SS4.SSS2.p2.8.m8.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS2.p2.8.m8.1.1.3" xref="S4.SS4.SSS2.p2.8.m8.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS2.p2.8.m8.1b"><apply id="S4.SS4.SSS2.p2.8.m8.1.1.cmml" xref="S4.SS4.SSS2.p2.8.m8.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS2.p2.8.m8.1.1.1.cmml" xref="S4.SS4.SSS2.p2.8.m8.1.1">superscript</csymbol><ci id="S4.SS4.SSS2.p2.8.m8.1.1.2.cmml" xref="S4.SS4.SSS2.p2.8.m8.1.1.2">𝒏</ci><ci id="S4.SS4.SSS2.p2.8.m8.1.1.3.cmml" xref="S4.SS4.SSS2.p2.8.m8.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS2.p2.8.m8.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS2.p2.8.m8.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>.</p> </div> <figure class="ltx_figure" id="S4.F7"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="247" id="S4.F7.g1" src="x6.png" width="814"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure">Figure 7: </span>Relationship between <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.F7.4.m1.1"><semantics id="S4.F7.4.m1.1b"><msub id="S4.F7.4.m1.1.1" xref="S4.F7.4.m1.1.1.cmml"><mi id="S4.F7.4.m1.1.1.2" xref="S4.F7.4.m1.1.1.2.cmml">SNR</mi><mi id="S4.F7.4.m1.1.1.3" xref="S4.F7.4.m1.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.F7.4.m1.1c"><apply id="S4.F7.4.m1.1.1.cmml" xref="S4.F7.4.m1.1.1"><csymbol cd="ambiguous" id="S4.F7.4.m1.1.1.1.cmml" xref="S4.F7.4.m1.1.1">subscript</csymbol><ci id="S4.F7.4.m1.1.1.2.cmml" xref="S4.F7.4.m1.1.1.2">SNR</ci><ci id="S4.F7.4.m1.1.1.3.cmml" xref="S4.F7.4.m1.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.F7.4.m1.1d">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.F7.4.m1.1e">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> and the evaluation results of NyTT. Values in parentheses indicate the evaluation results of unprocessed input signals. <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.F7.5.m2.1"><semantics id="S4.F7.5.m2.1b"><msup id="S4.F7.5.m2.1.1" xref="S4.F7.5.m2.1.1.cmml"><mi id="S4.F7.5.m2.1.1.2" xref="S4.F7.5.m2.1.1.2.cmml">𝒏</mi><mi id="S4.F7.5.m2.1.1.3" xref="S4.F7.5.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.F7.5.m2.1c"><apply id="S4.F7.5.m2.1.1.cmml" xref="S4.F7.5.m2.1.1"><csymbol cd="ambiguous" id="S4.F7.5.m2.1.1.1.cmml" xref="S4.F7.5.m2.1.1">superscript</csymbol><ci id="S4.F7.5.m2.1.1.2.cmml" xref="S4.F7.5.m2.1.1.2">𝒏</ci><ci id="S4.F7.5.m2.1.1.3.cmml" xref="S4.F7.5.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.F7.5.m2.1d">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.F7.5.m2.1e">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> was <span class="ltx_text ltx_font_typewriter" id="S4.F7.8.1">CHiME-B</span> and <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S4.F7.6.m3.1"><semantics id="S4.F7.6.m3.1b"><msub id="S4.F7.6.m3.1.1" xref="S4.F7.6.m3.1.1.cmml"><mi id="S4.F7.6.m3.1.1.2" xref="S4.F7.6.m3.1.1.2.cmml">SNR</mi><mi id="S4.F7.6.m3.1.1.3" xref="S4.F7.6.m3.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S4.F7.6.m3.1c"><apply id="S4.F7.6.m3.1.1.cmml" xref="S4.F7.6.m3.1.1"><csymbol cd="ambiguous" id="S4.F7.6.m3.1.1.1.cmml" xref="S4.F7.6.m3.1.1">subscript</csymbol><ci id="S4.F7.6.m3.1.1.2.cmml" xref="S4.F7.6.m3.1.1.2">SNR</ci><ci id="S4.F7.6.m3.1.1.3.cmml" xref="S4.F7.6.m3.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.F7.6.m3.1d">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.F7.6.m3.1e">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math> ranged from -5 to 5 dB.</figcaption> </figure> </section> <section class="ltx_subsubsection" id="S4.SS4.SSS3"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.4.3 </span>Difference in the impact of <math alttext="\mathrm{SNR}_{\textbf{y}}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.1.m1.1"><semantics id="S4.SS4.SSS3.1.m1.1b"><msub id="S4.SS4.SSS3.1.m1.1.1" xref="S4.SS4.SSS3.1.m1.1.1.cmml"><mi id="S4.SS4.SSS3.1.m1.1.1.2" xref="S4.SS4.SSS3.1.m1.1.1.2.cmml">SNR</mi><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS3.1.m1.1.1.3" xref="S4.SS4.SSS3.1.m1.1.1.3a.cmml">y</mtext></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.1.m1.1c"><apply id="S4.SS4.SSS3.1.m1.1.1.cmml" xref="S4.SS4.SSS3.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.1.m1.1.1.1.cmml" xref="S4.SS4.SSS3.1.m1.1.1">subscript</csymbol><ci id="S4.SS4.SSS3.1.m1.1.1.2.cmml" xref="S4.SS4.SSS3.1.m1.1.1.2">SNR</ci><ci id="S4.SS4.SSS3.1.m1.1.1.3a.cmml" xref="S4.SS4.SSS3.1.m1.1.1.3"><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS3.1.m1.1.1.3.cmml" mathsize="70%" xref="S4.SS4.SSS3.1.m1.1.1.3">y</mtext></ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.1.m1.1d">\mathrm{SNR}_{\textbf{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.1.m1.1e">roman_SNR start_POSTSUBSCRIPT y end_POSTSUBSCRIPT</annotation></semantics></math> with and without the mismatch between <math alttext="\textbf{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.2.m2.1"><semantics id="S4.SS4.SSS3.2.m2.1b"><msup id="S4.SS4.SSS3.2.m2.1.1" xref="S4.SS4.SSS3.2.m2.1.1.cmml"><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS3.2.m2.1.1.2" xref="S4.SS4.SSS3.2.m2.1.1.2a.cmml">n</mtext><mi id="S4.SS4.SSS3.2.m2.1.1.3" xref="S4.SS4.SSS3.2.m2.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.2.m2.1c"><apply id="S4.SS4.SSS3.2.m2.1.1.cmml" xref="S4.SS4.SSS3.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.2.m2.1.1.1.cmml" xref="S4.SS4.SSS3.2.m2.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.2.m2.1.1.2a.cmml" xref="S4.SS4.SSS3.2.m2.1.1.2"><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS3.2.m2.1.1.2.cmml" xref="S4.SS4.SSS3.2.m2.1.1.2">n</mtext></ci><ci id="S4.SS4.SSS3.2.m2.1.1.3.cmml" xref="S4.SS4.SSS3.2.m2.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.2.m2.1d">\textbf{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.2.m2.1e">n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\textbf{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.3.m3.1"><semantics id="S4.SS4.SSS3.3.m3.1b"><msup id="S4.SS4.SSS3.3.m3.1.1" xref="S4.SS4.SSS3.3.m3.1.1.cmml"><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS3.3.m3.1.1.2" xref="S4.SS4.SSS3.3.m3.1.1.2a.cmml">n</mtext><mi id="S4.SS4.SSS3.3.m3.1.1.3" xref="S4.SS4.SSS3.3.m3.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.3.m3.1c"><apply id="S4.SS4.SSS3.3.m3.1.1.cmml" xref="S4.SS4.SSS3.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.3.m3.1.1.1.cmml" xref="S4.SS4.SSS3.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.3.m3.1.1.2a.cmml" xref="S4.SS4.SSS3.3.m3.1.1.2"><mtext class="ltx_mathvariant_bold" id="S4.SS4.SSS3.3.m3.1.1.2.cmml" xref="S4.SS4.SSS3.3.m3.1.1.2">n</mtext></ci><ci id="S4.SS4.SSS3.3.m3.1.1.3.cmml" xref="S4.SS4.SSS3.3.m3.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.3.m3.1d">\textbf{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.3.m3.1e">n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math> </h4> <div class="ltx_para" id="S4.SS4.SSS3.p1"> <p class="ltx_p" id="S4.SS4.SSS3.p1.14">Considering that NyTT trains a DNN to estimate noisy targets by removing <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.1.m1.1"><semantics id="S4.SS4.SSS3.p1.1.m1.1a"><msup id="S4.SS4.SSS3.p1.1.m1.1.1" xref="S4.SS4.SSS3.p1.1.m1.1.1.cmml"><mi id="S4.SS4.SSS3.p1.1.m1.1.1.2" xref="S4.SS4.SSS3.p1.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS3.p1.1.m1.1.1.3" xref="S4.SS4.SSS3.p1.1.m1.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.1.m1.1b"><apply id="S4.SS4.SSS3.p1.1.m1.1.1.cmml" xref="S4.SS4.SSS3.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p1.1.m1.1.1.1.cmml" xref="S4.SS4.SSS3.p1.1.m1.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.p1.1.m1.1.1.2.cmml" xref="S4.SS4.SSS3.p1.1.m1.1.1.2">𝒏</ci><ci id="S4.SS4.SSS3.p1.1.m1.1.1.3.cmml" xref="S4.SS4.SSS3.p1.1.m1.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.1.m1.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> from <span class="ltx_text ltx_font_italic" id="S4.SS4.SSS3.p1.14.1">more noisy</span> signals, <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.2.m2.1"><semantics id="S4.SS4.SSS3.p1.2.m2.1a"><msub id="S4.SS4.SSS3.p1.2.m2.1.1" xref="S4.SS4.SSS3.p1.2.m2.1.1.cmml"><mi id="S4.SS4.SSS3.p1.2.m2.1.1.2" xref="S4.SS4.SSS3.p1.2.m2.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS3.p1.2.m2.1.1.3" xref="S4.SS4.SSS3.p1.2.m2.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.2.m2.1b"><apply id="S4.SS4.SSS3.p1.2.m2.1.1.cmml" xref="S4.SS4.SSS3.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p1.2.m2.1.1.1.cmml" xref="S4.SS4.SSS3.p1.2.m2.1.1">subscript</csymbol><ci id="S4.SS4.SSS3.p1.2.m2.1.1.2.cmml" xref="S4.SS4.SSS3.p1.2.m2.1.1.2">SNR</ci><ci id="S4.SS4.SSS3.p1.2.m2.1.1.3.cmml" xref="S4.SS4.SSS3.p1.2.m2.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.2.m2.1c">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.2.m2.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math> also affects the performance. For example, when there is no mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.3.m3.1"><semantics id="S4.SS4.SSS3.p1.3.m3.1a"><msup id="S4.SS4.SSS3.p1.3.m3.1.1" xref="S4.SS4.SSS3.p1.3.m3.1.1.cmml"><mi id="S4.SS4.SSS3.p1.3.m3.1.1.2" xref="S4.SS4.SSS3.p1.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS3.p1.3.m3.1.1.3" xref="S4.SS4.SSS3.p1.3.m3.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.3.m3.1b"><apply id="S4.SS4.SSS3.p1.3.m3.1.1.cmml" xref="S4.SS4.SSS3.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p1.3.m3.1.1.1.cmml" xref="S4.SS4.SSS3.p1.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.p1.3.m3.1.1.2.cmml" xref="S4.SS4.SSS3.p1.3.m3.1.1.2">𝒏</ci><ci id="S4.SS4.SSS3.p1.3.m3.1.1.3.cmml" xref="S4.SS4.SSS3.p1.3.m3.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.3.m3.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.4.m4.1"><semantics id="S4.SS4.SSS3.p1.4.m4.1a"><msup id="S4.SS4.SSS3.p1.4.m4.1.1" xref="S4.SS4.SSS3.p1.4.m4.1.1.cmml"><mi id="S4.SS4.SSS3.p1.4.m4.1.1.2" xref="S4.SS4.SSS3.p1.4.m4.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS3.p1.4.m4.1.1.3" xref="S4.SS4.SSS3.p1.4.m4.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.4.m4.1b"><apply id="S4.SS4.SSS3.p1.4.m4.1.1.cmml" xref="S4.SS4.SSS3.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p1.4.m4.1.1.1.cmml" xref="S4.SS4.SSS3.p1.4.m4.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.p1.4.m4.1.1.2.cmml" xref="S4.SS4.SSS3.p1.4.m4.1.1.2">𝒏</ci><ci id="S4.SS4.SSS3.p1.4.m4.1.1.3.cmml" xref="S4.SS4.SSS3.p1.4.m4.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.4.m4.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, and <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.5.m5.1"><semantics id="S4.SS4.SSS3.p1.5.m5.1a"><msub id="S4.SS4.SSS3.p1.5.m5.1.1" xref="S4.SS4.SSS3.p1.5.m5.1.1.cmml"><mi id="S4.SS4.SSS3.p1.5.m5.1.1.2" xref="S4.SS4.SSS3.p1.5.m5.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS3.p1.5.m5.1.1.3" xref="S4.SS4.SSS3.p1.5.m5.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.5.m5.1b"><apply id="S4.SS4.SSS3.p1.5.m5.1.1.cmml" xref="S4.SS4.SSS3.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p1.5.m5.1.1.1.cmml" xref="S4.SS4.SSS3.p1.5.m5.1.1">subscript</csymbol><ci id="S4.SS4.SSS3.p1.5.m5.1.1.2.cmml" xref="S4.SS4.SSS3.p1.5.m5.1.1.2">SNR</ci><ci id="S4.SS4.SSS3.p1.5.m5.1.1.3.cmml" xref="S4.SS4.SSS3.p1.5.m5.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.5.m5.1c">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.5.m5.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math> is high, the effect of reducing noise is less significant than the adverse effect of the residual noise in the output signal. To investigate this, we evaluated the performance of NyTT using <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS3.p1.14.2">CHiME-A</span> and <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS3.p1.14.3">DCASE</span> as <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.6.m6.1"><semantics id="S4.SS4.SSS3.p1.6.m6.1a"><msup id="S4.SS4.SSS3.p1.6.m6.1.1" xref="S4.SS4.SSS3.p1.6.m6.1.1.cmml"><mi id="S4.SS4.SSS3.p1.6.m6.1.1.2" xref="S4.SS4.SSS3.p1.6.m6.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS3.p1.6.m6.1.1.3" xref="S4.SS4.SSS3.p1.6.m6.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.6.m6.1b"><apply id="S4.SS4.SSS3.p1.6.m6.1.1.cmml" xref="S4.SS4.SSS3.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p1.6.m6.1.1.1.cmml" xref="S4.SS4.SSS3.p1.6.m6.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.p1.6.m6.1.1.2.cmml" xref="S4.SS4.SSS3.p1.6.m6.1.1.2">𝒏</ci><ci id="S4.SS4.SSS3.p1.6.m6.1.1.3.cmml" xref="S4.SS4.SSS3.p1.6.m6.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.6.m6.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.6.m6.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>, and <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS3.p1.14.4">CHiME-B</span> as <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.7.m7.1"><semantics id="S4.SS4.SSS3.p1.7.m7.1a"><msup id="S4.SS4.SSS3.p1.7.m7.1.1" xref="S4.SS4.SSS3.p1.7.m7.1.1.cmml"><mi id="S4.SS4.SSS3.p1.7.m7.1.1.2" xref="S4.SS4.SSS3.p1.7.m7.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS3.p1.7.m7.1.1.3" xref="S4.SS4.SSS3.p1.7.m7.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.7.m7.1b"><apply id="S4.SS4.SSS3.p1.7.m7.1.1.cmml" xref="S4.SS4.SSS3.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p1.7.m7.1.1.1.cmml" xref="S4.SS4.SSS3.p1.7.m7.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.p1.7.m7.1.1.2.cmml" xref="S4.SS4.SSS3.p1.7.m7.1.1.2">𝒏</ci><ci id="S4.SS4.SSS3.p1.7.m7.1.1.3.cmml" xref="S4.SS4.SSS3.p1.7.m7.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.7.m7.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.7.m7.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, and by varying the <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.8.m8.1"><semantics id="S4.SS4.SSS3.p1.8.m8.1a"><msub id="S4.SS4.SSS3.p1.8.m8.1.1" xref="S4.SS4.SSS3.p1.8.m8.1.1.cmml"><mi id="S4.SS4.SSS3.p1.8.m8.1.1.2" xref="S4.SS4.SSS3.p1.8.m8.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS3.p1.8.m8.1.1.3" xref="S4.SS4.SSS3.p1.8.m8.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.8.m8.1b"><apply id="S4.SS4.SSS3.p1.8.m8.1.1.cmml" xref="S4.SS4.SSS3.p1.8.m8.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p1.8.m8.1.1.1.cmml" xref="S4.SS4.SSS3.p1.8.m8.1.1">subscript</csymbol><ci id="S4.SS4.SSS3.p1.8.m8.1.1.2.cmml" xref="S4.SS4.SSS3.p1.8.m8.1.1.2">SNR</ci><ci id="S4.SS4.SSS3.p1.8.m8.1.1.3.cmml" xref="S4.SS4.SSS3.p1.8.m8.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.8.m8.1c">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.8.m8.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math> range to <math alttext="[-10,-5)" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.9.m9.2"><semantics id="S4.SS4.SSS3.p1.9.m9.2a"><mrow id="S4.SS4.SSS3.p1.9.m9.2.2.2" xref="S4.SS4.SSS3.p1.9.m9.2.2.3.cmml"><mo id="S4.SS4.SSS3.p1.9.m9.2.2.2.3" stretchy="false" xref="S4.SS4.SSS3.p1.9.m9.2.2.3.cmml">[</mo><mrow id="S4.SS4.SSS3.p1.9.m9.1.1.1.1" xref="S4.SS4.SSS3.p1.9.m9.1.1.1.1.cmml"><mo id="S4.SS4.SSS3.p1.9.m9.1.1.1.1a" xref="S4.SS4.SSS3.p1.9.m9.1.1.1.1.cmml">−</mo><mn id="S4.SS4.SSS3.p1.9.m9.1.1.1.1.2" xref="S4.SS4.SSS3.p1.9.m9.1.1.1.1.2.cmml">10</mn></mrow><mo id="S4.SS4.SSS3.p1.9.m9.2.2.2.4" xref="S4.SS4.SSS3.p1.9.m9.2.2.3.cmml">,</mo><mrow id="S4.SS4.SSS3.p1.9.m9.2.2.2.2" xref="S4.SS4.SSS3.p1.9.m9.2.2.2.2.cmml"><mo id="S4.SS4.SSS3.p1.9.m9.2.2.2.2a" xref="S4.SS4.SSS3.p1.9.m9.2.2.2.2.cmml">−</mo><mn id="S4.SS4.SSS3.p1.9.m9.2.2.2.2.2" xref="S4.SS4.SSS3.p1.9.m9.2.2.2.2.2.cmml">5</mn></mrow><mo id="S4.SS4.SSS3.p1.9.m9.2.2.2.5" stretchy="false" xref="S4.SS4.SSS3.p1.9.m9.2.2.3.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.9.m9.2b"><interval closure="closed-open" id="S4.SS4.SSS3.p1.9.m9.2.2.3.cmml" xref="S4.SS4.SSS3.p1.9.m9.2.2.2"><apply id="S4.SS4.SSS3.p1.9.m9.1.1.1.1.cmml" xref="S4.SS4.SSS3.p1.9.m9.1.1.1.1"><minus id="S4.SS4.SSS3.p1.9.m9.1.1.1.1.1.cmml" xref="S4.SS4.SSS3.p1.9.m9.1.1.1.1"></minus><cn id="S4.SS4.SSS3.p1.9.m9.1.1.1.1.2.cmml" type="integer" xref="S4.SS4.SSS3.p1.9.m9.1.1.1.1.2">10</cn></apply><apply id="S4.SS4.SSS3.p1.9.m9.2.2.2.2.cmml" xref="S4.SS4.SSS3.p1.9.m9.2.2.2.2"><minus id="S4.SS4.SSS3.p1.9.m9.2.2.2.2.1.cmml" xref="S4.SS4.SSS3.p1.9.m9.2.2.2.2"></minus><cn id="S4.SS4.SSS3.p1.9.m9.2.2.2.2.2.cmml" type="integer" xref="S4.SS4.SSS3.p1.9.m9.2.2.2.2.2">5</cn></apply></interval></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.9.m9.2c">[-10,-5)</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.9.m9.2d">[ - 10 , - 5 )</annotation></semantics></math>, <math alttext="[-5,0)" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.10.m10.2"><semantics id="S4.SS4.SSS3.p1.10.m10.2a"><mrow id="S4.SS4.SSS3.p1.10.m10.2.2.1" xref="S4.SS4.SSS3.p1.10.m10.2.2.2.cmml"><mo id="S4.SS4.SSS3.p1.10.m10.2.2.1.2" stretchy="false" xref="S4.SS4.SSS3.p1.10.m10.2.2.2.cmml">[</mo><mrow id="S4.SS4.SSS3.p1.10.m10.2.2.1.1" xref="S4.SS4.SSS3.p1.10.m10.2.2.1.1.cmml"><mo id="S4.SS4.SSS3.p1.10.m10.2.2.1.1a" xref="S4.SS4.SSS3.p1.10.m10.2.2.1.1.cmml">−</mo><mn id="S4.SS4.SSS3.p1.10.m10.2.2.1.1.2" xref="S4.SS4.SSS3.p1.10.m10.2.2.1.1.2.cmml">5</mn></mrow><mo id="S4.SS4.SSS3.p1.10.m10.2.2.1.3" xref="S4.SS4.SSS3.p1.10.m10.2.2.2.cmml">,</mo><mn id="S4.SS4.SSS3.p1.10.m10.1.1" xref="S4.SS4.SSS3.p1.10.m10.1.1.cmml">0</mn><mo id="S4.SS4.SSS3.p1.10.m10.2.2.1.4" stretchy="false" xref="S4.SS4.SSS3.p1.10.m10.2.2.2.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.10.m10.2b"><interval closure="closed-open" id="S4.SS4.SSS3.p1.10.m10.2.2.2.cmml" xref="S4.SS4.SSS3.p1.10.m10.2.2.1"><apply id="S4.SS4.SSS3.p1.10.m10.2.2.1.1.cmml" xref="S4.SS4.SSS3.p1.10.m10.2.2.1.1"><minus id="S4.SS4.SSS3.p1.10.m10.2.2.1.1.1.cmml" xref="S4.SS4.SSS3.p1.10.m10.2.2.1.1"></minus><cn id="S4.SS4.SSS3.p1.10.m10.2.2.1.1.2.cmml" type="integer" xref="S4.SS4.SSS3.p1.10.m10.2.2.1.1.2">5</cn></apply><cn id="S4.SS4.SSS3.p1.10.m10.1.1.cmml" type="integer" xref="S4.SS4.SSS3.p1.10.m10.1.1">0</cn></interval></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.10.m10.2c">[-5,0)</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.10.m10.2d">[ - 5 , 0 )</annotation></semantics></math>, <math alttext="[0,5)" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.11.m11.2"><semantics id="S4.SS4.SSS3.p1.11.m11.2a"><mrow id="S4.SS4.SSS3.p1.11.m11.2.3.2" xref="S4.SS4.SSS3.p1.11.m11.2.3.1.cmml"><mo id="S4.SS4.SSS3.p1.11.m11.2.3.2.1" stretchy="false" xref="S4.SS4.SSS3.p1.11.m11.2.3.1.cmml">[</mo><mn id="S4.SS4.SSS3.p1.11.m11.1.1" xref="S4.SS4.SSS3.p1.11.m11.1.1.cmml">0</mn><mo id="S4.SS4.SSS3.p1.11.m11.2.3.2.2" xref="S4.SS4.SSS3.p1.11.m11.2.3.1.cmml">,</mo><mn id="S4.SS4.SSS3.p1.11.m11.2.2" xref="S4.SS4.SSS3.p1.11.m11.2.2.cmml">5</mn><mo id="S4.SS4.SSS3.p1.11.m11.2.3.2.3" stretchy="false" xref="S4.SS4.SSS3.p1.11.m11.2.3.1.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.11.m11.2b"><interval closure="closed-open" id="S4.SS4.SSS3.p1.11.m11.2.3.1.cmml" xref="S4.SS4.SSS3.p1.11.m11.2.3.2"><cn id="S4.SS4.SSS3.p1.11.m11.1.1.cmml" type="integer" xref="S4.SS4.SSS3.p1.11.m11.1.1">0</cn><cn id="S4.SS4.SSS3.p1.11.m11.2.2.cmml" type="integer" xref="S4.SS4.SSS3.p1.11.m11.2.2">5</cn></interval></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.11.m11.2c">[0,5)</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.11.m11.2d">[ 0 , 5 )</annotation></semantics></math>, <math alttext="[5,10)" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.12.m12.2"><semantics id="S4.SS4.SSS3.p1.12.m12.2a"><mrow id="S4.SS4.SSS3.p1.12.m12.2.3.2" xref="S4.SS4.SSS3.p1.12.m12.2.3.1.cmml"><mo id="S4.SS4.SSS3.p1.12.m12.2.3.2.1" stretchy="false" xref="S4.SS4.SSS3.p1.12.m12.2.3.1.cmml">[</mo><mn id="S4.SS4.SSS3.p1.12.m12.1.1" xref="S4.SS4.SSS3.p1.12.m12.1.1.cmml">5</mn><mo id="S4.SS4.SSS3.p1.12.m12.2.3.2.2" xref="S4.SS4.SSS3.p1.12.m12.2.3.1.cmml">,</mo><mn id="S4.SS4.SSS3.p1.12.m12.2.2" xref="S4.SS4.SSS3.p1.12.m12.2.2.cmml">10</mn><mo id="S4.SS4.SSS3.p1.12.m12.2.3.2.3" stretchy="false" xref="S4.SS4.SSS3.p1.12.m12.2.3.1.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.12.m12.2b"><interval closure="closed-open" id="S4.SS4.SSS3.p1.12.m12.2.3.1.cmml" xref="S4.SS4.SSS3.p1.12.m12.2.3.2"><cn id="S4.SS4.SSS3.p1.12.m12.1.1.cmml" type="integer" xref="S4.SS4.SSS3.p1.12.m12.1.1">5</cn><cn id="S4.SS4.SSS3.p1.12.m12.2.2.cmml" type="integer" xref="S4.SS4.SSS3.p1.12.m12.2.2">10</cn></interval></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.12.m12.2c">[5,10)</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.12.m12.2d">[ 5 , 10 )</annotation></semantics></math>, and <math alttext="[10,15)" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.13.m13.2"><semantics id="S4.SS4.SSS3.p1.13.m13.2a"><mrow id="S4.SS4.SSS3.p1.13.m13.2.3.2" xref="S4.SS4.SSS3.p1.13.m13.2.3.1.cmml"><mo id="S4.SS4.SSS3.p1.13.m13.2.3.2.1" stretchy="false" xref="S4.SS4.SSS3.p1.13.m13.2.3.1.cmml">[</mo><mn id="S4.SS4.SSS3.p1.13.m13.1.1" xref="S4.SS4.SSS3.p1.13.m13.1.1.cmml">10</mn><mo id="S4.SS4.SSS3.p1.13.m13.2.3.2.2" xref="S4.SS4.SSS3.p1.13.m13.2.3.1.cmml">,</mo><mn id="S4.SS4.SSS3.p1.13.m13.2.2" xref="S4.SS4.SSS3.p1.13.m13.2.2.cmml">15</mn><mo id="S4.SS4.SSS3.p1.13.m13.2.3.2.3" stretchy="false" xref="S4.SS4.SSS3.p1.13.m13.2.3.1.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.13.m13.2b"><interval closure="closed-open" id="S4.SS4.SSS3.p1.13.m13.2.3.1.cmml" xref="S4.SS4.SSS3.p1.13.m13.2.3.2"><cn id="S4.SS4.SSS3.p1.13.m13.1.1.cmml" type="integer" xref="S4.SS4.SSS3.p1.13.m13.1.1">10</cn><cn id="S4.SS4.SSS3.p1.13.m13.2.2.cmml" type="integer" xref="S4.SS4.SSS3.p1.13.m13.2.2">15</cn></interval></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.13.m13.2c">[10,15)</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.13.m13.2d">[ 10 , 15 )</annotation></semantics></math> dB. We also evaluated the performance of CTT. In this experiment, we set <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p1.14.m14.1"><semantics id="S4.SS4.SSS3.p1.14.m14.1a"><msub id="S4.SS4.SSS3.p1.14.m14.1.1" xref="S4.SS4.SSS3.p1.14.m14.1.1.cmml"><mi id="S4.SS4.SSS3.p1.14.m14.1.1.2" xref="S4.SS4.SSS3.p1.14.m14.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS3.p1.14.m14.1.1.3" xref="S4.SS4.SSS3.p1.14.m14.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p1.14.m14.1b"><apply id="S4.SS4.SSS3.p1.14.m14.1.1.cmml" xref="S4.SS4.SSS3.p1.14.m14.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p1.14.m14.1.1.1.cmml" xref="S4.SS4.SSS3.p1.14.m14.1.1">subscript</csymbol><ci id="S4.SS4.SSS3.p1.14.m14.1.1.2.cmml" xref="S4.SS4.SSS3.p1.14.m14.1.1.2">SNR</ci><ci id="S4.SS4.SSS3.p1.14.m14.1.1.3.cmml" xref="S4.SS4.SSS3.p1.14.m14.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p1.14.m14.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p1.14.m14.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> to 5 dB for NyTT.</p> </div> <div class="ltx_para" id="S4.SS4.SSS3.p2"> <p class="ltx_p" id="S4.SS4.SSS3.p2.9">Figure <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.F8" title='Figure 8 ‣ 4.4.3 Difference in the impact of SNR_"y" with and without the mismatch between "n"ᵒᵇˢ and "n"ᵗᵉˢᵗ ‣ 4.4 Effects of noise mismatches ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement'><span class="ltx_text ltx_ref_tag">8</span></a> shows the SI-SDR, PESQ, and STOI of the processed results for the test dataset at each <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p2.1.m1.1"><semantics id="S4.SS4.SSS3.p2.1.m1.1a"><msub id="S4.SS4.SSS3.p2.1.m1.1.1" xref="S4.SS4.SSS3.p2.1.m1.1.1.cmml"><mi id="S4.SS4.SSS3.p2.1.m1.1.1.2" xref="S4.SS4.SSS3.p2.1.m1.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS3.p2.1.m1.1.1.3" xref="S4.SS4.SSS3.p2.1.m1.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p2.1.m1.1b"><apply id="S4.SS4.SSS3.p2.1.m1.1.1.cmml" xref="S4.SS4.SSS3.p2.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p2.1.m1.1.1.1.cmml" xref="S4.SS4.SSS3.p2.1.m1.1.1">subscript</csymbol><ci id="S4.SS4.SSS3.p2.1.m1.1.1.2.cmml" xref="S4.SS4.SSS3.p2.1.m1.1.1.2">SNR</ci><ci id="S4.SS4.SSS3.p2.1.m1.1.1.3.cmml" xref="S4.SS4.SSS3.p2.1.m1.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p2.1.m1.1c">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p2.1.m1.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math> range, where the performance of CTT varies depending on the mismatch of SNR between the training and test datasets. When <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p2.2.m2.1"><semantics id="S4.SS4.SSS3.p2.2.m2.1a"><msup id="S4.SS4.SSS3.p2.2.m2.1.1" xref="S4.SS4.SSS3.p2.2.m2.1.1.cmml"><mi id="S4.SS4.SSS3.p2.2.m2.1.1.2" xref="S4.SS4.SSS3.p2.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS3.p2.2.m2.1.1.3" xref="S4.SS4.SSS3.p2.2.m2.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p2.2.m2.1b"><apply id="S4.SS4.SSS3.p2.2.m2.1.1.cmml" xref="S4.SS4.SSS3.p2.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p2.2.m2.1.1.1.cmml" xref="S4.SS4.SSS3.p2.2.m2.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.p2.2.m2.1.1.2.cmml" xref="S4.SS4.SSS3.p2.2.m2.1.1.2">𝒏</ci><ci id="S4.SS4.SSS3.p2.2.m2.1.1.3.cmml" xref="S4.SS4.SSS3.p2.2.m2.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p2.2.m2.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p2.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS3.p2.9.1">DCASE</span>, which has a mismatch with <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p2.3.m3.1"><semantics id="S4.SS4.SSS3.p2.3.m3.1a"><msup id="S4.SS4.SSS3.p2.3.m3.1.1" xref="S4.SS4.SSS3.p2.3.m3.1.1.cmml"><mi id="S4.SS4.SSS3.p2.3.m3.1.1.2" xref="S4.SS4.SSS3.p2.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS3.p2.3.m3.1.1.3" xref="S4.SS4.SSS3.p2.3.m3.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p2.3.m3.1b"><apply id="S4.SS4.SSS3.p2.3.m3.1.1.cmml" xref="S4.SS4.SSS3.p2.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p2.3.m3.1.1.1.cmml" xref="S4.SS4.SSS3.p2.3.m3.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.p2.3.m3.1.1.2.cmml" xref="S4.SS4.SSS3.p2.3.m3.1.1.2">𝒏</ci><ci id="S4.SS4.SSS3.p2.3.m3.1.1.3.cmml" xref="S4.SS4.SSS3.p2.3.m3.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p2.3.m3.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p2.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, the performance of NyTT has a similar tendency to that of CTT. On the other hand, when <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p2.4.m4.1"><semantics id="S4.SS4.SSS3.p2.4.m4.1a"><msup id="S4.SS4.SSS3.p2.4.m4.1.1" xref="S4.SS4.SSS3.p2.4.m4.1.1.cmml"><mi id="S4.SS4.SSS3.p2.4.m4.1.1.2" xref="S4.SS4.SSS3.p2.4.m4.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS3.p2.4.m4.1.1.3" xref="S4.SS4.SSS3.p2.4.m4.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p2.4.m4.1b"><apply id="S4.SS4.SSS3.p2.4.m4.1.1.cmml" xref="S4.SS4.SSS3.p2.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p2.4.m4.1.1.1.cmml" xref="S4.SS4.SSS3.p2.4.m4.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.p2.4.m4.1.1.2.cmml" xref="S4.SS4.SSS3.p2.4.m4.1.1.2">𝒏</ci><ci id="S4.SS4.SSS3.p2.4.m4.1.1.3.cmml" xref="S4.SS4.SSS3.p2.4.m4.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p2.4.m4.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p2.4.m4.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <span class="ltx_text ltx_font_typewriter" id="S4.SS4.SSS3.p2.9.2">CHiME-A</span>, which has no mismatch with <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p2.5.m5.1"><semantics id="S4.SS4.SSS3.p2.5.m5.1a"><msup id="S4.SS4.SSS3.p2.5.m5.1.1" xref="S4.SS4.SSS3.p2.5.m5.1.1.cmml"><mi id="S4.SS4.SSS3.p2.5.m5.1.1.2" xref="S4.SS4.SSS3.p2.5.m5.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS3.p2.5.m5.1.1.3" xref="S4.SS4.SSS3.p2.5.m5.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p2.5.m5.1b"><apply id="S4.SS4.SSS3.p2.5.m5.1.1.cmml" xref="S4.SS4.SSS3.p2.5.m5.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p2.5.m5.1.1.1.cmml" xref="S4.SS4.SSS3.p2.5.m5.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.p2.5.m5.1.1.2.cmml" xref="S4.SS4.SSS3.p2.5.m5.1.1.2">𝒏</ci><ci id="S4.SS4.SSS3.p2.5.m5.1.1.3.cmml" xref="S4.SS4.SSS3.p2.5.m5.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p2.5.m5.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p2.5.m5.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, the performance of NyTT degrades when <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p2.6.m6.1"><semantics id="S4.SS4.SSS3.p2.6.m6.1a"><msub id="S4.SS4.SSS3.p2.6.m6.1.1" xref="S4.SS4.SSS3.p2.6.m6.1.1.cmml"><mi id="S4.SS4.SSS3.p2.6.m6.1.1.2" xref="S4.SS4.SSS3.p2.6.m6.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS3.p2.6.m6.1.1.3" xref="S4.SS4.SSS3.p2.6.m6.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p2.6.m6.1b"><apply id="S4.SS4.SSS3.p2.6.m6.1.1.cmml" xref="S4.SS4.SSS3.p2.6.m6.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p2.6.m6.1.1.1.cmml" xref="S4.SS4.SSS3.p2.6.m6.1.1">subscript</csymbol><ci id="S4.SS4.SSS3.p2.6.m6.1.1.2.cmml" xref="S4.SS4.SSS3.p2.6.m6.1.1.2">SNR</ci><ci id="S4.SS4.SSS3.p2.6.m6.1.1.3.cmml" xref="S4.SS4.SSS3.p2.6.m6.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p2.6.m6.1c">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p2.6.m6.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math> exceeds 5 dB, and this trend differs from that of CTT. Thus, when there is no mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p2.7.m7.1"><semantics id="S4.SS4.SSS3.p2.7.m7.1a"><msup id="S4.SS4.SSS3.p2.7.m7.1.1" xref="S4.SS4.SSS3.p2.7.m7.1.1.cmml"><mi id="S4.SS4.SSS3.p2.7.m7.1.1.2" xref="S4.SS4.SSS3.p2.7.m7.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS3.p2.7.m7.1.1.3" xref="S4.SS4.SSS3.p2.7.m7.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p2.7.m7.1b"><apply id="S4.SS4.SSS3.p2.7.m7.1.1.cmml" xref="S4.SS4.SSS3.p2.7.m7.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p2.7.m7.1.1.1.cmml" xref="S4.SS4.SSS3.p2.7.m7.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.p2.7.m7.1.1.2.cmml" xref="S4.SS4.SSS3.p2.7.m7.1.1.2">𝒏</ci><ci id="S4.SS4.SSS3.p2.7.m7.1.1.3.cmml" xref="S4.SS4.SSS3.p2.7.m7.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p2.7.m7.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p2.7.m7.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p2.8.m8.1"><semantics id="S4.SS4.SSS3.p2.8.m8.1a"><msup id="S4.SS4.SSS3.p2.8.m8.1.1" xref="S4.SS4.SSS3.p2.8.m8.1.1.cmml"><mi id="S4.SS4.SSS3.p2.8.m8.1.1.2" xref="S4.SS4.SSS3.p2.8.m8.1.1.2.cmml">𝒏</mi><mi id="S4.SS4.SSS3.p2.8.m8.1.1.3" xref="S4.SS4.SSS3.p2.8.m8.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p2.8.m8.1b"><apply id="S4.SS4.SSS3.p2.8.m8.1.1.cmml" xref="S4.SS4.SSS3.p2.8.m8.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p2.8.m8.1.1.1.cmml" xref="S4.SS4.SSS3.p2.8.m8.1.1">superscript</csymbol><ci id="S4.SS4.SSS3.p2.8.m8.1.1.2.cmml" xref="S4.SS4.SSS3.p2.8.m8.1.1.2">𝒏</ci><ci id="S4.SS4.SSS3.p2.8.m8.1.1.3.cmml" xref="S4.SS4.SSS3.p2.8.m8.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p2.8.m8.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p2.8.m8.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, NyTT effectively acquires the TSE feature by setting <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S4.SS4.SSS3.p2.9.m9.1"><semantics id="S4.SS4.SSS3.p2.9.m9.1a"><msub id="S4.SS4.SSS3.p2.9.m9.1.1" xref="S4.SS4.SSS3.p2.9.m9.1.1.cmml"><mi id="S4.SS4.SSS3.p2.9.m9.1.1.2" xref="S4.SS4.SSS3.p2.9.m9.1.1.2.cmml">SNR</mi><mi id="S4.SS4.SSS3.p2.9.m9.1.1.3" xref="S4.SS4.SSS3.p2.9.m9.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS4.SSS3.p2.9.m9.1b"><apply id="S4.SS4.SSS3.p2.9.m9.1.1.cmml" xref="S4.SS4.SSS3.p2.9.m9.1.1"><csymbol cd="ambiguous" id="S4.SS4.SSS3.p2.9.m9.1.1.1.cmml" xref="S4.SS4.SSS3.p2.9.m9.1.1">subscript</csymbol><ci id="S4.SS4.SSS3.p2.9.m9.1.1.2.cmml" xref="S4.SS4.SSS3.p2.9.m9.1.1.2">SNR</ci><ci id="S4.SS4.SSS3.p2.9.m9.1.1.3.cmml" xref="S4.SS4.SSS3.p2.9.m9.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS4.SSS3.p2.9.m9.1c">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS4.SSS3.p2.9.m9.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math> to a moderately low level, not too low.</p> </div> <figure class="ltx_figure" id="S4.F8"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="296" id="S4.F8.g1" src="x7.png" width="813"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure">Figure 8: </span>Relationship between <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S4.F8.4.m1.1"><semantics id="S4.F8.4.m1.1b"><msub id="S4.F8.4.m1.1.1" xref="S4.F8.4.m1.1.1.cmml"><mi id="S4.F8.4.m1.1.1.2" xref="S4.F8.4.m1.1.1.2.cmml">SNR</mi><mi id="S4.F8.4.m1.1.1.3" xref="S4.F8.4.m1.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S4.F8.4.m1.1c"><apply id="S4.F8.4.m1.1.1.cmml" xref="S4.F8.4.m1.1.1"><csymbol cd="ambiguous" id="S4.F8.4.m1.1.1.1.cmml" xref="S4.F8.4.m1.1.1">subscript</csymbol><ci id="S4.F8.4.m1.1.1.2.cmml" xref="S4.F8.4.m1.1.1.2">SNR</ci><ci id="S4.F8.4.m1.1.1.3.cmml" xref="S4.F8.4.m1.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.F8.4.m1.1d">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S4.F8.4.m1.1e">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math> and the evaluation results of NyTT. Values in parentheses indicate the evaluation results of unprocessed input signals. <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.F8.5.m2.1"><semantics id="S4.F8.5.m2.1b"><msup id="S4.F8.5.m2.1.1" xref="S4.F8.5.m2.1.1.cmml"><mi id="S4.F8.5.m2.1.1.2" xref="S4.F8.5.m2.1.1.2.cmml">𝒏</mi><mi id="S4.F8.5.m2.1.1.3" xref="S4.F8.5.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.F8.5.m2.1c"><apply id="S4.F8.5.m2.1.1.cmml" xref="S4.F8.5.m2.1.1"><csymbol cd="ambiguous" id="S4.F8.5.m2.1.1.1.cmml" xref="S4.F8.5.m2.1.1">superscript</csymbol><ci id="S4.F8.5.m2.1.1.2.cmml" xref="S4.F8.5.m2.1.1.2">𝒏</ci><ci id="S4.F8.5.m2.1.1.3.cmml" xref="S4.F8.5.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.F8.5.m2.1d">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.F8.5.m2.1e">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> was <span class="ltx_text ltx_font_typewriter" id="S4.F8.8.1">CHiME-B</span> and <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.F8.6.m3.1"><semantics id="S4.F8.6.m3.1b"><msub id="S4.F8.6.m3.1.1" xref="S4.F8.6.m3.1.1.cmml"><mi id="S4.F8.6.m3.1.1.2" xref="S4.F8.6.m3.1.1.2.cmml">SNR</mi><mi id="S4.F8.6.m3.1.1.3" xref="S4.F8.6.m3.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.F8.6.m3.1c"><apply id="S4.F8.6.m3.1.1.cmml" xref="S4.F8.6.m3.1.1"><csymbol cd="ambiguous" id="S4.F8.6.m3.1.1.1.cmml" xref="S4.F8.6.m3.1.1">subscript</csymbol><ci id="S4.F8.6.m3.1.1.2.cmml" xref="S4.F8.6.m3.1.1.2">SNR</ci><ci id="S4.F8.6.m3.1.1.3.cmml" xref="S4.F8.6.m3.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.F8.6.m3.1d">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.F8.6.m3.1e">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> was 5 dB.</figcaption> </figure> </section> </section> <section class="ltx_subsection" id="S4.SS5"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.5 </span>Effectiveness of utilizing noisy signals in a situation where clean target signals are available</h3> <div class="ltx_para" id="S4.SS5.p1"> <p class="ltx_p" id="S4.SS5.p1.4">In this experiment, we used 100 utterances from LibriSpeech as <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p1.4.1">Clean-100</span> and 900 utterances as <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p1.4.2">Clean-900</span>. We generated <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p1.4.3">Noisy-900</span> by mixing <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p1.4.4">Clean-900</span> with <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p1.4.5">CHiME-A</span> at an SNR of 5 dB. We investigated the effectiveness of using <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p1.4.6">Noisy-900</span>. Moreover, we investigated the effectiveness of using <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p1.4.7">EnhNoisy-900</span>, which was generated by applying TSE to <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p1.4.8">Noisy-900</span> using the CTT model trained on <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p1.4.9">Clean-100</span>. In this experiment, we used <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p1.4.10">CHiME-B</span> as <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.SS5.p1.1.m1.1"><semantics id="S4.SS5.p1.1.m1.1a"><msup id="S4.SS5.p1.1.m1.1.1" xref="S4.SS5.p1.1.m1.1.1.cmml"><mi id="S4.SS5.p1.1.m1.1.1.2" xref="S4.SS5.p1.1.m1.1.1.2.cmml">𝒏</mi><mi id="S4.SS5.p1.1.m1.1.1.3" xref="S4.SS5.p1.1.m1.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS5.p1.1.m1.1b"><apply id="S4.SS5.p1.1.m1.1.1.cmml" xref="S4.SS5.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS5.p1.1.m1.1.1.1.cmml" xref="S4.SS5.p1.1.m1.1.1">superscript</csymbol><ci id="S4.SS5.p1.1.m1.1.1.2.cmml" xref="S4.SS5.p1.1.m1.1.1.2">𝒏</ci><ci id="S4.SS5.p1.1.m1.1.1.3.cmml" xref="S4.SS5.p1.1.m1.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS5.p1.1.m1.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.SS5.p1.1.m1.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, and we used 50 clean utterances from LibriSpeech for the validation of both CTT and NyTT. Note that there was no mismatch between <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS5.p1.2.m2.1"><semantics id="S4.SS5.p1.2.m2.1a"><msup id="S4.SS5.p1.2.m2.1.1" xref="S4.SS5.p1.2.m2.1.1.cmml"><mi id="S4.SS5.p1.2.m2.1.1.2" xref="S4.SS5.p1.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS5.p1.2.m2.1.1.3" xref="S4.SS5.p1.2.m2.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS5.p1.2.m2.1b"><apply id="S4.SS5.p1.2.m2.1.1.cmml" xref="S4.SS5.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS5.p1.2.m2.1.1.1.cmml" xref="S4.SS5.p1.2.m2.1.1">superscript</csymbol><ci id="S4.SS5.p1.2.m2.1.1.2.cmml" xref="S4.SS5.p1.2.m2.1.1.2">𝒏</ci><ci id="S4.SS5.p1.2.m2.1.1.3.cmml" xref="S4.SS5.p1.2.m2.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS5.p1.2.m2.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS5.p1.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS5.p1.3.m3.1"><semantics id="S4.SS5.p1.3.m3.1a"><msup id="S4.SS5.p1.3.m3.1.1" xref="S4.SS5.p1.3.m3.1.1.cmml"><mi id="S4.SS5.p1.3.m3.1.1.2" xref="S4.SS5.p1.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS5.p1.3.m3.1.1.3" xref="S4.SS5.p1.3.m3.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS5.p1.3.m3.1b"><apply id="S4.SS5.p1.3.m3.1.1.cmml" xref="S4.SS5.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS5.p1.3.m3.1.1.1.cmml" xref="S4.SS5.p1.3.m3.1.1">superscript</csymbol><ci id="S4.SS5.p1.3.m3.1.1.2.cmml" xref="S4.SS5.p1.3.m3.1.1.2">𝒏</ci><ci id="S4.SS5.p1.3.m3.1.1.3.cmml" xref="S4.SS5.p1.3.m3.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS5.p1.3.m3.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS5.p1.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, and <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS5.p1.4.m4.1"><semantics id="S4.SS5.p1.4.m4.1a"><msub id="S4.SS5.p1.4.m4.1.1" xref="S4.SS5.p1.4.m4.1.1.cmml"><mi id="S4.SS5.p1.4.m4.1.1.2" xref="S4.SS5.p1.4.m4.1.1.2.cmml">SNR</mi><mi id="S4.SS5.p1.4.m4.1.1.3" xref="S4.SS5.p1.4.m4.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS5.p1.4.m4.1b"><apply id="S4.SS5.p1.4.m4.1.1.cmml" xref="S4.SS5.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S4.SS5.p1.4.m4.1.1.1.cmml" xref="S4.SS5.p1.4.m4.1.1">subscript</csymbol><ci id="S4.SS5.p1.4.m4.1.1.2.cmml" xref="S4.SS5.p1.4.m4.1.1.2">SNR</ci><ci id="S4.SS5.p1.4.m4.1.1.3.cmml" xref="S4.SS5.p1.4.m4.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS5.p1.4.m4.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS5.p1.4.m4.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> was 5 dB, resulting in a challenging condition for NyTT. Since the volumes of the speech datasets were different, we carefully trained a DNN for enough epochs, ensuring that the best epoch remained unchanged for the last 300 epochs.</p> </div> <div class="ltx_para" id="S4.SS5.p2"> <p class="ltx_p" id="S4.SS5.p2.3">Table <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S4.T5" title="Table 5 ‣ 4.5 Effectiveness of utilizing noisy signals in a situation where clean target signals are available ‣ 4 Experimental analysis in the denoising task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">5</span></a> shows the evaluation results, demonstrating that the combined use of <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p2.3.1">Clean-100</span> and <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p2.3.2">Noisy-900</span> achieves a higher SI-SDR than using either dataset separately. Furthermore, the performance of the combined use of <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p2.3.3">Clean-100</span> and <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p2.3.4">EnhNoisy-900</span> approaches that of the ideal situation where both <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p2.3.5">Clean-100</span> and <span class="ltx_text ltx_font_typewriter" id="S4.SS5.p2.3.6">Clean-900</span> are available. We can also expect that the performance will improve with the use of noisy targets recorded under better conditions (i.e., higher <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.SS5.p2.1.m1.1"><semantics id="S4.SS5.p2.1.m1.1a"><msub id="S4.SS5.p2.1.m1.1.1" xref="S4.SS5.p2.1.m1.1.1.cmml"><mi id="S4.SS5.p2.1.m1.1.1.2" xref="S4.SS5.p2.1.m1.1.1.2.cmml">SNR</mi><mi id="S4.SS5.p2.1.m1.1.1.3" xref="S4.SS5.p2.1.m1.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.SS5.p2.1.m1.1b"><apply id="S4.SS5.p2.1.m1.1.1.cmml" xref="S4.SS5.p2.1.m1.1.1"><csymbol cd="ambiguous" id="S4.SS5.p2.1.m1.1.1.1.cmml" xref="S4.SS5.p2.1.m1.1.1">subscript</csymbol><ci id="S4.SS5.p2.1.m1.1.1.2.cmml" xref="S4.SS5.p2.1.m1.1.1.2">SNR</ci><ci id="S4.SS5.p2.1.m1.1.1.3.cmml" xref="S4.SS5.p2.1.m1.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS5.p2.1.m1.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.SS5.p2.1.m1.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> or <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.SS5.p2.2.m2.1"><semantics id="S4.SS5.p2.2.m2.1a"><msup id="S4.SS5.p2.2.m2.1.1" xref="S4.SS5.p2.2.m2.1.1.cmml"><mi id="S4.SS5.p2.2.m2.1.1.2" xref="S4.SS5.p2.2.m2.1.1.2.cmml">𝒏</mi><mi id="S4.SS5.p2.2.m2.1.1.3" xref="S4.SS5.p2.2.m2.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS5.p2.2.m2.1b"><apply id="S4.SS5.p2.2.m2.1.1.cmml" xref="S4.SS5.p2.2.m2.1.1"><csymbol cd="ambiguous" id="S4.SS5.p2.2.m2.1.1.1.cmml" xref="S4.SS5.p2.2.m2.1.1">superscript</csymbol><ci id="S4.SS5.p2.2.m2.1.1.2.cmml" xref="S4.SS5.p2.2.m2.1.1.2">𝒏</ci><ci id="S4.SS5.p2.2.m2.1.1.3.cmml" xref="S4.SS5.p2.2.m2.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS5.p2.2.m2.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.SS5.p2.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> mismatched with <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.SS5.p2.3.m3.1"><semantics id="S4.SS5.p2.3.m3.1a"><msup id="S4.SS5.p2.3.m3.1.1" xref="S4.SS5.p2.3.m3.1.1.cmml"><mi id="S4.SS5.p2.3.m3.1.1.2" xref="S4.SS5.p2.3.m3.1.1.2.cmml">𝒏</mi><mi id="S4.SS5.p2.3.m3.1.1.3" xref="S4.SS5.p2.3.m3.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.SS5.p2.3.m3.1b"><apply id="S4.SS5.p2.3.m3.1.1.cmml" xref="S4.SS5.p2.3.m3.1.1"><csymbol cd="ambiguous" id="S4.SS5.p2.3.m3.1.1.1.cmml" xref="S4.SS5.p2.3.m3.1.1">superscript</csymbol><ci id="S4.SS5.p2.3.m3.1.1.2.cmml" xref="S4.SS5.p2.3.m3.1.1.2">𝒏</ci><ci id="S4.SS5.p2.3.m3.1.1.3.cmml" xref="S4.SS5.p2.3.m3.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.SS5.p2.3.m3.1c">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.SS5.p2.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>). These results indicate that leveraging a large number of noisy signals is beneficial, even when a small number of clean target signals are available.</p> </div> <figure class="ltx_table" id="S4.T5"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table">Table 5: </span>Evaluation results of the combined use of the clean and noisy signals. <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S4.T5.5.m1.1"><semantics id="S4.T5.5.m1.1b"><msup id="S4.T5.5.m1.1.1" xref="S4.T5.5.m1.1.1.cmml"><mi id="S4.T5.5.m1.1.1.2" xref="S4.T5.5.m1.1.1.2.cmml">𝒏</mi><mi id="S4.T5.5.m1.1.1.3" xref="S4.T5.5.m1.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T5.5.m1.1c"><apply id="S4.T5.5.m1.1.1.cmml" xref="S4.T5.5.m1.1.1"><csymbol cd="ambiguous" id="S4.T5.5.m1.1.1.1.cmml" xref="S4.T5.5.m1.1.1">superscript</csymbol><ci id="S4.T5.5.m1.1.1.2.cmml" xref="S4.T5.5.m1.1.1.2">𝒏</ci><ci id="S4.T5.5.m1.1.1.3.cmml" xref="S4.T5.5.m1.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T5.5.m1.1d">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S4.T5.5.m1.1e">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> was <span class="ltx_text ltx_font_typewriter" id="S4.T5.13.1">CHiME-A</span>, <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S4.T5.6.m2.1"><semantics id="S4.T5.6.m2.1b"><msup id="S4.T5.6.m2.1.1" xref="S4.T5.6.m2.1.1.cmml"><mi id="S4.T5.6.m2.1.1.2" xref="S4.T5.6.m2.1.1.2.cmml">𝒏</mi><mi id="S4.T5.6.m2.1.1.3" xref="S4.T5.6.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T5.6.m2.1c"><apply id="S4.T5.6.m2.1.1.cmml" xref="S4.T5.6.m2.1.1"><csymbol cd="ambiguous" id="S4.T5.6.m2.1.1.1.cmml" xref="S4.T5.6.m2.1.1">superscript</csymbol><ci id="S4.T5.6.m2.1.1.2.cmml" xref="S4.T5.6.m2.1.1.2">𝒏</ci><ci id="S4.T5.6.m2.1.1.3.cmml" xref="S4.T5.6.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T5.6.m2.1d">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S4.T5.6.m2.1e">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> was <span class="ltx_text ltx_font_typewriter" id="S4.T5.14.2">CHiME-B</span>, and <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S4.T5.7.m3.1"><semantics id="S4.T5.7.m3.1b"><msub id="S4.T5.7.m3.1.1" xref="S4.T5.7.m3.1.1.cmml"><mi id="S4.T5.7.m3.1.1.2" xref="S4.T5.7.m3.1.1.2.cmml">SNR</mi><mi id="S4.T5.7.m3.1.1.3" xref="S4.T5.7.m3.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S4.T5.7.m3.1c"><apply id="S4.T5.7.m3.1.1.cmml" xref="S4.T5.7.m3.1.1"><csymbol cd="ambiguous" id="S4.T5.7.m3.1.1.1.cmml" xref="S4.T5.7.m3.1.1">subscript</csymbol><ci id="S4.T5.7.m3.1.1.2.cmml" xref="S4.T5.7.m3.1.1.2">SNR</ci><ci id="S4.T5.7.m3.1.1.3.cmml" xref="S4.T5.7.m3.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T5.7.m3.1d">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S4.T5.7.m3.1e">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> was 5 dB. SI-SDR, PESQ, and STOI of the unprocessed noisy signals were 10.27 dB, 1.48, and 0.874, respectively. <math alttext="\bm{n}^{\rm test}" class="ltx_Math" display="inline" id="S4.T5.8.m4.1"><semantics id="S4.T5.8.m4.1b"><msup id="S4.T5.8.m4.1.1" xref="S4.T5.8.m4.1.1.cmml"><mi id="S4.T5.8.m4.1.1.2" xref="S4.T5.8.m4.1.1.2.cmml">𝒏</mi><mi id="S4.T5.8.m4.1.1.3" xref="S4.T5.8.m4.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S4.T5.8.m4.1c"><apply id="S4.T5.8.m4.1.1.cmml" xref="S4.T5.8.m4.1.1"><csymbol cd="ambiguous" id="S4.T5.8.m4.1.1.1.cmml" xref="S4.T5.8.m4.1.1">superscript</csymbol><ci id="S4.T5.8.m4.1.1.2.cmml" xref="S4.T5.8.m4.1.1.2">𝒏</ci><ci id="S4.T5.8.m4.1.1.3.cmml" xref="S4.T5.8.m4.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S4.T5.8.m4.1d">\bm{n}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S4.T5.8.m4.1e">bold_italic_n start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math> was <span class="ltx_text ltx_font_typewriter" id="S4.T5.15.3">CHiME-C</span>. We assume that <span class="ltx_text ltx_font_typewriter" id="S4.T5.16.4">Clean-900</span> is not available.</figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T5.17" style="width:216.6pt;height:81pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-36.1pt,13.5pt) scale(0.75,0.75) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T5.17.1"> <tr class="ltx_tr" id="S4.T5.17.1.1"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T5.17.1.1.1">Training dataset</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T5.17.1.1.2">SI-SDR</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T5.17.1.1.3">PESQ</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T5.17.1.1.4">STOI</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T5.17.1.1.5">Epoch</td> </tr> <tr class="ltx_tr" id="S4.T5.17.1.2"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.17.1.2.1"><span class="ltx_text ltx_font_typewriter" id="S4.T5.17.1.2.1.1">Clean-100</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T5.17.1.2.2">13.82</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T5.17.1.2.3">2.03</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T5.17.1.2.4">0.910</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T5.17.1.2.5">3,816</td> </tr> <tr class="ltx_tr" id="S4.T5.17.1.3"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T5.17.1.3.1"><span class="ltx_text ltx_font_typewriter" id="S4.T5.17.1.3.1.1">Noisy-900</span></td> <td class="ltx_td ltx_align_center" id="S4.T5.17.1.3.2">13.68</td> <td class="ltx_td ltx_align_center" id="S4.T5.17.1.3.3">1.88</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T5.17.1.3.4">0.906</td> <td class="ltx_td ltx_align_center" id="S4.T5.17.1.3.5">1,302</td> </tr> <tr class="ltx_tr" id="S4.T5.17.1.4"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T5.17.1.4.1"> <span class="ltx_text ltx_font_typewriter" id="S4.T5.17.1.4.1.1">Clean-100</span>, <span class="ltx_text ltx_font_typewriter" id="S4.T5.17.1.4.1.2">Noisy-900</span> </td> <td class="ltx_td ltx_align_center" id="S4.T5.17.1.4.2">14.35</td> <td class="ltx_td ltx_align_center" id="S4.T5.17.1.4.3">2.00</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T5.17.1.4.4">0.914</td> <td class="ltx_td ltx_align_center" id="S4.T5.17.1.4.5">2,350</td> </tr> <tr class="ltx_tr" id="S4.T5.17.1.5"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T5.17.1.5.1"> <span class="ltx_text ltx_font_typewriter" id="S4.T5.17.1.5.1.1">Clean-100</span>, <span class="ltx_text ltx_font_typewriter" id="S4.T5.17.1.5.1.2">EnhNoisy-900</span> </td> <td class="ltx_td ltx_align_center" id="S4.T5.17.1.5.2">16.45</td> <td class="ltx_td ltx_align_center" id="S4.T5.17.1.5.3">2.33</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T5.17.1.5.4">0.935</td> <td class="ltx_td ltx_align_center" id="S4.T5.17.1.5.5">11,626</td> </tr> <tr class="ltx_tr" id="S4.T5.17.1.6"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T5.17.1.6.1"><span class="ltx_text ltx_font_typewriter" id="S4.T5.17.1.6.1.1" style="color:#808080;">Clean-100<span class="ltx_text ltx_font_serif" id="S4.T5.17.1.6.1.1.1" style="color:#808080;">, </span>Clean-900</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T5.17.1.6.2"><span class="ltx_text ltx_font_bold" id="S4.T5.17.1.6.2.1" style="color:#808080;">17.13</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T5.17.1.6.3"><span class="ltx_text ltx_font_bold" id="S4.T5.17.1.6.3.1" style="color:#808080;">2.56</span></td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T5.17.1.6.4"><span class="ltx_text ltx_font_bold" id="S4.T5.17.1.6.4.1" style="color:#808080;">0.942</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T5.17.1.6.5"><span class="ltx_text" id="S4.T5.17.1.6.5.1" style="color:#808080;">11,593</span></td> </tr> </table> </span></div> </figure> </section> </section> <section class="ltx_section" id="S5"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">5 </span>Experimental analysis in the dereverberation task</h2> <section class="ltx_subsection" id="S5.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">5.1 </span>Setups</h3> <div class="ltx_para" id="S5.SS1.p1"> <p class="ltx_p" id="S5.SS1.p1.7">In the experiments, we used RIR simulated by utilizing Pyroomacoustics <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">Scheibler2018pyroomacoustics</span>]</cite>. The room width and depth were randomly selected from 4 to 8 m, the height was randomly selected from 2 to 6 m, and the distance between the microphone and the sound source was set to 1 m. The reverberation time <math alttext="\mathrm{RT}_{60}" class="ltx_Math" display="inline" id="S5.SS1.p1.1.m1.1"><semantics id="S5.SS1.p1.1.m1.1a"><msub id="S5.SS1.p1.1.m1.1.1" xref="S5.SS1.p1.1.m1.1.1.cmml"><mi id="S5.SS1.p1.1.m1.1.1.2" xref="S5.SS1.p1.1.m1.1.1.2.cmml">RT</mi><mn id="S5.SS1.p1.1.m1.1.1.3" xref="S5.SS1.p1.1.m1.1.1.3.cmml">60</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS1.p1.1.m1.1b"><apply id="S5.SS1.p1.1.m1.1.1.cmml" xref="S5.SS1.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S5.SS1.p1.1.m1.1.1.1.cmml" xref="S5.SS1.p1.1.m1.1.1">subscript</csymbol><ci id="S5.SS1.p1.1.m1.1.1.2.cmml" xref="S5.SS1.p1.1.m1.1.1.2">RT</ci><cn id="S5.SS1.p1.1.m1.1.1.3.cmml" type="integer" xref="S5.SS1.p1.1.m1.1.1.3">60</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS1.p1.1.m1.1c">\mathrm{RT}_{60}</annotation><annotation encoding="application/x-llamapun" id="S5.SS1.p1.1.m1.1d">roman_RT start_POSTSUBSCRIPT 60 end_POSTSUBSCRIPT</annotation></semantics></math> ranged from 0.20 to <math alttext="1.10\text{\,}\mathrm{s}" class="ltx_Math" display="inline" id="S5.SS1.p1.2.m2.3"><semantics id="S5.SS1.p1.2.m2.3a"><mrow id="S5.SS1.p1.2.m2.3.3" xref="S5.SS1.p1.2.m2.3.3.cmml"><mn id="S5.SS1.p1.2.m2.1.1.1.1.1.1" xref="S5.SS1.p1.2.m2.1.1.1.1.1.1.cmml">1.10</mn><mtext id="S5.SS1.p1.2.m2.2.2.2.2.2.2" xref="S5.SS1.p1.2.m2.2.2.2.2.2.2.cmml"> </mtext><mi id="S5.SS1.p1.2.m2.3.3.3.3.3.3" mathvariant="normal" xref="S5.SS1.p1.2.m2.3.3.3.3.3.3.cmml">s</mi></mrow><annotation-xml encoding="MathML-Content" id="S5.SS1.p1.2.m2.3b"><apply id="S5.SS1.p1.2.m2.3.3.cmml" xref="S5.SS1.p1.2.m2.3.3"><csymbol cd="latexml" id="S5.SS1.p1.2.m2.2.2.2.2.2.2.cmml" xref="S5.SS1.p1.2.m2.2.2.2.2.2.2">times</csymbol><cn id="S5.SS1.p1.2.m2.1.1.1.1.1.1.cmml" type="float" xref="S5.SS1.p1.2.m2.1.1.1.1.1.1">1.10</cn><ci id="S5.SS1.p1.2.m2.3.3.3.3.3.3.cmml" xref="S5.SS1.p1.2.m2.3.3.3.3.3.3">s</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS1.p1.2.m2.3c">1.10\text{\,}\mathrm{s}</annotation><annotation encoding="application/x-llamapun" id="S5.SS1.p1.2.m2.3d">start_ARG 1.10 end_ARG start_ARG times end_ARG start_ARG roman_s end_ARG</annotation></semantics></math>. We divided the range into <math alttext="0.05\text{\,}\mathrm{s}" class="ltx_Math" display="inline" id="S5.SS1.p1.3.m3.3"><semantics id="S5.SS1.p1.3.m3.3a"><mrow id="S5.SS1.p1.3.m3.3.3" xref="S5.SS1.p1.3.m3.3.3.cmml"><mn id="S5.SS1.p1.3.m3.1.1.1.1.1.1" xref="S5.SS1.p1.3.m3.1.1.1.1.1.1.cmml">0.05</mn><mtext id="S5.SS1.p1.3.m3.2.2.2.2.2.2" xref="S5.SS1.p1.3.m3.2.2.2.2.2.2.cmml"> </mtext><mi id="S5.SS1.p1.3.m3.3.3.3.3.3.3" mathvariant="normal" xref="S5.SS1.p1.3.m3.3.3.3.3.3.3.cmml">s</mi></mrow><annotation-xml encoding="MathML-Content" id="S5.SS1.p1.3.m3.3b"><apply id="S5.SS1.p1.3.m3.3.3.cmml" xref="S5.SS1.p1.3.m3.3.3"><csymbol cd="latexml" id="S5.SS1.p1.3.m3.2.2.2.2.2.2.cmml" xref="S5.SS1.p1.3.m3.2.2.2.2.2.2">times</csymbol><cn id="S5.SS1.p1.3.m3.1.1.1.1.1.1.cmml" type="float" xref="S5.SS1.p1.3.m3.1.1.1.1.1.1">0.05</cn><ci id="S5.SS1.p1.3.m3.3.3.3.3.3.3.cmml" xref="S5.SS1.p1.3.m3.3.3.3.3.3.3">s</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS1.p1.3.m3.3c">0.05\text{\,}\mathrm{s}</annotation><annotation encoding="application/x-llamapun" id="S5.SS1.p1.3.m3.3d">start_ARG 0.05 end_ARG start_ARG times end_ARG start_ARG roman_s end_ARG</annotation></semantics></math> segments and generated 170 RIRs for each interval. The 170 RIRs were split into 80, 80, and 10 samples, and which were used as <span class="ltx_text ltx_font_typewriter" id="S5.SS1.p1.7.1">RIR-A</span>, <span class="ltx_text ltx_font_typewriter" id="S5.SS1.p1.7.2">RIR-B</span>, and <span class="ltx_text ltx_font_typewriter" id="S5.SS1.p1.7.3">RIR-C</span>, respectively. The total number of <span class="ltx_text ltx_font_typewriter" id="S5.SS1.p1.7.4">RIR-A</span>, <span class="ltx_text ltx_font_typewriter" id="S5.SS1.p1.7.5">RIR-B</span>, and <span class="ltx_text ltx_font_typewriter" id="S5.SS1.p1.7.6">RIR-C</span> were 1,440, 1,440, and 180, respectively. Each of these three datasets covers the same <math alttext="\mathrm{RT}_{60}" class="ltx_Math" display="inline" id="S5.SS1.p1.4.m4.1"><semantics id="S5.SS1.p1.4.m4.1a"><msub id="S5.SS1.p1.4.m4.1.1" xref="S5.SS1.p1.4.m4.1.1.cmml"><mi id="S5.SS1.p1.4.m4.1.1.2" xref="S5.SS1.p1.4.m4.1.1.2.cmml">RT</mi><mn id="S5.SS1.p1.4.m4.1.1.3" xref="S5.SS1.p1.4.m4.1.1.3.cmml">60</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS1.p1.4.m4.1b"><apply id="S5.SS1.p1.4.m4.1.1.cmml" xref="S5.SS1.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S5.SS1.p1.4.m4.1.1.1.cmml" xref="S5.SS1.p1.4.m4.1.1">subscript</csymbol><ci id="S5.SS1.p1.4.m4.1.1.2.cmml" xref="S5.SS1.p1.4.m4.1.1.2">RT</ci><cn id="S5.SS1.p1.4.m4.1.1.3.cmml" type="integer" xref="S5.SS1.p1.4.m4.1.1.3">60</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS1.p1.4.m4.1c">\mathrm{RT}_{60}</annotation><annotation encoding="application/x-llamapun" id="S5.SS1.p1.4.m4.1d">roman_RT start_POSTSUBSCRIPT 60 end_POSTSUBSCRIPT</annotation></semantics></math> range. <span class="ltx_text ltx_font_typewriter" id="S5.SS1.p1.7.7">RIR-A</span>, <span class="ltx_text ltx_font_typewriter" id="S5.SS1.p1.7.8">RIR-B</span>, and <span class="ltx_text ltx_font_typewriter" id="S5.SS1.p1.7.9">RIR-C</span> were used as <math alttext="\bm{r}^{\rm obs}" class="ltx_Math" display="inline" id="S5.SS1.p1.5.m5.1"><semantics id="S5.SS1.p1.5.m5.1a"><msup id="S5.SS1.p1.5.m5.1.1" xref="S5.SS1.p1.5.m5.1.1.cmml"><mi id="S5.SS1.p1.5.m5.1.1.2" xref="S5.SS1.p1.5.m5.1.1.2.cmml">𝒓</mi><mi id="S5.SS1.p1.5.m5.1.1.3" xref="S5.SS1.p1.5.m5.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S5.SS1.p1.5.m5.1b"><apply id="S5.SS1.p1.5.m5.1.1.cmml" xref="S5.SS1.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S5.SS1.p1.5.m5.1.1.1.cmml" xref="S5.SS1.p1.5.m5.1.1">superscript</csymbol><ci id="S5.SS1.p1.5.m5.1.1.2.cmml" xref="S5.SS1.p1.5.m5.1.1.2">𝒓</ci><ci id="S5.SS1.p1.5.m5.1.1.3.cmml" xref="S5.SS1.p1.5.m5.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS1.p1.5.m5.1c">\bm{r}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S5.SS1.p1.5.m5.1d">bold_italic_r start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>, <math alttext="\bm{r}^{\rm add}" class="ltx_Math" display="inline" id="S5.SS1.p1.6.m6.1"><semantics id="S5.SS1.p1.6.m6.1a"><msup id="S5.SS1.p1.6.m6.1.1" xref="S5.SS1.p1.6.m6.1.1.cmml"><mi id="S5.SS1.p1.6.m6.1.1.2" xref="S5.SS1.p1.6.m6.1.1.2.cmml">𝒓</mi><mi id="S5.SS1.p1.6.m6.1.1.3" xref="S5.SS1.p1.6.m6.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S5.SS1.p1.6.m6.1b"><apply id="S5.SS1.p1.6.m6.1.1.cmml" xref="S5.SS1.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S5.SS1.p1.6.m6.1.1.1.cmml" xref="S5.SS1.p1.6.m6.1.1">superscript</csymbol><ci id="S5.SS1.p1.6.m6.1.1.2.cmml" xref="S5.SS1.p1.6.m6.1.1.2">𝒓</ci><ci id="S5.SS1.p1.6.m6.1.1.3.cmml" xref="S5.SS1.p1.6.m6.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS1.p1.6.m6.1c">\bm{r}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S5.SS1.p1.6.m6.1d">bold_italic_r start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, and <math alttext="\bm{r}^{\rm test}" class="ltx_Math" display="inline" id="S5.SS1.p1.7.m7.1"><semantics id="S5.SS1.p1.7.m7.1a"><msup id="S5.SS1.p1.7.m7.1.1" xref="S5.SS1.p1.7.m7.1.1.cmml"><mi id="S5.SS1.p1.7.m7.1.1.2" xref="S5.SS1.p1.7.m7.1.1.2.cmml">𝒓</mi><mi id="S5.SS1.p1.7.m7.1.1.3" xref="S5.SS1.p1.7.m7.1.1.3.cmml">test</mi></msup><annotation-xml encoding="MathML-Content" id="S5.SS1.p1.7.m7.1b"><apply id="S5.SS1.p1.7.m7.1.1.cmml" xref="S5.SS1.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S5.SS1.p1.7.m7.1.1.1.cmml" xref="S5.SS1.p1.7.m7.1.1">superscript</csymbol><ci id="S5.SS1.p1.7.m7.1.1.2.cmml" xref="S5.SS1.p1.7.m7.1.1.2">𝒓</ci><ci id="S5.SS1.p1.7.m7.1.1.3.cmml" xref="S5.SS1.p1.7.m7.1.1.3">test</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS1.p1.7.m7.1c">\bm{r}^{\rm test}</annotation><annotation encoding="application/x-llamapun" id="S5.SS1.p1.7.m7.1d">bold_italic_r start_POSTSUPERSCRIPT roman_test end_POSTSUPERSCRIPT</annotation></semantics></math>, respectively. The clean target signals were 10,000 utterances from LibriSpeech and the reverberant target signals were generated by convolving the clean target signals with <span class="ltx_text ltx_font_typewriter" id="S5.SS1.p1.7.10">RIR-A</span>. The test dataset of reverberant signals was generated by convolving 1,000 utterances from LibriSpeech with <span class="ltx_text ltx_font_typewriter" id="S5.SS1.p1.7.11">RIR-C</span>. The sampling frequency was 16 kHz.</p> </div> <div class="ltx_para" id="S5.SS1.p2"> <p class="ltx_p" id="S5.SS1.p2.1">The DNN was Conv-TasNet <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">luo2019conv</span>]</cite> implemented in the Asteroid toolkit <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">pariente2020asteroid</span>]</cite>, and the loss function was SNR. We trained the DNN for 850 epochs with a mini-batch size of 12, using the Adam optimizer with a fixed learning rate of 0.0001. For the validation of both CTT and NyTT, we used 50 clean utterances of LibriSpeech and generated reverberant signals using the same <math alttext="\bm{r}^{\rm add}" class="ltx_Math" display="inline" id="S5.SS1.p2.1.m1.1"><semantics id="S5.SS1.p2.1.m1.1a"><msup id="S5.SS1.p2.1.m1.1.1" xref="S5.SS1.p2.1.m1.1.1.cmml"><mi id="S5.SS1.p2.1.m1.1.1.2" xref="S5.SS1.p2.1.m1.1.1.2.cmml">𝒓</mi><mi id="S5.SS1.p2.1.m1.1.1.3" xref="S5.SS1.p2.1.m1.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S5.SS1.p2.1.m1.1b"><apply id="S5.SS1.p2.1.m1.1.1.cmml" xref="S5.SS1.p2.1.m1.1.1"><csymbol cd="ambiguous" id="S5.SS1.p2.1.m1.1.1.1.cmml" xref="S5.SS1.p2.1.m1.1.1">superscript</csymbol><ci id="S5.SS1.p2.1.m1.1.1.2.cmml" xref="S5.SS1.p2.1.m1.1.1.2">𝒓</ci><ci id="S5.SS1.p2.1.m1.1.1.3.cmml" xref="S5.SS1.p2.1.m1.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS1.p2.1.m1.1c">\bm{r}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S5.SS1.p2.1.m1.1d">bold_italic_r start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math> as in the training. As the metrics, we used speech-to-reverberation modulation energy ratio (SRMR) <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">falk2010non</span>]</cite> in addition to SI-SDR, PESQ, and STOI.</p> </div> </section> <section class="ltx_subsection" id="S5.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">5.2 </span>Effectiveness of NyTT in the dereverberation task</h3> <div class="ltx_para" id="S5.SS2.p1"> <p class="ltx_p" id="S5.SS2.p1.5">We conducted experimental evaluations of NyTT in the dereverberation task. Table <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S5.T6" title="Table 6 ‣ 5.2 Effectiveness of NyTT in the dereverberation task ‣ 5 Experimental analysis in the dereverberation task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">6</span></a> shows the evaluation results under different <math alttext="\mathrm{RT}{60}" class="ltx_Math" display="inline" id="S5.SS2.p1.1.m1.1"><semantics id="S5.SS2.p1.1.m1.1a"><mi id="S5.SS2.p1.1.m1.1.1" xref="S5.SS2.p1.1.m1.1.1.cmml">RT60</mi><annotation-xml encoding="MathML-Content" id="S5.SS2.p1.1.m1.1b"><ci id="S5.SS2.p1.1.m1.1.1.cmml" xref="S5.SS2.p1.1.m1.1.1">RT60</ci></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p1.1.m1.1c">\mathrm{RT}{60}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p1.1.m1.1d">RT60</annotation></semantics></math> conditions of <math alttext="\bm{r}^{\rm obs}" class="ltx_Math" display="inline" id="S5.SS2.p1.2.m2.1"><semantics id="S5.SS2.p1.2.m2.1a"><msup id="S5.SS2.p1.2.m2.1.1" xref="S5.SS2.p1.2.m2.1.1.cmml"><mi id="S5.SS2.p1.2.m2.1.1.2" xref="S5.SS2.p1.2.m2.1.1.2.cmml">𝒓</mi><mi id="S5.SS2.p1.2.m2.1.1.3" xref="S5.SS2.p1.2.m2.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S5.SS2.p1.2.m2.1b"><apply id="S5.SS2.p1.2.m2.1.1.cmml" xref="S5.SS2.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S5.SS2.p1.2.m2.1.1.1.cmml" xref="S5.SS2.p1.2.m2.1.1">superscript</csymbol><ci id="S5.SS2.p1.2.m2.1.1.2.cmml" xref="S5.SS2.p1.2.m2.1.1.2">𝒓</ci><ci id="S5.SS2.p1.2.m2.1.1.3.cmml" xref="S5.SS2.p1.2.m2.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p1.2.m2.1c">\bm{r}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p1.2.m2.1d">bold_italic_r start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>. First, we observe that NyTT achieves higher scores than the unprocessed signals in most cases, demonstrating its capability in this task. This result also demonstrates that NyTT is not Noise2Noise, since the degradation is not even caused by additive noise. Additionally, we can see that the performance improves with shorter <math alttext="\mathrm{RT}{60}" class="ltx_Math" display="inline" id="S5.SS2.p1.3.m3.1"><semantics id="S5.SS2.p1.3.m3.1a"><mi id="S5.SS2.p1.3.m3.1.1" xref="S5.SS2.p1.3.m3.1.1.cmml">RT60</mi><annotation-xml encoding="MathML-Content" id="S5.SS2.p1.3.m3.1b"><ci id="S5.SS2.p1.3.m3.1.1.cmml" xref="S5.SS2.p1.3.m3.1.1">RT60</ci></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p1.3.m3.1c">\mathrm{RT}{60}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p1.3.m3.1d">RT60</annotation></semantics></math> values and degrades with longer <math alttext="\mathrm{RT}{60}" class="ltx_Math" display="inline" id="S5.SS2.p1.4.m4.1"><semantics id="S5.SS2.p1.4.m4.1a"><mi id="S5.SS2.p1.4.m4.1.1" xref="S5.SS2.p1.4.m4.1.1.cmml">RT60</mi><annotation-xml encoding="MathML-Content" id="S5.SS2.p1.4.m4.1b"><ci id="S5.SS2.p1.4.m4.1.1.cmml" xref="S5.SS2.p1.4.m4.1.1">RT60</ci></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p1.4.m4.1c">\mathrm{RT}{60}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p1.4.m4.1d">RT60</annotation></semantics></math> values. This trend is consistent with the results of the denoising task, where higher quality (<math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S5.SS2.p1.5.m5.1"><semantics id="S5.SS2.p1.5.m5.1a"><msub id="S5.SS2.p1.5.m5.1.1" xref="S5.SS2.p1.5.m5.1.1.cmml"><mi id="S5.SS2.p1.5.m5.1.1.2" xref="S5.SS2.p1.5.m5.1.1.2.cmml">SNR</mi><mi id="S5.SS2.p1.5.m5.1.1.3" xref="S5.SS2.p1.5.m5.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S5.SS2.p1.5.m5.1b"><apply id="S5.SS2.p1.5.m5.1.1.cmml" xref="S5.SS2.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S5.SS2.p1.5.m5.1.1.1.cmml" xref="S5.SS2.p1.5.m5.1.1">subscript</csymbol><ci id="S5.SS2.p1.5.m5.1.1.2.cmml" xref="S5.SS2.p1.5.m5.1.1.2">SNR</ci><ci id="S5.SS2.p1.5.m5.1.1.3.cmml" xref="S5.SS2.p1.5.m5.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS2.p1.5.m5.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S5.SS2.p1.5.m5.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math>) leads to better performance.</p> </div> <figure class="ltx_table" id="S5.T6"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table">Table 6: </span>Evaluation results of CTT and NyTT in the dereverberation task.</figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S5.T6.5.5" style="width:236.7pt;height:81pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-39.4pt,13.5pt) scale(0.75,0.75) ;"> <table class="ltx_tabular ltx_align_middle" id="S5.T6.5.5.5"> <tr class="ltx_tr" id="S5.T6.2.2.2.2"> <td class="ltx_td ltx_align_center ltx_border_tt" id="S5.T6.2.2.2.2.3">Method</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S5.T6.2.2.2.2.2"> <math alttext="\mathrm{RT}_{60}" class="ltx_Math" display="inline" id="S5.T6.1.1.1.1.1.m1.1"><semantics id="S5.T6.1.1.1.1.1.m1.1a"><msub id="S5.T6.1.1.1.1.1.m1.1.1" xref="S5.T6.1.1.1.1.1.m1.1.1.cmml"><mi id="S5.T6.1.1.1.1.1.m1.1.1.2" xref="S5.T6.1.1.1.1.1.m1.1.1.2.cmml">RT</mi><mn id="S5.T6.1.1.1.1.1.m1.1.1.3" xref="S5.T6.1.1.1.1.1.m1.1.1.3.cmml">60</mn></msub><annotation-xml encoding="MathML-Content" id="S5.T6.1.1.1.1.1.m1.1b"><apply id="S5.T6.1.1.1.1.1.m1.1.1.cmml" xref="S5.T6.1.1.1.1.1.m1.1.1"><csymbol cd="ambiguous" id="S5.T6.1.1.1.1.1.m1.1.1.1.cmml" xref="S5.T6.1.1.1.1.1.m1.1.1">subscript</csymbol><ci id="S5.T6.1.1.1.1.1.m1.1.1.2.cmml" xref="S5.T6.1.1.1.1.1.m1.1.1.2">RT</ci><cn id="S5.T6.1.1.1.1.1.m1.1.1.3.cmml" type="integer" xref="S5.T6.1.1.1.1.1.m1.1.1.3">60</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T6.1.1.1.1.1.m1.1c">\mathrm{RT}_{60}</annotation><annotation encoding="application/x-llamapun" id="S5.T6.1.1.1.1.1.m1.1d">roman_RT start_POSTSUBSCRIPT 60 end_POSTSUBSCRIPT</annotation></semantics></math> of <math alttext="\bm{r}^{\rm obs}" class="ltx_Math" display="inline" id="S5.T6.2.2.2.2.2.m2.1"><semantics id="S5.T6.2.2.2.2.2.m2.1a"><msup id="S5.T6.2.2.2.2.2.m2.1.1" xref="S5.T6.2.2.2.2.2.m2.1.1.cmml"><mi id="S5.T6.2.2.2.2.2.m2.1.1.2" xref="S5.T6.2.2.2.2.2.m2.1.1.2.cmml">𝒓</mi><mi id="S5.T6.2.2.2.2.2.m2.1.1.3" xref="S5.T6.2.2.2.2.2.m2.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S5.T6.2.2.2.2.2.m2.1b"><apply id="S5.T6.2.2.2.2.2.m2.1.1.cmml" xref="S5.T6.2.2.2.2.2.m2.1.1"><csymbol cd="ambiguous" id="S5.T6.2.2.2.2.2.m2.1.1.1.cmml" xref="S5.T6.2.2.2.2.2.m2.1.1">superscript</csymbol><ci id="S5.T6.2.2.2.2.2.m2.1.1.2.cmml" xref="S5.T6.2.2.2.2.2.m2.1.1.2">𝒓</ci><ci id="S5.T6.2.2.2.2.2.m2.1.1.3.cmml" xref="S5.T6.2.2.2.2.2.m2.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.T6.2.2.2.2.2.m2.1c">\bm{r}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S5.T6.2.2.2.2.2.m2.1d">bold_italic_r start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> [sec]</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S5.T6.2.2.2.2.4">SI-SDR</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S5.T6.2.2.2.2.5">PESQ</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S5.T6.2.2.2.2.6">STOI</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S5.T6.2.2.2.2.7">SRMR</td> </tr> <tr class="ltx_tr" id="S5.T6.5.5.5.6"> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.5.5.5.6.1">Unprocessed</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S5.T6.5.5.5.6.2">-</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.5.5.5.6.3">-5.32</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.5.5.5.6.4">1.59</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.5.5.5.6.5">0.834</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S5.T6.5.5.5.6.6">4.84</td> </tr> <tr class="ltx_tr" id="S5.T6.5.5.5.7"> <td class="ltx_td ltx_align_center" id="S5.T6.5.5.5.7.1">CTT</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S5.T6.5.5.5.7.2">0.0</td> <td class="ltx_td ltx_align_center" id="S5.T6.5.5.5.7.3"><span class="ltx_text ltx_font_bold" id="S5.T6.5.5.5.7.3.1">3.83</span></td> <td class="ltx_td ltx_align_center" id="S5.T6.5.5.5.7.4"><span class="ltx_text ltx_font_bold" id="S5.T6.5.5.5.7.4.1">2.23</span></td> <td class="ltx_td ltx_align_center" id="S5.T6.5.5.5.7.5"><span class="ltx_text ltx_font_bold" id="S5.T6.5.5.5.7.5.1">0.918</span></td> <td class="ltx_td ltx_align_center" id="S5.T6.5.5.5.7.6"><span class="ltx_text ltx_font_bold" id="S5.T6.5.5.5.7.6.1">8.81</span></td> </tr> <tr class="ltx_tr" id="S5.T6.3.3.3.3"> <td class="ltx_td ltx_align_center" id="S5.T6.3.3.3.3.2">NyTT</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S5.T6.3.3.3.3.1"><math alttext="[0.20,0.50)" class="ltx_Math" display="inline" id="S5.T6.3.3.3.3.1.m1.2"><semantics id="S5.T6.3.3.3.3.1.m1.2a"><mrow id="S5.T6.3.3.3.3.1.m1.2.3.2" xref="S5.T6.3.3.3.3.1.m1.2.3.1.cmml"><mo id="S5.T6.3.3.3.3.1.m1.2.3.2.1" stretchy="false" xref="S5.T6.3.3.3.3.1.m1.2.3.1.cmml">[</mo><mn id="S5.T6.3.3.3.3.1.m1.1.1" xref="S5.T6.3.3.3.3.1.m1.1.1.cmml">0.20</mn><mo id="S5.T6.3.3.3.3.1.m1.2.3.2.2" xref="S5.T6.3.3.3.3.1.m1.2.3.1.cmml">,</mo><mn id="S5.T6.3.3.3.3.1.m1.2.2" xref="S5.T6.3.3.3.3.1.m1.2.2.cmml">0.50</mn><mo id="S5.T6.3.3.3.3.1.m1.2.3.2.3" stretchy="false" xref="S5.T6.3.3.3.3.1.m1.2.3.1.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S5.T6.3.3.3.3.1.m1.2b"><interval closure="closed-open" id="S5.T6.3.3.3.3.1.m1.2.3.1.cmml" xref="S5.T6.3.3.3.3.1.m1.2.3.2"><cn id="S5.T6.3.3.3.3.1.m1.1.1.cmml" type="float" xref="S5.T6.3.3.3.3.1.m1.1.1">0.20</cn><cn id="S5.T6.3.3.3.3.1.m1.2.2.cmml" type="float" xref="S5.T6.3.3.3.3.1.m1.2.2">0.50</cn></interval></annotation-xml><annotation encoding="application/x-tex" id="S5.T6.3.3.3.3.1.m1.2c">[0.20,0.50)</annotation><annotation encoding="application/x-llamapun" id="S5.T6.3.3.3.3.1.m1.2d">[ 0.20 , 0.50 )</annotation></semantics></math></td> <td class="ltx_td ltx_align_center" id="S5.T6.3.3.3.3.3">1.69</td> <td class="ltx_td ltx_align_center" id="S5.T6.3.3.3.3.4">1.95</td> <td class="ltx_td ltx_align_center" id="S5.T6.3.3.3.3.5">0.902</td> <td class="ltx_td ltx_align_center" id="S5.T6.3.3.3.3.6">7.59</td> </tr> <tr class="ltx_tr" id="S5.T6.4.4.4.4"> <td class="ltx_td ltx_align_center" id="S5.T6.4.4.4.4.2">NyTT</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S5.T6.4.4.4.4.1"><math alttext="[0.50,0.80)" class="ltx_Math" display="inline" id="S5.T6.4.4.4.4.1.m1.2"><semantics id="S5.T6.4.4.4.4.1.m1.2a"><mrow id="S5.T6.4.4.4.4.1.m1.2.3.2" xref="S5.T6.4.4.4.4.1.m1.2.3.1.cmml"><mo id="S5.T6.4.4.4.4.1.m1.2.3.2.1" stretchy="false" xref="S5.T6.4.4.4.4.1.m1.2.3.1.cmml">[</mo><mn id="S5.T6.4.4.4.4.1.m1.1.1" xref="S5.T6.4.4.4.4.1.m1.1.1.cmml">0.50</mn><mo id="S5.T6.4.4.4.4.1.m1.2.3.2.2" xref="S5.T6.4.4.4.4.1.m1.2.3.1.cmml">,</mo><mn id="S5.T6.4.4.4.4.1.m1.2.2" xref="S5.T6.4.4.4.4.1.m1.2.2.cmml">0.80</mn><mo id="S5.T6.4.4.4.4.1.m1.2.3.2.3" stretchy="false" xref="S5.T6.4.4.4.4.1.m1.2.3.1.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S5.T6.4.4.4.4.1.m1.2b"><interval closure="closed-open" id="S5.T6.4.4.4.4.1.m1.2.3.1.cmml" xref="S5.T6.4.4.4.4.1.m1.2.3.2"><cn id="S5.T6.4.4.4.4.1.m1.1.1.cmml" type="float" xref="S5.T6.4.4.4.4.1.m1.1.1">0.50</cn><cn id="S5.T6.4.4.4.4.1.m1.2.2.cmml" type="float" xref="S5.T6.4.4.4.4.1.m1.2.2">0.80</cn></interval></annotation-xml><annotation encoding="application/x-tex" id="S5.T6.4.4.4.4.1.m1.2c">[0.50,0.80)</annotation><annotation encoding="application/x-llamapun" id="S5.T6.4.4.4.4.1.m1.2d">[ 0.50 , 0.80 )</annotation></semantics></math></td> <td class="ltx_td ltx_align_center" id="S5.T6.4.4.4.4.3">0.51</td> <td class="ltx_td ltx_align_center" id="S5.T6.4.4.4.4.4">1.74</td> <td class="ltx_td ltx_align_center" id="S5.T6.4.4.4.4.5">0.882</td> <td class="ltx_td ltx_align_center" id="S5.T6.4.4.4.4.6">6.01</td> </tr> <tr class="ltx_tr" id="S5.T6.5.5.5.5"> <td class="ltx_td ltx_align_center ltx_border_bb" id="S5.T6.5.5.5.5.2">NyTT</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S5.T6.5.5.5.5.1"><math alttext="[0.80,1.10)" class="ltx_Math" display="inline" id="S5.T6.5.5.5.5.1.m1.2"><semantics id="S5.T6.5.5.5.5.1.m1.2a"><mrow id="S5.T6.5.5.5.5.1.m1.2.3.2" xref="S5.T6.5.5.5.5.1.m1.2.3.1.cmml"><mo id="S5.T6.5.5.5.5.1.m1.2.3.2.1" stretchy="false" xref="S5.T6.5.5.5.5.1.m1.2.3.1.cmml">[</mo><mn id="S5.T6.5.5.5.5.1.m1.1.1" xref="S5.T6.5.5.5.5.1.m1.1.1.cmml">0.80</mn><mo id="S5.T6.5.5.5.5.1.m1.2.3.2.2" xref="S5.T6.5.5.5.5.1.m1.2.3.1.cmml">,</mo><mn id="S5.T6.5.5.5.5.1.m1.2.2" xref="S5.T6.5.5.5.5.1.m1.2.2.cmml">1.10</mn><mo id="S5.T6.5.5.5.5.1.m1.2.3.2.3" stretchy="false" xref="S5.T6.5.5.5.5.1.m1.2.3.1.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S5.T6.5.5.5.5.1.m1.2b"><interval closure="closed-open" id="S5.T6.5.5.5.5.1.m1.2.3.1.cmml" xref="S5.T6.5.5.5.5.1.m1.2.3.2"><cn id="S5.T6.5.5.5.5.1.m1.1.1.cmml" type="float" xref="S5.T6.5.5.5.5.1.m1.1.1">0.80</cn><cn id="S5.T6.5.5.5.5.1.m1.2.2.cmml" type="float" xref="S5.T6.5.5.5.5.1.m1.2.2">1.10</cn></interval></annotation-xml><annotation encoding="application/x-tex" id="S5.T6.5.5.5.5.1.m1.2c">[0.80,1.10)</annotation><annotation encoding="application/x-llamapun" id="S5.T6.5.5.5.5.1.m1.2d">[ 0.80 , 1.10 )</annotation></semantics></math></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S5.T6.5.5.5.5.3">-1.39</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S5.T6.5.5.5.5.4">1.57</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S5.T6.5.5.5.5.5">0.840</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S5.T6.5.5.5.5.6">4.85</td> </tr> </table> </span></div> </figure> </section> <section class="ltx_subsection" id="S5.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">5.3 </span>Effectiveness of IterNyTT in the dereverberation task</h3> <div class="ltx_para" id="S5.SS3.p1"> <p class="ltx_p" id="S5.SS3.p1.11">To verify the effectiveness of IterNyTT in the dereverberation task, we evaluated the performance over five iterations under different <math alttext="\mathrm{RT}_{60}" class="ltx_Math" display="inline" id="S5.SS3.p1.1.m1.1"><semantics id="S5.SS3.p1.1.m1.1a"><msub id="S5.SS3.p1.1.m1.1.1" xref="S5.SS3.p1.1.m1.1.1.cmml"><mi id="S5.SS3.p1.1.m1.1.1.2" xref="S5.SS3.p1.1.m1.1.1.2.cmml">RT</mi><mn id="S5.SS3.p1.1.m1.1.1.3" xref="S5.SS3.p1.1.m1.1.1.3.cmml">60</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS3.p1.1.m1.1b"><apply id="S5.SS3.p1.1.m1.1.1.cmml" xref="S5.SS3.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S5.SS3.p1.1.m1.1.1.1.cmml" xref="S5.SS3.p1.1.m1.1.1">subscript</csymbol><ci id="S5.SS3.p1.1.m1.1.1.2.cmml" xref="S5.SS3.p1.1.m1.1.1.2">RT</ci><cn id="S5.SS3.p1.1.m1.1.1.3.cmml" type="integer" xref="S5.SS3.p1.1.m1.1.1.3">60</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS3.p1.1.m1.1c">\mathrm{RT}_{60}</annotation><annotation encoding="application/x-llamapun" id="S5.SS3.p1.1.m1.1d">roman_RT start_POSTSUBSCRIPT 60 end_POSTSUBSCRIPT</annotation></semantics></math> conditions for <math alttext="\bm{r}^{\rm obs}" class="ltx_Math" display="inline" id="S5.SS3.p1.2.m2.1"><semantics id="S5.SS3.p1.2.m2.1a"><msup id="S5.SS3.p1.2.m2.1.1" xref="S5.SS3.p1.2.m2.1.1.cmml"><mi id="S5.SS3.p1.2.m2.1.1.2" xref="S5.SS3.p1.2.m2.1.1.2.cmml">𝒓</mi><mi id="S5.SS3.p1.2.m2.1.1.3" xref="S5.SS3.p1.2.m2.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S5.SS3.p1.2.m2.1b"><apply id="S5.SS3.p1.2.m2.1.1.cmml" xref="S5.SS3.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S5.SS3.p1.2.m2.1.1.1.cmml" xref="S5.SS3.p1.2.m2.1.1">superscript</csymbol><ci id="S5.SS3.p1.2.m2.1.1.2.cmml" xref="S5.SS3.p1.2.m2.1.1.2">𝒓</ci><ci id="S5.SS3.p1.2.m2.1.1.3.cmml" xref="S5.SS3.p1.2.m2.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS3.p1.2.m2.1c">\bm{r}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S5.SS3.p1.2.m2.1d">bold_italic_r start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>. Figure <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S5.F9" title="Figure 9 ‣ 5.3 Effectiveness of IterNyTT in the dereverberation task ‣ 5 Experimental analysis in the dereverberation task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">9</span></a> illustrates the SI-SDR of the reverberant targets, along with the SI-SDR, PESQ, STOI, and SRMR of the processed results for the test dataset at each iteration of IterNyTT. From this figure, we observe an overall trend of improved performance of IterNyTT. When <math alttext="\mathrm{RT}_{60}" class="ltx_Math" display="inline" id="S5.SS3.p1.3.m3.1"><semantics id="S5.SS3.p1.3.m3.1a"><msub id="S5.SS3.p1.3.m3.1.1" xref="S5.SS3.p1.3.m3.1.1.cmml"><mi id="S5.SS3.p1.3.m3.1.1.2" xref="S5.SS3.p1.3.m3.1.1.2.cmml">RT</mi><mn id="S5.SS3.p1.3.m3.1.1.3" xref="S5.SS3.p1.3.m3.1.1.3.cmml">60</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS3.p1.3.m3.1b"><apply id="S5.SS3.p1.3.m3.1.1.cmml" xref="S5.SS3.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S5.SS3.p1.3.m3.1.1.1.cmml" xref="S5.SS3.p1.3.m3.1.1">subscript</csymbol><ci id="S5.SS3.p1.3.m3.1.1.2.cmml" xref="S5.SS3.p1.3.m3.1.1.2">RT</ci><cn id="S5.SS3.p1.3.m3.1.1.3.cmml" type="integer" xref="S5.SS3.p1.3.m3.1.1.3">60</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS3.p1.3.m3.1c">\mathrm{RT}_{60}</annotation><annotation encoding="application/x-llamapun" id="S5.SS3.p1.3.m3.1d">roman_RT start_POSTSUBSCRIPT 60 end_POSTSUBSCRIPT</annotation></semantics></math> of <math alttext="\bm{r}^{\rm obs}" class="ltx_Math" display="inline" id="S5.SS3.p1.4.m4.1"><semantics id="S5.SS3.p1.4.m4.1a"><msup id="S5.SS3.p1.4.m4.1.1" xref="S5.SS3.p1.4.m4.1.1.cmml"><mi id="S5.SS3.p1.4.m4.1.1.2" xref="S5.SS3.p1.4.m4.1.1.2.cmml">𝒓</mi><mi id="S5.SS3.p1.4.m4.1.1.3" xref="S5.SS3.p1.4.m4.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S5.SS3.p1.4.m4.1b"><apply id="S5.SS3.p1.4.m4.1.1.cmml" xref="S5.SS3.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S5.SS3.p1.4.m4.1.1.1.cmml" xref="S5.SS3.p1.4.m4.1.1">superscript</csymbol><ci id="S5.SS3.p1.4.m4.1.1.2.cmml" xref="S5.SS3.p1.4.m4.1.1.2">𝒓</ci><ci id="S5.SS3.p1.4.m4.1.1.3.cmml" xref="S5.SS3.p1.4.m4.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS3.p1.4.m4.1c">\bm{r}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S5.SS3.p1.4.m4.1d">bold_italic_r start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> was <math alttext="[0.8,1.1)" class="ltx_Math" display="inline" id="S5.SS3.p1.5.m5.2"><semantics id="S5.SS3.p1.5.m5.2a"><mrow id="S5.SS3.p1.5.m5.2.3.2" xref="S5.SS3.p1.5.m5.2.3.1.cmml"><mo id="S5.SS3.p1.5.m5.2.3.2.1" stretchy="false" xref="S5.SS3.p1.5.m5.2.3.1.cmml">[</mo><mn id="S5.SS3.p1.5.m5.1.1" xref="S5.SS3.p1.5.m5.1.1.cmml">0.8</mn><mo id="S5.SS3.p1.5.m5.2.3.2.2" xref="S5.SS3.p1.5.m5.2.3.1.cmml">,</mo><mn id="S5.SS3.p1.5.m5.2.2" xref="S5.SS3.p1.5.m5.2.2.cmml">1.1</mn><mo id="S5.SS3.p1.5.m5.2.3.2.3" stretchy="false" xref="S5.SS3.p1.5.m5.2.3.1.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S5.SS3.p1.5.m5.2b"><interval closure="closed-open" id="S5.SS3.p1.5.m5.2.3.1.cmml" xref="S5.SS3.p1.5.m5.2.3.2"><cn id="S5.SS3.p1.5.m5.1.1.cmml" type="float" xref="S5.SS3.p1.5.m5.1.1">0.8</cn><cn id="S5.SS3.p1.5.m5.2.2.cmml" type="float" xref="S5.SS3.p1.5.m5.2.2">1.1</cn></interval></annotation-xml><annotation encoding="application/x-tex" id="S5.SS3.p1.5.m5.2c">[0.8,1.1)</annotation><annotation encoding="application/x-llamapun" id="S5.SS3.p1.5.m5.2d">[ 0.8 , 1.1 )</annotation></semantics></math>, IterNyTT in the first iteration does not perform well, and the performance is not improved in the subsequent iterations. We can also see that IterNyTT works stably when the <math alttext="\mathrm{RT}_{60}" class="ltx_Math" display="inline" id="S5.SS3.p1.6.m6.1"><semantics id="S5.SS3.p1.6.m6.1a"><msub id="S5.SS3.p1.6.m6.1.1" xref="S5.SS3.p1.6.m6.1.1.cmml"><mi id="S5.SS3.p1.6.m6.1.1.2" xref="S5.SS3.p1.6.m6.1.1.2.cmml">RT</mi><mn id="S5.SS3.p1.6.m6.1.1.3" xref="S5.SS3.p1.6.m6.1.1.3.cmml">60</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS3.p1.6.m6.1b"><apply id="S5.SS3.p1.6.m6.1.1.cmml" xref="S5.SS3.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S5.SS3.p1.6.m6.1.1.1.cmml" xref="S5.SS3.p1.6.m6.1.1">subscript</csymbol><ci id="S5.SS3.p1.6.m6.1.1.2.cmml" xref="S5.SS3.p1.6.m6.1.1.2">RT</ci><cn id="S5.SS3.p1.6.m6.1.1.3.cmml" type="integer" xref="S5.SS3.p1.6.m6.1.1.3">60</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS3.p1.6.m6.1c">\mathrm{RT}_{60}</annotation><annotation encoding="application/x-llamapun" id="S5.SS3.p1.6.m6.1d">roman_RT start_POSTSUBSCRIPT 60 end_POSTSUBSCRIPT</annotation></semantics></math> of <math alttext="\bm{r}^{\rm obs}" class="ltx_Math" display="inline" id="S5.SS3.p1.7.m7.1"><semantics id="S5.SS3.p1.7.m7.1a"><msup id="S5.SS3.p1.7.m7.1.1" xref="S5.SS3.p1.7.m7.1.1.cmml"><mi id="S5.SS3.p1.7.m7.1.1.2" xref="S5.SS3.p1.7.m7.1.1.2.cmml">𝒓</mi><mi id="S5.SS3.p1.7.m7.1.1.3" xref="S5.SS3.p1.7.m7.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S5.SS3.p1.7.m7.1b"><apply id="S5.SS3.p1.7.m7.1.1.cmml" xref="S5.SS3.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S5.SS3.p1.7.m7.1.1.1.cmml" xref="S5.SS3.p1.7.m7.1.1">superscript</csymbol><ci id="S5.SS3.p1.7.m7.1.1.2.cmml" xref="S5.SS3.p1.7.m7.1.1.2">𝒓</ci><ci id="S5.SS3.p1.7.m7.1.1.3.cmml" xref="S5.SS3.p1.7.m7.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS3.p1.7.m7.1c">\bm{r}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S5.SS3.p1.7.m7.1d">bold_italic_r start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <math alttext="[0.5,0.8)" class="ltx_Math" display="inline" id="S5.SS3.p1.8.m8.2"><semantics id="S5.SS3.p1.8.m8.2a"><mrow id="S5.SS3.p1.8.m8.2.3.2" xref="S5.SS3.p1.8.m8.2.3.1.cmml"><mo id="S5.SS3.p1.8.m8.2.3.2.1" stretchy="false" xref="S5.SS3.p1.8.m8.2.3.1.cmml">[</mo><mn id="S5.SS3.p1.8.m8.1.1" xref="S5.SS3.p1.8.m8.1.1.cmml">0.5</mn><mo id="S5.SS3.p1.8.m8.2.3.2.2" xref="S5.SS3.p1.8.m8.2.3.1.cmml">,</mo><mn id="S5.SS3.p1.8.m8.2.2" xref="S5.SS3.p1.8.m8.2.2.cmml">0.8</mn><mo id="S5.SS3.p1.8.m8.2.3.2.3" stretchy="false" xref="S5.SS3.p1.8.m8.2.3.1.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S5.SS3.p1.8.m8.2b"><interval closure="closed-open" id="S5.SS3.p1.8.m8.2.3.1.cmml" xref="S5.SS3.p1.8.m8.2.3.2"><cn id="S5.SS3.p1.8.m8.1.1.cmml" type="float" xref="S5.SS3.p1.8.m8.1.1">0.5</cn><cn id="S5.SS3.p1.8.m8.2.2.cmml" type="float" xref="S5.SS3.p1.8.m8.2.2">0.8</cn></interval></annotation-xml><annotation encoding="application/x-tex" id="S5.SS3.p1.8.m8.2c">[0.5,0.8)</annotation><annotation encoding="application/x-llamapun" id="S5.SS3.p1.8.m8.2d">[ 0.5 , 0.8 )</annotation></semantics></math>, whereas it becomes unstable when the <math alttext="\mathrm{RT}_{60}" class="ltx_Math" display="inline" id="S5.SS3.p1.9.m9.1"><semantics id="S5.SS3.p1.9.m9.1a"><msub id="S5.SS3.p1.9.m9.1.1" xref="S5.SS3.p1.9.m9.1.1.cmml"><mi id="S5.SS3.p1.9.m9.1.1.2" xref="S5.SS3.p1.9.m9.1.1.2.cmml">RT</mi><mn id="S5.SS3.p1.9.m9.1.1.3" xref="S5.SS3.p1.9.m9.1.1.3.cmml">60</mn></msub><annotation-xml encoding="MathML-Content" id="S5.SS3.p1.9.m9.1b"><apply id="S5.SS3.p1.9.m9.1.1.cmml" xref="S5.SS3.p1.9.m9.1.1"><csymbol cd="ambiguous" id="S5.SS3.p1.9.m9.1.1.1.cmml" xref="S5.SS3.p1.9.m9.1.1">subscript</csymbol><ci id="S5.SS3.p1.9.m9.1.1.2.cmml" xref="S5.SS3.p1.9.m9.1.1.2">RT</ci><cn id="S5.SS3.p1.9.m9.1.1.3.cmml" type="integer" xref="S5.SS3.p1.9.m9.1.1.3">60</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS3.p1.9.m9.1c">\mathrm{RT}_{60}</annotation><annotation encoding="application/x-llamapun" id="S5.SS3.p1.9.m9.1d">roman_RT start_POSTSUBSCRIPT 60 end_POSTSUBSCRIPT</annotation></semantics></math> of <math alttext="\bm{r}^{\rm obs}" class="ltx_Math" display="inline" id="S5.SS3.p1.10.m10.1"><semantics id="S5.SS3.p1.10.m10.1a"><msup id="S5.SS3.p1.10.m10.1.1" xref="S5.SS3.p1.10.m10.1.1.cmml"><mi id="S5.SS3.p1.10.m10.1.1.2" xref="S5.SS3.p1.10.m10.1.1.2.cmml">𝒓</mi><mi id="S5.SS3.p1.10.m10.1.1.3" xref="S5.SS3.p1.10.m10.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S5.SS3.p1.10.m10.1b"><apply id="S5.SS3.p1.10.m10.1.1.cmml" xref="S5.SS3.p1.10.m10.1.1"><csymbol cd="ambiguous" id="S5.SS3.p1.10.m10.1.1.1.cmml" xref="S5.SS3.p1.10.m10.1.1">superscript</csymbol><ci id="S5.SS3.p1.10.m10.1.1.2.cmml" xref="S5.SS3.p1.10.m10.1.1.2">𝒓</ci><ci id="S5.SS3.p1.10.m10.1.1.3.cmml" xref="S5.SS3.p1.10.m10.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.SS3.p1.10.m10.1c">\bm{r}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S5.SS3.p1.10.m10.1d">bold_italic_r start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> is <math alttext="[0.2,0.5)" class="ltx_Math" display="inline" id="S5.SS3.p1.11.m11.2"><semantics id="S5.SS3.p1.11.m11.2a"><mrow id="S5.SS3.p1.11.m11.2.3.2" xref="S5.SS3.p1.11.m11.2.3.1.cmml"><mo id="S5.SS3.p1.11.m11.2.3.2.1" stretchy="false" xref="S5.SS3.p1.11.m11.2.3.1.cmml">[</mo><mn id="S5.SS3.p1.11.m11.1.1" xref="S5.SS3.p1.11.m11.1.1.cmml">0.2</mn><mo id="S5.SS3.p1.11.m11.2.3.2.2" xref="S5.SS3.p1.11.m11.2.3.1.cmml">,</mo><mn id="S5.SS3.p1.11.m11.2.2" xref="S5.SS3.p1.11.m11.2.2.cmml">0.5</mn><mo id="S5.SS3.p1.11.m11.2.3.2.3" stretchy="false" xref="S5.SS3.p1.11.m11.2.3.1.cmml">)</mo></mrow><annotation-xml encoding="MathML-Content" id="S5.SS3.p1.11.m11.2b"><interval closure="closed-open" id="S5.SS3.p1.11.m11.2.3.1.cmml" xref="S5.SS3.p1.11.m11.2.3.2"><cn id="S5.SS3.p1.11.m11.1.1.cmml" type="float" xref="S5.SS3.p1.11.m11.1.1">0.2</cn><cn id="S5.SS3.p1.11.m11.2.2.cmml" type="float" xref="S5.SS3.p1.11.m11.2.2">0.5</cn></interval></annotation-xml><annotation encoding="application/x-tex" id="S5.SS3.p1.11.m11.2c">[0.2,0.5)</annotation><annotation encoding="application/x-llamapun" id="S5.SS3.p1.11.m11.2d">[ 0.2 , 0.5 )</annotation></semantics></math>. Although there are cases where IterNyTT does not work well, especially when the target signals are of very low quality, we can conclude that IterNyTT is generally effective even in the dereverberation task.</p> </div> <figure class="ltx_figure" id="S5.F9"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="230" id="S5.F9.g1" src="x8.png" width="813"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure">Figure 9: </span>Changes in SI-SDR of the target signals and evaluation results on the test dataset through IterNyTT under different <math alttext="\mathrm{RT}_{60}" class="ltx_Math" display="inline" id="S5.F9.3.m1.1"><semantics id="S5.F9.3.m1.1b"><msub id="S5.F9.3.m1.1.1" xref="S5.F9.3.m1.1.1.cmml"><mi id="S5.F9.3.m1.1.1.2" xref="S5.F9.3.m1.1.1.2.cmml">RT</mi><mn id="S5.F9.3.m1.1.1.3" xref="S5.F9.3.m1.1.1.3.cmml">60</mn></msub><annotation-xml encoding="MathML-Content" id="S5.F9.3.m1.1c"><apply id="S5.F9.3.m1.1.1.cmml" xref="S5.F9.3.m1.1.1"><csymbol cd="ambiguous" id="S5.F9.3.m1.1.1.1.cmml" xref="S5.F9.3.m1.1.1">subscript</csymbol><ci id="S5.F9.3.m1.1.1.2.cmml" xref="S5.F9.3.m1.1.1.2">RT</ci><cn id="S5.F9.3.m1.1.1.3.cmml" type="integer" xref="S5.F9.3.m1.1.1.3">60</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.F9.3.m1.1d">\mathrm{RT}_{60}</annotation><annotation encoding="application/x-llamapun" id="S5.F9.3.m1.1e">roman_RT start_POSTSUBSCRIPT 60 end_POSTSUBSCRIPT</annotation></semantics></math> conditions of <math alttext="\bm{r}^{\rm obs}" class="ltx_Math" display="inline" id="S5.F9.4.m2.1"><semantics id="S5.F9.4.m2.1b"><msup id="S5.F9.4.m2.1.1" xref="S5.F9.4.m2.1.1.cmml"><mi id="S5.F9.4.m2.1.1.2" xref="S5.F9.4.m2.1.1.2.cmml">𝒓</mi><mi id="S5.F9.4.m2.1.1.3" xref="S5.F9.4.m2.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S5.F9.4.m2.1c"><apply id="S5.F9.4.m2.1.1.cmml" xref="S5.F9.4.m2.1.1"><csymbol cd="ambiguous" id="S5.F9.4.m2.1.1.1.cmml" xref="S5.F9.4.m2.1.1">superscript</csymbol><ci id="S5.F9.4.m2.1.1.2.cmml" xref="S5.F9.4.m2.1.1.2">𝒓</ci><ci id="S5.F9.4.m2.1.1.3.cmml" xref="S5.F9.4.m2.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S5.F9.4.m2.1d">\bm{r}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S5.F9.4.m2.1e">bold_italic_r start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math>. The first iteration of IterNyTT is equivalent to the original NyTT. Values in parentheses indicate the evaluation results of unprocessed input signals.</figcaption> </figure> </section> </section> <section class="ltx_section" id="S6"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">6 </span>Experimental analysis in the declipping task</h2> <section class="ltx_subsection" id="S6.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">6.1 </span>Setups</h3> <div class="ltx_para" id="S6.SS1.p1"> <p class="ltx_p" id="S6.SS1.p1.3">In the experiments, the clean target signals were 10,000 utterances of LibriSpeech, and we generated the clipped target signals <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S6.SS1.p1.1.m1.1"><semantics id="S6.SS1.p1.1.m1.1a"><mi id="S6.SS1.p1.1.m1.1.1" xref="S6.SS1.p1.1.m1.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S6.SS1.p1.1.m1.1b"><ci id="S6.SS1.p1.1.m1.1.1.cmml" xref="S6.SS1.p1.1.m1.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S6.SS1.p1.1.m1.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S6.SS1.p1.1.m1.1d">bold_italic_x</annotation></semantics></math> by clipping them with an <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S6.SS1.p1.2.m2.1"><semantics id="S6.SS1.p1.2.m2.1a"><msub id="S6.SS1.p1.2.m2.1.1" xref="S6.SS1.p1.2.m2.1.1.cmml"><mi id="S6.SS1.p1.2.m2.1.1.2" xref="S6.SS1.p1.2.m2.1.1.2.cmml">SNR</mi><mi id="S6.SS1.p1.2.m2.1.1.3" xref="S6.SS1.p1.2.m2.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S6.SS1.p1.2.m2.1b"><apply id="S6.SS1.p1.2.m2.1.1.cmml" xref="S6.SS1.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S6.SS1.p1.2.m2.1.1.1.cmml" xref="S6.SS1.p1.2.m2.1.1">subscript</csymbol><ci id="S6.SS1.p1.2.m2.1.1.2.cmml" xref="S6.SS1.p1.2.m2.1.1.2">SNR</ci><ci id="S6.SS1.p1.2.m2.1.1.3.cmml" xref="S6.SS1.p1.2.m2.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S6.SS1.p1.2.m2.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S6.SS1.p1.2.m2.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> of 3, 7, or 15 dB. The clipped signals of the test dataset were generated by clipping 1,000 utterances of LibriSpeech with the SNR randomly selected from 1, 3, 7, and 15 dB. During the training, for both CTT and NyTT, the clipping threshold was determined from the <math alttext="\mathrm{SNR}_{\bm{y}}" class="ltx_Math" display="inline" id="S6.SS1.p1.3.m3.1"><semantics id="S6.SS1.p1.3.m3.1a"><msub id="S6.SS1.p1.3.m3.1.1" xref="S6.SS1.p1.3.m3.1.1.cmml"><mi id="S6.SS1.p1.3.m3.1.1.2" xref="S6.SS1.p1.3.m3.1.1.2.cmml">SNR</mi><mi id="S6.SS1.p1.3.m3.1.1.3" xref="S6.SS1.p1.3.m3.1.1.3.cmml">𝒚</mi></msub><annotation-xml encoding="MathML-Content" id="S6.SS1.p1.3.m3.1b"><apply id="S6.SS1.p1.3.m3.1.1.cmml" xref="S6.SS1.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S6.SS1.p1.3.m3.1.1.1.cmml" xref="S6.SS1.p1.3.m3.1.1">subscript</csymbol><ci id="S6.SS1.p1.3.m3.1.1.2.cmml" xref="S6.SS1.p1.3.m3.1.1.2">SNR</ci><ci id="S6.SS1.p1.3.m3.1.1.3.cmml" xref="S6.SS1.p1.3.m3.1.1.3">𝒚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S6.SS1.p1.3.m3.1c">\mathrm{SNR}_{\bm{y}}</annotation><annotation encoding="application/x-llamapun" id="S6.SS1.p1.3.m3.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_y end_POSTSUBSCRIPT</annotation></semantics></math> randomly selected from 1 to 9 dB.</p> </div> <div class="ltx_para" id="S6.SS1.p2"> <p class="ltx_p" id="S6.SS1.p2.1">The DNN was a causal Demucs <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">yi2024ddd</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">defossez20_interspeech</span>]</cite>, and the loss function was a weighted sum of the L1 waveform and multi-resolution STFT losses, as in <cite class="ltx_cite ltx_citemacro_cite">[<span class="ltx_ref ltx_missing_citation ltx_ref_self">kwon2024speech</span>, <span class="ltx_ref ltx_missing_citation ltx_ref_self">yi2024ddd</span>]</cite>. The weights for the L1 waveform and multi-resolution STFT losses were set to 10 and 0.1, respectively. We trained the DNN for 400 epochs with a mini-batch size of 12, using the Adam optimizer with a fixed learning rate of 0.0001. For the validation, we used 50 clean utterances of LibriSpeech and generated clipped signals with the SNR randomly selected from 1, 3, 7, and 15 dB. As the metrics, we used SI-SDR, PESQ, and STOI.</p> </div> <figure class="ltx_table" id="S6.T7"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table">Table 7: </span>Evaluation results of CTT and NyTT in the declipping task.</figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S6.T7.2.2" style="width:187.9pt;height:81pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-31.3pt,13.5pt) scale(0.75,0.75) ;"> <table class="ltx_tabular ltx_align_middle" id="S6.T7.2.2.2"> <tr class="ltx_tr" id="S6.T7.1.1.1.1"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S6.T7.1.1.1.1.2">Method</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S6.T7.1.1.1.1.1"> <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S6.T7.1.1.1.1.1.m1.1"><semantics id="S6.T7.1.1.1.1.1.m1.1a"><msub id="S6.T7.1.1.1.1.1.m1.1.1" xref="S6.T7.1.1.1.1.1.m1.1.1.cmml"><mi id="S6.T7.1.1.1.1.1.m1.1.1.2" xref="S6.T7.1.1.1.1.1.m1.1.1.2.cmml">SNR</mi><mi id="S6.T7.1.1.1.1.1.m1.1.1.3" xref="S6.T7.1.1.1.1.1.m1.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S6.T7.1.1.1.1.1.m1.1b"><apply id="S6.T7.1.1.1.1.1.m1.1.1.cmml" xref="S6.T7.1.1.1.1.1.m1.1.1"><csymbol cd="ambiguous" id="S6.T7.1.1.1.1.1.m1.1.1.1.cmml" xref="S6.T7.1.1.1.1.1.m1.1.1">subscript</csymbol><ci id="S6.T7.1.1.1.1.1.m1.1.1.2.cmml" xref="S6.T7.1.1.1.1.1.m1.1.1.2">SNR</ci><ci id="S6.T7.1.1.1.1.1.m1.1.1.3.cmml" xref="S6.T7.1.1.1.1.1.m1.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S6.T7.1.1.1.1.1.m1.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S6.T7.1.1.1.1.1.m1.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> [dB]</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S6.T7.1.1.1.1.3">SI-SDR</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S6.T7.1.1.1.1.4">PESQ</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S6.T7.1.1.1.1.5">STOI</td> </tr> <tr class="ltx_tr" id="S6.T7.2.2.2.3"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S6.T7.2.2.2.3.1">Unprocessed</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S6.T7.2.2.2.3.2">-</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S6.T7.2.2.2.3.3">6.41</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S6.T7.2.2.2.3.4">1.89</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S6.T7.2.2.2.3.5">0.866</td> </tr> <tr class="ltx_tr" id="S6.T7.2.2.2.2"> <td class="ltx_td ltx_align_center ltx_border_r" id="S6.T7.2.2.2.2.2">CTT</td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.2.1"><math alttext="\infty" class="ltx_Math" display="inline" id="S6.T7.2.2.2.2.1.m1.1"><semantics id="S6.T7.2.2.2.2.1.m1.1a"><mi id="S6.T7.2.2.2.2.1.m1.1.1" mathvariant="normal" xref="S6.T7.2.2.2.2.1.m1.1.1.cmml">∞</mi><annotation-xml encoding="MathML-Content" id="S6.T7.2.2.2.2.1.m1.1b"><infinity id="S6.T7.2.2.2.2.1.m1.1.1.cmml" xref="S6.T7.2.2.2.2.1.m1.1.1"></infinity></annotation-xml><annotation encoding="application/x-tex" id="S6.T7.2.2.2.2.1.m1.1c">\infty</annotation><annotation encoding="application/x-llamapun" id="S6.T7.2.2.2.2.1.m1.1d">∞</annotation></semantics></math></td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.2.3"><span class="ltx_text ltx_font_bold" id="S6.T7.2.2.2.2.3.1">16.59</span></td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.2.4"><span class="ltx_text ltx_font_bold" id="S6.T7.2.2.2.2.4.1">3.53</span></td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.2.5"><span class="ltx_text ltx_font_bold" id="S6.T7.2.2.2.2.5.1">0.965</span></td> </tr> <tr class="ltx_tr" id="S6.T7.2.2.2.4"> <td class="ltx_td ltx_align_center ltx_border_r" id="S6.T7.2.2.2.4.1">NyTT</td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.4.2">15</td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.4.3">15.27</td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.4.4">3.21</td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.4.5">0.959</td> </tr> <tr class="ltx_tr" id="S6.T7.2.2.2.5"> <td class="ltx_td ltx_align_center ltx_border_r" id="S6.T7.2.2.2.5.1">NyTT</td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.5.2">7</td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.5.3">12.44</td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.5.4">2.65</td> <td class="ltx_td ltx_align_center" id="S6.T7.2.2.2.5.5">0.941</td> </tr> <tr class="ltx_tr" id="S6.T7.2.2.2.6"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S6.T7.2.2.2.6.1">NyTT</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S6.T7.2.2.2.6.2">3</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S6.T7.2.2.2.6.3">10.09</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S6.T7.2.2.2.6.4">2.28</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S6.T7.2.2.2.6.5">0.915</td> </tr> </table> </span></div> </figure> </section> <section class="ltx_subsection" id="S6.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">6.2 </span>Effectiveness of NyTT in the declipping task</h3> <div class="ltx_para" id="S6.SS2.p1"> <p class="ltx_p" id="S6.SS2.p1.1">We conducted experimental evaluations of NyTT in the declipping task. Table <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S6.T7" title="Table 7 ‣ 6.1 Setups ‣ 6 Experimental analysis in the declipping task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">7</span></a> shows the evaluation results under different <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S6.SS2.p1.1.m1.1"><semantics id="S6.SS2.p1.1.m1.1a"><msub id="S6.SS2.p1.1.m1.1.1" xref="S6.SS2.p1.1.m1.1.1.cmml"><mi id="S6.SS2.p1.1.m1.1.1.2" xref="S6.SS2.p1.1.m1.1.1.2.cmml">SNR</mi><mi id="S6.SS2.p1.1.m1.1.1.3" xref="S6.SS2.p1.1.m1.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S6.SS2.p1.1.m1.1b"><apply id="S6.SS2.p1.1.m1.1.1.cmml" xref="S6.SS2.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S6.SS2.p1.1.m1.1.1.1.cmml" xref="S6.SS2.p1.1.m1.1.1">subscript</csymbol><ci id="S6.SS2.p1.1.m1.1.1.2.cmml" xref="S6.SS2.p1.1.m1.1.1.2">SNR</ci><ci id="S6.SS2.p1.1.m1.1.1.3.cmml" xref="S6.SS2.p1.1.m1.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S6.SS2.p1.1.m1.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S6.SS2.p1.1.m1.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> conditions. As in the denoising and dereverberation tasks, we can see that NyTT is effective in the declipping task, NyTT works without satisfying the Noise2Noise conditions, and the performance of NyTT improves with higher target signal quality.</p> </div> <figure class="ltx_figure" id="S6.F10"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="282" id="S6.F10.g1" src="x9.png" width="814"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure">Figure 10: </span> Changes in SI-SDR of the target signals and evaluation results on the test dataset through IterNyTT under different <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S6.F10.2.m1.1"><semantics id="S6.F10.2.m1.1b"><msub id="S6.F10.2.m1.1.1" xref="S6.F10.2.m1.1.1.cmml"><mi id="S6.F10.2.m1.1.1.2" xref="S6.F10.2.m1.1.1.2.cmml">SNR</mi><mi id="S6.F10.2.m1.1.1.3" xref="S6.F10.2.m1.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S6.F10.2.m1.1c"><apply id="S6.F10.2.m1.1.1.cmml" xref="S6.F10.2.m1.1.1"><csymbol cd="ambiguous" id="S6.F10.2.m1.1.1.1.cmml" xref="S6.F10.2.m1.1.1">subscript</csymbol><ci id="S6.F10.2.m1.1.1.2.cmml" xref="S6.F10.2.m1.1.1.2">SNR</ci><ci id="S6.F10.2.m1.1.1.3.cmml" xref="S6.F10.2.m1.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S6.F10.2.m1.1d">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S6.F10.2.m1.1e">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> conditions. The first iteration of IterNyTT is equivalent to the original NyTT. Values in parentheses indicate the evaluation results of unprocessed input signals.</figcaption> </figure> </section> <section class="ltx_subsection" id="S6.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">6.3 </span>Effectiveness of IterNyTT in the declipping task</h3> <div class="ltx_para" id="S6.SS3.p1"> <p class="ltx_p" id="S6.SS3.p1.2">To verify the effectiveness of IterNyTT in the declipping task, we evaluated the performance over five iterations under different <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S6.SS3.p1.1.m1.1"><semantics id="S6.SS3.p1.1.m1.1a"><msub id="S6.SS3.p1.1.m1.1.1" xref="S6.SS3.p1.1.m1.1.1.cmml"><mi id="S6.SS3.p1.1.m1.1.1.2" xref="S6.SS3.p1.1.m1.1.1.2.cmml">SNR</mi><mi id="S6.SS3.p1.1.m1.1.1.3" xref="S6.SS3.p1.1.m1.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S6.SS3.p1.1.m1.1b"><apply id="S6.SS3.p1.1.m1.1.1.cmml" xref="S6.SS3.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S6.SS3.p1.1.m1.1.1.1.cmml" xref="S6.SS3.p1.1.m1.1.1">subscript</csymbol><ci id="S6.SS3.p1.1.m1.1.1.2.cmml" xref="S6.SS3.p1.1.m1.1.1.2">SNR</ci><ci id="S6.SS3.p1.1.m1.1.1.3.cmml" xref="S6.SS3.p1.1.m1.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S6.SS3.p1.1.m1.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S6.SS3.p1.1.m1.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> conditions. Figure <a class="ltx_ref" href="https://arxiv.org/html/2503.14854v1#S6.F10" title="Figure 10 ‣ 6.2 Effectiveness of NyTT in the declipping task ‣ 6 Experimental analysis in the declipping task ‣ Analysis and Extension of Noisy-target Training for Unsupervised Target Signal Enhancement"><span class="ltx_text ltx_ref_tag">10</span></a> illustrates the SI-SDR of the clipped targets, along with the SI-SDR, PESQ, and STOI of the processed results for the test dataset at each iteration of IterNyTT. Here, we can again observe a consistent trend: IterNyTT improves performance, although its effectiveness is affected by the quality of the target signals. Specifically, when <math alttext="\mathrm{SNR}_{\bm{x}}" class="ltx_Math" display="inline" id="S6.SS3.p1.2.m2.1"><semantics id="S6.SS3.p1.2.m2.1a"><msub id="S6.SS3.p1.2.m2.1.1" xref="S6.SS3.p1.2.m2.1.1.cmml"><mi id="S6.SS3.p1.2.m2.1.1.2" xref="S6.SS3.p1.2.m2.1.1.2.cmml">SNR</mi><mi id="S6.SS3.p1.2.m2.1.1.3" xref="S6.SS3.p1.2.m2.1.1.3.cmml">𝒙</mi></msub><annotation-xml encoding="MathML-Content" id="S6.SS3.p1.2.m2.1b"><apply id="S6.SS3.p1.2.m2.1.1.cmml" xref="S6.SS3.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S6.SS3.p1.2.m2.1.1.1.cmml" xref="S6.SS3.p1.2.m2.1.1">subscript</csymbol><ci id="S6.SS3.p1.2.m2.1.1.2.cmml" xref="S6.SS3.p1.2.m2.1.1.2">SNR</ci><ci id="S6.SS3.p1.2.m2.1.1.3.cmml" xref="S6.SS3.p1.2.m2.1.1.3">𝒙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S6.SS3.p1.2.m2.1c">\mathrm{SNR}_{\bm{x}}</annotation><annotation encoding="application/x-llamapun" id="S6.SS3.p1.2.m2.1d">roman_SNR start_POSTSUBSCRIPT bold_italic_x end_POSTSUBSCRIPT</annotation></semantics></math> is 15 dB, IterNyTT achieves performance comparable to that of CTT.</p> </div> </section> </section> <section class="ltx_section" id="S7"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">7 </span>Conclusion</h2> <div class="ltx_para" id="S7.p1"> <p class="ltx_p" id="S7.p1.3">In this study, we conducted comprehensive experimental analyses of NyTT to elucidate its detailed properties. Our experiments revealed the following key findings: 1) NyTT can be interpreted as a training method to estimate the noisy target <math alttext="\bm{x}" class="ltx_Math" display="inline" id="S7.p1.1.m1.1"><semantics id="S7.p1.1.m1.1a"><mi id="S7.p1.1.m1.1.1" xref="S7.p1.1.m1.1.1.cmml">𝒙</mi><annotation-xml encoding="MathML-Content" id="S7.p1.1.m1.1b"><ci id="S7.p1.1.m1.1.1.cmml" xref="S7.p1.1.m1.1.1">𝒙</ci></annotation-xml><annotation encoding="application/x-tex" id="S7.p1.1.m1.1c">\bm{x}</annotation><annotation encoding="application/x-llamapun" id="S7.p1.1.m1.1d">bold_italic_x</annotation></semantics></math> by removing <math alttext="\bm{n}^{\rm add}" class="ltx_Math" display="inline" id="S7.p1.2.m2.1"><semantics id="S7.p1.2.m2.1a"><msup id="S7.p1.2.m2.1.1" xref="S7.p1.2.m2.1.1.cmml"><mi id="S7.p1.2.m2.1.1.2" xref="S7.p1.2.m2.1.1.2.cmml">𝒏</mi><mi id="S7.p1.2.m2.1.1.3" xref="S7.p1.2.m2.1.1.3.cmml">add</mi></msup><annotation-xml encoding="MathML-Content" id="S7.p1.2.m2.1b"><apply id="S7.p1.2.m2.1.1.cmml" xref="S7.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S7.p1.2.m2.1.1.1.cmml" xref="S7.p1.2.m2.1.1">superscript</csymbol><ci id="S7.p1.2.m2.1.1.2.cmml" xref="S7.p1.2.m2.1.1.2">𝒏</ci><ci id="S7.p1.2.m2.1.1.3.cmml" xref="S7.p1.2.m2.1.1.3">add</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.p1.2.m2.1c">\bm{n}^{\rm add}</annotation><annotation encoding="application/x-llamapun" id="S7.p1.2.m2.1d">bold_italic_n start_POSTSUPERSCRIPT roman_add end_POSTSUPERSCRIPT</annotation></semantics></math>, rather than strictly adhering to the Noise2Noise framework. This indicates that the Noise2Noise conditions (i.e., the zero-mean distribution assumption for <math alttext="\bm{n}^{\rm obs}" class="ltx_Math" display="inline" id="S7.p1.3.m3.1"><semantics id="S7.p1.3.m3.1a"><msup id="S7.p1.3.m3.1.1" xref="S7.p1.3.m3.1.1.cmml"><mi id="S7.p1.3.m3.1.1.2" xref="S7.p1.3.m3.1.1.2.cmml">𝒏</mi><mi id="S7.p1.3.m3.1.1.3" xref="S7.p1.3.m3.1.1.3.cmml">obs</mi></msup><annotation-xml encoding="MathML-Content" id="S7.p1.3.m3.1b"><apply id="S7.p1.3.m3.1.1.cmml" xref="S7.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S7.p1.3.m3.1.1.1.cmml" xref="S7.p1.3.m3.1.1">superscript</csymbol><ci id="S7.p1.3.m3.1.1.2.cmml" xref="S7.p1.3.m3.1.1.2">𝒏</ci><ci id="S7.p1.3.m3.1.1.3.cmml" xref="S7.p1.3.m3.1.1.3">obs</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.p1.3.m3.1c">\bm{n}^{\rm obs}</annotation><annotation encoding="application/x-llamapun" id="S7.p1.3.m3.1d">bold_italic_n start_POSTSUPERSCRIPT roman_obs end_POSTSUPERSCRIPT</annotation></semantics></math> and the use of the MSE loss function) are not necessary, demonstrating the flexibility of NyTT. 2) IterNyTT improved performance by enhancing the quality of noisy target signals, demonstrating its potential to achieve performance comparable to that of CTT. 3) By investigating the effects of noise mismatches, we derived desirable noise conditions. 4) Even when a small number of clean target signals were available, the combined use of noisy and clean target signals improved performance. 5) NyTT was also effective in the dereverberation and declipping tasks. Furthermore, both NyTT and IterNyTT exhibited similar behaviors across the denoising, dereverberation, and declipping tasks, implying their general applicability. </p> </div> </section> <section class="ltx_section" id="Sx1"> <h2 class="ltx_title ltx_title_section">Acknowledgements</h2> <div class="ltx_para" id="Sx1.p1"> <p class="ltx_p" id="Sx1.p1.1">This work was partly supported by JST CREST Grant Number JPMJCR19A3, JSPS KAKENHI Grant Number JP20H00102, and JST SPRING Grant Number JPMJSP2125.</p> </div> </section> <section class="ltx_section" id="Sx2"> <h2 class="ltx_title ltx_title_section">References</h2> <div class="ltx_para" id="Sx2.p1"> <span class="ltx_ERROR undefined" id="Sx2.p1.1">\printbibliography</span> </div> </section> </article> </div> <footer class="ltx_page_footer"> <div class="ltx_page_logo">Generated on Wed Mar 19 03:19:58 2025 by <a class="ltx_LaTeXML_logo" href="http://dlmf.nist.gov/LaTeXML/"><span style="letter-spacing:-0.2em; margin-right:0.1em;">L<span class="ltx_font_smallcaps" style="position:relative; bottom:2.2pt;">a</span>T<span class="ltx_font_smallcaps" style="font-size:120%;position:relative; bottom:-0.2ex;">e</span></span><span style="font-size:90%; position:relative; bottom:-0.2ex;">XML</span><img alt="Mascot Sammy" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAsAAAAOCAYAAAD5YeaVAAAAAXNSR0IArs4c6QAAAAZiS0dEAP8A/wD/oL2nkwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9wKExQZLWTEaOUAAAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAdpJREFUKM9tkL+L2nAARz9fPZNCKFapUn8kyI0e4iRHSR1Kb8ng0lJw6FYHFwv2LwhOpcWxTjeUunYqOmqd6hEoRDhtDWdA8ApRYsSUCDHNt5ul13vz4w0vWCgUnnEc975arX6ORqN3VqtVZbfbTQC4uEHANM3jSqXymFI6yWazP2KxWAXAL9zCUa1Wy2tXVxheKA9YNoR8Pt+aTqe4FVVVvz05O6MBhqUIBGk8Hn8HAOVy+T+XLJfLS4ZhTiRJgqIoVBRFIoric47jPnmeB1mW/9rr9ZpSSn3Lsmir1fJZlqWlUonKsvwWwD8ymc/nXwVBeLjf7xEKhdBut9Hr9WgmkyGEkJwsy5eHG5vN5g0AKIoCAEgkEkin0wQAfN9/cXPdheu6P33fBwB4ngcAcByHJpPJl+fn54mD3Gg0NrquXxeLRQAAwzAYj8cwTZPwPH9/sVg8PXweDAauqqr2cDjEer1GJBLBZDJBs9mE4zjwfZ85lAGg2+06hmGgXq+j3+/DsixYlgVN03a9Xu8jgCNCyIegIAgx13Vfd7vdu+FweG8YRkjXdWy329+dTgeSJD3ieZ7RNO0VAXAPwDEAO5VKndi2fWrb9jWl9Esul6PZbDY9Go1OZ7PZ9z/lyuD3OozU2wAAAABJRU5ErkJggg=="/></a> </div></footer> </div> </body> </html>