CINXE.COM

SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation

<!DOCTYPE html> <html lang="en"> <head> <meta content="text/html; charset=utf-8" http-equiv="content-type"/> <title>SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation</title> <!--Generated on Sun Mar 16 04:02:48 2025 by LaTeXML (version 0.8.8) http://dlmf.nist.gov/LaTeXML/.--> <meta content="width=device-width, initial-scale=1, shrink-to-fit=no" name="viewport"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv-fonts.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/latexml_styles.css" rel="stylesheet" type="text/css"/> <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.3.3/html2canvas.min.js"></script> <script src="/static/browse/0.3.4/js/addons_new.js"></script> <script src="/static/browse/0.3.4/js/feedbackOverlay.js"></script> <base href="/html/2411.19921v2/"/></head> <body> <nav class="ltx_page_navbar"> <nav class="ltx_TOC"> <ol class="ltx_toclist"> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S1" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">1 </span>Introduction</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S2" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2 </span>Related Works</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_paragraph"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S2.SS0.SSS0.Px1" title="In 2 Related Works ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title">Kinematic-based Human Scene Interaction</span></a></li> <li class="ltx_tocentry ltx_tocentry_paragraph"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S2.SS0.SSS0.Px2" title="In 2 Related Works ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title">Physics-based Human-Scene Interaction</span></a></li> <li class="ltx_tocentry ltx_tocentry_paragraph"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S2.SS0.SSS0.Px3" title="In 2 Related Works ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title">Comparison with Previous HSI Methods</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3 </span>Method</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS1" title="In 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.1 </span>Short Script Database Construction</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS2" title="In 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.2 </span>Retrieval Augmented Script Generation</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS3" title="In 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3.3 </span>Multi-Condition Controller</span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_paragraph"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS3.SSS0.Px1" title="In 3.3 Multi-Condition Controller ‣ 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title">Overview</span></a></li> <li class="ltx_tocentry ltx_tocentry_paragraph"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS3.SSS0.Px2" title="In 3.3 Multi-Condition Controller ‣ 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title">Finite State Machine</span></a></li> <li class="ltx_tocentry ltx_tocentry_paragraph"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS3.SSS0.Px3" title="In 3.3 Multi-Condition Controller ‣ 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title">Language Condition</span></a></li> <li class="ltx_tocentry ltx_tocentry_paragraph"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS3.SSS0.Px4" title="In 3.3 Multi-Condition Controller ‣ 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title">Scene Condition</span></a></li> <li class="ltx_tocentry ltx_tocentry_paragraph"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS3.SSS0.Px5" title="In 3.3 Multi-Condition Controller ‣ 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title">Universal Goal Condition</span></a></li> <li class="ltx_tocentry ltx_tocentry_paragraph"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS3.SSS0.Px6" title="In 3.3 Multi-Condition Controller ‣ 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title">Policy Training</span></a></li> </ol> </li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4 </span>Experiments</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS1" title="In 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.1 </span>Dataset</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS2" title="In 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.2 </span>Motion Metrics</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS3" title="In 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.3 </span>Comparison with SOTA methods</span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS3.SSS1" title="In 4.3 Comparison with SOTA methods ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.3.1 </span>Physical Performance for Different Skills</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS3.SSS2" title="In 4.3 Comparison with SOTA methods ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.3.2 </span>Motion Diversity for Different Skills</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS3.SSS3" title="In 4.3 Comparison with SOTA methods ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.3.3 </span>User Study on SOTA Long-Term HSI Methods</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS4" title="In 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.4 </span>Ablation Study on SIMS</span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS4.SSS1" title="In 4.4 Ablation Study on SIMS ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.4.1 </span>Direct Generation <span class="ltx_text ltx_font_italic">vs.</span> RASG.</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS4.SSS2" title="In 4.4 Ablation Study on SIMS ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.4.2 </span>Generalization on Unseen Objects</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS4.SSS3" title="In 4.4 Ablation Study on SIMS ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.4.3 </span>Scale Up on New Motion Datasets</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS4.SSS4" title="In 4.4 Ablation Study on SIMS ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.4.4 </span>Ablation of Policy Settings</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.SS5" title="In 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4.5 </span>Qualitative Results</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S5" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">5 </span>Conclusion</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S6" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">6 </span>Furture Work</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S7" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">7 </span>Reward Templates</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S8" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">8 </span>Re-implemented MotionCLIP</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S9" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">9 </span>New Skill Scalability</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S10" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">10 </span>ViconStyle Dataset</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S10.SS1" title="In 10 ViconStyle Dataset ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">10.1 </span>Capture Setting</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S10.SS2" title="In 10 ViconStyle Dataset ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">10.2 </span>Dataset Statistics</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S10.SS3" title="In 10 ViconStyle Dataset ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">10.3 </span>Qualitative Results</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S11" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">11 </span>Short Script Examples</span></a></li> </ol></nav> </nav> <div class="ltx_page_main"> <div class="ltx_page_content"> <article class="ltx_document ltx_authors_1line ltx_pruned_first" lang="en"> <h1 class="ltx_title ltx_title_document">SIMS: Simulating Stylized Human-Scene Interactions <br class="ltx_break"/>with Retrieval-Augmented Script Generation</h1> <div class="ltx_authors"> <span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Wenjia Wang<sup class="ltx_sup" id="id15.13.id1"><span class="ltx_text ltx_font_italic" id="id15.13.id1.1">1</span></sup>    Liang Pan<sup class="ltx_sup" id="id16.14.id2"><span class="ltx_text ltx_font_italic" id="id16.14.id2.1">1,2</span></sup>    Zhiyang Dou<sup class="ltx_sup" id="id17.15.id3">1</sup>    Jidong Mei<sup class="ltx_sup" id="id18.16.id4">1</sup>    Zhouyingcheng Liao<sup class="ltx_sup" id="id19.17.id5"><span class="ltx_text ltx_font_italic" id="id19.17.id5.1">1</span></sup>    <br class="ltx_break"/>Yuke Lou<sup class="ltx_sup" id="id20.18.id6">1</sup>    Yifan Wu<sup class="ltx_sup" id="id21.19.id7">1</sup>    Lei Yang<sup class="ltx_sup" id="id22.20.id8">2</sup>    Jingbo Wang<sup class="ltx_sup" id="id23.21.id9"><span class="ltx_text ltx_font_italic" id="id23.21.id9.1">2†</span></sup>    Taku Komura<sup class="ltx_sup" id="id24.22.id10"><span class="ltx_text ltx_font_italic" id="id24.22.id10.1">1†</span></sup> <br class="ltx_break"/> <sup class="ltx_sup" id="id25.23.id11">1</sup> The University of Hong Kong  <sup class="ltx_sup" id="id26.24.id12">2</sup> Shanghai AI Laboratory </span></span> </div> <div class="ltx_abstract"> <h6 class="ltx_title ltx_title_abstract">Abstract</h6> <p class="ltx_p" id="id27.id1"><span class="ltx_text" id="id27.id1.1">Simulating stylized human-scene interactions (HSI) in physical environments is a challenging yet fascinating task. Prior works emphasize long-term execution but fall short in achieving both diverse style and physical plausibility. To tackle this challenge, we introduce a novel hierarchical framework named SIMS that seamlessly bridges high-level script-driven intent with a low-level control policy, enabling more expressive and diverse human-scene interactions. Specifically, we employ Large Language Models with Retrieval-Augmented Generation (RAG) to generate coherent and diverse long-form scripts, providing a rich foundation for motion planning. A versatile multi-condition physics-based control policy is also developed, which leverages text embeddings from the generated scripts to encode stylistic cues, simultaneously perceiving environmental geometries and accomplishing task goals. By integrating the retrieval-augmented script generation with the multi-condition controller, our approach provides a unified solution for generating stylized HSI motions. We further introduce a comprehensive planning dataset produced by RAG and a stylized motion dataset featuring diverse locomotions and interactions. Extensive experiments demonstrate SIMS’s effectiveness in executing various tasks and generalizing across different scenarios, significantly outperforming previous methods. Project page: <a class="ltx_ref ltx_href" href="https://wenjiawang0312.github.io/projects/sims/" title="">https://wenjiawang0312.github.io/projects/sims/</a>.</span></p> </div> <div class="ltx_logical-block" id="id14"> <div class="ltx_para" id="id14.p1"> <img alt="[Uncaptioned image]" class="ltx_graphics ltx_centering ltx_img_landscape" height="254" id="id13.g1" src="x1.png" width="797"/> </div> <figure class="ltx_figure ltx_align_center" id="S0.F1"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S0.F1.2.1.1" style="font-size:90%;">Figure 1</span>: </span><span class="ltx_text" id="S0.F1.3.2" style="font-size:90%;"> SIMS enables physically simulated characters to perform diverse skills within complex 3D scenes given long-term daily narratives and scene inputs. Our character could perform versatile skills, including Locomotions, Human Scene Interactions and Dynamic Object Interactions with diverse styles while accomplishing physically plausible contacts and obstacle avoidance. Left: a dialogue-based retrieval-augmented script generation process. Right: a skillful humanoid performing diverse stylized interactions in a 3D scene.</span></figcaption> </figure> </div> <span class="ltx_note ltx_role_footnotetext" id="footnotex1"><sup class="ltx_note_mark">1</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">1</sup><span class="ltx_note_type">footnotetext: </span><math alttext="\dagger" class="ltx_Math" display="inline" id="footnotex1.m1.1"><semantics id="footnotex1.m1.1b"><mo id="footnotex1.m1.1.1" xref="footnotex1.m1.1.1.cmml">†</mo><annotation-xml encoding="MathML-Content" id="footnotex1.m1.1c"><ci id="footnotex1.m1.1.1.cmml" xref="footnotex1.m1.1.1">†</ci></annotation-xml><annotation encoding="application/x-tex" id="footnotex1.m1.1d">\dagger</annotation><annotation encoding="application/x-llamapun" id="footnotex1.m1.1e">†</annotation></semantics></math>: equal advising.</span></span></span> <section class="ltx_section" id="S1"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">1 </span>Introduction</h2> <figure class="ltx_table" id="S1.T1"> <div class="ltx_inline-block ltx_transformed_outer" id="S1.T1.2" style="width:433.6pt;height:94.3pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-212.3pt,46.2pt) scale(0.50529,0.50529) ;"> <p class="ltx_p" id="S1.T1.2.1"><span class="ltx_text" id="S1.T1.2.1.1"> <span class="ltx_inline-block ltx_transformed_outer" id="S1.T1.2.1.1.1" style="width:858.2pt;height:186.7pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(0.0pt,0.0pt) scale(1,1) ;"> <span class="ltx_p" id="S1.T1.2.1.1.1.1"><span class="ltx_text" id="S1.T1.2.1.1.1.1.1"> <span class="ltx_tabular ltx_align_middle" id="S1.T1.2.1.1.1.1.1.1"> <span class="ltx_tr" id="S1.T1.2.1.1.1.1.1.1.1"> <span class="ltx_td ltx_align_left ltx_border_r ltx_border_tt ltx_rowspan ltx_rowspan_2" id="S1.T1.2.1.1.1.1.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S1.T1.2.1.1.1.1.1.1.1.1.1">Method</span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_tt" id="S1.T1.2.1.1.1.1.1.1.1.2"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.1.2.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.1.2.1.1" style="width:75.0pt;"><span class="ltx_text ltx_font_bold" id="S1.T1.2.1.1.1.1.1.1.1.2.1.1.1">Physical-Plausibe</span></span> </span></span> <span class="ltx_td ltx_align_center ltx_align_top ltx_border_r ltx_border_tt ltx_colspan ltx_colspan_2" id="S1.T1.2.1.1.1.1.1.1.1.3"><span class="ltx_text ltx_font_bold" id="S1.T1.2.1.1.1.1.1.1.1.3.1">Planner</span></span> <span class="ltx_td ltx_align_center ltx_align_top ltx_border_r ltx_border_tt ltx_colspan ltx_colspan_3" id="S1.T1.2.1.1.1.1.1.1.1.4"><span class="ltx_text ltx_font_bold" id="S1.T1.2.1.1.1.1.1.1.1.4.1">Controller</span></span> <span class="ltx_td ltx_align_center ltx_align_top ltx_border_tt ltx_colspan ltx_colspan_7" id="S1.T1.2.1.1.1.1.1.1.1.5"><span class="ltx_text ltx_font_bold" id="S1.T1.2.1.1.1.1.1.1.1.5.1">Incorporated Skills</span></span></span> <span class="ltx_tr" id="S1.T1.2.1.1.1.1.1.1.2"> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S1.T1.2.1.1.1.1.1.1.2.1"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.1.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.1.1.1" style="width:75.0pt;"></span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.2"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.2.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.2.1.1" style="width:50.0pt;">Automatic</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S1.T1.2.1.1.1.1.1.1.2.3"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.3.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.3.1.1" style="width:65.0pt;">Style-Diversity</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.4"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.4.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.4.1.1" style="width:50.0pt;">Text-Aware</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.5"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.5.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.5.1.1" style="width:55.0pt;">Scene-Aware</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S1.T1.2.1.1.1.1.1.1.2.6"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.6.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.6.1.1" style="width:75.0pt;">Skill-Scalability</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.7"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.7.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.7.1.1" style="width:20.0pt;">Walk</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.8"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.8.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.8.1.1" style="width:20.0pt;">Sit</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.9"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.9.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.9.1.1" style="width:20.0pt;">Lie</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.10"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.10.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.10.1.1" style="width:20.0pt;">GetUp</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.11"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.11.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.11.1.1" style="width:20.0pt;">Reach</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.12"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.12.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.12.1.1" style="width:20.0pt;">Idle</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.13"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.2.13.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.2.13.1.1" style="width:20.0pt;">Carry</span> </span></span></span> <span class="ltx_tr" id="S1.T1.2.1.1.1.1.1.1.3"> <span class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.1">NSM<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib33" title=""><span class="ltx_text" style="font-size:90%;">33</span></a>]</cite></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.2"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.2.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.2.1.1" style="width:75.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.3"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.3.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.3.1.1" style="width:50.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.4"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.4.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.4.1.1" style="width:65.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.5"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.5.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.5.1.1" style="width:50.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.6"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.6.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.6.1.1" style="width:55.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.7"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.7.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.7.1.1" style="width:75.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.8"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.8.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.8.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.9"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.9.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.9.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.10"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.10.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.10.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.11"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.11.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.11.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.12"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.12.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.12.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.13"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.13.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.13.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.3.14"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.3.14.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.3.14.1.1" style="width:20.0pt;">✓</span> </span></span></span> <span class="ltx_tr" id="S1.T1.2.1.1.1.1.1.1.4"> <span class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.1">SAMP<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib12" title=""><span class="ltx_text" style="font-size:90%;">12</span></a>]</cite></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.2"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.2.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.2.1.1" style="width:75.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.3"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.3.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.3.1.1" style="width:50.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.4"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.4.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.4.1.1" style="width:65.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.5"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.5.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.5.1.1" style="width:50.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.6"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.6.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.6.1.1" style="width:55.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.7"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.7.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.7.1.1" style="width:75.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.8"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.8.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.8.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.9"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.9.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.9.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.10"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.10.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.10.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.11"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.11.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.11.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.12"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.12.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.12.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.13"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.13.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.13.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.4.14"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.4.14.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.4.14.1.1" style="width:20.0pt;">✗</span> </span></span></span> <span class="ltx_tr" id="S1.T1.2.1.1.1.1.1.1.5"> <span class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.1">Humanise<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib44" title=""><span class="ltx_text" style="font-size:90%;">44</span></a>]</cite></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.2"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.2.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.2.1.1" style="width:75.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.3"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.3.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.3.1.1" style="width:50.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.4"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.4.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.4.1.1" style="width:65.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.5"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.5.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.5.1.1" style="width:50.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.6"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.6.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.6.1.1" style="width:55.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.7"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.7.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.7.1.1" style="width:75.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.8"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.8.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.8.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.9"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.9.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.9.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.10"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.10.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.10.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.11"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.11.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.11.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.12"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.12.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.12.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.13"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.13.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.13.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.5.14"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.5.14.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.5.14.1.1" style="width:20.0pt;">✗</span> </span></span></span> <span class="ltx_tr" id="S1.T1.2.1.1.1.1.1.1.6"> <span class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.1">AffordMotion<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib45" title=""><span class="ltx_text" style="font-size:90%;">45</span></a>]</cite></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.2"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.2.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.2.1.1" style="width:75.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.3"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.3.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.3.1.1" style="width:50.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.4"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.4.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.4.1.1" style="width:65.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.5"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.5.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.5.1.1" style="width:50.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.6"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.6.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.6.1.1" style="width:55.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.7"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.7.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.7.1.1" style="width:75.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.8"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.8.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.8.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.9"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.9.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.9.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.10"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.10.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.10.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.11"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.11.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.11.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.12"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.12.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.12.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.13"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.13.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.13.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.6.14"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.6.14.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.6.14.1.1" style="width:20.0pt;">✗</span> </span></span></span> <span class="ltx_tr" id="S1.T1.2.1.1.1.1.1.1.7"> <span class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.1">TesMo<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib50" title=""><span class="ltx_text" style="font-size:90%;">50</span></a>]</cite></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.2"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.2.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.2.1.1" style="width:75.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.3"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.3.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.3.1.1" style="width:50.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.4"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.4.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.4.1.1" style="width:65.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.5"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.5.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.5.1.1" style="width:50.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.6"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.6.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.6.1.1" style="width:55.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.7"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.7.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.7.1.1" style="width:75.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.8"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.8.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.8.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.9"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.9.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.9.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.10"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.10.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.10.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.11"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.11.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.11.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.12"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.12.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.12.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.13"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.13.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.13.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.7.14"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.7.14.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.7.14.1.1" style="width:20.0pt;">✗</span> </span></span></span> <span class="ltx_tr" id="S1.T1.2.1.1.1.1.1.1.8"> <span class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.1">InterScene<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>]</cite></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.2"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.2.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.2.1.1" style="width:75.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.3"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.3.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.3.1.1" style="width:50.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.4"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.4.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.4.1.1" style="width:65.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.5"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.5.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.5.1.1" style="width:50.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.6"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.6.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.6.1.1" style="width:55.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.7"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.7.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.7.1.1" style="width:75.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.8"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.8.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.8.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.9"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.9.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.9.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.10"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.10.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.10.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.11"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.11.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.11.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.12"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.12.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.12.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.13"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.13.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.13.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.8.14"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.8.14.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.8.14.1.1" style="width:20.0pt;">✗</span> </span></span></span> <span class="ltx_tr" id="S1.T1.2.1.1.1.1.1.1.9"> <span class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.1">UniHSI<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.2"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.2.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.2.1.1" style="width:75.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.3"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.3.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.3.1.1" style="width:50.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.4"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.4.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.4.1.1" style="width:65.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.5"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.5.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.5.1.1" style="width:50.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.6"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.6.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.6.1.1" style="width:55.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.7"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.7.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.7.1.1" style="width:75.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.8"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.8.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.8.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.9"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.9.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.9.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.10"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.10.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.10.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.11"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.11.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.11.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.12"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.12.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.12.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.13"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.13.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.13.1.1" style="width:20.0pt;">✗</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.9.14"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.9.14.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.9.14.1.1" style="width:20.0pt;">✗</span> </span></span></span> <span class="ltx_tr" id="S1.T1.2.1.1.1.1.1.1.10"> <span class="ltx_td ltx_align_left ltx_border_bb ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.1"><span class="ltx_text ltx_font_bold" id="S1.T1.2.1.1.1.1.1.1.10.1.1">SIMS (ours)</span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.2"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.2.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.2.1.1" style="width:75.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.3"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.3.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.3.1.1" style="width:50.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.4"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.4.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.4.1.1" style="width:65.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.5"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.5.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.5.1.1" style="width:50.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.6"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.6.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.6.1.1" style="width:55.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.7"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.7.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.7.1.1" style="width:75.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.8"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.8.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.8.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.9"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.9.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.9.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.10"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.10.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.10.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.11"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.11.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.11.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.12"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.12.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.12.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.13"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.13.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.13.1.1" style="width:20.0pt;">✓</span> </span></span> <span class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S1.T1.2.1.1.1.1.1.1.10.14"> <span class="ltx_inline-block ltx_align_top" id="S1.T1.2.1.1.1.1.1.1.10.14.1"> <span class="ltx_p" id="S1.T1.2.1.1.1.1.1.1.10.14.1.1" style="width:20.0pt;">✓</span> </span></span></span> </span></span></span> </span></span></span></p> </span></div> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S1.T1.3.1.1" style="font-size:90%;">Table 1</span>: </span><span class="ltx_text" id="S1.T1.4.2" style="font-size:90%;">Comparision of Kinematics-Based(upper 5) and Physics-Based(lower 3) Long-term Human Scene Interaction methods.</span></figcaption> </figure> <div class="ltx_para" id="S1.p1"> <p class="ltx_p" id="S1.p1.1">Developing skillful characters with a broad repertoire of motor skills, such as walking, sitting, and reaching—while facilitating rich interactions with their environments has long been a desirable goal for animation, robotics, and VR/AR applications. In particular, achieving <span class="ltx_text ltx_font_italic" id="S1.p1.1.1">long-term, stylized, and physically plausible</span> interactions with diverse styles and intricate details is crucial for bringing characters and narratives to life.</p> </div> <div class="ltx_para" id="S1.p2"> <p class="ltx_p" id="S1.p2.1">Previous works <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib33" title=""><span class="ltx_text" style="font-size:90%;">33</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib44" title=""><span class="ltx_text" style="font-size:90%;">44</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib12" title=""><span class="ltx_text" style="font-size:90%;">12</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib41" title=""><span class="ltx_text" style="font-size:90%;">41</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib54" title=""><span class="ltx_text" style="font-size:90%;">54</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib55" title=""><span class="ltx_text" style="font-size:90%;">55</span></a>]</cite> have explored long-term motion generation for kinematics-based human-scene interactions. However, they typically suffer from severe physical artifacts such as penetration and foot skating. To address these issues, recent studies <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib51" title=""><span class="ltx_text" style="font-size:90%;">51</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib17" title=""><span class="ltx_text" style="font-size:90%;">17</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib34" title=""><span class="ltx_text" style="font-size:90%;">34</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib48" title=""><span class="ltx_text" style="font-size:90%;">48</span></a>]</cite> have started incorporating physics simulators, i.e.,  <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib23" title=""><span class="ltx_text" style="font-size:90%;">23</span></a>]</cite> to produce more physically plausible motions. Despite these advancements, the frameworks are limited to a small number of specific skills and task objectives, lacking diversity. Moreover, their planning results are often simplistic by following chronological lists <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib48" title=""><span class="ltx_text" style="font-size:90%;">48</span></a>]</cite> or focusing solely on contacts <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite>. This stands in contrast to real-world situations where body language in human motion and interactions directly convey a large number of <span class="ltx_text ltx_font_bold ltx_font_italic" id="S1.p2.1.1">emotional or stylized</span> states. For example, a person sitting on a chair with their head down and supporting it with their hands often conveys a sense of depression.</p> </div> <div class="ltx_para" id="S1.p3"> <p class="ltx_p" id="S1.p3.1">To address the aforementioned challenges, we propose a novel framework terms SIMS, (<span class="ltx_text ltx_font_bold" id="S1.p3.1.1">S</span>multating styl<span class="ltx_text ltx_font_bold" id="S1.p3.1.2">I</span>zed hu<span class="ltx_text ltx_font_bold" id="S1.p3.1.3">M</span>an <span class="ltx_text ltx_font_bold" id="S1.p3.1.4">S</span>cene interactions). Specifically, SIMS utilizes an LLM <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib1" title=""><span class="ltx_text" style="font-size:90%;">1</span></a>]</cite> as a powerful high-level motion planner and physical policies as low-level controllers equipped with diverse motor skills. Inspired by Retrieval-Augmented Generation <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib18" title=""><span class="ltx_text" style="font-size:90%;">18</span></a>]</cite>, to generate semantically rich scripts, we develop a method of first creating a short script database and then retrieving and generating longer scripts. Each short script includes several keyframes detailing stylized interactions that the low-level control policy can effectively execute. We then retrieve the top-<span class="ltx_text ltx_font_italic" id="S1.p3.1.5">k</span> short scripts via the CLIP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib31" title=""><span class="ltx_text" style="font-size:90%;">31</span></a>]</cite> similarity between short script summaries and the user-provided story themes. Finally, we prompt the LLM to retrieve and generate stylized long-term scripts based on the short script inputs. Given the planned keyframes, a low-level control policy is employed to obtain the detailed body motions in the physical simulator, producing natural, diverse, and high-quality interactions. To ensure stylized motions are adaptable to various furniture shapes within a complex indoor environment, we propose a multi-condition control policy that is attuned to scene geometries, task goal observations, and text embeddings from the CLIP model <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib31" title=""><span class="ltx_text" style="font-size:90%;">31</span></a>]</cite> for high-fidelity motion generation. Our multi-condition design not only facilitates effective scene perception but also captures fine-grained body movements, enabling a better grasp of stylized motor skills, i.e., the policy learns to perform more skills during imitation learning. Compared to previous policies <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>]</cite> that lack style control and UniHSI <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite>, which relies on accurate references, our approach supports flexible multi-condition control while mitigating mode collapse in AMP-based methods <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib29" title=""><span class="ltx_text" style="font-size:90%;">29</span></a>]</cite>. We incorporate a finite state machine (FSM) to manage multiple policies guided by specified keyframes, enabling the synthesis of physics-based animation that aligns with real-world distributions while improving scalability. To address the scarcity of motion data in the field of stylized motion generation, we collected and annotated captions and style labels from five existing motion capture datasets. Additionally, we capture a new dataset named ViconStyle to supplement the limitations in both the categories and quantity of stylized motion data. </p> </div> <div class="ltx_para" id="S1.p4"> <p class="ltx_p" id="S1.p4.1">We conduct an extensive evaluation of our method to validate its effectiveness. To provide a more comprehensive overview, we compare five SOTA kinematics-based <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib33" title=""><span class="ltx_text" style="font-size:90%;">33</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib12" title=""><span class="ltx_text" style="font-size:90%;">12</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib44" title=""><span class="ltx_text" style="font-size:90%;">44</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib45" title=""><span class="ltx_text" style="font-size:90%;">45</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib50" title=""><span class="ltx_text" style="font-size:90%;">50</span></a>]</cite> and two physics-based <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite> long-term HSI methods with SIMS to explain our task setting in <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S1.T1" title="In 1 Introduction ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">1</span></a>. Our method, SIMS, surpasses existing approaches with a fully automatic framework that integrates style diversity, text awareness, scene awareness, and physics plausibility for realistic human-scene interactions. Unlike prior methods, it supports easy extension, ensuring scalability and adaptability. SIMS also achieves the most comprehensive skill coverage, making it a state-of-the-art solution for versatile and controllable motion synthesis.</p> </div> <div class="ltx_para" id="S1.p5"> <p class="ltx_p" id="S1.p5.1">In summary, our contributions are threefold: </p> <ol class="ltx_enumerate" id="S1.I1"> <li class="ltx_item" id="S1.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">1.</span> <div class="ltx_para" id="S1.I1.i1.p1"> <p class="ltx_p" id="S1.I1.i1.p1.1">We propose a framework for physically simulated characters to perform stylized 3D interactions using RAG-based script generation and a multi-condition control policy that encodes style from text while adapting to the environment, featuring: <span class="ltx_text ltx_font_italic" id="S1.I1.i1.p1.1.1">(a) Stylized Control</span>: A script planner for coherent storytelling and a text-conditioned controller for expressive, style-consistent motion. <span class="ltx_text ltx_font_italic" id="S1.I1.i1.p1.1.2">(b) Automatic Generation</span>: A planner that generates executable keyframes from theme descriptions. <span class="ltx_text ltx_font_italic" id="S1.I1.i1.p1.1.3">(c) Scalability</span>: New skills and styles can be integrated by updating the script database and training a new policy.</p> </div> </li> <li class="ltx_item" id="S1.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">2.</span> <div class="ltx_para" id="S1.I1.i2.p1"> <p class="ltx_p" id="S1.I1.i2.p1.1">We provide a comprehensive dataset of restructured motion clips with captions, emotional labels, and a short script database for stylized interactions.</p> </div> </li> <li class="ltx_item" id="S1.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">3.</span> <div class="ltx_para" id="S1.I1.i3.p1"> <p class="ltx_p" id="S1.I1.i3.p1.1">Our method outperforms previous approaches across multiple metrics, achieving high-quality, diverse, and physically plausible long-term motion generation.</p> </div> </li> </ol> </div> </section> <section class="ltx_section ltx_pruned_first" id="S2"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">2 </span>Related Works</h2> <section class="ltx_paragraph" id="S2.SS0.SSS0.Px1"> <h5 class="ltx_title ltx_title_paragraph">Kinematic-based Human Scene Interaction</h5> <div class="ltx_para" id="S2.SS0.SSS0.Px1.p1"> <p class="ltx_p" id="S2.SS0.SSS0.Px1.p1.1">Synthesizing realistic human behavior has been a long-standing challenge. While most methods enhance the quality and diversity of humanoid movements <cite class="ltx_cite ltx_citemacro_citep">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib52" title=""><span class="ltx_text" style="font-size:90%;">52</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib38" title=""><span class="ltx_text" style="font-size:90%;">38</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib39" title=""><span class="ltx_text" style="font-size:90%;">39</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib56" title=""><span class="ltx_text" style="font-size:90%;">56</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib15" title=""><span class="ltx_text" style="font-size:90%;">15</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib40" title=""><span class="ltx_text" style="font-size:90%;">40</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib3" title=""><span class="ltx_text" style="font-size:90%;">3</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib20" title=""><span class="ltx_text" style="font-size:90%;">20</span></a>]</cite>, they often overlook scene interactions. Recently, there’s been growing interest in integrating human-scene interactions, crucial for applications like embodied AI and virtual reality. Many previous approaches <cite class="ltx_cite ltx_citemacro_citep">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib33" title=""><span class="ltx_text" style="font-size:90%;">33</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib12" title=""><span class="ltx_text" style="font-size:90%;">12</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib41" title=""><span class="ltx_text" style="font-size:90%;">41</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib54" title=""><span class="ltx_text" style="font-size:90%;">54</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib44" title=""><span class="ltx_text" style="font-size:90%;">44</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib16" title=""><span class="ltx_text" style="font-size:90%;">16</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib45" title=""><span class="ltx_text" style="font-size:90%;">45</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib50" title=""><span class="ltx_text" style="font-size:90%;">50</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib3" title=""><span class="ltx_text" style="font-size:90%;">3</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib53" title=""><span class="ltx_text" style="font-size:90%;">53</span></a>]</cite> rely on data-driven kinematic models <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib35" title=""><span class="ltx_text" style="font-size:90%;">35</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib43" title=""><span class="ltx_text" style="font-size:90%;">43</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib6" title=""><span class="ltx_text" style="font-size:90%;">6</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib9" title=""><span class="ltx_text" style="font-size:90%;">9</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib42" title=""><span class="ltx_text" style="font-size:90%;">42</span></a>]</cite> for static or dynamic interactions. However, these often lack physical plausibility, resulting in artifacts like penetration, floating, and sliding, and require additional post-processing, limiting real-time use.</p> </div> </section> <section class="ltx_paragraph" id="S2.SS0.SSS0.Px2"> <h5 class="ltx_title ltx_title_paragraph">Physics-based Human-Scene Interaction</h5> <div class="ltx_para" id="S2.SS0.SSS0.Px2.p1"> <p class="ltx_p" id="S2.SS0.SSS0.Px2.p1.1">While previous physics-based animation approaches mainly focused on human motion alone <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib30" title=""><span class="ltx_text" style="font-size:90%;">30</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib29" title=""><span class="ltx_text" style="font-size:90%;">29</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib28" title=""><span class="ltx_text" style="font-size:90%;">28</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib5" title=""><span class="ltx_text" style="font-size:90%;">5</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib14" title=""><span class="ltx_text" style="font-size:90%;">14</span></a>]</cite>. InterPhys <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>]</cite> presents a framework extending AMP to include character and object dynamics, using a scene-conditioned discriminator for superior performance compared to previous methods. Additionally, InterScene <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>]</cite> effectively synthesizes physically plausible long-term human motions in complex 3D scenes by decomposing interactions into Interacting and Navigating processes. This method uses reusable controllers trained in simple environments to generalize across diverse scenarios. With the development of LLMs, UniHSI <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite> introduces a unified framework for human-object interaction via language commands, featuring an LLM Planner and Unified Controller, which reduces training labor with LLM-generated plans. The effectiveness of this approach is evaluated using the ScenePlan dataset.</p> </div> </section> <section class="ltx_paragraph" id="S2.SS0.SSS0.Px3"> <h5 class="ltx_title ltx_title_paragraph">Comparison with Previous HSI Methods</h5> <div class="ltx_para" id="S2.SS0.SSS0.Px3.p1"> <p class="ltx_p" id="S2.SS0.SSS0.Px3.p1.1">We compare five kinematics-based SOTA and two physics-based long-term HSI methods with SIMS to explain our task setting in <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S1.T1" title="In 1 Introduction ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">1</span></a>. NSM <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib33" title=""><span class="ltx_text" style="font-size:90%;">33</span></a>]</cite> and SAMP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib12" title=""><span class="ltx_text" style="font-size:90%;">12</span></a>]</cite> use goal positions for planning. Humanise <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib44" title=""><span class="ltx_text" style="font-size:90%;">44</span></a>]</cite>, AffordMotion <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib45" title=""><span class="ltx_text" style="font-size:90%;">45</span></a>]</cite>, and TeSMo <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib50" title=""><span class="ltx_text" style="font-size:90%;">50</span></a>]</cite> utilize text-based control for human motion, with the latter two leveraging textual annotations from datasets like HumanML3D <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib11" title=""><span class="ltx_text" style="font-size:90%;">11</span></a>]</cite>, enabling some details in motion expression. All five kinematics-based methods rely on continuous keyframe control, requiring frequent user input updates. In contrast, InterScene <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>]</cite> automates control by setting long-term keyframes for FSM to switch skills, and UniHSI <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite> applies long-term keyframes of body-object contacts. Our planning uses RAG to generate long-term scripts, and enable automation and diversity. For HSI skills, we focus on 2 locomotion skills: walk and idle, 4 common human scene interaction skills: sit, lie, get up, and touch, and 1 dynamic object interaction skill: carry. Regarding control extensibility, only InterScene and our approach allow training solely for new skills without retraining the entire controller. In Supp.Mat, we demonstrate how to easily involve new interaction skills with specific styles into our framework.</p> </div> <figure class="ltx_figure" id="S2.F2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="403" id="S2.F2.g1" src="x2.png" width="797"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S2.F2.2.1.1" style="font-size:90%;">Figure 2</span>: </span><span class="ltx_text" id="S2.F2.3.2" style="font-size:90%;">(a) Our main pipeline. We prompt LLMs to generate new short scripts following their emotion and interaction logic. The retrieval process includes 2 stages. We first retrieve the top-k short script with semantics similarity, then ask LLM to retrieve useful samples from the short scripts and concatenate them as a fluent long-term story. In the Finite State Machine. We parse skills, captions, and scene geometry from each keyframe into task goals, language embeddings, and heightmap conditions to drive the low-level physical control policy. (c) The multi-condition physics policy. We divide common skills into 3 categories: Lococmotion, HSI, and DOI. Skills in the same category share similar task observations and reward computations.</span></figcaption> </figure> </section> </section> <section class="ltx_section" id="S3"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">3 </span>Method</h2> <div class="ltx_para" id="S3.p1"> <p class="ltx_p" id="S3.p1.1">We present SIMS as a hierarchical character animation system that leverages LLMs for high-level long-term script planning, multi-condition policies for low-level character control, and a finite state machine to bridge two levels. In <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS1" title="3.1 Short Script Database Construction ‣ 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Sec.</span> <span class="ltx_text ltx_ref_tag">3.1</span></a>, we first describe the construction of short script databases. <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS2" title="3.2 Retrieval Augmented Script Generation ‣ 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Sec.</span> <span class="ltx_text ltx_ref_tag">3.2</span></a> then describes the generation of stylized long-term scripts using Retrieval-Augmented Script Generation (RASG). Finally, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS3" title="3.3 Multi-Condition Controller ‣ 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Sec.</span> <span class="ltx_text ltx_ref_tag">3.3</span></a> explains the training of multi-condition policies and their scheduling through the finite state machine based on key frames. The supplementary material demonstrates our system’s extensibility in adding new scene interaction skills.</p> </div> <section class="ltx_subsection" id="S3.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.1 </span>Short Script Database Construction</h3> <div class="ltx_para" id="S3.SS1.p1"> <p class="ltx_p" id="S3.SS1.p1.11">A short script <math alttext="p" class="ltx_Math" display="inline" id="S3.SS1.p1.1.m1.1"><semantics id="S3.SS1.p1.1.m1.1a"><mi id="S3.SS1.p1.1.m1.1.1" xref="S3.SS1.p1.1.m1.1.1.cmml">p</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.1.m1.1b"><ci id="S3.SS1.p1.1.m1.1.1.cmml" xref="S3.SS1.p1.1.m1.1.1">𝑝</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.1.m1.1c">p</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.1.m1.1d">italic_p</annotation></semantics></math> consists of a sequence of key frames <math alttext="\{f_{0},f_{1},...,f_{N}\}" class="ltx_Math" display="inline" id="S3.SS1.p1.2.m2.4"><semantics id="S3.SS1.p1.2.m2.4a"><mrow id="S3.SS1.p1.2.m2.4.4.3" xref="S3.SS1.p1.2.m2.4.4.4.cmml"><mo id="S3.SS1.p1.2.m2.4.4.3.4" stretchy="false" xref="S3.SS1.p1.2.m2.4.4.4.cmml">{</mo><msub id="S3.SS1.p1.2.m2.2.2.1.1" xref="S3.SS1.p1.2.m2.2.2.1.1.cmml"><mi id="S3.SS1.p1.2.m2.2.2.1.1.2" xref="S3.SS1.p1.2.m2.2.2.1.1.2.cmml">f</mi><mn id="S3.SS1.p1.2.m2.2.2.1.1.3" xref="S3.SS1.p1.2.m2.2.2.1.1.3.cmml">0</mn></msub><mo id="S3.SS1.p1.2.m2.4.4.3.5" xref="S3.SS1.p1.2.m2.4.4.4.cmml">,</mo><msub id="S3.SS1.p1.2.m2.3.3.2.2" xref="S3.SS1.p1.2.m2.3.3.2.2.cmml"><mi id="S3.SS1.p1.2.m2.3.3.2.2.2" xref="S3.SS1.p1.2.m2.3.3.2.2.2.cmml">f</mi><mn id="S3.SS1.p1.2.m2.3.3.2.2.3" xref="S3.SS1.p1.2.m2.3.3.2.2.3.cmml">1</mn></msub><mo id="S3.SS1.p1.2.m2.4.4.3.6" xref="S3.SS1.p1.2.m2.4.4.4.cmml">,</mo><mi id="S3.SS1.p1.2.m2.1.1" mathvariant="normal" xref="S3.SS1.p1.2.m2.1.1.cmml">…</mi><mo id="S3.SS1.p1.2.m2.4.4.3.7" xref="S3.SS1.p1.2.m2.4.4.4.cmml">,</mo><msub id="S3.SS1.p1.2.m2.4.4.3.3" xref="S3.SS1.p1.2.m2.4.4.3.3.cmml"><mi id="S3.SS1.p1.2.m2.4.4.3.3.2" xref="S3.SS1.p1.2.m2.4.4.3.3.2.cmml">f</mi><mi id="S3.SS1.p1.2.m2.4.4.3.3.3" xref="S3.SS1.p1.2.m2.4.4.3.3.3.cmml">N</mi></msub><mo id="S3.SS1.p1.2.m2.4.4.3.8" stretchy="false" xref="S3.SS1.p1.2.m2.4.4.4.cmml">}</mo></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.2.m2.4b"><set id="S3.SS1.p1.2.m2.4.4.4.cmml" xref="S3.SS1.p1.2.m2.4.4.3"><apply id="S3.SS1.p1.2.m2.2.2.1.1.cmml" xref="S3.SS1.p1.2.m2.2.2.1.1"><csymbol cd="ambiguous" id="S3.SS1.p1.2.m2.2.2.1.1.1.cmml" xref="S3.SS1.p1.2.m2.2.2.1.1">subscript</csymbol><ci id="S3.SS1.p1.2.m2.2.2.1.1.2.cmml" xref="S3.SS1.p1.2.m2.2.2.1.1.2">𝑓</ci><cn id="S3.SS1.p1.2.m2.2.2.1.1.3.cmml" type="integer" xref="S3.SS1.p1.2.m2.2.2.1.1.3">0</cn></apply><apply id="S3.SS1.p1.2.m2.3.3.2.2.cmml" xref="S3.SS1.p1.2.m2.3.3.2.2"><csymbol cd="ambiguous" id="S3.SS1.p1.2.m2.3.3.2.2.1.cmml" xref="S3.SS1.p1.2.m2.3.3.2.2">subscript</csymbol><ci id="S3.SS1.p1.2.m2.3.3.2.2.2.cmml" xref="S3.SS1.p1.2.m2.3.3.2.2.2">𝑓</ci><cn id="S3.SS1.p1.2.m2.3.3.2.2.3.cmml" type="integer" xref="S3.SS1.p1.2.m2.3.3.2.2.3">1</cn></apply><ci id="S3.SS1.p1.2.m2.1.1.cmml" xref="S3.SS1.p1.2.m2.1.1">…</ci><apply id="S3.SS1.p1.2.m2.4.4.3.3.cmml" xref="S3.SS1.p1.2.m2.4.4.3.3"><csymbol cd="ambiguous" id="S3.SS1.p1.2.m2.4.4.3.3.1.cmml" xref="S3.SS1.p1.2.m2.4.4.3.3">subscript</csymbol><ci id="S3.SS1.p1.2.m2.4.4.3.3.2.cmml" xref="S3.SS1.p1.2.m2.4.4.3.3.2">𝑓</ci><ci id="S3.SS1.p1.2.m2.4.4.3.3.3.cmml" xref="S3.SS1.p1.2.m2.4.4.3.3.3">𝑁</ci></apply></set></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.2.m2.4c">\{f_{0},f_{1},...,f_{N}\}</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.2.m2.4d">{ italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_f start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT }</annotation></semantics></math>. Each key frame <math alttext="f=(s,o,c,e)" class="ltx_Math" display="inline" id="S3.SS1.p1.3.m3.4"><semantics id="S3.SS1.p1.3.m3.4a"><mrow id="S3.SS1.p1.3.m3.4.5" xref="S3.SS1.p1.3.m3.4.5.cmml"><mi id="S3.SS1.p1.3.m3.4.5.2" xref="S3.SS1.p1.3.m3.4.5.2.cmml">f</mi><mo id="S3.SS1.p1.3.m3.4.5.1" xref="S3.SS1.p1.3.m3.4.5.1.cmml">=</mo><mrow id="S3.SS1.p1.3.m3.4.5.3.2" xref="S3.SS1.p1.3.m3.4.5.3.1.cmml"><mo id="S3.SS1.p1.3.m3.4.5.3.2.1" stretchy="false" xref="S3.SS1.p1.3.m3.4.5.3.1.cmml">(</mo><mi id="S3.SS1.p1.3.m3.1.1" xref="S3.SS1.p1.3.m3.1.1.cmml">s</mi><mo id="S3.SS1.p1.3.m3.4.5.3.2.2" xref="S3.SS1.p1.3.m3.4.5.3.1.cmml">,</mo><mi id="S3.SS1.p1.3.m3.2.2" xref="S3.SS1.p1.3.m3.2.2.cmml">o</mi><mo id="S3.SS1.p1.3.m3.4.5.3.2.3" xref="S3.SS1.p1.3.m3.4.5.3.1.cmml">,</mo><mi id="S3.SS1.p1.3.m3.3.3" xref="S3.SS1.p1.3.m3.3.3.cmml">c</mi><mo id="S3.SS1.p1.3.m3.4.5.3.2.4" xref="S3.SS1.p1.3.m3.4.5.3.1.cmml">,</mo><mi id="S3.SS1.p1.3.m3.4.4" xref="S3.SS1.p1.3.m3.4.4.cmml">e</mi><mo id="S3.SS1.p1.3.m3.4.5.3.2.5" stretchy="false" xref="S3.SS1.p1.3.m3.4.5.3.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.3.m3.4b"><apply id="S3.SS1.p1.3.m3.4.5.cmml" xref="S3.SS1.p1.3.m3.4.5"><eq id="S3.SS1.p1.3.m3.4.5.1.cmml" xref="S3.SS1.p1.3.m3.4.5.1"></eq><ci id="S3.SS1.p1.3.m3.4.5.2.cmml" xref="S3.SS1.p1.3.m3.4.5.2">𝑓</ci><vector id="S3.SS1.p1.3.m3.4.5.3.1.cmml" xref="S3.SS1.p1.3.m3.4.5.3.2"><ci id="S3.SS1.p1.3.m3.1.1.cmml" xref="S3.SS1.p1.3.m3.1.1">𝑠</ci><ci id="S3.SS1.p1.3.m3.2.2.cmml" xref="S3.SS1.p1.3.m3.2.2">𝑜</ci><ci id="S3.SS1.p1.3.m3.3.3.cmml" xref="S3.SS1.p1.3.m3.3.3">𝑐</ci><ci id="S3.SS1.p1.3.m3.4.4.cmml" xref="S3.SS1.p1.3.m3.4.4">𝑒</ci></vector></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.3.m3.4c">f=(s,o,c,e)</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.3.m3.4d">italic_f = ( italic_s , italic_o , italic_c , italic_e )</annotation></semantics></math> specifies (1) a skill <math alttext="s" class="ltx_Math" display="inline" id="S3.SS1.p1.4.m4.1"><semantics id="S3.SS1.p1.4.m4.1a"><mi id="S3.SS1.p1.4.m4.1.1" xref="S3.SS1.p1.4.m4.1.1.cmml">s</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.4.m4.1b"><ci id="S3.SS1.p1.4.m4.1.1.cmml" xref="S3.SS1.p1.4.m4.1.1">𝑠</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.4.m4.1c">s</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.4.m4.1d">italic_s</annotation></semantics></math> to execute, (2) a target object <math alttext="o" class="ltx_Math" display="inline" id="S3.SS1.p1.5.m5.1"><semantics id="S3.SS1.p1.5.m5.1a"><mi id="S3.SS1.p1.5.m5.1.1" xref="S3.SS1.p1.5.m5.1.1.cmml">o</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.5.m5.1b"><ci id="S3.SS1.p1.5.m5.1.1.cmml" xref="S3.SS1.p1.5.m5.1.1">𝑜</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.5.m5.1c">o</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.5.m5.1d">italic_o</annotation></semantics></math> to interact with, (3) captions <math alttext="c" class="ltx_Math" display="inline" id="S3.SS1.p1.6.m6.1"><semantics id="S3.SS1.p1.6.m6.1a"><mi id="S3.SS1.p1.6.m6.1.1" xref="S3.SS1.p1.6.m6.1.1.cmml">c</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.6.m6.1b"><ci id="S3.SS1.p1.6.m6.1.1.cmml" xref="S3.SS1.p1.6.m6.1.1">𝑐</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.6.m6.1c">c</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.6.m6.1d">italic_c</annotation></semantics></math> that describes motion attributes, and (4) the emotion or style <math alttext="e" class="ltx_Math" display="inline" id="S3.SS1.p1.7.m7.1"><semantics id="S3.SS1.p1.7.m7.1a"><mi id="S3.SS1.p1.7.m7.1.1" xref="S3.SS1.p1.7.m7.1.1.cmml">e</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.7.m7.1b"><ci id="S3.SS1.p1.7.m7.1.1.cmml" xref="S3.SS1.p1.7.m7.1.1">𝑒</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.7.m7.1c">e</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.7.m7.1d">italic_e</annotation></semantics></math> the motion expresses. Inspired by filmmaking, the short script uses only a few key frames to represent a short daily human-scene interaction segment. We add a concise one-sentence summary <math alttext="u" class="ltx_Math" display="inline" id="S3.SS1.p1.8.m8.1"><semantics id="S3.SS1.p1.8.m8.1a"><mi id="S3.SS1.p1.8.m8.1.1" xref="S3.SS1.p1.8.m8.1.1.cmml">u</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.8.m8.1b"><ci id="S3.SS1.p1.8.m8.1.1.cmml" xref="S3.SS1.p1.8.m8.1.1">𝑢</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.8.m8.1c">u</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.8.m8.1d">italic_u</annotation></semantics></math> that encapsulates the core style or emotion and interaction events of the short script. We further separate the style or emotion keyword as a distinctive label <math alttext="d" class="ltx_Math" display="inline" id="S3.SS1.p1.9.m9.1"><semantics id="S3.SS1.p1.9.m9.1a"><mi id="S3.SS1.p1.9.m9.1.1" xref="S3.SS1.p1.9.m9.1.1.cmml">d</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.9.m9.1b"><ci id="S3.SS1.p1.9.m9.1.1.cmml" xref="S3.SS1.p1.9.m9.1.1">𝑑</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.9.m9.1c">d</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.9.m9.1d">italic_d</annotation></semantics></math>, as a conclusion of the keyframe style labels. hus, the final formation of the short script is <math alttext="p=[\{f_{0},f_{1},...,f_{N}\},u,d]" class="ltx_Math" display="inline" id="S3.SS1.p1.10.m10.4"><semantics id="S3.SS1.p1.10.m10.4a"><mrow id="S3.SS1.p1.10.m10.4.4" xref="S3.SS1.p1.10.m10.4.4.cmml"><mi id="S3.SS1.p1.10.m10.4.4.3" xref="S3.SS1.p1.10.m10.4.4.3.cmml">p</mi><mo id="S3.SS1.p1.10.m10.4.4.2" xref="S3.SS1.p1.10.m10.4.4.2.cmml">=</mo><mrow id="S3.SS1.p1.10.m10.4.4.1.1" xref="S3.SS1.p1.10.m10.4.4.1.2.cmml"><mo id="S3.SS1.p1.10.m10.4.4.1.1.2" stretchy="false" xref="S3.SS1.p1.10.m10.4.4.1.2.cmml">[</mo><mrow id="S3.SS1.p1.10.m10.4.4.1.1.1.3" xref="S3.SS1.p1.10.m10.4.4.1.1.1.4.cmml"><mo id="S3.SS1.p1.10.m10.4.4.1.1.1.3.4" stretchy="false" xref="S3.SS1.p1.10.m10.4.4.1.1.1.4.cmml">{</mo><msub id="S3.SS1.p1.10.m10.4.4.1.1.1.1.1" xref="S3.SS1.p1.10.m10.4.4.1.1.1.1.1.cmml"><mi id="S3.SS1.p1.10.m10.4.4.1.1.1.1.1.2" xref="S3.SS1.p1.10.m10.4.4.1.1.1.1.1.2.cmml">f</mi><mn id="S3.SS1.p1.10.m10.4.4.1.1.1.1.1.3" xref="S3.SS1.p1.10.m10.4.4.1.1.1.1.1.3.cmml">0</mn></msub><mo id="S3.SS1.p1.10.m10.4.4.1.1.1.3.5" xref="S3.SS1.p1.10.m10.4.4.1.1.1.4.cmml">,</mo><msub id="S3.SS1.p1.10.m10.4.4.1.1.1.2.2" xref="S3.SS1.p1.10.m10.4.4.1.1.1.2.2.cmml"><mi id="S3.SS1.p1.10.m10.4.4.1.1.1.2.2.2" xref="S3.SS1.p1.10.m10.4.4.1.1.1.2.2.2.cmml">f</mi><mn id="S3.SS1.p1.10.m10.4.4.1.1.1.2.2.3" xref="S3.SS1.p1.10.m10.4.4.1.1.1.2.2.3.cmml">1</mn></msub><mo id="S3.SS1.p1.10.m10.4.4.1.1.1.3.6" xref="S3.SS1.p1.10.m10.4.4.1.1.1.4.cmml">,</mo><mi id="S3.SS1.p1.10.m10.1.1" mathvariant="normal" xref="S3.SS1.p1.10.m10.1.1.cmml">…</mi><mo id="S3.SS1.p1.10.m10.4.4.1.1.1.3.7" xref="S3.SS1.p1.10.m10.4.4.1.1.1.4.cmml">,</mo><msub id="S3.SS1.p1.10.m10.4.4.1.1.1.3.3" xref="S3.SS1.p1.10.m10.4.4.1.1.1.3.3.cmml"><mi id="S3.SS1.p1.10.m10.4.4.1.1.1.3.3.2" xref="S3.SS1.p1.10.m10.4.4.1.1.1.3.3.2.cmml">f</mi><mi id="S3.SS1.p1.10.m10.4.4.1.1.1.3.3.3" xref="S3.SS1.p1.10.m10.4.4.1.1.1.3.3.3.cmml">N</mi></msub><mo id="S3.SS1.p1.10.m10.4.4.1.1.1.3.8" stretchy="false" xref="S3.SS1.p1.10.m10.4.4.1.1.1.4.cmml">}</mo></mrow><mo id="S3.SS1.p1.10.m10.4.4.1.1.3" xref="S3.SS1.p1.10.m10.4.4.1.2.cmml">,</mo><mi id="S3.SS1.p1.10.m10.2.2" xref="S3.SS1.p1.10.m10.2.2.cmml">u</mi><mo id="S3.SS1.p1.10.m10.4.4.1.1.4" xref="S3.SS1.p1.10.m10.4.4.1.2.cmml">,</mo><mi id="S3.SS1.p1.10.m10.3.3" xref="S3.SS1.p1.10.m10.3.3.cmml">d</mi><mo id="S3.SS1.p1.10.m10.4.4.1.1.5" stretchy="false" xref="S3.SS1.p1.10.m10.4.4.1.2.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.10.m10.4b"><apply id="S3.SS1.p1.10.m10.4.4.cmml" xref="S3.SS1.p1.10.m10.4.4"><eq id="S3.SS1.p1.10.m10.4.4.2.cmml" xref="S3.SS1.p1.10.m10.4.4.2"></eq><ci id="S3.SS1.p1.10.m10.4.4.3.cmml" xref="S3.SS1.p1.10.m10.4.4.3">𝑝</ci><list id="S3.SS1.p1.10.m10.4.4.1.2.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1"><set id="S3.SS1.p1.10.m10.4.4.1.1.1.4.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1.1.3"><apply id="S3.SS1.p1.10.m10.4.4.1.1.1.1.1.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS1.p1.10.m10.4.4.1.1.1.1.1.1.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1.1.1.1">subscript</csymbol><ci id="S3.SS1.p1.10.m10.4.4.1.1.1.1.1.2.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1.1.1.1.2">𝑓</ci><cn id="S3.SS1.p1.10.m10.4.4.1.1.1.1.1.3.cmml" type="integer" xref="S3.SS1.p1.10.m10.4.4.1.1.1.1.1.3">0</cn></apply><apply id="S3.SS1.p1.10.m10.4.4.1.1.1.2.2.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1.1.2.2"><csymbol cd="ambiguous" id="S3.SS1.p1.10.m10.4.4.1.1.1.2.2.1.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1.1.2.2">subscript</csymbol><ci id="S3.SS1.p1.10.m10.4.4.1.1.1.2.2.2.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1.1.2.2.2">𝑓</ci><cn id="S3.SS1.p1.10.m10.4.4.1.1.1.2.2.3.cmml" type="integer" xref="S3.SS1.p1.10.m10.4.4.1.1.1.2.2.3">1</cn></apply><ci id="S3.SS1.p1.10.m10.1.1.cmml" xref="S3.SS1.p1.10.m10.1.1">…</ci><apply id="S3.SS1.p1.10.m10.4.4.1.1.1.3.3.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1.1.3.3"><csymbol cd="ambiguous" id="S3.SS1.p1.10.m10.4.4.1.1.1.3.3.1.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1.1.3.3">subscript</csymbol><ci id="S3.SS1.p1.10.m10.4.4.1.1.1.3.3.2.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1.1.3.3.2">𝑓</ci><ci id="S3.SS1.p1.10.m10.4.4.1.1.1.3.3.3.cmml" xref="S3.SS1.p1.10.m10.4.4.1.1.1.3.3.3">𝑁</ci></apply></set><ci id="S3.SS1.p1.10.m10.2.2.cmml" xref="S3.SS1.p1.10.m10.2.2">𝑢</ci><ci id="S3.SS1.p1.10.m10.3.3.cmml" xref="S3.SS1.p1.10.m10.3.3">𝑑</ci></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.10.m10.4c">p=[\{f_{0},f_{1},...,f_{N}\},u,d]</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.10.m10.4d">italic_p = [ { italic_f start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_f start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT } , italic_u , italic_d ]</annotation></semantics></math>, serving as the foundational building block in the database. We prompt a Large Language Model (LLM) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib1" title=""><span class="ltx_text" style="font-size:90%;">1</span></a>]</cite> to generate a wide range of short scripts by providing it with the available skills, text captions, specific styles, and available objects. The LLM is tasked not only with creating coherent and lifelike key frame sequences but also with generating matching summaries <math alttext="u" class="ltx_Math" display="inline" id="S3.SS1.p1.11.m11.1"><semantics id="S3.SS1.p1.11.m11.1a"><mi id="S3.SS1.p1.11.m11.1.1" xref="S3.SS1.p1.11.m11.1.1.cmml">u</mi><annotation-xml encoding="MathML-Content" id="S3.SS1.p1.11.m11.1b"><ci id="S3.SS1.p1.11.m11.1.1.cmml" xref="S3.SS1.p1.11.m11.1.1">𝑢</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS1.p1.11.m11.1c">u</annotation><annotation encoding="application/x-llamapun" id="S3.SS1.p1.11.m11.1d">italic_u</annotation></semantics></math>. These short scripts are further categorized based on their distinct emotion or style labels for better modular organization. To enable retrieval, we employ CLIP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib31" title=""><span class="ltx_text" style="font-size:90%;">31</span></a>]</cite> to extract embeddings from the summaries of the short scripts. The extracted embeddings act as keys for efficient and precise retrieval within the database.</p> </div> </section> <section class="ltx_subsection" id="S3.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.2 </span>Retrieval Augmented Script Generation</h3> <div class="ltx_para" id="S3.SS2.p1"> <p class="ltx_p" id="S3.SS2.p1.1">Long-term script generation with LLMs faces challenges such as redundancy, lack of diversity, and insufficient guidance in maintaining coherent narratives. Previous works, such as <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite>, focus on generating limited keyframes with minimal diversity, which constrains their ability to create engaging and robust long-term stories. Inspired by Retrieval-Augmented Generation (RAG) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib18" title=""><span class="ltx_text" style="font-size:90%;">18</span></a>]</cite>, we propose a novel Retrieval-Augmented Script Generation (RASG) method to address these issues.</p> </div> <div class="ltx_para" id="S3.SS2.p2"> <p class="ltx_p" id="S3.SS2.p2.1">To enhance long-term script generation, the LLM retrieves and builds upon the pre-generated short scripts based on user themes in the following steps:</p> </div> <div class="ltx_para" id="S3.SS2.p3"> <p class="ltx_p" id="S3.SS2.p3.1">1) The LLM identifies <span class="ltx_text ltx_font_italic" id="S3.SS2.p3.1.1">M</span> styles most relevant to the theme, narrowing down the potential scope of retrieval. 2) Semantic Similarity Retrieval: The user-provided theme sentence is extracted as a CLIP feature, which serves as the retrieval query. By computing the cosine distance between query and keys, the LLM retrieves top-<span class="ltx_text ltx_font_italic" id="S3.SS2.p3.1.2">k</span> of short scripts for each style. Resulting in <span class="ltx_text ltx_font_italic" id="S3.SS2.p3.1.3">M</span> × <span class="ltx_text ltx_font_italic" id="S3.SS2.p3.1.4">k</span> summaries being retrieved for further processing. 3) Summary Filtering and Long Script Creation: The retrieved summaries are passed to the LLM. Then, based on the given scene layout, the LLM selects and combines suitable summaries into a cohesive narrative by logically concatenating keyframes.</p> </div> <div class="ltx_para" id="S3.SS2.p4"> <p class="ltx_p" id="S3.SS2.p4.1">To ensure executable permutations, we structure skills into tuples, such as <span class="ltx_text ltx_font_italic" id="S3.SS2.p4.1.1">(sit, getup)</span>, <span class="ltx_text ltx_font_italic" id="S3.SS2.p4.1.2">(lie, getup)</span>, <span class="ltx_text ltx_font_italic" id="S3.SS2.p4.1.3">(idle)</span>, <span class="ltx_text ltx_font_italic" id="S3.SS2.p4.1.4">(walk, carry)</span>, <span class="ltx_text ltx_font_italic" id="S3.SS2.p4.1.5">(walk, reach)</span>, etc. Notably, the <span class="ltx_text ltx_font_italic" id="S3.SS2.p4.1.6">walk</span> skill can serve as a transition motion between any skill tuples, enabling seamless connections across sequences. We use this rule to process the generated keyframes and add transitions for interaction skills.</p> </div> </section> <section class="ltx_subsection ltx_pruned_first" id="S3.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">3.3 </span>Multi-Condition Controller</h3> <section class="ltx_paragraph" id="S3.SS3.SSS0.Px1"> <h5 class="ltx_title ltx_title_paragraph">Overview</h5> <div class="ltx_para" id="S3.SS3.SSS0.Px1.p1"> <p class="ltx_p" id="S3.SS3.SSS0.Px1.p1.21">Once a long-term script generated, our goal is to direct a simulated character to perform the key frame sequence in complex 3D scenes. To train characters to complete tasks in a lifelike and stylized manner, we adopt a goal-conditioned RL framework with a text-conditioned discriminator <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib29" title=""><span class="ltx_text" style="font-size:90%;">29</span></a>]</cite>. At each time step <math alttext="t" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.1.m1.1"><semantics id="S3.SS3.SSS0.Px1.p1.1.m1.1a"><mi id="S3.SS3.SSS0.Px1.p1.1.m1.1.1" xref="S3.SS3.SSS0.Px1.p1.1.m1.1.1.cmml">t</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.1.m1.1b"><ci id="S3.SS3.SSS0.Px1.p1.1.m1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.1.m1.1.1">𝑡</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.1.m1.1c">t</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.1.m1.1d">italic_t</annotation></semantics></math>, the policy <math alttext="\pi({\mathbf{a}}_{t}|{\mathbf{s}}_{t},{\mathbf{h}}_{t},{\mathbf{g}}_{t},{% \mathbf{z}})" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.2.m2.2"><semantics id="S3.SS3.SSS0.Px1.p1.2.m2.2a"><mrow id="S3.SS3.SSS0.Px1.p1.2.m2.2.2" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.cmml"><mi id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.3" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.3.cmml">π</mi><mo id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.2" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.2.cmml">⁢</mo><mrow id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.cmml"><mo id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.2" stretchy="false" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.cmml">(</mo><mrow id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.cmml"><msub id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5.cmml"><mi id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5.2" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5.2.cmml">𝐚</mi><mi id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5.3" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5.3.cmml">t</mi></msub><mo fence="false" id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.4" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.4.cmml">|</mo><mrow id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.4.cmml"><msub id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1.cmml"><mi id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1.2" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1.2.cmml">𝐬</mi><mi id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1.3" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1.3.cmml">t</mi></msub><mo id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.4" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.4.cmml">,</mo><msub id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2.cmml"><mi id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2.2" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2.2.cmml">𝐡</mi><mi id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2.3" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2.3.cmml">t</mi></msub><mo id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.5" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.4.cmml">,</mo><msub id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3.cmml"><mi id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3.2" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3.2.cmml">𝐠</mi><mi id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3.3" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3.3.cmml">t</mi></msub><mo id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.6" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.4.cmml">,</mo><mi id="S3.SS3.SSS0.Px1.p1.2.m2.1.1" xref="S3.SS3.SSS0.Px1.p1.2.m2.1.1.cmml">𝐳</mi></mrow></mrow><mo id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.3" stretchy="false" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.2.m2.2b"><apply id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2"><times id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.2"></times><ci id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.3">𝜋</ci><apply id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1"><csymbol cd="latexml" id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.4.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.4">conditional</csymbol><apply id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5.1.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5.2.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5.2">𝐚</ci><ci id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5.3.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.5.3">𝑡</ci></apply><list id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.4.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3"><apply id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1.2">𝐬</ci><ci id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.1.1.1.3">𝑡</ci></apply><apply id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2.2">𝐡</ci><ci id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.2.2.2.3">𝑡</ci></apply><apply id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3.2">𝐠</ci><ci id="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3.3.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.2.2.1.1.1.3.3.3.3">𝑡</ci></apply><ci id="S3.SS3.SSS0.Px1.p1.2.m2.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.2.m2.1.1">𝐳</ci></list></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.2.m2.2c">\pi({\mathbf{a}}_{t}|{\mathbf{s}}_{t},{\mathbf{h}}_{t},{\mathbf{g}}_{t},{% \mathbf{z}})</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.2.m2.2d">italic_π ( bold_a start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | bold_s start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_z )</annotation></semantics></math> receives the humanoid proprioception <math alttext="{\mathbf{s}}_{t}\in\mathcal{S}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.3.m3.1"><semantics id="S3.SS3.SSS0.Px1.p1.3.m3.1a"><mrow id="S3.SS3.SSS0.Px1.p1.3.m3.1.1" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.cmml"><msub id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2.cmml"><mi id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2.2" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2.2.cmml">𝐬</mi><mi id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2.3" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2.3.cmml">t</mi></msub><mo id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.1" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.1.cmml">∈</mo><mi class="ltx_font_mathcaligraphic" id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.3" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.3.cmml">𝒮</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.3.m3.1b"><apply id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1"><in id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.1"></in><apply id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2.2">𝐬</ci><ci id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.2.3">𝑡</ci></apply><ci id="S3.SS3.SSS0.Px1.p1.3.m3.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.3.m3.1.1.3">𝒮</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.3.m3.1c">{\mathbf{s}}_{t}\in\mathcal{S}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.3.m3.1d">bold_s start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ caligraphic_S</annotation></semantics></math>, an egocentric heightmap <math alttext="{\mathbf{h}}_{t}\in\mathcal{H}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.4.m4.1"><semantics id="S3.SS3.SSS0.Px1.p1.4.m4.1a"><mrow id="S3.SS3.SSS0.Px1.p1.4.m4.1.1" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.cmml"><msub id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2.cmml"><mi id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2.2" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2.2.cmml">𝐡</mi><mi id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2.3" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2.3.cmml">t</mi></msub><mo id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.1" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.1.cmml">∈</mo><mi class="ltx_font_mathcaligraphic" id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.3" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.3.cmml">ℋ</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.4.m4.1b"><apply id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1"><in id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.1"></in><apply id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2.2">𝐡</ci><ci id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.2.3">𝑡</ci></apply><ci id="S3.SS3.SSS0.Px1.p1.4.m4.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.4.m4.1.1.3">ℋ</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.4.m4.1c">{\mathbf{h}}_{t}\in\mathcal{H}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.4.m4.1d">bold_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ caligraphic_H</annotation></semantics></math>, a task-specific goal state <math alttext="{\mathbf{g}}_{t}\in\mathcal{G}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.5.m5.1"><semantics id="S3.SS3.SSS0.Px1.p1.5.m5.1a"><mrow id="S3.SS3.SSS0.Px1.p1.5.m5.1.1" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.cmml"><msub id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2.cmml"><mi id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2.2" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2.2.cmml">𝐠</mi><mi id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2.3" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2.3.cmml">t</mi></msub><mo id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.1" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.1.cmml">∈</mo><mi class="ltx_font_mathcaligraphic" id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.3" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.3.cmml">𝒢</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.5.m5.1b"><apply id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1"><in id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.1"></in><apply id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2.2">𝐠</ci><ci id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.2.3">𝑡</ci></apply><ci id="S3.SS3.SSS0.Px1.p1.5.m5.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.5.m5.1.1.3">𝒢</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.5.m5.1c">{\mathbf{g}}_{t}\in\mathcal{G}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.5.m5.1d">bold_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ caligraphic_G</annotation></semantics></math>, and a language embedding <math alttext="{\mathbf{z}}\in\mathcal{Z}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.6.m6.1"><semantics id="S3.SS3.SSS0.Px1.p1.6.m6.1a"><mrow id="S3.SS3.SSS0.Px1.p1.6.m6.1.1" xref="S3.SS3.SSS0.Px1.p1.6.m6.1.1.cmml"><mi id="S3.SS3.SSS0.Px1.p1.6.m6.1.1.2" xref="S3.SS3.SSS0.Px1.p1.6.m6.1.1.2.cmml">𝐳</mi><mo id="S3.SS3.SSS0.Px1.p1.6.m6.1.1.1" xref="S3.SS3.SSS0.Px1.p1.6.m6.1.1.1.cmml">∈</mo><mi class="ltx_font_mathcaligraphic" id="S3.SS3.SSS0.Px1.p1.6.m6.1.1.3" xref="S3.SS3.SSS0.Px1.p1.6.m6.1.1.3.cmml">𝒵</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.6.m6.1b"><apply id="S3.SS3.SSS0.Px1.p1.6.m6.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.6.m6.1.1"><in id="S3.SS3.SSS0.Px1.p1.6.m6.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.6.m6.1.1.1"></in><ci id="S3.SS3.SSS0.Px1.p1.6.m6.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.6.m6.1.1.2">𝐳</ci><ci id="S3.SS3.SSS0.Px1.p1.6.m6.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.6.m6.1.1.3">𝒵</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.6.m6.1c">{\mathbf{z}}\in\mathcal{Z}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.6.m6.1d">bold_z ∈ caligraphic_Z</annotation></semantics></math>. The goal <math alttext="{\mathbf{g}}_{t}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.7.m7.1"><semantics id="S3.SS3.SSS0.Px1.p1.7.m7.1a"><msub id="S3.SS3.SSS0.Px1.p1.7.m7.1.1" xref="S3.SS3.SSS0.Px1.p1.7.m7.1.1.cmml"><mi id="S3.SS3.SSS0.Px1.p1.7.m7.1.1.2" xref="S3.SS3.SSS0.Px1.p1.7.m7.1.1.2.cmml">𝐠</mi><mi id="S3.SS3.SSS0.Px1.p1.7.m7.1.1.3" xref="S3.SS3.SSS0.Px1.p1.7.m7.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.7.m7.1b"><apply id="S3.SS3.SSS0.Px1.p1.7.m7.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.7.m7.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.7.m7.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.7.m7.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.7.m7.1.1.2">𝐠</ci><ci id="S3.SS3.SSS0.Px1.p1.7.m7.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.7.m7.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.7.m7.1c">{\mathbf{g}}_{t}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.7.m7.1d">bold_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> specifies high-level task objectives that the character should achieve, such as contacting with a certain furniture or moving an object to a certain coordinate. The <math alttext="{\mathbf{h}}_{t}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.8.m8.1"><semantics id="S3.SS3.SSS0.Px1.p1.8.m8.1a"><msub id="S3.SS3.SSS0.Px1.p1.8.m8.1.1" xref="S3.SS3.SSS0.Px1.p1.8.m8.1.1.cmml"><mi id="S3.SS3.SSS0.Px1.p1.8.m8.1.1.2" xref="S3.SS3.SSS0.Px1.p1.8.m8.1.1.2.cmml">𝐡</mi><mi id="S3.SS3.SSS0.Px1.p1.8.m8.1.1.3" xref="S3.SS3.SSS0.Px1.p1.8.m8.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.8.m8.1b"><apply id="S3.SS3.SSS0.Px1.p1.8.m8.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.8.m8.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.8.m8.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.8.m8.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.8.m8.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.8.m8.1.1.2">𝐡</ci><ci id="S3.SS3.SSS0.Px1.p1.8.m8.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.8.m8.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.8.m8.1c">{\mathbf{h}}_{t}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.8.m8.1d">bold_h start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> is the egocentric heightmap around the character, representing the surrounding geometries. The language embedding <math alttext="{\mathbf{z}}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.9.m9.1"><semantics id="S3.SS3.SSS0.Px1.p1.9.m9.1a"><mi id="S3.SS3.SSS0.Px1.p1.9.m9.1.1" xref="S3.SS3.SSS0.Px1.p1.9.m9.1.1.cmml">𝐳</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.9.m9.1b"><ci id="S3.SS3.SSS0.Px1.p1.9.m9.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.9.m9.1.1">𝐳</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.9.m9.1c">{\mathbf{z}}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.9.m9.1d">bold_z</annotation></semantics></math> specifies the style that the character should use to achieve the desired task, such as walking excitedly or sitting with legs crossed. The policy <math alttext="\pi" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.10.m10.1"><semantics id="S3.SS3.SSS0.Px1.p1.10.m10.1a"><mi id="S3.SS3.SSS0.Px1.p1.10.m10.1.1" xref="S3.SS3.SSS0.Px1.p1.10.m10.1.1.cmml">π</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.10.m10.1b"><ci id="S3.SS3.SSS0.Px1.p1.10.m10.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.10.m10.1.1">𝜋</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.10.m10.1c">\pi</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.10.m10.1d">italic_π</annotation></semantics></math> then samples an action <math alttext="\mathbf{a}_{t}\in\mathcal{A}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.11.m11.1"><semantics id="S3.SS3.SSS0.Px1.p1.11.m11.1a"><mrow id="S3.SS3.SSS0.Px1.p1.11.m11.1.1" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.cmml"><msub id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2.cmml"><mi id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2.2" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2.2.cmml">𝐚</mi><mi id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2.3" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2.3.cmml">t</mi></msub><mo id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.1" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.1.cmml">∈</mo><mi class="ltx_font_mathcaligraphic" id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.3" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.3.cmml">𝒜</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.11.m11.1b"><apply id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1"><in id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.1"></in><apply id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2.2">𝐚</ci><ci id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.2.3">𝑡</ci></apply><ci id="S3.SS3.SSS0.Px1.p1.11.m11.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.11.m11.1.1.3">𝒜</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.11.m11.1c">\mathbf{a}_{t}\in\mathcal{A}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.11.m11.1d">bold_a start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ caligraphic_A</annotation></semantics></math>. Applying the action <math alttext="\mathbf{a}_{t}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.12.m12.1"><semantics id="S3.SS3.SSS0.Px1.p1.12.m12.1a"><msub id="S3.SS3.SSS0.Px1.p1.12.m12.1.1" xref="S3.SS3.SSS0.Px1.p1.12.m12.1.1.cmml"><mi id="S3.SS3.SSS0.Px1.p1.12.m12.1.1.2" xref="S3.SS3.SSS0.Px1.p1.12.m12.1.1.2.cmml">𝐚</mi><mi id="S3.SS3.SSS0.Px1.p1.12.m12.1.1.3" xref="S3.SS3.SSS0.Px1.p1.12.m12.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.12.m12.1b"><apply id="S3.SS3.SSS0.Px1.p1.12.m12.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.12.m12.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.12.m12.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.12.m12.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.12.m12.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.12.m12.1.1.2">𝐚</ci><ci id="S3.SS3.SSS0.Px1.p1.12.m12.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.12.m12.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.12.m12.1c">\mathbf{a}_{t}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.12.m12.1d">bold_a start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math>, the environment performs state transition and the policy receives a reward <math alttext="r_{t}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.13.m13.1"><semantics id="S3.SS3.SSS0.Px1.p1.13.m13.1a"><msub id="S3.SS3.SSS0.Px1.p1.13.m13.1.1" xref="S3.SS3.SSS0.Px1.p1.13.m13.1.1.cmml"><mi id="S3.SS3.SSS0.Px1.p1.13.m13.1.1.2" xref="S3.SS3.SSS0.Px1.p1.13.m13.1.1.2.cmml">r</mi><mi id="S3.SS3.SSS0.Px1.p1.13.m13.1.1.3" xref="S3.SS3.SSS0.Px1.p1.13.m13.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.13.m13.1b"><apply id="S3.SS3.SSS0.Px1.p1.13.m13.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.13.m13.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.13.m13.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.13.m13.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.13.m13.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.13.m13.1.1.2">𝑟</ci><ci id="S3.SS3.SSS0.Px1.p1.13.m13.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.13.m13.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.13.m13.1c">r_{t}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.13.m13.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math>. The objective is to learn a policy that maximizes the expected discounted return <math alttext="J(\pi)=\mathbb{E}_{p(\tau|\pi)}\left[\sum_{t=0}^{T-1}\gamma^{t}r_{t}\right]" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.14.m14.3"><semantics id="S3.SS3.SSS0.Px1.p1.14.m14.3a"><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.3.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.cmml"><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.cmml"><mi id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.2.cmml">J</mi><mo id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.1.cmml">⁢</mo><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.3.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.cmml"><mo id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.3.2.1" stretchy="false" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.cmml">(</mo><mi id="S3.SS3.SSS0.Px1.p1.14.m14.2.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.2.2.cmml">π</mi><mo id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.3.2.2" stretchy="false" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.cmml">)</mo></mrow></mrow><mo id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.2.cmml">=</mo><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.cmml"><msub id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.3.cmml"><mi id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.3.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.3.2.cmml">𝔼</mi><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.cmml"><mi id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.3.cmml">p</mi><mo id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.2.cmml">⁢</mo><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.cmml"><mo id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.2" stretchy="false" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.cmml">(</mo><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.cmml"><mi id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.2.cmml">τ</mi><mo fence="false" id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.1.cmml">|</mo><mi id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.3.cmml">π</mi></mrow><mo id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.3" stretchy="false" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></msub><mo id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.2.cmml">⁢</mo><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.2.cmml"><mo id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.2.1.cmml">[</mo><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.cmml"><msubsup id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.cmml"><mo id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.2" lspace="0em" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.2.cmml">∑</mo><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.cmml"><mi id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.2.cmml">t</mi><mo id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.1.cmml">=</mo><mn id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.3.cmml">0</mn></mrow><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.cmml"><mi id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.2.cmml">T</mi><mo id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.1.cmml">−</mo><mn id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.3.cmml">1</mn></mrow></msubsup><mrow id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.cmml"><msup id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2.cmml"><mi id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2.2.cmml">γ</mi><mi id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2.3.cmml">t</mi></msup><mo id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.1" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.1.cmml">⁢</mo><msub id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3.cmml"><mi id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3.2" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3.2.cmml">r</mi><mi id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3.3.cmml">t</mi></msub></mrow></mrow><mo id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.3" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.2.1.cmml">]</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.14.m14.3b"><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3"><eq id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.2"></eq><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3"><times id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.1"></times><ci id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.3.2">𝐽</ci><ci id="S3.SS3.SSS0.Px1.p1.14.m14.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.2.2">𝜋</ci></apply><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1"><times id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.2"></times><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.3"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.3">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.3.2">𝔼</ci><apply id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1"><times id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.2"></times><ci id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.3">𝑝</ci><apply id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1"><csymbol cd="latexml" id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.1">conditional</csymbol><ci id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.2">𝜏</ci><ci id="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.1.1.1.1.1.1.3">𝜋</ci></apply></apply></apply><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1"><csymbol cd="latexml" id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.2">delimited-[]</csymbol><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1"><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1">superscript</csymbol><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1">subscript</csymbol><sum id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.2"></sum><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3"><eq id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.1"></eq><ci id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.2">𝑡</ci><cn id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.3.cmml" type="integer" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.2.3.3">0</cn></apply></apply><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3"><minus id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.1"></minus><ci id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.2">𝑇</ci><cn id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.3.cmml" type="integer" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.1.3.3">1</cn></apply></apply><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2"><times id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.1"></times><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2">superscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2.2">𝛾</ci><ci id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.2.3">𝑡</ci></apply><apply id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3.2">𝑟</ci><ci id="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3.3.cmml" xref="S3.SS3.SSS0.Px1.p1.14.m14.3.3.1.1.1.1.2.3.3">𝑡</ci></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.14.m14.3c">J(\pi)=\mathbb{E}_{p(\tau|\pi)}\left[\sum_{t=0}^{T-1}\gamma^{t}r_{t}\right]</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.14.m14.3d">italic_J ( italic_π ) = blackboard_E start_POSTSUBSCRIPT italic_p ( italic_τ | italic_π ) end_POSTSUBSCRIPT [ ∑ start_POSTSUBSCRIPT italic_t = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T - 1 end_POSTSUPERSCRIPT italic_γ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ]</annotation></semantics></math>, where <math alttext="T" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.15.m15.1"><semantics id="S3.SS3.SSS0.Px1.p1.15.m15.1a"><mi id="S3.SS3.SSS0.Px1.p1.15.m15.1.1" xref="S3.SS3.SSS0.Px1.p1.15.m15.1.1.cmml">T</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.15.m15.1b"><ci id="S3.SS3.SSS0.Px1.p1.15.m15.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.15.m15.1.1">𝑇</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.15.m15.1c">T</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.15.m15.1d">italic_T</annotation></semantics></math> is the horizontal length and <math alttext="\gamma\in[0,1]" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.16.m16.2"><semantics id="S3.SS3.SSS0.Px1.p1.16.m16.2a"><mrow id="S3.SS3.SSS0.Px1.p1.16.m16.2.3" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.3.cmml"><mi id="S3.SS3.SSS0.Px1.p1.16.m16.2.3.2" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.3.2.cmml">γ</mi><mo id="S3.SS3.SSS0.Px1.p1.16.m16.2.3.1" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.3.1.cmml">∈</mo><mrow id="S3.SS3.SSS0.Px1.p1.16.m16.2.3.3.2" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.3.3.1.cmml"><mo id="S3.SS3.SSS0.Px1.p1.16.m16.2.3.3.2.1" stretchy="false" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.3.3.1.cmml">[</mo><mn id="S3.SS3.SSS0.Px1.p1.16.m16.1.1" xref="S3.SS3.SSS0.Px1.p1.16.m16.1.1.cmml">0</mn><mo id="S3.SS3.SSS0.Px1.p1.16.m16.2.3.3.2.2" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.3.3.1.cmml">,</mo><mn id="S3.SS3.SSS0.Px1.p1.16.m16.2.2" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.2.cmml">1</mn><mo id="S3.SS3.SSS0.Px1.p1.16.m16.2.3.3.2.3" stretchy="false" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.3.3.1.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.16.m16.2b"><apply id="S3.SS3.SSS0.Px1.p1.16.m16.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.3"><in id="S3.SS3.SSS0.Px1.p1.16.m16.2.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.3.1"></in><ci id="S3.SS3.SSS0.Px1.p1.16.m16.2.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.3.2">𝛾</ci><interval closure="closed" id="S3.SS3.SSS0.Px1.p1.16.m16.2.3.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.3.3.2"><cn id="S3.SS3.SSS0.Px1.p1.16.m16.1.1.cmml" type="integer" xref="S3.SS3.SSS0.Px1.p1.16.m16.1.1">0</cn><cn id="S3.SS3.SSS0.Px1.p1.16.m16.2.2.cmml" type="integer" xref="S3.SS3.SSS0.Px1.p1.16.m16.2.2">1</cn></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.16.m16.2c">\gamma\in[0,1]</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.16.m16.2d">italic_γ ∈ [ 0 , 1 ]</annotation></semantics></math> defines the discount factor. In order to train the policy <math alttext="\pi" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.17.m17.1"><semantics id="S3.SS3.SSS0.Px1.p1.17.m17.1a"><mi id="S3.SS3.SSS0.Px1.p1.17.m17.1.1" xref="S3.SS3.SSS0.Px1.p1.17.m17.1.1.cmml">π</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.17.m17.1b"><ci id="S3.SS3.SSS0.Px1.p1.17.m17.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.17.m17.1.1">𝜋</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.17.m17.1c">\pi</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.17.m17.1d">italic_π</annotation></semantics></math> to perform the task using diverse motion styles, we utilize a reward function consisting of two components: <math alttext="r_{t}=\lambda^{\text{style}}r^{\text{style}}_{t}+\lambda^{\text{task}}r^{\text% {task}}_{t}," class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.18.m18.1"><semantics id="S3.SS3.SSS0.Px1.p1.18.m18.1a"><mrow id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.cmml"><mrow id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.cmml"><msub id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2.cmml"><mi id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2.2" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2.2.cmml">r</mi><mi id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2.3" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2.3.cmml">t</mi></msub><mo id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.1" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.1.cmml">=</mo><mrow id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.cmml"><mrow id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.cmml"><msup id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.cmml"><mi id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.2" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.2.cmml">λ</mi><mtext id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.3" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.3a.cmml">style</mtext></msup><mo id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.1" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.1.cmml">⁢</mo><msubsup id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.cmml"><mi id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.2" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.2.cmml">r</mi><mi id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.3" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.3.cmml">t</mi><mtext id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.3" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.3a.cmml">style</mtext></msubsup></mrow><mo id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.1" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.1.cmml">+</mo><mrow id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.cmml"><msup id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.cmml"><mi id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.2" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.2.cmml">λ</mi><mtext id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.3" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.3a.cmml">task</mtext></msup><mo id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.1" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.1.cmml">⁢</mo><msubsup id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.cmml"><mi id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.2" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.2.cmml">r</mi><mi id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.3" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.3.cmml">t</mi><mtext id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.3" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.3a.cmml">task</mtext></msubsup></mrow></mrow></mrow><mo id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.2" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.cmml">,</mo></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.18.m18.1b"><apply id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1"><eq id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.1"></eq><apply id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2">subscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2.2">𝑟</ci><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.2.3">𝑡</ci></apply><apply id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3"><plus id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.1"></plus><apply id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2"><times id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.1"></times><apply id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2">superscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.2">𝜆</ci><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.3a.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.3"><mtext id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.3.cmml" mathsize="70%" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.2.3">style</mtext></ci></apply><apply id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3">subscript</csymbol><apply id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3">superscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.2">𝑟</ci><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.3a.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.3"><mtext id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.3.cmml" mathsize="70%" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.2.3">style</mtext></ci></apply><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.3.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.2.3.3">𝑡</ci></apply></apply><apply id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3"><times id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.1"></times><apply id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2">superscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.2">𝜆</ci><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.3a.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.3"><mtext id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.3.cmml" mathsize="70%" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.2.3">task</mtext></ci></apply><apply id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3">subscript</csymbol><apply id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3">superscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.2">𝑟</ci><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.3a.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.3"><mtext id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.3.cmml" mathsize="70%" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.2.3">task</mtext></ci></apply><ci id="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.3.cmml" xref="S3.SS3.SSS0.Px1.p1.18.m18.1.1.1.1.3.3.3.3">𝑡</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.18.m18.1c">r_{t}=\lambda^{\text{style}}r^{\text{style}}_{t}+\lambda^{\text{task}}r^{\text% {task}}_{t},</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.18.m18.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_λ start_POSTSUPERSCRIPT style end_POSTSUPERSCRIPT italic_r start_POSTSUPERSCRIPT style end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_λ start_POSTSUPERSCRIPT task end_POSTSUPERSCRIPT italic_r start_POSTSUPERSCRIPT task end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ,</annotation></semantics></math> where <math alttext="r^{\text{style}}_{t}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.19.m19.1"><semantics id="S3.SS3.SSS0.Px1.p1.19.m19.1a"><msubsup id="S3.SS3.SSS0.Px1.p1.19.m19.1.1" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1.cmml"><mi id="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.2" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.2.cmml">r</mi><mi id="S3.SS3.SSS0.Px1.p1.19.m19.1.1.3" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1.3.cmml">t</mi><mtext id="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.3" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.3a.cmml">style</mtext></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.19.m19.1b"><apply id="S3.SS3.SSS0.Px1.p1.19.m19.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.19.m19.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1">subscript</csymbol><apply id="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1">superscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.2">𝑟</ci><ci id="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.3a.cmml" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.3"><mtext id="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.3.cmml" mathsize="70%" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1.2.3">style</mtext></ci></apply><ci id="S3.SS3.SSS0.Px1.p1.19.m19.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.19.m19.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.19.m19.1c">r^{\text{style}}_{t}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.19.m19.1d">italic_r start_POSTSUPERSCRIPT style end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> is a style reward modeled by the text-conditioned motion discriminator, and <math alttext="r^{\text{task}}_{t}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.20.m20.1"><semantics id="S3.SS3.SSS0.Px1.p1.20.m20.1a"><msubsup id="S3.SS3.SSS0.Px1.p1.20.m20.1.1" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1.cmml"><mi id="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.2" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.2.cmml">r</mi><mi id="S3.SS3.SSS0.Px1.p1.20.m20.1.1.3" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1.3.cmml">t</mi><mtext id="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.3" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.3a.cmml">task</mtext></msubsup><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.20.m20.1b"><apply id="S3.SS3.SSS0.Px1.p1.20.m20.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.20.m20.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1">subscript</csymbol><apply id="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.1.cmml" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1">superscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.2.cmml" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.2">𝑟</ci><ci id="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.3a.cmml" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.3"><mtext id="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.3.cmml" mathsize="70%" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1.2.3">task</mtext></ci></apply><ci id="S3.SS3.SSS0.Px1.p1.20.m20.1.1.3.cmml" xref="S3.SS3.SSS0.Px1.p1.20.m20.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.20.m20.1c">r^{\text{task}}_{t}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.20.m20.1d">italic_r start_POSTSUPERSCRIPT task end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> is a task-specific reward with coefficient <math alttext="\lambda^{\text{task}}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px1.p1.21.m21.1"><semantics id="S3.SS3.SSS0.Px1.p1.21.m21.1a"><msup id="S3.SS3.SSS0.Px1.p1.21.m21.1.1" xref="S3.SS3.SSS0.Px1.p1.21.m21.1.1.cmml"><mi id="S3.SS3.SSS0.Px1.p1.21.m21.1.1.2" xref="S3.SS3.SSS0.Px1.p1.21.m21.1.1.2.cmml">λ</mi><mtext id="S3.SS3.SSS0.Px1.p1.21.m21.1.1.3" xref="S3.SS3.SSS0.Px1.p1.21.m21.1.1.3a.cmml">task</mtext></msup><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px1.p1.21.m21.1b"><apply id="S3.SS3.SSS0.Px1.p1.21.m21.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.21.m21.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px1.p1.21.m21.1.1.1.cmml" xref="S3.SS3.SSS0.Px1.p1.21.m21.1.1">superscript</csymbol><ci id="S3.SS3.SSS0.Px1.p1.21.m21.1.1.2.cmml" xref="S3.SS3.SSS0.Px1.p1.21.m21.1.1.2">𝜆</ci><ci id="S3.SS3.SSS0.Px1.p1.21.m21.1.1.3a.cmml" xref="S3.SS3.SSS0.Px1.p1.21.m21.1.1.3"><mtext id="S3.SS3.SSS0.Px1.p1.21.m21.1.1.3.cmml" mathsize="70%" xref="S3.SS3.SSS0.Px1.p1.21.m21.1.1.3">task</mtext></ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px1.p1.21.m21.1c">\lambda^{\text{task}}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px1.p1.21.m21.1d">italic_λ start_POSTSUPERSCRIPT task end_POSTSUPERSCRIPT</annotation></semantics></math>.</p> </div> </section> <section class="ltx_paragraph" id="S3.SS3.SSS0.Px2"> <h5 class="ltx_title ltx_title_paragraph">Finite State Machine</h5> <div class="ltx_para" id="S3.SS3.SSS0.Px2.p1"> <p class="ltx_p" id="S3.SS3.SSS0.Px2.p1.7">As illustrated in Fig <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S2.F2" title="Figure 2 ‣ Comparison with Previous HSI Methods ‣ 2 Related Works ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">2</span></a>, our framework integrates several reusable policies, serving as low-level controllers. We have trained 7 policies: the Walk policy <math alttext="\pi_{w}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px2.p1.1.m1.1"><semantics id="S3.SS3.SSS0.Px2.p1.1.m1.1a"><msub id="S3.SS3.SSS0.Px2.p1.1.m1.1.1" xref="S3.SS3.SSS0.Px2.p1.1.m1.1.1.cmml"><mi id="S3.SS3.SSS0.Px2.p1.1.m1.1.1.2" xref="S3.SS3.SSS0.Px2.p1.1.m1.1.1.2.cmml">π</mi><mi id="S3.SS3.SSS0.Px2.p1.1.m1.1.1.3" xref="S3.SS3.SSS0.Px2.p1.1.m1.1.1.3.cmml">w</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px2.p1.1.m1.1b"><apply id="S3.SS3.SSS0.Px2.p1.1.m1.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px2.p1.1.m1.1.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.1.m1.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px2.p1.1.m1.1.1.2.cmml" xref="S3.SS3.SSS0.Px2.p1.1.m1.1.1.2">𝜋</ci><ci id="S3.SS3.SSS0.Px2.p1.1.m1.1.1.3.cmml" xref="S3.SS3.SSS0.Px2.p1.1.m1.1.1.3">𝑤</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px2.p1.1.m1.1c">\pi_{w}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px2.p1.1.m1.1d">italic_π start_POSTSUBSCRIPT italic_w end_POSTSUBSCRIPT</annotation></semantics></math>, Idle policy <math alttext="\pi_{i}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px2.p1.2.m2.1"><semantics id="S3.SS3.SSS0.Px2.p1.2.m2.1a"><msub id="S3.SS3.SSS0.Px2.p1.2.m2.1.1" xref="S3.SS3.SSS0.Px2.p1.2.m2.1.1.cmml"><mi id="S3.SS3.SSS0.Px2.p1.2.m2.1.1.2" xref="S3.SS3.SSS0.Px2.p1.2.m2.1.1.2.cmml">π</mi><mi id="S3.SS3.SSS0.Px2.p1.2.m2.1.1.3" xref="S3.SS3.SSS0.Px2.p1.2.m2.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px2.p1.2.m2.1b"><apply id="S3.SS3.SSS0.Px2.p1.2.m2.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px2.p1.2.m2.1.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.2.m2.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px2.p1.2.m2.1.1.2.cmml" xref="S3.SS3.SSS0.Px2.p1.2.m2.1.1.2">𝜋</ci><ci id="S3.SS3.SSS0.Px2.p1.2.m2.1.1.3.cmml" xref="S3.SS3.SSS0.Px2.p1.2.m2.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px2.p1.2.m2.1c">\pi_{i}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px2.p1.2.m2.1d">italic_π start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math>, Sit policy <math alttext="\pi_{s}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px2.p1.3.m3.1"><semantics id="S3.SS3.SSS0.Px2.p1.3.m3.1a"><msub id="S3.SS3.SSS0.Px2.p1.3.m3.1.1" xref="S3.SS3.SSS0.Px2.p1.3.m3.1.1.cmml"><mi id="S3.SS3.SSS0.Px2.p1.3.m3.1.1.2" xref="S3.SS3.SSS0.Px2.p1.3.m3.1.1.2.cmml">π</mi><mi id="S3.SS3.SSS0.Px2.p1.3.m3.1.1.3" xref="S3.SS3.SSS0.Px2.p1.3.m3.1.1.3.cmml">s</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px2.p1.3.m3.1b"><apply id="S3.SS3.SSS0.Px2.p1.3.m3.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px2.p1.3.m3.1.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.3.m3.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px2.p1.3.m3.1.1.2.cmml" xref="S3.SS3.SSS0.Px2.p1.3.m3.1.1.2">𝜋</ci><ci id="S3.SS3.SSS0.Px2.p1.3.m3.1.1.3.cmml" xref="S3.SS3.SSS0.Px2.p1.3.m3.1.1.3">𝑠</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px2.p1.3.m3.1c">\pi_{s}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px2.p1.3.m3.1d">italic_π start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT</annotation></semantics></math>, Lie policy <math alttext="\pi_{l}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px2.p1.4.m4.1"><semantics id="S3.SS3.SSS0.Px2.p1.4.m4.1a"><msub id="S3.SS3.SSS0.Px2.p1.4.m4.1.1" xref="S3.SS3.SSS0.Px2.p1.4.m4.1.1.cmml"><mi id="S3.SS3.SSS0.Px2.p1.4.m4.1.1.2" xref="S3.SS3.SSS0.Px2.p1.4.m4.1.1.2.cmml">π</mi><mi id="S3.SS3.SSS0.Px2.p1.4.m4.1.1.3" xref="S3.SS3.SSS0.Px2.p1.4.m4.1.1.3.cmml">l</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px2.p1.4.m4.1b"><apply id="S3.SS3.SSS0.Px2.p1.4.m4.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px2.p1.4.m4.1.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.4.m4.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px2.p1.4.m4.1.1.2.cmml" xref="S3.SS3.SSS0.Px2.p1.4.m4.1.1.2">𝜋</ci><ci id="S3.SS3.SSS0.Px2.p1.4.m4.1.1.3.cmml" xref="S3.SS3.SSS0.Px2.p1.4.m4.1.1.3">𝑙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px2.p1.4.m4.1c">\pi_{l}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px2.p1.4.m4.1d">italic_π start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT</annotation></semantics></math>, Reach policy <math alttext="\pi_{r}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px2.p1.5.m5.1"><semantics id="S3.SS3.SSS0.Px2.p1.5.m5.1a"><msub id="S3.SS3.SSS0.Px2.p1.5.m5.1.1" xref="S3.SS3.SSS0.Px2.p1.5.m5.1.1.cmml"><mi id="S3.SS3.SSS0.Px2.p1.5.m5.1.1.2" xref="S3.SS3.SSS0.Px2.p1.5.m5.1.1.2.cmml">π</mi><mi id="S3.SS3.SSS0.Px2.p1.5.m5.1.1.3" xref="S3.SS3.SSS0.Px2.p1.5.m5.1.1.3.cmml">r</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px2.p1.5.m5.1b"><apply id="S3.SS3.SSS0.Px2.p1.5.m5.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px2.p1.5.m5.1.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.5.m5.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px2.p1.5.m5.1.1.2.cmml" xref="S3.SS3.SSS0.Px2.p1.5.m5.1.1.2">𝜋</ci><ci id="S3.SS3.SSS0.Px2.p1.5.m5.1.1.3.cmml" xref="S3.SS3.SSS0.Px2.p1.5.m5.1.1.3">𝑟</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px2.p1.5.m5.1c">\pi_{r}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px2.p1.5.m5.1d">italic_π start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT</annotation></semantics></math>, GetUp policy <math alttext="\pi_{g}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px2.p1.6.m6.1"><semantics id="S3.SS3.SSS0.Px2.p1.6.m6.1a"><msub id="S3.SS3.SSS0.Px2.p1.6.m6.1.1" xref="S3.SS3.SSS0.Px2.p1.6.m6.1.1.cmml"><mi id="S3.SS3.SSS0.Px2.p1.6.m6.1.1.2" xref="S3.SS3.SSS0.Px2.p1.6.m6.1.1.2.cmml">π</mi><mi id="S3.SS3.SSS0.Px2.p1.6.m6.1.1.3" xref="S3.SS3.SSS0.Px2.p1.6.m6.1.1.3.cmml">g</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px2.p1.6.m6.1b"><apply id="S3.SS3.SSS0.Px2.p1.6.m6.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px2.p1.6.m6.1.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.6.m6.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px2.p1.6.m6.1.1.2.cmml" xref="S3.SS3.SSS0.Px2.p1.6.m6.1.1.2">𝜋</ci><ci id="S3.SS3.SSS0.Px2.p1.6.m6.1.1.3.cmml" xref="S3.SS3.SSS0.Px2.p1.6.m6.1.1.3">𝑔</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px2.p1.6.m6.1c">\pi_{g}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px2.p1.6.m6.1d">italic_π start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT</annotation></semantics></math> and Carry policy <math alttext="\pi_{c}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px2.p1.7.m7.1"><semantics id="S3.SS3.SSS0.Px2.p1.7.m7.1a"><msub id="S3.SS3.SSS0.Px2.p1.7.m7.1.1" xref="S3.SS3.SSS0.Px2.p1.7.m7.1.1.cmml"><mi id="S3.SS3.SSS0.Px2.p1.7.m7.1.1.2" xref="S3.SS3.SSS0.Px2.p1.7.m7.1.1.2.cmml">π</mi><mi id="S3.SS3.SSS0.Px2.p1.7.m7.1.1.3" xref="S3.SS3.SSS0.Px2.p1.7.m7.1.1.3.cmml">c</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px2.p1.7.m7.1b"><apply id="S3.SS3.SSS0.Px2.p1.7.m7.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px2.p1.7.m7.1.1.1.cmml" xref="S3.SS3.SSS0.Px2.p1.7.m7.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px2.p1.7.m7.1.1.2.cmml" xref="S3.SS3.SSS0.Px2.p1.7.m7.1.1.2">𝜋</ci><ci id="S3.SS3.SSS0.Px2.p1.7.m7.1.1.3.cmml" xref="S3.SS3.SSS0.Px2.p1.7.m7.1.1.3">𝑐</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px2.p1.7.m7.1c">\pi_{c}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px2.p1.7.m7.1d">italic_π start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT</annotation></semantics></math>.</p> </div> <div class="ltx_para" id="S3.SS3.SSS0.Px2.p2"> <p class="ltx_p" id="S3.SS3.SSS0.Px2.p2.1">Following <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>]</cite>, the FSM determines when to transition between skills. For instance, it initiates the next skill when the overlap time between the character’s root and its target position exceeds a specific threshold. This simple rule-based FSM allows users to achieve desired long-term human motions in complex 3D scenes. Compared to the recent work InterScene <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>]</cite>, our FSM contains egocentric heightmaps by frame and text embedding by skill, which could ensure scene understanding and semantic control.</p> </div> </section> <section class="ltx_paragraph" id="S3.SS3.SSS0.Px3"> <h5 class="ltx_title ltx_title_paragraph">Language Condition</h5> <div class="ltx_para" id="S3.SS3.SSS0.Px3.p1"> <p class="ltx_p" id="S3.SS3.SSS0.Px3.p1.4">To control policy language constraints, we build an embedding space where motion representations are aligned with natural language descriptions. Given a motion clip <math alttext="\hat{\mathbf{m}}=(\hat{\mathbf{q}}_{1},\ldots,\hat{\mathbf{q}}_{n})" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px3.p1.1.m1.3"><semantics id="S3.SS3.SSS0.Px3.p1.1.m1.3a"><mrow id="S3.SS3.SSS0.Px3.p1.1.m1.3.3" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.cmml"><mover accent="true" id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4.cmml"><mi id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4.2" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4.2.cmml">𝐦</mi><mo id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4.1" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4.1.cmml">^</mo></mover><mo id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.3" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.3.cmml">=</mo><mrow id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.3.cmml"><mo id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.3" stretchy="false" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.3.cmml">(</mo><msub id="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1" xref="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.cmml"><mover accent="true" id="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2" xref="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2.cmml"><mi id="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2.2" xref="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2.2.cmml">𝐪</mi><mo id="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2.1" xref="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2.1.cmml">^</mo></mover><mn id="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.3" xref="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.4" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.3.cmml">,</mo><mi id="S3.SS3.SSS0.Px3.p1.1.m1.1.1" mathvariant="normal" xref="S3.SS3.SSS0.Px3.p1.1.m1.1.1.cmml">…</mi><mo id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.5" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.3.cmml">,</mo><msub id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.cmml"><mover accent="true" id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2.cmml"><mi id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2.2" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2.2.cmml">𝐪</mi><mo id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2.1" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2.1.cmml">^</mo></mover><mi id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.3" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.3.cmml">n</mi></msub><mo id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.6" stretchy="false" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.3.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px3.p1.1.m1.3b"><apply id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3"><eq id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.3.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.3"></eq><apply id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4"><ci id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4.1.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4.1">^</ci><ci id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4.2.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.4.2">𝐦</ci></apply><vector id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.3.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2"><apply id="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.1.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1">subscript</csymbol><apply id="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2"><ci id="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2.1.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2.1">^</ci><ci id="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2.2.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.2.2">𝐪</ci></apply><cn id="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.3.cmml" type="integer" xref="S3.SS3.SSS0.Px3.p1.1.m1.2.2.1.1.1.3">1</cn></apply><ci id="S3.SS3.SSS0.Px3.p1.1.m1.1.1.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.1.1">…</ci><apply id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.1.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2">subscript</csymbol><apply id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2"><ci id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2.1.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2.1">^</ci><ci id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2.2.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.2.2">𝐪</ci></apply><ci id="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.3.cmml" xref="S3.SS3.SSS0.Px3.p1.1.m1.3.3.2.2.2.3">𝑛</ci></apply></vector></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px3.p1.1.m1.3c">\hat{\mathbf{m}}=(\hat{\mathbf{q}}_{1},\ldots,\hat{\mathbf{q}}_{n})</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px3.p1.1.m1.3d">over^ start_ARG bold_m end_ARG = ( over^ start_ARG bold_q end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , over^ start_ARG bold_q end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT )</annotation></semantics></math>, the motion encoder <math alttext="{\mathbf{z}}=\text{Enc}_{m}(\hat{\mathbf{m}})" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px3.p1.2.m2.1"><semantics id="S3.SS3.SSS0.Px3.p1.2.m2.1a"><mrow id="S3.SS3.SSS0.Px3.p1.2.m2.1.2" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.cmml"><mi id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.2" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.2.cmml">𝐳</mi><mo id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.1" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.1.cmml">=</mo><mrow id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.cmml"><msub id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.cmml"><mtext id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.2" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.2a.cmml">Enc</mtext><mi id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.3" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.3.cmml">m</mi></msub><mo id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.1" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.1.cmml">⁢</mo><mrow id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.3.2" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.1.cmml"><mo id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.3.2.1" stretchy="false" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.1.cmml">(</mo><mover accent="true" id="S3.SS3.SSS0.Px3.p1.2.m2.1.1" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.1.cmml"><mi id="S3.SS3.SSS0.Px3.p1.2.m2.1.1.2" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.1.2.cmml">𝐦</mi><mo id="S3.SS3.SSS0.Px3.p1.2.m2.1.1.1" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.1.1.cmml">^</mo></mover><mo id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.3.2.2" stretchy="false" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px3.p1.2.m2.1b"><apply id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2"><eq id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.1.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.1"></eq><ci id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.2.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.2">𝐳</ci><apply id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3"><times id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.1.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.1"></times><apply id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.1.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2">subscript</csymbol><ci id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.2a.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.2"><mtext id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.2.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.2">Enc</mtext></ci><ci id="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.3.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.2.3">𝑚</ci></apply><apply id="S3.SS3.SSS0.Px3.p1.2.m2.1.1.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.2.3.3.2"><ci id="S3.SS3.SSS0.Px3.p1.2.m2.1.1.1.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.1.1">^</ci><ci id="S3.SS3.SSS0.Px3.p1.2.m2.1.1.2.cmml" xref="S3.SS3.SSS0.Px3.p1.2.m2.1.1.2">𝐦</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px3.p1.2.m2.1c">{\mathbf{z}}=\text{Enc}_{m}(\hat{\mathbf{m}})</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px3.p1.2.m2.1d">bold_z = Enc start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( over^ start_ARG bold_m end_ARG )</annotation></semantics></math> maps the motion to a unit sphere embedding <math alttext="\|{\mathbf{z}}\|=1" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px3.p1.3.m3.1"><semantics id="S3.SS3.SSS0.Px3.p1.3.m3.1a"><mrow id="S3.SS3.SSS0.Px3.p1.3.m3.1.2" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.2.cmml"><mrow id="S3.SS3.SSS0.Px3.p1.3.m3.1.2.2.2" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.2.2.1.cmml"><mo id="S3.SS3.SSS0.Px3.p1.3.m3.1.2.2.2.1" stretchy="false" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.2.2.1.1.cmml">‖</mo><mi id="S3.SS3.SSS0.Px3.p1.3.m3.1.1" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.1.cmml">𝐳</mi><mo id="S3.SS3.SSS0.Px3.p1.3.m3.1.2.2.2.2" stretchy="false" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.2.2.1.1.cmml">‖</mo></mrow><mo id="S3.SS3.SSS0.Px3.p1.3.m3.1.2.1" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.2.1.cmml">=</mo><mn id="S3.SS3.SSS0.Px3.p1.3.m3.1.2.3" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.2.3.cmml">1</mn></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px3.p1.3.m3.1b"><apply id="S3.SS3.SSS0.Px3.p1.3.m3.1.2.cmml" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.2"><eq id="S3.SS3.SSS0.Px3.p1.3.m3.1.2.1.cmml" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.2.1"></eq><apply id="S3.SS3.SSS0.Px3.p1.3.m3.1.2.2.1.cmml" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.2.2.2"><csymbol cd="latexml" id="S3.SS3.SSS0.Px3.p1.3.m3.1.2.2.1.1.cmml" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.2.2.2.1">norm</csymbol><ci id="S3.SS3.SSS0.Px3.p1.3.m3.1.1.cmml" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.1">𝐳</ci></apply><cn id="S3.SS3.SSS0.Px3.p1.3.m3.1.2.3.cmml" type="integer" xref="S3.SS3.SSS0.Px3.p1.3.m3.1.2.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px3.p1.3.m3.1c">\|{\mathbf{z}}\|=1</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px3.p1.3.m3.1d">∥ bold_z ∥ = 1</annotation></semantics></math>, while corresponding text captions are processed through a pre-trained CLIP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib31" title=""><span class="ltx_text" style="font-size:90%;">31</span></a>]</cite> encoder <math alttext="\text{Enc}_{l}" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px3.p1.4.m4.1"><semantics id="S3.SS3.SSS0.Px3.p1.4.m4.1a"><msub id="S3.SS3.SSS0.Px3.p1.4.m4.1.1" xref="S3.SS3.SSS0.Px3.p1.4.m4.1.1.cmml"><mtext id="S3.SS3.SSS0.Px3.p1.4.m4.1.1.2" xref="S3.SS3.SSS0.Px3.p1.4.m4.1.1.2a.cmml">Enc</mtext><mi id="S3.SS3.SSS0.Px3.p1.4.m4.1.1.3" xref="S3.SS3.SSS0.Px3.p1.4.m4.1.1.3.cmml">l</mi></msub><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px3.p1.4.m4.1b"><apply id="S3.SS3.SSS0.Px3.p1.4.m4.1.1.cmml" xref="S3.SS3.SSS0.Px3.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S3.SS3.SSS0.Px3.p1.4.m4.1.1.1.cmml" xref="S3.SS3.SSS0.Px3.p1.4.m4.1.1">subscript</csymbol><ci id="S3.SS3.SSS0.Px3.p1.4.m4.1.1.2a.cmml" xref="S3.SS3.SSS0.Px3.p1.4.m4.1.1.2"><mtext id="S3.SS3.SSS0.Px3.p1.4.m4.1.1.2.cmml" xref="S3.SS3.SSS0.Px3.p1.4.m4.1.1.2">Enc</mtext></ci><ci id="S3.SS3.SSS0.Px3.p1.4.m4.1.1.3.cmml" xref="S3.SS3.SSS0.Px3.p1.4.m4.1.1.3">𝑙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px3.p1.4.m4.1c">\text{Enc}_{l}</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px3.p1.4.m4.1d">Enc start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT</annotation></semantics></math> and use fully connected layers to match the latent dimensionality. The training combines reconstruction and alignment losses to ensure that motion and text embeddings effectively correspond to each other. For further details on the network architecture and training losses, please refer to the Supp. Mat.</p> </div> </section> <section class="ltx_paragraph" id="S3.SS3.SSS0.Px4"> <h5 class="ltx_title ltx_title_paragraph">Scene Condition</h5> <div class="ltx_para" id="S3.SS3.SSS0.Px4.p1"> <p class="ltx_p" id="S3.SS3.SSS0.Px4.p1.1">To enhance the humanoid’s navigation and interaction capabilities, it is crucial to maintain environmental awareness to prevent collisions. We draw inspiration from methods such as <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib41" title=""><span class="ltx_text" style="font-size:90%;">41</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib46" title=""><span class="ltx_text" style="font-size:90%;">46</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib33" title=""><span class="ltx_text" style="font-size:90%;">33</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite>, which utilize environmental sampling for humanoid observations. A square, ego-centric heightmap is generated to capture the elevation of surrounding objects. See in <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S2.F2" title="In Comparison with Previous HSI Methods ‣ 2 Related Works ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">2</span></a>. Consistent with UniHSI <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite>, we pre-generate pointclouds for each scene. However, creating detailed pointclouds while preserving surface intricacies is computationally intensive. To enhance the humanoid’s understanding of complex surfaces for sitting or lying, we pre-generate scene pointclouds by voxelizing the objects within the bounding box range. The egocentric heightmap is updated by calculating the nearest object’s pointclouds only when the object is sufficiently close to the humanoid’s root position. The heightmap is a 12<math alttext="\times" class="ltx_Math" display="inline" id="S3.SS3.SSS0.Px4.p1.1.m1.1"><semantics id="S3.SS3.SSS0.Px4.p1.1.m1.1a"><mo id="S3.SS3.SSS0.Px4.p1.1.m1.1.1" xref="S3.SS3.SSS0.Px4.p1.1.m1.1.1.cmml">×</mo><annotation-xml encoding="MathML-Content" id="S3.SS3.SSS0.Px4.p1.1.m1.1b"><times id="S3.SS3.SSS0.Px4.p1.1.m1.1.1.cmml" xref="S3.SS3.SSS0.Px4.p1.1.m1.1.1"></times></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.SSS0.Px4.p1.1.m1.1c">\times</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.SSS0.Px4.p1.1.m1.1d">×</annotation></semantics></math>12 grid with an adjacent distance of 0.15 meters. We flatten the heightmap grid to a vector and concatenate it into the observation.</p> </div> </section> <section class="ltx_paragraph" id="S3.SS3.SSS0.Px5"> <h5 class="ltx_title ltx_title_paragraph">Universal Goal Condition</h5> <div class="ltx_para" id="S3.SS3.SSS0.Px5.p1"> <p class="ltx_p" id="S3.SS3.SSS0.Px5.p1.1">We consider 7 distinct scene interaction skills. To reduce the development overhead of diverse task-specific configurations, we implement all interaction tasks based on 3 task templates: Loco (Walk and Idle), HSI (Sit   Lie, Reach and GetUp) and DOI (Carry). The implementation details are as follows:</p> <ul class="ltx_itemize" id="S3.I1"> <li class="ltx_item" id="S3.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I1.i1.p1"> <p class="ltx_p" id="S3.I1.i1.p1.2"><span class="ltx_text ltx_font_bold" id="S3.I1.i1.p1.2.1">Loco tasks</span> require the humanoid to position its pelvis at a target 2D location <math alttext="{\mathbf{g}}\in\mathbb{R}^{2}" class="ltx_Math" display="inline" id="S3.I1.i1.p1.1.m1.1"><semantics id="S3.I1.i1.p1.1.m1.1a"><mrow id="S3.I1.i1.p1.1.m1.1.1" xref="S3.I1.i1.p1.1.m1.1.1.cmml"><mi id="S3.I1.i1.p1.1.m1.1.1.2" xref="S3.I1.i1.p1.1.m1.1.1.2.cmml">𝐠</mi><mo id="S3.I1.i1.p1.1.m1.1.1.1" xref="S3.I1.i1.p1.1.m1.1.1.1.cmml">∈</mo><msup id="S3.I1.i1.p1.1.m1.1.1.3" xref="S3.I1.i1.p1.1.m1.1.1.3.cmml"><mi id="S3.I1.i1.p1.1.m1.1.1.3.2" xref="S3.I1.i1.p1.1.m1.1.1.3.2.cmml">ℝ</mi><mn id="S3.I1.i1.p1.1.m1.1.1.3.3" xref="S3.I1.i1.p1.1.m1.1.1.3.3.cmml">2</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S3.I1.i1.p1.1.m1.1b"><apply id="S3.I1.i1.p1.1.m1.1.1.cmml" xref="S3.I1.i1.p1.1.m1.1.1"><in id="S3.I1.i1.p1.1.m1.1.1.1.cmml" xref="S3.I1.i1.p1.1.m1.1.1.1"></in><ci id="S3.I1.i1.p1.1.m1.1.1.2.cmml" xref="S3.I1.i1.p1.1.m1.1.1.2">𝐠</ci><apply id="S3.I1.i1.p1.1.m1.1.1.3.cmml" xref="S3.I1.i1.p1.1.m1.1.1.3"><csymbol cd="ambiguous" id="S3.I1.i1.p1.1.m1.1.1.3.1.cmml" xref="S3.I1.i1.p1.1.m1.1.1.3">superscript</csymbol><ci id="S3.I1.i1.p1.1.m1.1.1.3.2.cmml" xref="S3.I1.i1.p1.1.m1.1.1.3.2">ℝ</ci><cn id="S3.I1.i1.p1.1.m1.1.1.3.3.cmml" type="integer" xref="S3.I1.i1.p1.1.m1.1.1.3.3">2</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.I1.i1.p1.1.m1.1c">{\mathbf{g}}\in\mathbb{R}^{2}</annotation><annotation encoding="application/x-llamapun" id="S3.I1.i1.p1.1.m1.1d">bold_g ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT</annotation></semantics></math> . For Walk, the location is set <math alttext="\geq 1m" class="ltx_Math" display="inline" id="S3.I1.i1.p1.2.m2.1"><semantics id="S3.I1.i1.p1.2.m2.1a"><mrow id="S3.I1.i1.p1.2.m2.1.1" xref="S3.I1.i1.p1.2.m2.1.1.cmml"><mi id="S3.I1.i1.p1.2.m2.1.1.2" xref="S3.I1.i1.p1.2.m2.1.1.2.cmml"></mi><mo id="S3.I1.i1.p1.2.m2.1.1.1" xref="S3.I1.i1.p1.2.m2.1.1.1.cmml">≥</mo><mrow id="S3.I1.i1.p1.2.m2.1.1.3" xref="S3.I1.i1.p1.2.m2.1.1.3.cmml"><mn id="S3.I1.i1.p1.2.m2.1.1.3.2" xref="S3.I1.i1.p1.2.m2.1.1.3.2.cmml">1</mn><mo id="S3.I1.i1.p1.2.m2.1.1.3.1" xref="S3.I1.i1.p1.2.m2.1.1.3.1.cmml">⁢</mo><mi id="S3.I1.i1.p1.2.m2.1.1.3.3" xref="S3.I1.i1.p1.2.m2.1.1.3.3.cmml">m</mi></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.I1.i1.p1.2.m2.1b"><apply id="S3.I1.i1.p1.2.m2.1.1.cmml" xref="S3.I1.i1.p1.2.m2.1.1"><geq id="S3.I1.i1.p1.2.m2.1.1.1.cmml" xref="S3.I1.i1.p1.2.m2.1.1.1"></geq><csymbol cd="latexml" id="S3.I1.i1.p1.2.m2.1.1.2.cmml" xref="S3.I1.i1.p1.2.m2.1.1.2">absent</csymbol><apply id="S3.I1.i1.p1.2.m2.1.1.3.cmml" xref="S3.I1.i1.p1.2.m2.1.1.3"><times id="S3.I1.i1.p1.2.m2.1.1.3.1.cmml" xref="S3.I1.i1.p1.2.m2.1.1.3.1"></times><cn id="S3.I1.i1.p1.2.m2.1.1.3.2.cmml" type="integer" xref="S3.I1.i1.p1.2.m2.1.1.3.2">1</cn><ci id="S3.I1.i1.p1.2.m2.1.1.3.3.cmml" xref="S3.I1.i1.p1.2.m2.1.1.3.3">𝑚</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.I1.i1.p1.2.m2.1c">\geq 1m</annotation><annotation encoding="application/x-llamapun" id="S3.I1.i1.p1.2.m2.1d">≥ 1 italic_m</annotation></semantics></math> from the humanoid’s initial position, whereas the location of Idle is identical to the humanoid’s current position, encouraging pacing in place.</p> </div> </li> <li class="ltx_item" id="S3.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I1.i2.p1"> <p class="ltx_p" id="S3.I1.i2.p1.1"><span class="ltx_text ltx_font_bold" id="S3.I1.i2.p1.1.1">HSI tasks</span> require a specific body joint to contact with the surface of a target object. We constrain the pelvis joint in Sit, Lie, and GetUp, and use either the left or right hand for Reach. The target location <math alttext="{\mathbf{g}}\in\mathbb{R}^{3}" class="ltx_Math" display="inline" id="S3.I1.i2.p1.1.m1.1"><semantics id="S3.I1.i2.p1.1.m1.1a"><mrow id="S3.I1.i2.p1.1.m1.1.1" xref="S3.I1.i2.p1.1.m1.1.1.cmml"><mi id="S3.I1.i2.p1.1.m1.1.1.2" xref="S3.I1.i2.p1.1.m1.1.1.2.cmml">𝐠</mi><mo id="S3.I1.i2.p1.1.m1.1.1.1" xref="S3.I1.i2.p1.1.m1.1.1.1.cmml">∈</mo><msup id="S3.I1.i2.p1.1.m1.1.1.3" xref="S3.I1.i2.p1.1.m1.1.1.3.cmml"><mi id="S3.I1.i2.p1.1.m1.1.1.3.2" xref="S3.I1.i2.p1.1.m1.1.1.3.2.cmml">ℝ</mi><mn id="S3.I1.i2.p1.1.m1.1.1.3.3" xref="S3.I1.i2.p1.1.m1.1.1.3.3.cmml">3</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S3.I1.i2.p1.1.m1.1b"><apply id="S3.I1.i2.p1.1.m1.1.1.cmml" xref="S3.I1.i2.p1.1.m1.1.1"><in id="S3.I1.i2.p1.1.m1.1.1.1.cmml" xref="S3.I1.i2.p1.1.m1.1.1.1"></in><ci id="S3.I1.i2.p1.1.m1.1.1.2.cmml" xref="S3.I1.i2.p1.1.m1.1.1.2">𝐠</ci><apply id="S3.I1.i2.p1.1.m1.1.1.3.cmml" xref="S3.I1.i2.p1.1.m1.1.1.3"><csymbol cd="ambiguous" id="S3.I1.i2.p1.1.m1.1.1.3.1.cmml" xref="S3.I1.i2.p1.1.m1.1.1.3">superscript</csymbol><ci id="S3.I1.i2.p1.1.m1.1.1.3.2.cmml" xref="S3.I1.i2.p1.1.m1.1.1.3.2">ℝ</ci><cn id="S3.I1.i2.p1.1.m1.1.1.3.3.cmml" type="integer" xref="S3.I1.i2.p1.1.m1.1.1.3.3">3</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.I1.i2.p1.1.m1.1c">{\mathbf{g}}\in\mathbb{R}^{3}</annotation><annotation encoding="application/x-llamapun" id="S3.I1.i2.p1.1.m1.1d">bold_g ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT</annotation></semantics></math> is determined by the nearest 3D point on the object’s interactable surface.</p> </div> </li> <li class="ltx_item" id="S3.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I1.i3.p1"> <p class="ltx_p" id="S3.I1.i3.p1.3"><span class="ltx_text ltx_font_bold" id="S3.I1.i3.p1.3.1">DOI tasks</span> no longer constrain body joints, but encourage the character to move the dynamic object’s root to a target 3D location. We use the bounding box coordinates of the object <math alttext="{\mathbf{g}}^{bbox}\in\mathbb{R}^{3\times 8}" class="ltx_Math" display="inline" id="S3.I1.i3.p1.1.m1.1"><semantics id="S3.I1.i3.p1.1.m1.1a"><mrow id="S3.I1.i3.p1.1.m1.1.1" xref="S3.I1.i3.p1.1.m1.1.1.cmml"><msup id="S3.I1.i3.p1.1.m1.1.1.2" xref="S3.I1.i3.p1.1.m1.1.1.2.cmml"><mi id="S3.I1.i3.p1.1.m1.1.1.2.2" xref="S3.I1.i3.p1.1.m1.1.1.2.2.cmml">𝐠</mi><mrow id="S3.I1.i3.p1.1.m1.1.1.2.3" xref="S3.I1.i3.p1.1.m1.1.1.2.3.cmml"><mi id="S3.I1.i3.p1.1.m1.1.1.2.3.2" xref="S3.I1.i3.p1.1.m1.1.1.2.3.2.cmml">b</mi><mo id="S3.I1.i3.p1.1.m1.1.1.2.3.1" xref="S3.I1.i3.p1.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S3.I1.i3.p1.1.m1.1.1.2.3.3" xref="S3.I1.i3.p1.1.m1.1.1.2.3.3.cmml">b</mi><mo id="S3.I1.i3.p1.1.m1.1.1.2.3.1a" xref="S3.I1.i3.p1.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S3.I1.i3.p1.1.m1.1.1.2.3.4" xref="S3.I1.i3.p1.1.m1.1.1.2.3.4.cmml">o</mi><mo id="S3.I1.i3.p1.1.m1.1.1.2.3.1b" xref="S3.I1.i3.p1.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S3.I1.i3.p1.1.m1.1.1.2.3.5" xref="S3.I1.i3.p1.1.m1.1.1.2.3.5.cmml">x</mi></mrow></msup><mo id="S3.I1.i3.p1.1.m1.1.1.1" xref="S3.I1.i3.p1.1.m1.1.1.1.cmml">∈</mo><msup id="S3.I1.i3.p1.1.m1.1.1.3" xref="S3.I1.i3.p1.1.m1.1.1.3.cmml"><mi id="S3.I1.i3.p1.1.m1.1.1.3.2" xref="S3.I1.i3.p1.1.m1.1.1.3.2.cmml">ℝ</mi><mrow id="S3.I1.i3.p1.1.m1.1.1.3.3" xref="S3.I1.i3.p1.1.m1.1.1.3.3.cmml"><mn id="S3.I1.i3.p1.1.m1.1.1.3.3.2" xref="S3.I1.i3.p1.1.m1.1.1.3.3.2.cmml">3</mn><mo id="S3.I1.i3.p1.1.m1.1.1.3.3.1" lspace="0.222em" rspace="0.222em" xref="S3.I1.i3.p1.1.m1.1.1.3.3.1.cmml">×</mo><mn id="S3.I1.i3.p1.1.m1.1.1.3.3.3" xref="S3.I1.i3.p1.1.m1.1.1.3.3.3.cmml">8</mn></mrow></msup></mrow><annotation-xml encoding="MathML-Content" id="S3.I1.i3.p1.1.m1.1b"><apply id="S3.I1.i3.p1.1.m1.1.1.cmml" xref="S3.I1.i3.p1.1.m1.1.1"><in id="S3.I1.i3.p1.1.m1.1.1.1.cmml" xref="S3.I1.i3.p1.1.m1.1.1.1"></in><apply id="S3.I1.i3.p1.1.m1.1.1.2.cmml" xref="S3.I1.i3.p1.1.m1.1.1.2"><csymbol cd="ambiguous" id="S3.I1.i3.p1.1.m1.1.1.2.1.cmml" xref="S3.I1.i3.p1.1.m1.1.1.2">superscript</csymbol><ci id="S3.I1.i3.p1.1.m1.1.1.2.2.cmml" xref="S3.I1.i3.p1.1.m1.1.1.2.2">𝐠</ci><apply id="S3.I1.i3.p1.1.m1.1.1.2.3.cmml" xref="S3.I1.i3.p1.1.m1.1.1.2.3"><times id="S3.I1.i3.p1.1.m1.1.1.2.3.1.cmml" xref="S3.I1.i3.p1.1.m1.1.1.2.3.1"></times><ci id="S3.I1.i3.p1.1.m1.1.1.2.3.2.cmml" xref="S3.I1.i3.p1.1.m1.1.1.2.3.2">𝑏</ci><ci id="S3.I1.i3.p1.1.m1.1.1.2.3.3.cmml" xref="S3.I1.i3.p1.1.m1.1.1.2.3.3">𝑏</ci><ci id="S3.I1.i3.p1.1.m1.1.1.2.3.4.cmml" xref="S3.I1.i3.p1.1.m1.1.1.2.3.4">𝑜</ci><ci id="S3.I1.i3.p1.1.m1.1.1.2.3.5.cmml" xref="S3.I1.i3.p1.1.m1.1.1.2.3.5">𝑥</ci></apply></apply><apply id="S3.I1.i3.p1.1.m1.1.1.3.cmml" xref="S3.I1.i3.p1.1.m1.1.1.3"><csymbol cd="ambiguous" id="S3.I1.i3.p1.1.m1.1.1.3.1.cmml" xref="S3.I1.i3.p1.1.m1.1.1.3">superscript</csymbol><ci id="S3.I1.i3.p1.1.m1.1.1.3.2.cmml" xref="S3.I1.i3.p1.1.m1.1.1.3.2">ℝ</ci><apply id="S3.I1.i3.p1.1.m1.1.1.3.3.cmml" xref="S3.I1.i3.p1.1.m1.1.1.3.3"><times id="S3.I1.i3.p1.1.m1.1.1.3.3.1.cmml" xref="S3.I1.i3.p1.1.m1.1.1.3.3.1"></times><cn id="S3.I1.i3.p1.1.m1.1.1.3.3.2.cmml" type="integer" xref="S3.I1.i3.p1.1.m1.1.1.3.3.2">3</cn><cn id="S3.I1.i3.p1.1.m1.1.1.3.3.3.cmml" type="integer" xref="S3.I1.i3.p1.1.m1.1.1.3.3.3">8</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.I1.i3.p1.1.m1.1c">{\mathbf{g}}^{bbox}\in\mathbb{R}^{3\times 8}</annotation><annotation encoding="application/x-llamapun" id="S3.I1.i3.p1.1.m1.1d">bold_g start_POSTSUPERSCRIPT italic_b italic_b italic_o italic_x end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 3 × 8 end_POSTSUPERSCRIPT</annotation></semantics></math> and the target location <math alttext="{\mathbf{g}}^{tar}\in\mathbb{R}^{3}" class="ltx_Math" display="inline" id="S3.I1.i3.p1.2.m2.1"><semantics id="S3.I1.i3.p1.2.m2.1a"><mrow id="S3.I1.i3.p1.2.m2.1.1" xref="S3.I1.i3.p1.2.m2.1.1.cmml"><msup id="S3.I1.i3.p1.2.m2.1.1.2" xref="S3.I1.i3.p1.2.m2.1.1.2.cmml"><mi id="S3.I1.i3.p1.2.m2.1.1.2.2" xref="S3.I1.i3.p1.2.m2.1.1.2.2.cmml">𝐠</mi><mrow id="S3.I1.i3.p1.2.m2.1.1.2.3" xref="S3.I1.i3.p1.2.m2.1.1.2.3.cmml"><mi id="S3.I1.i3.p1.2.m2.1.1.2.3.2" xref="S3.I1.i3.p1.2.m2.1.1.2.3.2.cmml">t</mi><mo id="S3.I1.i3.p1.2.m2.1.1.2.3.1" xref="S3.I1.i3.p1.2.m2.1.1.2.3.1.cmml">⁢</mo><mi id="S3.I1.i3.p1.2.m2.1.1.2.3.3" xref="S3.I1.i3.p1.2.m2.1.1.2.3.3.cmml">a</mi><mo id="S3.I1.i3.p1.2.m2.1.1.2.3.1a" xref="S3.I1.i3.p1.2.m2.1.1.2.3.1.cmml">⁢</mo><mi id="S3.I1.i3.p1.2.m2.1.1.2.3.4" xref="S3.I1.i3.p1.2.m2.1.1.2.3.4.cmml">r</mi></mrow></msup><mo id="S3.I1.i3.p1.2.m2.1.1.1" xref="S3.I1.i3.p1.2.m2.1.1.1.cmml">∈</mo><msup id="S3.I1.i3.p1.2.m2.1.1.3" xref="S3.I1.i3.p1.2.m2.1.1.3.cmml"><mi id="S3.I1.i3.p1.2.m2.1.1.3.2" xref="S3.I1.i3.p1.2.m2.1.1.3.2.cmml">ℝ</mi><mn id="S3.I1.i3.p1.2.m2.1.1.3.3" xref="S3.I1.i3.p1.2.m2.1.1.3.3.cmml">3</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S3.I1.i3.p1.2.m2.1b"><apply id="S3.I1.i3.p1.2.m2.1.1.cmml" xref="S3.I1.i3.p1.2.m2.1.1"><in id="S3.I1.i3.p1.2.m2.1.1.1.cmml" xref="S3.I1.i3.p1.2.m2.1.1.1"></in><apply id="S3.I1.i3.p1.2.m2.1.1.2.cmml" xref="S3.I1.i3.p1.2.m2.1.1.2"><csymbol cd="ambiguous" id="S3.I1.i3.p1.2.m2.1.1.2.1.cmml" xref="S3.I1.i3.p1.2.m2.1.1.2">superscript</csymbol><ci id="S3.I1.i3.p1.2.m2.1.1.2.2.cmml" xref="S3.I1.i3.p1.2.m2.1.1.2.2">𝐠</ci><apply id="S3.I1.i3.p1.2.m2.1.1.2.3.cmml" xref="S3.I1.i3.p1.2.m2.1.1.2.3"><times id="S3.I1.i3.p1.2.m2.1.1.2.3.1.cmml" xref="S3.I1.i3.p1.2.m2.1.1.2.3.1"></times><ci id="S3.I1.i3.p1.2.m2.1.1.2.3.2.cmml" xref="S3.I1.i3.p1.2.m2.1.1.2.3.2">𝑡</ci><ci id="S3.I1.i3.p1.2.m2.1.1.2.3.3.cmml" xref="S3.I1.i3.p1.2.m2.1.1.2.3.3">𝑎</ci><ci id="S3.I1.i3.p1.2.m2.1.1.2.3.4.cmml" xref="S3.I1.i3.p1.2.m2.1.1.2.3.4">𝑟</ci></apply></apply><apply id="S3.I1.i3.p1.2.m2.1.1.3.cmml" xref="S3.I1.i3.p1.2.m2.1.1.3"><csymbol cd="ambiguous" id="S3.I1.i3.p1.2.m2.1.1.3.1.cmml" xref="S3.I1.i3.p1.2.m2.1.1.3">superscript</csymbol><ci id="S3.I1.i3.p1.2.m2.1.1.3.2.cmml" xref="S3.I1.i3.p1.2.m2.1.1.3.2">ℝ</ci><cn id="S3.I1.i3.p1.2.m2.1.1.3.3.cmml" type="integer" xref="S3.I1.i3.p1.2.m2.1.1.3.3">3</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.I1.i3.p1.2.m2.1c">{\mathbf{g}}^{tar}\in\mathbb{R}^{3}</annotation><annotation encoding="application/x-llamapun" id="S3.I1.i3.p1.2.m2.1d">bold_g start_POSTSUPERSCRIPT italic_t italic_a italic_r end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT</annotation></semantics></math> as the goal condition <math alttext="{\mathbf{g}}=\{{\mathbf{g}}^{bbox},{\mathbf{g}}^{tar}\}" class="ltx_Math" display="inline" id="S3.I1.i3.p1.3.m3.2"><semantics id="S3.I1.i3.p1.3.m3.2a"><mrow id="S3.I1.i3.p1.3.m3.2.2" xref="S3.I1.i3.p1.3.m3.2.2.cmml"><mi id="S3.I1.i3.p1.3.m3.2.2.4" xref="S3.I1.i3.p1.3.m3.2.2.4.cmml">𝐠</mi><mo id="S3.I1.i3.p1.3.m3.2.2.3" xref="S3.I1.i3.p1.3.m3.2.2.3.cmml">=</mo><mrow id="S3.I1.i3.p1.3.m3.2.2.2.2" xref="S3.I1.i3.p1.3.m3.2.2.2.3.cmml"><mo id="S3.I1.i3.p1.3.m3.2.2.2.2.3" stretchy="false" xref="S3.I1.i3.p1.3.m3.2.2.2.3.cmml">{</mo><msup id="S3.I1.i3.p1.3.m3.1.1.1.1.1" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.cmml"><mi id="S3.I1.i3.p1.3.m3.1.1.1.1.1.2" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.2.cmml">𝐠</mi><mrow id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.cmml"><mi id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.2" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.2.cmml">b</mi><mo id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.1" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.3" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.3.cmml">b</mi><mo id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.1a" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.4" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.4.cmml">o</mi><mo id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.1b" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.5" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.5.cmml">x</mi></mrow></msup><mo id="S3.I1.i3.p1.3.m3.2.2.2.2.4" xref="S3.I1.i3.p1.3.m3.2.2.2.3.cmml">,</mo><msup id="S3.I1.i3.p1.3.m3.2.2.2.2.2" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.cmml"><mi id="S3.I1.i3.p1.3.m3.2.2.2.2.2.2" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.2.cmml">𝐠</mi><mrow id="S3.I1.i3.p1.3.m3.2.2.2.2.2.3" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.cmml"><mi id="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.2" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.2.cmml">t</mi><mo id="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.1" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.1.cmml">⁢</mo><mi id="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.3" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.3.cmml">a</mi><mo id="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.1a" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.1.cmml">⁢</mo><mi id="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.4" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.4.cmml">r</mi></mrow></msup><mo id="S3.I1.i3.p1.3.m3.2.2.2.2.5" stretchy="false" xref="S3.I1.i3.p1.3.m3.2.2.2.3.cmml">}</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S3.I1.i3.p1.3.m3.2b"><apply id="S3.I1.i3.p1.3.m3.2.2.cmml" xref="S3.I1.i3.p1.3.m3.2.2"><eq id="S3.I1.i3.p1.3.m3.2.2.3.cmml" xref="S3.I1.i3.p1.3.m3.2.2.3"></eq><ci id="S3.I1.i3.p1.3.m3.2.2.4.cmml" xref="S3.I1.i3.p1.3.m3.2.2.4">𝐠</ci><set id="S3.I1.i3.p1.3.m3.2.2.2.3.cmml" xref="S3.I1.i3.p1.3.m3.2.2.2.2"><apply id="S3.I1.i3.p1.3.m3.1.1.1.1.1.cmml" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1"><csymbol cd="ambiguous" id="S3.I1.i3.p1.3.m3.1.1.1.1.1.1.cmml" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1">superscript</csymbol><ci id="S3.I1.i3.p1.3.m3.1.1.1.1.1.2.cmml" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.2">𝐠</ci><apply id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.cmml" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3"><times id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.1.cmml" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.1"></times><ci id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.2.cmml" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.2">𝑏</ci><ci id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.3.cmml" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.3">𝑏</ci><ci id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.4.cmml" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.4">𝑜</ci><ci id="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.5.cmml" xref="S3.I1.i3.p1.3.m3.1.1.1.1.1.3.5">𝑥</ci></apply></apply><apply id="S3.I1.i3.p1.3.m3.2.2.2.2.2.cmml" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2"><csymbol cd="ambiguous" id="S3.I1.i3.p1.3.m3.2.2.2.2.2.1.cmml" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2">superscript</csymbol><ci id="S3.I1.i3.p1.3.m3.2.2.2.2.2.2.cmml" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.2">𝐠</ci><apply id="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.cmml" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.3"><times id="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.1.cmml" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.1"></times><ci id="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.2.cmml" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.2">𝑡</ci><ci id="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.3.cmml" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.3">𝑎</ci><ci id="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.4.cmml" xref="S3.I1.i3.p1.3.m3.2.2.2.2.2.3.4">𝑟</ci></apply></apply></set></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.I1.i3.p1.3.m3.2c">{\mathbf{g}}=\{{\mathbf{g}}^{bbox},{\mathbf{g}}^{tar}\}</annotation><annotation encoding="application/x-llamapun" id="S3.I1.i3.p1.3.m3.2d">bold_g = { bold_g start_POSTSUPERSCRIPT italic_b italic_b italic_o italic_x end_POSTSUPERSCRIPT , bold_g start_POSTSUPERSCRIPT italic_t italic_a italic_r end_POSTSUPERSCRIPT }</annotation></semantics></math>.</p> </div> </li> </ul> </div> <div class="ltx_para ltx_noindent" id="S3.SS3.SSS0.Px5.p2"> <p class="ltx_p" id="S3.SS3.SSS0.Px5.p2.1">Using sparse goal conditions can effectively train policies to perform scene interaction tasks <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib8" title=""><span class="ltx_text" style="font-size:90%;">8</span></a>]</cite>. However, we cannot control motion styles via these conditions. Tracking-based methods <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib21" title=""><span class="ltx_text" style="font-size:90%;">21</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib37" title=""><span class="ltx_text" style="font-size:90%;">37</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib49" title=""><span class="ltx_text" style="font-size:90%;">49</span></a>]</cite> enable fine-grained control of each frame but require accurate stylized reference motions as dense input conditions. We employ a conditional discriminator <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib5" title=""><span class="ltx_text" style="font-size:90%;">5</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib36" title=""><span class="ltx_text" style="font-size:90%;">36</span></a>]</cite> to inject text-based style control into policies. Unlike motion <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib36" title=""><span class="ltx_text" style="font-size:90%;">36</span></a>]</cite> or one-hot  <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib5" title=""><span class="ltx_text" style="font-size:90%;">5</span></a>]</cite> conditions, language is a more intuitive interface for LLMs and users.</p> </div> </section> <section class="ltx_paragraph" id="S3.SS3.SSS0.Px6"> <h5 class="ltx_title ltx_title_paragraph">Policy Training</h5> <div class="ltx_para" id="S3.SS3.SSS0.Px6.p1"> <p class="ltx_p" id="S3.SS3.SSS0.Px6.p1.1">We train 7 task-specific policies: (1) Walk, (2) Idle, (3) Sit, (4) Lie, (5) Reach, (6) GetUp, and (7) Carry. We provide Walk, Idle, Sit, Lie, Carry policies with text conditions since these behaviors contain diverse interaction styles that represent vivid emotions. For Reach and GetUp, we do not use text conditions.</p> <ul class="ltx_itemize" id="S3.I2"> <li class="ltx_item" id="S3.I2.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I2.i1.p1"> <p class="ltx_p" id="S3.I2.i1.p1.1"><span class="ltx_text ltx_font_bold" id="S3.I2.i1.p1.1.1">Initialization.</span> Following UniHSI <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite>, we create the environment by randomly sampling objects from 3DFront <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib7" title=""><span class="ltx_text" style="font-size:90%;">7</span></a>]</cite>. For HSI skills, we initialize characters using reference state initialization <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib27" title=""><span class="ltx_text" style="font-size:90%;">27</span></a>]</cite> and default pose initialization with a random global rotation and location<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>]</cite> nearby the object. For locomotion skills, we randomly sampled on the whole ground plane while calculating the collision with the objects. For DOI skills, we randomly sample target position on the whole ground plane, and initialize objects in the humanoid’s hands from reference object motion. Notebly, we add Walk motion data to the initiate reference state data during the training of all the skills because we use Walk as the transition between different interactions.</p> </div> </li> <li class="ltx_item" id="S3.I2.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I2.i2.p1"> <p class="ltx_p" id="S3.I2.i2.p1.1"><span class="ltx_text ltx_font_bold" id="S3.I2.i2.p1.1.1">Rewards.</span> See the detailed reward function in Supp.Mat.</p> </div> </li> <li class="ltx_item" id="S3.I2.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I2.i3.p1"> <p class="ltx_p" id="S3.I2.i3.p1.1"><span class="ltx_text ltx_font_bold" id="S3.I2.i3.p1.1.1">Reset and early termination conditions.</span> Following <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib29" title=""><span class="ltx_text" style="font-size:90%;">29</span></a>]</cite>, we use a fixed episode length and fall detection as early termination triggers. We also use early termination when the task is accomplished for a certain time <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>]</cite> or the contact forces are extremely large <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite>.</p> </div> </li> </ul> </div> </section> </section> </section> <section class="ltx_section" id="S4"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">4 </span>Experiments</h2> <section class="ltx_subsection" id="S4.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.1 </span>Dataset</h3> <figure class="ltx_table" id="S4.T2"> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T2.2" style="width:411.9pt;height:154.3pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(13.8pt,-5.2pt) scale(1.07162849959132,1.07162849959132) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T2.2.1"> <tr class="ltx_tr" id="S4.T2.2.1.1"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T2.2.1.1.1" rowspan="2"><span class="ltx_text" id="S4.T2.2.1.1.1.1">Datasets</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" colspan="2" id="S4.T2.2.1.1.2">Loco</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" colspan="4" id="S4.T2.2.1.1.3">HSI</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T2.2.1.1.4">DOI</td> </tr> <tr class="ltx_tr" id="S4.T2.2.1.2"> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.2.1">Walk</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.2.2">Idle</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.2.3">Sit</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.2.4">Lie</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.2.5">Getup</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.2.6">Reach</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.2.7">Carry</td> </tr> <tr class="ltx_tr" id="S4.T2.2.1.3"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.3.1">SAMP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib12" title=""><span class="ltx_text" style="font-size:90%;">12</span></a>]</cite> </td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.2.1.3.2"><span class="ltx_text ltx_framed ltx_framed_rectangle" id="S4.T2.2.1.3.2.1" style="border-color: #000000;">20.6</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.3.3">-</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.2.1.3.4">35.2</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.2.1.3.5">14.8</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.2.1.3.6"><span class="ltx_text ltx_framed ltx_framed_rectangle" id="S4.T2.2.1.3.6.1" style="border-color: #000000;">11.2</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T2.2.1.3.7">-</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.2.1.3.8">-</td> </tr> <tr class="ltx_tr" id="S4.T2.2.1.4"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.4.1">COUCH <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib54" title=""><span class="ltx_text" style="font-size:90%;">54</span></a>]</cite> </td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.4.2">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.4.3">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.4.4">36.4</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.4.5">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.4.6"><span class="ltx_text ltx_framed ltx_framed_rectangle" id="S4.T2.2.1.4.6.1" style="border-color: #000000;">23.4</span></td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.4.7">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.4.8">-</td> </tr> <tr class="ltx_tr" id="S4.T2.2.1.5"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.5.1">Circles <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib2" title=""><span class="ltx_text" style="font-size:90%;">2</span></a>]</cite> </td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.5.2">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.5.3">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.5.4">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.5.5">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.5.6">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.5.7"><span class="ltx_text ltx_framed ltx_framed_rectangle" id="S4.T2.2.1.5.7.1" style="border-color: #000000;">3.6</span></td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.5.8">-</td> </tr> <tr class="ltx_tr" id="S4.T2.2.1.6"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.6.1">100Style <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib24" title=""><span class="ltx_text" style="font-size:90%;">24</span></a>]</cite> </td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.6.2">203.1</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.6.3">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.6.4">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.6.5">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.6.6">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.6.7">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.6.8">-</td> </tr> <tr class="ltx_tr" id="S4.T2.2.1.7"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.7.1">AMASS <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib22" title=""><span class="ltx_text" style="font-size:90%;">22</span></a>]</cite> </td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.7.2"><span class="ltx_text ltx_framed ltx_framed_rectangle" id="S4.T2.2.1.7.2.1" style="border-color: #000000;">8.2</span></td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.7.3">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.7.4">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.7.5">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.7.6">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T2.2.1.7.7">-</td> <td class="ltx_td ltx_align_center" id="S4.T2.2.1.7.8"><span class="ltx_text ltx_framed ltx_framed_rectangle" id="S4.T2.2.1.7.8.1" style="border-color: #000000;">3.4</span></td> </tr> <tr class="ltx_tr" id="S4.T2.2.1.8"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r ltx_border_t" id="S4.T2.2.1.8.1">ViconStyle</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_t" id="S4.T2.2.1.8.2">-</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r ltx_border_t" id="S4.T2.2.1.8.3">12.0</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_t" id="S4.T2.2.1.8.4">-</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_t" id="S4.T2.2.1.8.5">21.9</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_t" id="S4.T2.2.1.8.6"><span class="ltx_text ltx_framed ltx_framed_rectangle" id="S4.T2.2.1.8.6.1" style="border-color: #000000;">11.7</span></td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r ltx_border_t" id="S4.T2.2.1.8.7">-</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_t" id="S4.T2.2.1.8.8">26.0</td> </tr> </table> </span></div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T2.3.1.1" style="font-size:90%;">Table 2</span>: </span><span class="ltx_text" id="S4.T2.4.2" style="font-size:90%;">Mixture of collected stylized motion datasets.</span></figcaption> </figure> <div class="ltx_para" id="S4.SS1.p1"> <p class="ltx_p" id="S4.SS1.p1.1">We show our collected mixture of 6 motion dataset in <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.T2" title="In 4.1 Dataset ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">2</span></a>. We show the skill for training and the motion duration in minutes. The number with black bounding-box like <span class="ltx_text ltx_framed ltx_framed_rectangle" id="S4.SS1.p1.1.1" style="border-color: #000000;">20.6</span>, means the 20.6 minutes of motion in this dataset do not have style diversity, only counted as <span class="ltx_text ltx_font_italic" id="S4.SS1.p1.1.2">neutral</span>. ViconStyle is our captured dataset, which supplements for the quantity and the category of stylized motions. See details in Supp.Mat. We annotate all the motion clips with captions and style labels. For each caption, we provide 5 synonymous sentences with the help of LLM <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib1" title=""><span class="ltx_text" style="font-size:90%;">1</span></a>]</cite>. Besides neutral, we categorize the emotion or style of the remaining motions into 8 categories: <span class="ltx_text ltx_font_italic" id="S4.SS1.p1.1.3">happy, angry, hurried, tired, sad, stressed, drunk, and relaxed.</span> We left-right-flip all the motions so we get double the amount, and the captions are flipped concerning body joint symmetry as well.</p> </div> <div class="ltx_para" id="S4.SS1.p2"> <p class="ltx_p" id="S4.SS1.p2.1">For 3D objects, we use the furniture and scene layouts from the 3DFront <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib7" title=""><span class="ltx_text" style="font-size:90%;">7</span></a>]</cite> dataset for training. Since 3DFront does not provide segmentation information, we voxelize the object meshes and segment the point clouds based on normal vectors to get the affordance surface.</p> </div> </section> <section class="ltx_subsection" id="S4.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.2 </span>Motion Metrics</h3> <div class="ltx_para" id="S4.SS2.p1"> <p class="ltx_p" id="S4.SS2.p1.1">To evaluate motion diversity, we use two metrics from the previous papers: Fréchet Inception Distance (FID) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib39" title=""><span class="ltx_text" style="font-size:90%;">39</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib5" title=""><span class="ltx_text" style="font-size:90%;">5</span></a>]</cite> and Average Pairwise Distance (APD) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib5" title=""><span class="ltx_text" style="font-size:90%;">5</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib41" title=""><span class="ltx_text" style="font-size:90%;">41</span></a>]</cite>. FID measures the similarity between the distributions of generated and real data in a feature space, reflecting the realism and quality of the generated motions. Lower FID values indicate closer alignment with real data. APD, on the other hand, quantifies the diversity within the generated motions by calculating the average pairwise distance between samples. Higher APD values indicate greater diversity in the generated motions. We calculate FID and APD on joint rotations and positions.</p> </div> <div class="ltx_para" id="S4.SS2.p2"> <p class="ltx_p" id="S4.SS2.p2.1">We follow <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib12" title=""><span class="ltx_text" style="font-size:90%;">12</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite> that uses <em class="ltx_emph ltx_font_italic" id="S4.SS2.p2.1.1">Success Rate</em> and <em class="ltx_emph ltx_font_italic" id="S4.SS2.p2.1.2">Contact Error</em> as the main metrics to measure the quality of interactions quantitatively. Success Rate records the percentage of trials that humanoids successfully complete the contact within a certain threshold. We follow <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>]</cite> to set the threshold of Sit as 20cm, Reach as 20cm, Lie as 30cm, Carry as 20cm.</p> </div> <div class="ltx_para" id="S4.SS2.p3"> <p class="ltx_p" id="S4.SS2.p3.1">To evaluate the generation quality of long-term scripts, we also involve user study and SBERT <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib32" title=""><span class="ltx_text" style="font-size:90%;">32</span></a>]</cite> Model, please see the metrics in the corresponding part.</p> </div> </section> <section class="ltx_subsection" id="S4.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.3 </span>Comparison with SOTA methods</h3> <section class="ltx_subsubsection" id="S4.SS3.SSS1"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.3.1 </span>Physical Performance for Different Skills</h4> <div class="ltx_para" id="S4.SS3.SSS1.p1"> <p class="ltx_p" id="S4.SS3.SSS1.p1.1">Our method achieves better or comparable results across various metrics in <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.T3" title="In 4.3.1 Physical Performance for Different Skills ‣ 4.3 Comparison with SOTA methods ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">3</span></a>. Unlike previous physics-based methods <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>]</cite> which only care about contact but not styles, our result is achieved on 4096 random text conditions sampled from the datasets. The previous methods could be viewed as just a specific situation of our model. Under this background, we can see from <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.T3" title="In 4.3.1 Physical Performance for Different Skills ‣ 4.3 Comparison with SOTA methods ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">3</span></a> that our results are only slightly lower than the best methods in Reach and Carry skill. Since Interphys <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>]</cite> have not released their code and carry motion data, we only train on the small amount of carry motion in AMASS <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib22" title=""><span class="ltx_text" style="font-size:90%;">22</span></a>]</cite> for <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.T3" title="In 4.3.1 Physical Performance for Different Skills ‣ 4.3 Comparison with SOTA methods ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">3</span></a>.</p> </div> <figure class="ltx_table" id="S4.T3"> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T3.2.2" style="width:433.6pt;height:120.8pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-9.2pt,2.6pt) scale(0.959099041468634,0.959099041468634) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T3.2.2.2"> <tr class="ltx_tr" id="S4.T3.2.2.2.2"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_tt" id="S4.T3.2.2.2.2.3" rowspan="2"><span class="ltx_text" id="S4.T3.2.2.2.2.3.1">Methods</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" colspan="4" id="S4.T3.1.1.1.1.1">Success Rate (%) <math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T3.1.1.1.1.1.m1.1"><semantics id="S4.T3.1.1.1.1.1.m1.1a"><mo id="S4.T3.1.1.1.1.1.m1.1.1" stretchy="false" xref="S4.T3.1.1.1.1.1.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T3.1.1.1.1.1.m1.1b"><ci id="S4.T3.1.1.1.1.1.m1.1.1.cmml" xref="S4.T3.1.1.1.1.1.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T3.1.1.1.1.1.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T3.1.1.1.1.1.m1.1d">↑</annotation></semantics></math> </td> <td class="ltx_td ltx_align_center ltx_border_tt" colspan="4" id="S4.T3.2.2.2.2.2">Contact Error <math alttext="\downarrow" class="ltx_Math" display="inline" id="S4.T3.2.2.2.2.2.m1.1"><semantics id="S4.T3.2.2.2.2.2.m1.1a"><mo id="S4.T3.2.2.2.2.2.m1.1.1" stretchy="false" xref="S4.T3.2.2.2.2.2.m1.1.1.cmml">↓</mo><annotation-xml encoding="MathML-Content" id="S4.T3.2.2.2.2.2.m1.1b"><ci id="S4.T3.2.2.2.2.2.m1.1.1.cmml" xref="S4.T3.2.2.2.2.2.m1.1.1">↓</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T3.2.2.2.2.2.m1.1c">\downarrow</annotation><annotation encoding="application/x-llamapun" id="S4.T3.2.2.2.2.2.m1.1d">↓</annotation></semantics></math> </td> </tr> <tr class="ltx_tr" id="S4.T3.2.2.2.3"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.3.1">Sit</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.3.2">Lie</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.3.3">Reach</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.3.4">Carry</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.3.5">Sit</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.3.6">Lie</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.3.7">Reach</td> <td class="ltx_td ltx_align_center" id="S4.T3.2.2.2.3.8">Carry</td> </tr> <tr class="ltx_tr" id="S4.T3.2.2.2.4"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S4.T3.2.2.2.4.1">InterPhys <cite class="ltx_cite ltx_citemacro_citep">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>]</cite> </td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.4.2">93.7</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.4.3">80.0</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.4.4">-</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.4.5">94.3</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.4.6">0.09</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.4.7">0.30</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.4.8">-</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.2.2.2.4.9"><span class="ltx_text ltx_font_bold" id="S4.T3.2.2.2.4.9.1">0.08</span></td> </tr> <tr class="ltx_tr" id="S4.T3.2.2.2.5"> <td class="ltx_td ltx_align_left ltx_border_r" id="S4.T3.2.2.2.5.1">InterScene <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib26" title=""><span class="ltx_text" style="font-size:90%;">26</span></a>]</cite> </td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.5.2">97.8</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.5.3">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.5.4">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.5.5">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.5.6">0.04</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.5.7">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.5.8">-</td> <td class="ltx_td ltx_align_center" id="S4.T3.2.2.2.5.9">-</td> </tr> <tr class="ltx_tr" id="S4.T3.2.2.2.6"> <td class="ltx_td ltx_align_left ltx_border_r" id="S4.T3.2.2.2.6.1">UniHSI<cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite> </td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.6.2">94.3</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.6.3">81.5</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.6.4"><span class="ltx_text ltx_font_bold" id="S4.T3.2.2.2.6.4.1">97.5</span></td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.6.5">-</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.6.6"><span class="ltx_text ltx_framed ltx_framed_underline" id="S4.T3.2.2.2.6.6.1">0.032</span></td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.6.7">0.061</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T3.2.2.2.6.8"><span class="ltx_text ltx_font_bold" id="S4.T3.2.2.2.6.8.1">0.016</span></td> <td class="ltx_td ltx_align_center" id="S4.T3.2.2.2.6.9">-</td> </tr> <tr class="ltx_tr" id="S4.T3.2.2.2.7"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S4.T3.2.2.2.7.1">SIMS</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.7.2"><span class="ltx_text ltx_framed ltx_framed_underline" id="S4.T3.2.2.2.7.2.1">98.1</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.7.3"><span class="ltx_text ltx_framed ltx_framed_underline" id="S4.T3.2.2.2.7.3.1">87.6</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.7.4"><span class="ltx_text ltx_framed ltx_framed_underline" id="S4.T3.2.2.2.7.4.1">95.2</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.7.5">92.9</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.7.6"><span class="ltx_text ltx_font_bold" id="S4.T3.2.2.2.7.6.1">0.028</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.7.7"><span class="ltx_text ltx_framed ltx_framed_underline" id="S4.T3.2.2.2.7.7.1">0.049</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T3.2.2.2.7.8"><span class="ltx_text ltx_framed ltx_framed_underline" id="S4.T3.2.2.2.7.8.1">0.026</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T3.2.2.2.7.9">0.099</td> </tr> <tr class="ltx_tr" id="S4.T3.2.2.2.8"> <td class="ltx_td ltx_align_left ltx_border_bb ltx_border_r" id="S4.T3.2.2.2.8.1">SIMS (+data)</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T3.2.2.2.8.2"><span class="ltx_text ltx_font_bold" id="S4.T3.2.2.2.8.2.1">98.4</span></td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T3.2.2.2.8.3"><span class="ltx_text ltx_font_bold" id="S4.T3.2.2.2.8.3.1">89.6</span></td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T3.2.2.2.8.4">-</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T3.2.2.2.8.5"><span class="ltx_text ltx_font_bold" id="S4.T3.2.2.2.8.5.1">96.4</span></td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T3.2.2.2.8.6">0.033</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T3.2.2.2.8.7"><span class="ltx_text ltx_font_bold" id="S4.T3.2.2.2.8.7.1">0.048</span></td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T3.2.2.2.8.8">-</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T3.2.2.2.8.9"><span class="ltx_text ltx_framed ltx_framed_underline" id="S4.T3.2.2.2.8.9.1">0.085</span></td> </tr> </table> </span></div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T3.4.1.1" style="font-size:90%;">Table 3</span>: </span><span class="ltx_text" id="S4.T3.5.2" style="font-size:90%;">Comparision on Baseline Models. For fair comparison, our Sit, Lie, and Reach policies are only trained on SAMP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib12" title=""><span class="ltx_text" style="font-size:90%;">12</span></a>]</cite> here. While our Carry policy is trained on the small amount of carry motions from AMASS <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib22" title=""><span class="ltx_text" style="font-size:90%;">22</span></a>]</cite>. (+data) here represents our results trained on available motions from the mixture of 6 datasets.</span></figcaption> </figure> <figure class="ltx_figure" id="S4.F3"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="434" id="S4.F3.g1" src="x3.png" width="822"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F3.2.1.1" style="font-size:90%;">Figure 3</span>: </span><span class="ltx_text" id="S4.F3.3.2" style="font-size:90%;">Long-term scripts with detailed keyframes and vivid final stories in two complex 3D scenes generated by our complete system. Upper: character in the bedroom and living room. Lower: character in the living room, dining room, and study room. We briefly demonstrate the retrieved summaries, key frames and part of the final long stories.</span></figcaption> </figure> </section> <section class="ltx_subsubsection" id="S4.SS3.SSS2"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.3.2 </span>Motion Diversity for Different Skills</h4> <div class="ltx_para" id="S4.SS3.SSS2.p1"> <p class="ltx_p" id="S4.SS3.SSS2.p1.1">We compare motion diversity in the Sit and Lie skills with UniHSI <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite> and our re-implemented Interphys <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>]</cite>. All experiments are conducted on a single RTX 4090 GPU, running 1024 sequences and aggregating the results over 10 trials. For each sequence, the text condition is randomly sampled from the dataset. To test UniHSI <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite>, we randomly sample contact pairs from the provided chain of contacts from the generated ScenePlan dataset. We measure the FID between the generated motions and that of reference motions from SAMP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib12" title=""><span class="ltx_text" style="font-size:90%;">12</span></a>]</cite>. The APD measures the diversity among the generated motion sequences. As shown in <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.T4" title="In 4.3.2 Motion Diversity for Different Skills ‣ 4.3 Comparison with SOTA methods ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">4</span></a>, our results significantly outperform UniHSI in both FID and APD metrics. Our method achieves lower FID, indicating motions produced from ours are closer to the distribution of reference motions. Notably, the APD results highlight that the motions generated by UniHSI are nearly identical, demonstrating a lack of diversity. Our method also surpass the re-implemented InterPhys <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>]</cite>.</p> </div> <figure class="ltx_table" id="S4.T4"> <div class="ltx_inline-block ltx_transformed_outer" id="S4.T4.8" style="width:433.6pt;height:107.9pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(35.4pt,-8.8pt) scale(1.19508049925066,1.19508049925066) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T4.8.8"> <tr class="ltx_tr" id="S4.T4.2.2.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_tt" id="S4.T4.2.2.2.3"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.2.2.2.3.1"> <span class="ltx_p" id="S4.T4.2.2.2.3.1.1" style="width:56.9pt;"><span class="ltx_text" id="S4.T4.2.2.2.3.1.1.1">Method</span></span> </span> </td> <td class="ltx_td ltx_align_center ltx_align_top ltx_border_r ltx_border_tt" colspan="3" id="S4.T4.1.1.1.1">FID<math alttext="\downarrow" class="ltx_Math" display="inline" id="S4.T4.1.1.1.1.m1.1"><semantics id="S4.T4.1.1.1.1.m1.1a"><mo id="S4.T4.1.1.1.1.m1.1.1" stretchy="false" xref="S4.T4.1.1.1.1.m1.1.1.cmml">↓</mo><annotation-xml encoding="MathML-Content" id="S4.T4.1.1.1.1.m1.1b"><ci id="S4.T4.1.1.1.1.m1.1.1.cmml" xref="S4.T4.1.1.1.1.m1.1.1">↓</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.1.1.1.1.m1.1c">\downarrow</annotation><annotation encoding="application/x-llamapun" id="S4.T4.1.1.1.1.m1.1d">↓</annotation></semantics></math> </td> <td class="ltx_td ltx_align_center ltx_align_top ltx_border_tt" colspan="3" id="S4.T4.2.2.2.2">APD<math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T4.2.2.2.2.m1.1"><semantics id="S4.T4.2.2.2.2.m1.1a"><mo id="S4.T4.2.2.2.2.m1.1.1" stretchy="false" xref="S4.T4.2.2.2.2.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T4.2.2.2.2.m1.1b"><ci id="S4.T4.2.2.2.2.m1.1.1.cmml" xref="S4.T4.2.2.2.2.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.2.2.2.2.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T4.2.2.2.2.m1.1d">↑</annotation></semantics></math> </td> </tr> <tr class="ltx_tr" id="S4.T4.8.8.9"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.8.8.9.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.9.1.1"> <span class="ltx_p" id="S4.T4.8.8.9.1.1.1" style="width:56.9pt;"></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.8.8.9.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.9.2.1"> <span class="ltx_p" id="S4.T4.8.8.9.2.1.1" style="width:28.5pt;">Sit</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.8.8.9.3"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.9.3.1"> <span class="ltx_p" id="S4.T4.8.8.9.3.1.1" style="width:28.5pt;">Lie</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.8.8.9.4"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.9.4.1"> <span class="ltx_p" id="S4.T4.8.8.9.4.1.1" style="width:28.5pt;">Carry</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.8.8.9.5"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.9.5.1"> <span class="ltx_p" id="S4.T4.8.8.9.5.1.1" style="width:45.5pt;">Sit</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.8.8.9.6"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.9.6.1"> <span class="ltx_p" id="S4.T4.8.8.9.6.1.1" style="width:45.5pt;">Lie</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top" id="S4.T4.8.8.9.7"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.9.7.1"> <span class="ltx_p" id="S4.T4.8.8.9.7.1.1" style="width:45.5pt;">Carry</span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T4.3.3.3"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S4.T4.3.3.3.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.3.3.3.2.1"> <span class="ltx_p" id="S4.T4.3.3.3.2.1.1" style="width:56.9pt;">InterPhys* <cite class="ltx_cite ltx_centering ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>]</cite></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S4.T4.3.3.3.3"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.3.3.3.3.1"> <span class="ltx_p" id="S4.T4.3.3.3.3.1.1" style="width:28.5pt;">-</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S4.T4.3.3.3.4"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.3.3.3.4.1"> <span class="ltx_p" id="S4.T4.3.3.3.4.1.1" style="width:28.5pt;">-</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S4.T4.3.3.3.5"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.3.3.3.5.1"> <span class="ltx_p" id="S4.T4.3.3.3.5.1.1" style="width:28.5pt;">81.0</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S4.T4.3.3.3.6"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.3.3.3.6.1"> <span class="ltx_p" id="S4.T4.3.3.3.6.1.1" style="width:45.5pt;">-</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S4.T4.3.3.3.7"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.3.3.3.7.1"> <span class="ltx_p" id="S4.T4.3.3.3.7.1.1" style="width:45.5pt;">-</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T4.3.3.3.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.3.3.3.1.1"> <span class="ltx_p" id="S4.T4.3.3.3.1.1.1" style="width:45.5pt;">12.41<math alttext="\pm" class="ltx_centering" display="inline" id="S4.T4.3.3.3.1.1.1.m1.1"><semantics id="S4.T4.3.3.3.1.1.1.m1.1a"><mo id="S4.T4.3.3.3.1.1.1.m1.1.1" xref="S4.T4.3.3.3.1.1.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T4.3.3.3.1.1.1.m1.1b"><csymbol cd="latexml" id="S4.T4.3.3.3.1.1.1.m1.1.1.cmml" xref="S4.T4.3.3.3.1.1.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.3.3.3.1.1.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T4.3.3.3.1.1.1.m1.1d">±</annotation></semantics></math>0.19</span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T4.5.5.5"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.5.5.5.3"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.5.5.5.3.1"> <span class="ltx_p" id="S4.T4.5.5.5.3.1.1" style="width:56.9pt;">UniHSI <cite class="ltx_cite ltx_centering ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a>]</cite></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.5.5.5.4"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.5.5.5.4.1"> <span class="ltx_p" id="S4.T4.5.5.5.4.1.1" style="width:28.5pt;">153.84</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.5.5.5.5"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.5.5.5.5.1"> <span class="ltx_p" id="S4.T4.5.5.5.5.1.1" style="width:28.5pt;">211.22</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.5.5.5.6"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.5.5.5.6.1"> <span class="ltx_p" id="S4.T4.5.5.5.6.1.1" style="width:28.5pt;">-</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.4.4.4.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.4.4.4.1.1"> <span class="ltx_p" id="S4.T4.4.4.4.1.1.1" style="width:45.5pt;">1.14<math alttext="\pm" class="ltx_centering" display="inline" id="S4.T4.4.4.4.1.1.1.m1.1"><semantics id="S4.T4.4.4.4.1.1.1.m1.1a"><mo id="S4.T4.4.4.4.1.1.1.m1.1.1" xref="S4.T4.4.4.4.1.1.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T4.4.4.4.1.1.1.m1.1b"><csymbol cd="latexml" id="S4.T4.4.4.4.1.1.1.m1.1.1.cmml" xref="S4.T4.4.4.4.1.1.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.4.4.4.1.1.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T4.4.4.4.1.1.1.m1.1d">±</annotation></semantics></math>0.01</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T4.5.5.5.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.5.5.5.2.1"> <span class="ltx_p" id="S4.T4.5.5.5.2.1.1" style="width:45.5pt;">1.35<math alttext="\pm" class="ltx_centering" display="inline" id="S4.T4.5.5.5.2.1.1.m1.1"><semantics id="S4.T4.5.5.5.2.1.1.m1.1a"><mo id="S4.T4.5.5.5.2.1.1.m1.1.1" xref="S4.T4.5.5.5.2.1.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T4.5.5.5.2.1.1.m1.1b"><csymbol cd="latexml" id="S4.T4.5.5.5.2.1.1.m1.1.1.cmml" xref="S4.T4.5.5.5.2.1.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.5.5.5.2.1.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T4.5.5.5.2.1.1.m1.1d">±</annotation></semantics></math>0.02</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top" id="S4.T4.5.5.5.7"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.5.5.5.7.1"> <span class="ltx_p" id="S4.T4.5.5.5.7.1.1" style="width:45.5pt;">-</span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T4.8.8.8"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S4.T4.8.8.8.4"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.8.4.1"> <span class="ltx_p" id="S4.T4.8.8.8.4.1.1" style="width:56.9pt;">SIMS</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S4.T4.8.8.8.5"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.8.5.1"> <span class="ltx_p" id="S4.T4.8.8.8.5.1.1" style="width:28.5pt;"><span class="ltx_text ltx_font_bold" id="S4.T4.8.8.8.5.1.1.1">125.66</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S4.T4.8.8.8.6"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.8.6.1"> <span class="ltx_p" id="S4.T4.8.8.8.6.1.1" style="width:28.5pt;"><span class="ltx_text ltx_font_bold" id="S4.T4.8.8.8.6.1.1.1">171.24</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S4.T4.8.8.8.7"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.8.7.1"> <span class="ltx_p" id="S4.T4.8.8.8.7.1.1" style="width:28.5pt;"><span class="ltx_text ltx_font_bold" id="S4.T4.8.8.8.7.1.1.1">65.14</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S4.T4.6.6.6.1"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.6.6.6.1.1"> <span class="ltx_p" id="S4.T4.6.6.6.1.1.1" style="width:45.5pt;"><span class="ltx_text ltx_font_bold" id="S4.T4.6.6.6.1.1.1.1">16.55<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T4.6.6.6.1.1.1.1.m1.1"><semantics id="S4.T4.6.6.6.1.1.1.1.m1.1a"><mo id="S4.T4.6.6.6.1.1.1.1.m1.1.1" xref="S4.T4.6.6.6.1.1.1.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T4.6.6.6.1.1.1.1.m1.1b"><csymbol cd="latexml" id="S4.T4.6.6.6.1.1.1.1.m1.1.1.cmml" xref="S4.T4.6.6.6.1.1.1.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.6.6.6.1.1.1.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T4.6.6.6.1.1.1.1.m1.1d">±</annotation></semantics></math>0.54</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S4.T4.7.7.7.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.7.7.7.2.1"> <span class="ltx_p" id="S4.T4.7.7.7.2.1.1" style="width:45.5pt;"><span class="ltx_text ltx_font_bold" id="S4.T4.7.7.7.2.1.1.1">16.40<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T4.7.7.7.2.1.1.1.m1.1"><semantics id="S4.T4.7.7.7.2.1.1.1.m1.1a"><mo id="S4.T4.7.7.7.2.1.1.1.m1.1.1" xref="S4.T4.7.7.7.2.1.1.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T4.7.7.7.2.1.1.1.m1.1b"><csymbol cd="latexml" id="S4.T4.7.7.7.2.1.1.1.m1.1.1.cmml" xref="S4.T4.7.7.7.2.1.1.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.7.7.7.2.1.1.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T4.7.7.7.2.1.1.1.m1.1d">±</annotation></semantics></math>0.94</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S4.T4.8.8.8.3"> <span class="ltx_inline-block ltx_align_top" id="S4.T4.8.8.8.3.1"> <span class="ltx_p" id="S4.T4.8.8.8.3.1.1" style="width:45.5pt;"><span class="ltx_text ltx_font_bold" id="S4.T4.8.8.8.3.1.1.1">14.36<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T4.8.8.8.3.1.1.1.m1.1"><semantics id="S4.T4.8.8.8.3.1.1.1.m1.1a"><mo id="S4.T4.8.8.8.3.1.1.1.m1.1.1" xref="S4.T4.8.8.8.3.1.1.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T4.8.8.8.3.1.1.1.m1.1b"><csymbol cd="latexml" id="S4.T4.8.8.8.3.1.1.1.m1.1.1.cmml" xref="S4.T4.8.8.8.3.1.1.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T4.8.8.8.3.1.1.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T4.8.8.8.3.1.1.1.m1.1d">±</annotation></semantics></math>0.12</span></span> </span> </td> </tr> </table> </span></div> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T4.10.1.1" style="font-size:90%;">Table 4</span>: </span><span class="ltx_text" id="S4.T4.11.2" style="font-size:90%;">Motion diversity results. InterPhys <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib13" title=""><span class="ltx_text" style="font-size:90%;">13</span></a>]</cite> is not released, so we report our re-implemented version here. For fair comparison, our Sit, Lie, and Reach policies are only trained on SAMP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib12" title=""><span class="ltx_text" style="font-size:90%;">12</span></a>]</cite> here. While the Carry policy and the re-implemented InterPhys are both trained on the carry motions from ViconStyle. </span></figcaption> </figure> </section> <section class="ltx_subsubsection" id="S4.SS3.SSS3"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.3.3 </span>User Study on SOTA Long-Term HSI Methods</h4> <div class="ltx_para" id="S4.SS3.SSS3.p1"> <p class="ltx_p" id="S4.SS3.SSS3.p1.1">To further evaluate the control capabilities of the long-term scripts, we conducted a user study on the rendered videos generated from different methods. We use the same category of interactions to drive the characters in the scenes. 30 participants were asked to rate the physical realism, motion diversity, split engagement and emotion resonace of the videos produced by each method on a scale from 1 (poor) to 5 (excellent). In  <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.T5" title="In 4.3.3 User Study on SOTA Long-Term HSI Methods ‣ 4.3 Comparison with SOTA methods ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">5</span></a>, the results indicate that our approach significantly outperformed UniHSI, demonstrating its effectiveness in both body motion superiority and script superiority in the generated animations.</p> </div> <figure class="ltx_table" id="S4.T5"> <div class="ltx_inline-block ltx_transformed_outer" id="S4.T5.4" style="width:346.9pt;height:105.8pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(25.8pt,-7.9pt) scale(1.1751257272861,1.1751257272861) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T5.4.4"> <tr class="ltx_tr" id="S4.T5.4.4.5"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_tt" colspan="2" id="S4.T5.4.4.5.1">Metrics</td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_tt" id="S4.T5.4.4.5.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T5.4.4.5.2.1"> <span class="ltx_p" id="S4.T5.4.4.5.2.1.1" style="width:56.9pt;">UniHSI</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_tt" id="S4.T5.4.4.5.3"> <span class="ltx_inline-block ltx_align_top" id="S4.T5.4.4.5.3.1"> <span class="ltx_p" id="S4.T5.4.4.5.3.1.1" style="width:56.9pt;">SIMS</span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T5.1.1.1"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S4.T5.1.1.1.2" rowspan="2"><span class="ltx_text" id="S4.T5.1.1.1.2.1">Motion</span></td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S4.T5.1.1.1.1">Physical Realism <math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T5.1.1.1.1.m1.1"><semantics id="S4.T5.1.1.1.1.m1.1a"><mo id="S4.T5.1.1.1.1.m1.1.1" stretchy="false" xref="S4.T5.1.1.1.1.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T5.1.1.1.1.m1.1b"><ci id="S4.T5.1.1.1.1.m1.1.1.cmml" xref="S4.T5.1.1.1.1.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T5.1.1.1.1.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T5.1.1.1.1.m1.1d">↑</annotation></semantics></math> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S4.T5.1.1.1.3"> <span class="ltx_inline-block ltx_align_top" id="S4.T5.1.1.1.3.1"> <span class="ltx_p" id="S4.T5.1.1.1.3.1.1" style="width:56.9pt;">2.6</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T5.1.1.1.4"> <span class="ltx_inline-block ltx_align_top" id="S4.T5.1.1.1.4.1"> <span class="ltx_p" id="S4.T5.1.1.1.4.1.1" style="width:56.9pt;"><span class="ltx_text ltx_font_bold" id="S4.T5.1.1.1.4.1.1.1">3.4</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T5.2.2.2"> <td class="ltx_td ltx_align_left ltx_border_r" id="S4.T5.2.2.2.1">Motion Diversity <math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T5.2.2.2.1.m1.1"><semantics id="S4.T5.2.2.2.1.m1.1a"><mo id="S4.T5.2.2.2.1.m1.1.1" stretchy="false" xref="S4.T5.2.2.2.1.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T5.2.2.2.1.m1.1b"><ci id="S4.T5.2.2.2.1.m1.1.1.cmml" xref="S4.T5.2.2.2.1.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T5.2.2.2.1.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T5.2.2.2.1.m1.1d">↑</annotation></semantics></math> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r" id="S4.T5.2.2.2.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T5.2.2.2.2.1"> <span class="ltx_p" id="S4.T5.2.2.2.2.1.1" style="width:56.9pt;">2.9</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top" id="S4.T5.2.2.2.3"> <span class="ltx_inline-block ltx_align_top" id="S4.T5.2.2.2.3.1"> <span class="ltx_p" id="S4.T5.2.2.2.3.1.1" style="width:56.9pt;"><span class="ltx_text ltx_font_bold" id="S4.T5.2.2.2.3.1.1.1">3.6</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T5.3.3.3"> <td class="ltx_td ltx_align_left ltx_border_bb ltx_border_r ltx_border_t" id="S4.T5.3.3.3.2" rowspan="2"><span class="ltx_text" id="S4.T5.3.3.3.2.1">Script</span></td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S4.T5.3.3.3.1">Plot Engagement <math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T5.3.3.3.1.m1.1"><semantics id="S4.T5.3.3.3.1.m1.1a"><mo id="S4.T5.3.3.3.1.m1.1.1" stretchy="false" xref="S4.T5.3.3.3.1.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T5.3.3.3.1.m1.1b"><ci id="S4.T5.3.3.3.1.m1.1.1.cmml" xref="S4.T5.3.3.3.1.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T5.3.3.3.1.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T5.3.3.3.1.m1.1d">↑</annotation></semantics></math> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S4.T5.3.3.3.3"> <span class="ltx_inline-block ltx_align_top" id="S4.T5.3.3.3.3.1"> <span class="ltx_p" id="S4.T5.3.3.3.3.1.1" style="width:56.9pt;">2.4</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S4.T5.3.3.3.4"> <span class="ltx_inline-block ltx_align_top" id="S4.T5.3.3.3.4.1"> <span class="ltx_p" id="S4.T5.3.3.3.4.1.1" style="width:56.9pt;"><span class="ltx_text ltx_font_bold" id="S4.T5.3.3.3.4.1.1.1">3.0</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S4.T5.4.4.4"> <td class="ltx_td ltx_align_left ltx_border_bb ltx_border_r" id="S4.T5.4.4.4.1">Emotional Resonace <math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T5.4.4.4.1.m1.1"><semantics id="S4.T5.4.4.4.1.m1.1a"><mo id="S4.T5.4.4.4.1.m1.1.1" stretchy="false" xref="S4.T5.4.4.4.1.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T5.4.4.4.1.m1.1b"><ci id="S4.T5.4.4.4.1.m1.1.1.cmml" xref="S4.T5.4.4.4.1.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T5.4.4.4.1.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T5.4.4.4.1.m1.1d">↑</annotation></semantics></math> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r" id="S4.T5.4.4.4.2"> <span class="ltx_inline-block ltx_align_top" id="S4.T5.4.4.4.2.1"> <span class="ltx_p" id="S4.T5.4.4.4.2.1.1" style="width:56.9pt;">3.0</span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb" id="S4.T5.4.4.4.3"> <span class="ltx_inline-block ltx_align_top" id="S4.T5.4.4.4.3.1"> <span class="ltx_p" id="S4.T5.4.4.4.3.1.1" style="width:56.9pt;"><span class="ltx_text ltx_font_bold" id="S4.T5.4.4.4.3.1.1.1">3.8</span></span> </span> </td> </tr> </table> </span></div> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T5.6.1.1" style="font-size:90%;">Table 5</span>: </span><span class="ltx_text" id="S4.T5.7.2" style="font-size:90%;">User Study on SOTA long-term HSI methods. SIMS outperforms the SOTA method UniHSI by a significant margin.</span></figcaption> </figure> <figure class="ltx_figure" id="S4.F4"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="494" id="S4.F4.1.g1" src="x4.png" width="797"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F4.3.1.1" style="font-size:90%;">Figure 4</span>: </span><span class="ltx_text" id="S4.F4.4.2" style="font-size:90%;">Qualitative results for skills with different text conditions.</span></figcaption> </figure> </section> </section> <section class="ltx_subsection" id="S4.SS4"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.4 </span>Ablation Study on SIMS</h3> <figure class="ltx_table" id="S4.T6"> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T6.2" style="width:346.9pt;height:50.3pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-12.7pt,1.8pt) scale(0.931787383603231,0.931787383603231) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T6.2.2"> <tr class="ltx_tr" id="S4.T6.2.2.2"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_tt" id="S4.T6.2.2.2.3">Method</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T6.1.1.1.1">SBERT Similarity <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib32" title=""><span class="ltx_text" style="font-size:90%;">32</span></a>]</cite><math alttext="\downarrow" class="ltx_Math" display="inline" id="S4.T6.1.1.1.1.m1.1"><semantics id="S4.T6.1.1.1.1.m1.1a"><mo id="S4.T6.1.1.1.1.m1.1.1" stretchy="false" xref="S4.T6.1.1.1.1.m1.1.1.cmml">↓</mo><annotation-xml encoding="MathML-Content" id="S4.T6.1.1.1.1.m1.1b"><ci id="S4.T6.1.1.1.1.m1.1.1.cmml" xref="S4.T6.1.1.1.1.m1.1.1">↓</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T6.1.1.1.1.m1.1c">\downarrow</annotation><annotation encoding="application/x-llamapun" id="S4.T6.1.1.1.1.m1.1d">↓</annotation></semantics></math> </td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T6.2.2.2.2">Average Generation Time(s)<math alttext="\downarrow" class="ltx_Math" display="inline" id="S4.T6.2.2.2.2.m1.1"><semantics id="S4.T6.2.2.2.2.m1.1a"><mo id="S4.T6.2.2.2.2.m1.1.1" stretchy="false" xref="S4.T6.2.2.2.2.m1.1.1.cmml">↓</mo><annotation-xml encoding="MathML-Content" id="S4.T6.2.2.2.2.m1.1b"><ci id="S4.T6.2.2.2.2.m1.1.1.cmml" xref="S4.T6.2.2.2.2.m1.1.1">↓</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T6.2.2.2.2.m1.1c">\downarrow</annotation><annotation encoding="application/x-llamapun" id="S4.T6.2.2.2.2.m1.1d">↓</annotation></semantics></math> </td> </tr> <tr class="ltx_tr" id="S4.T6.2.2.3"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S4.T6.2.2.3.1">LLM</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T6.2.2.3.2">0.8167</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T6.2.2.3.3">12.2</td> </tr> <tr class="ltx_tr" id="S4.T6.2.2.4"> <td class="ltx_td ltx_align_left ltx_border_bb ltx_border_r" id="S4.T6.2.2.4.1">RASG</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T6.2.2.4.2"><span class="ltx_text ltx_font_bold" id="S4.T6.2.2.4.2.1">0.7759</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T6.2.2.4.3"><span class="ltx_text ltx_font_bold" id="S4.T6.2.2.4.3.1">7.32</span></td> </tr> </table> </span></div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T6.4.1.1" style="font-size:90%;">Table 6</span>: </span><span class="ltx_text" id="S4.T6.5.2" style="font-size:90%;">Ablation on script generation methods.</span></figcaption> </figure> <section class="ltx_subsubsection" id="S4.SS4.SSS1"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.4.1 </span>Direct Generation <span class="ltx_text ltx_font_italic" id="S4.SS4.SSS1.1.1">vs.</span> RASG.</h4> <div class="ltx_para" id="S4.SS4.SSS1.p1"> <p class="ltx_p" id="S4.SS4.SSS1.p1.1">We compare our RASG method with direct LLM generation using GPT-4 <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib1" title=""><span class="ltx_text" style="font-size:90%;">1</span></a>]</cite>. For direct LLM generation, we provide the LLM with all the available skills as input. To evaluate the narrative diversity and generation efficiency of our approach, we measure the cosine similarity of SBERT <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib32" title=""><span class="ltx_text" style="font-size:90%;">32</span></a>]</cite> embeddings and the generation time. Our method achieves lower cosine similarity among the generated stories, indicating that it produces more diverse scripts. For generation time, we require the LLM to generate approximately 20 keyframes for direct generation method. For the RASG method, we ask LLM to retrieve 4-5 short scripts, which are approximately 20 keyframes in total. The results are evaluated on 200 generated samples separately.</p> </div> </section> <section class="ltx_subsubsection" id="S4.SS4.SSS2"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.4.2 </span>Generalization on Unseen Objects</h4> <figure class="ltx_table" id="S4.T7"> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T7.2" style="width:303.5pt;height:78.8pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(13.1pt,-3.4pt) scale(1.09457633742172,1.09457633742172) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T7.2.2"> <tr class="ltx_tr" id="S4.T7.2.2.2"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T7.2.2.2.3" rowspan="2"><span class="ltx_text" id="S4.T7.2.2.2.3.1">Datasets</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" colspan="2" id="S4.T7.1.1.1.1">Success Rate(%)<math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T7.1.1.1.1.m1.1"><semantics id="S4.T7.1.1.1.1.m1.1a"><mo id="S4.T7.1.1.1.1.m1.1.1" stretchy="false" xref="S4.T7.1.1.1.1.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T7.1.1.1.1.m1.1b"><ci id="S4.T7.1.1.1.1.m1.1.1.cmml" xref="S4.T7.1.1.1.1.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T7.1.1.1.1.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T7.1.1.1.1.m1.1d">↑</annotation></semantics></math> </td> <td class="ltx_td ltx_align_center ltx_border_tt" colspan="2" id="S4.T7.2.2.2.2">Contact Error<math alttext="\downarrow" class="ltx_Math" display="inline" id="S4.T7.2.2.2.2.m1.1"><semantics id="S4.T7.2.2.2.2.m1.1a"><mo id="S4.T7.2.2.2.2.m1.1.1" stretchy="false" xref="S4.T7.2.2.2.2.m1.1.1.cmml">↓</mo><annotation-xml encoding="MathML-Content" id="S4.T7.2.2.2.2.m1.1b"><ci id="S4.T7.2.2.2.2.m1.1.1.cmml" xref="S4.T7.2.2.2.2.m1.1.1">↓</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T7.2.2.2.2.m1.1c">\downarrow</annotation><annotation encoding="application/x-llamapun" id="S4.T7.2.2.2.2.m1.1d">↓</annotation></semantics></math> </td> </tr> <tr class="ltx_tr" id="S4.T7.2.2.3"> <td class="ltx_td ltx_align_center" id="S4.T7.2.2.3.1">Sit</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T7.2.2.3.2">Lie</td> <td class="ltx_td ltx_align_center" id="S4.T7.2.2.3.3">Sit</td> <td class="ltx_td ltx_align_center" id="S4.T7.2.2.3.4">Lie</td> </tr> <tr class="ltx_tr" id="S4.T7.2.2.4"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T7.2.2.4.1">PartNet <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib25" title=""><span class="ltx_text" style="font-size:90%;">25</span></a>]</cite> </td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T7.2.2.4.2">98.7</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T7.2.2.4.3">87.6</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T7.2.2.4.4">0.028</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T7.2.2.4.5">0.065</td> </tr> <tr class="ltx_tr" id="S4.T7.2.2.5"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T7.2.2.5.1">3DFront <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib7" title=""><span class="ltx_text" style="font-size:90%;">7</span></a>]</cite> </td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T7.2.2.5.2">96.9</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T7.2.2.5.3">89.7</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T7.2.2.5.4">0.014</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T7.2.2.5.5">0.030</td> </tr> </table> </span></div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T7.4.1.1" style="font-size:90%;">Table 7</span>: </span><span class="ltx_text" id="S4.T7.5.2" style="font-size:90%;">Results on PartNet and 3DFront. The policies are trained on 3DFront’s furniture only.</span></figcaption> </figure> <div class="ltx_para" id="S4.SS4.SSS2.p1"> <p class="ltx_p" id="S4.SS4.SSS2.p1.1">In <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.T7" title="In 4.4.2 Generalization on Unseen Objects ‣ 4.4 Ablation Study on SIMS ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">7</span></a>, we show the physical performance of interaction skills on PartNet <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib25" title=""><span class="ltx_text" style="font-size:90%;">25</span></a>]</cite> and 3DFront <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib7" title=""><span class="ltx_text" style="font-size:90%;">7</span></a>]</cite>. Note that our policies are only trained on the objects from 3DFront. From the table, we can see our results could achieve as good performance on unseen objects, mainly due to the generalization ability of heightmap design.</p> </div> </section> <section class="ltx_subsubsection" id="S4.SS4.SSS3"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.4.3 </span>Scale Up on New Motion Datasets</h4> <div class="ltx_para" id="S4.SS4.SSS3.p1"> <p class="ltx_p" id="S4.SS4.SSS3.p1.1">To prove the reliable of the proposed datasets, and the generality of our text-conditioned policy, we report the Success Rate and APD for Walk, Carry, Sit, and Lie skills in <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.T9" title="In 4.4.3 Scale Up on New Motion Datasets ‣ 4.4 Ablation Study on SIMS ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">9</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.T9" title="In 4.4.3 Scale Up on New Motion Datasets ‣ 4.4 Ablation Study on SIMS ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">9</span></a>, and <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.T10" title="In 4.4.3 Scale Up on New Motion Datasets ‣ 4.4 Ablation Study on SIMS ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">10</span></a>. From the tables, we could find that with more data, Walk achieves a higher success rate mainly because AMASS provides stable neutral walking and running motions. The APD changes little because 100Style also contains neutral walking styles. For carry skill, since ViconStyle is the first dataset containing stylized carrying motion, both metrics increase by a large margin. For HSI skills, sit and lie both become slightly better with the introduction of COUCH and ViconStyle dataset. Couch provides more stylized sitting motions and ViconStyle provides more stylized lying motions.</p> </div> <figure class="ltx_table" id="S4.T9"> <div class="ltx_flex_figure ltx_flex_table"> <div class="ltx_flex_cell ltx_flex_size_1"> <figure class="ltx_figure ltx_figure_panel ltx_minipage ltx_align_center ltx_align_middle" id="S4.T9.4" style="width:199.5pt;"> <div class="ltx_inline-block ltx_transformed_outer" id="S4.T9.4.4" style="width:433.6pt;height:156.5pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(117.0pt,-42.2pt) scale(2.17292519842723,2.17292519842723) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T9.4.4.4"> <tr class="ltx_tr" id="S4.T9.2.2.2.2"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T9.2.2.2.2.3" rowspan="2"><span class="ltx_text" id="S4.T9.2.2.2.2.3.1">Datasets</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T9.1.1.1.1.1">Success Rate(%)<math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T9.1.1.1.1.1.m1.1"><semantics id="S4.T9.1.1.1.1.1.m1.1a"><mo id="S4.T9.1.1.1.1.1.m1.1.1" stretchy="false" xref="S4.T9.1.1.1.1.1.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T9.1.1.1.1.1.m1.1b"><ci id="S4.T9.1.1.1.1.1.m1.1.1.cmml" xref="S4.T9.1.1.1.1.1.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T9.1.1.1.1.1.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T9.1.1.1.1.1.m1.1d">↑</annotation></semantics></math> </td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T9.2.2.2.2.2">APD<math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T9.2.2.2.2.2.m1.1"><semantics id="S4.T9.2.2.2.2.2.m1.1a"><mo id="S4.T9.2.2.2.2.2.m1.1.1" stretchy="false" xref="S4.T9.2.2.2.2.2.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T9.2.2.2.2.2.m1.1b"><ci id="S4.T9.2.2.2.2.2.m1.1.1.cmml" xref="S4.T9.2.2.2.2.2.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T9.2.2.2.2.2.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T9.2.2.2.2.2.m1.1d">↑</annotation></semantics></math> </td> </tr> <tr class="ltx_tr" id="S4.T9.4.4.4.5"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T9.4.4.4.5.1">Walk</td> <td class="ltx_td ltx_align_center" id="S4.T9.4.4.4.5.2">Walk</td> </tr> <tr class="ltx_tr" id="S4.T9.3.3.3.3"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T9.3.3.3.3.2">100S</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T9.3.3.3.3.3">92.6</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T9.3.3.3.3.1">14.83<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T9.3.3.3.3.1.m1.1"><semantics id="S4.T9.3.3.3.3.1.m1.1a"><mo id="S4.T9.3.3.3.3.1.m1.1.1" xref="S4.T9.3.3.3.3.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T9.3.3.3.3.1.m1.1b"><csymbol cd="latexml" id="S4.T9.3.3.3.3.1.m1.1.1.cmml" xref="S4.T9.3.3.3.3.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T9.3.3.3.3.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T9.3.3.3.3.1.m1.1d">±</annotation></semantics></math>0.35</td> </tr> <tr class="ltx_tr" id="S4.T9.4.4.4.4"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T9.4.4.4.4.2">A+100S</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T9.4.4.4.4.3">95.1</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T9.4.4.4.4.1">14.88<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T9.4.4.4.4.1.m1.1"><semantics id="S4.T9.4.4.4.4.1.m1.1a"><mo id="S4.T9.4.4.4.4.1.m1.1.1" xref="S4.T9.4.4.4.4.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T9.4.4.4.4.1.m1.1b"><csymbol cd="latexml" id="S4.T9.4.4.4.4.1.m1.1.1.cmml" xref="S4.T9.4.4.4.4.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T9.4.4.4.4.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T9.4.4.4.4.1.m1.1d">±</annotation></semantics></math>0.29</td> </tr> </table> </span></div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.T9.4.5.1.1" style="font-size:90%;">Table 8</span>: </span><span class="ltx_text" id="S4.T9.4.6.2" style="font-size:90%;">Dataset ablation on Walk Skill. 100S: 100Style , A: AMASS.</span></figcaption> </figure> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <figure class="ltx_figure ltx_figure_panel ltx_minipage ltx_align_center ltx_align_middle" id="S4.T9.8" style="width:199.5pt;"> <div class="ltx_inline-block ltx_transformed_outer" id="S4.T9.8.4" style="width:433.6pt;height:156.5pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(117.0pt,-42.2pt) scale(2.17292519842723,2.17292519842723) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T9.8.4.4"> <tr class="ltx_tr" id="S4.T9.6.2.2.2"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T9.6.2.2.2.3" rowspan="2"><span class="ltx_text" id="S4.T9.6.2.2.2.3.1">Datasets</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T9.5.1.1.1.1">Success Rate(%)<math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T9.5.1.1.1.1.m1.1"><semantics id="S4.T9.5.1.1.1.1.m1.1a"><mo id="S4.T9.5.1.1.1.1.m1.1.1" stretchy="false" xref="S4.T9.5.1.1.1.1.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T9.5.1.1.1.1.m1.1b"><ci id="S4.T9.5.1.1.1.1.m1.1.1.cmml" xref="S4.T9.5.1.1.1.1.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T9.5.1.1.1.1.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T9.5.1.1.1.1.m1.1d">↑</annotation></semantics></math> </td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T9.6.2.2.2.2">APD<math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T9.6.2.2.2.2.m1.1"><semantics id="S4.T9.6.2.2.2.2.m1.1a"><mo id="S4.T9.6.2.2.2.2.m1.1.1" stretchy="false" xref="S4.T9.6.2.2.2.2.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T9.6.2.2.2.2.m1.1b"><ci id="S4.T9.6.2.2.2.2.m1.1.1.cmml" xref="S4.T9.6.2.2.2.2.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T9.6.2.2.2.2.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T9.6.2.2.2.2.m1.1d">↑</annotation></semantics></math> </td> </tr> <tr class="ltx_tr" id="S4.T9.8.4.4.5"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T9.8.4.4.5.1">carry</td> <td class="ltx_td ltx_align_center" id="S4.T9.8.4.4.5.2">carry</td> </tr> <tr class="ltx_tr" id="S4.T9.7.3.3.3"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T9.7.3.3.3.2">A</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T9.7.3.3.3.3">92.9</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T9.7.3.3.3.1">14.36<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T9.7.3.3.3.1.m1.1"><semantics id="S4.T9.7.3.3.3.1.m1.1a"><mo id="S4.T9.7.3.3.3.1.m1.1.1" xref="S4.T9.7.3.3.3.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T9.7.3.3.3.1.m1.1b"><csymbol cd="latexml" id="S4.T9.7.3.3.3.1.m1.1.1.cmml" xref="S4.T9.7.3.3.3.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T9.7.3.3.3.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T9.7.3.3.3.1.m1.1d">±</annotation></semantics></math>0.12</td> </tr> <tr class="ltx_tr" id="S4.T9.8.4.4.4"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T9.8.4.4.4.2">A+VS</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T9.8.4.4.4.3">96.4</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T9.8.4.4.4.1">14.92<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T9.8.4.4.4.1.m1.1"><semantics id="S4.T9.8.4.4.4.1.m1.1a"><mo id="S4.T9.8.4.4.4.1.m1.1.1" xref="S4.T9.8.4.4.4.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T9.8.4.4.4.1.m1.1b"><csymbol cd="latexml" id="S4.T9.8.4.4.4.1.m1.1.1.cmml" xref="S4.T9.8.4.4.4.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T9.8.4.4.4.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T9.8.4.4.4.1.m1.1d">±</annotation></semantics></math>0.23</td> </tr> </table> </span></div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.T9.8.5.1.1" style="font-size:90%;">Table 9</span>: </span><span class="ltx_text" id="S4.T9.8.6.2" style="font-size:90%;">Dataset ablation on Carry Skill. A: AMASS. VS: ViconStyle.</span></figcaption> </figure> </div> </div> </figure> <figure class="ltx_table" id="S4.T10"> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T10.7" style="width:390.3pt;height:115.9pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(43.5pt,-12.9pt) scale(1.28726061263295,1.28726061263295) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T10.7.7"> <tr class="ltx_tr" id="S4.T10.3.3.3"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T10.3.3.3.4" rowspan="2"><span class="ltx_text" id="S4.T10.3.3.3.4.1">Datasets</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" colspan="2" id="S4.T10.1.1.1.1">Success Rate(%)<math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T10.1.1.1.1.m1.1"><semantics id="S4.T10.1.1.1.1.m1.1a"><mo id="S4.T10.1.1.1.1.m1.1.1" stretchy="false" xref="S4.T10.1.1.1.1.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T10.1.1.1.1.m1.1b"><ci id="S4.T10.1.1.1.1.m1.1.1.cmml" xref="S4.T10.1.1.1.1.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T10.1.1.1.1.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T10.1.1.1.1.m1.1d">↑</annotation></semantics></math> </td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" colspan="2" id="S4.T10.2.2.2.2">Contact error<math alttext="\downarrow" class="ltx_Math" display="inline" id="S4.T10.2.2.2.2.m1.1"><semantics id="S4.T10.2.2.2.2.m1.1a"><mo id="S4.T10.2.2.2.2.m1.1.1" stretchy="false" xref="S4.T10.2.2.2.2.m1.1.1.cmml">↓</mo><annotation-xml encoding="MathML-Content" id="S4.T10.2.2.2.2.m1.1b"><ci id="S4.T10.2.2.2.2.m1.1.1.cmml" xref="S4.T10.2.2.2.2.m1.1.1">↓</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T10.2.2.2.2.m1.1c">\downarrow</annotation><annotation encoding="application/x-llamapun" id="S4.T10.2.2.2.2.m1.1d">↓</annotation></semantics></math> </td> <td class="ltx_td ltx_align_center ltx_border_tt" colspan="2" id="S4.T10.3.3.3.3">APD<math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T10.3.3.3.3.m1.1"><semantics id="S4.T10.3.3.3.3.m1.1a"><mo id="S4.T10.3.3.3.3.m1.1.1" stretchy="false" xref="S4.T10.3.3.3.3.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T10.3.3.3.3.m1.1b"><ci id="S4.T10.3.3.3.3.m1.1.1.cmml" xref="S4.T10.3.3.3.3.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T10.3.3.3.3.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T10.3.3.3.3.m1.1d">↑</annotation></semantics></math> </td> </tr> <tr class="ltx_tr" id="S4.T10.7.7.8"> <td class="ltx_td ltx_align_center" id="S4.T10.7.7.8.1">Sit</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T10.7.7.8.2">Lie</td> <td class="ltx_td ltx_align_center" id="S4.T10.7.7.8.3">Sit</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T10.7.7.8.4">Lie</td> <td class="ltx_td ltx_align_center" id="S4.T10.7.7.8.5">Sit</td> <td class="ltx_td ltx_align_center" id="S4.T10.7.7.8.6">Lie</td> </tr> <tr class="ltx_tr" id="S4.T10.5.5.5"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T10.5.5.5.3">S</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T10.5.5.5.4">95.5</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T10.5.5.5.5">86.9</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T10.5.5.5.6">0.040</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T10.5.5.5.7">0.055</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T10.4.4.4.1">16.43<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T10.4.4.4.1.m1.1"><semantics id="S4.T10.4.4.4.1.m1.1a"><mo id="S4.T10.4.4.4.1.m1.1.1" xref="S4.T10.4.4.4.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T10.4.4.4.1.m1.1b"><csymbol cd="latexml" id="S4.T10.4.4.4.1.m1.1.1.cmml" xref="S4.T10.4.4.4.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T10.4.4.4.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T10.4.4.4.1.m1.1d">±</annotation></semantics></math>0.90</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T10.5.5.5.2">16.40<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T10.5.5.5.2.m1.1"><semantics id="S4.T10.5.5.5.2.m1.1a"><mo id="S4.T10.5.5.5.2.m1.1.1" xref="S4.T10.5.5.5.2.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T10.5.5.5.2.m1.1b"><csymbol cd="latexml" id="S4.T10.5.5.5.2.m1.1.1.cmml" xref="S4.T10.5.5.5.2.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T10.5.5.5.2.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T10.5.5.5.2.m1.1d">±</annotation></semantics></math>0.94</td> </tr> <tr class="ltx_tr" id="S4.T10.6.6.6"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T10.6.6.6.2">S+C</td> <td class="ltx_td ltx_align_center" id="S4.T10.6.6.6.3">96.9</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T10.6.6.6.4">-</td> <td class="ltx_td ltx_align_center" id="S4.T10.6.6.6.5">0.014</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T10.6.6.6.6">-</td> <td class="ltx_td ltx_align_center" id="S4.T10.6.6.6.1">16.52<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T10.6.6.6.1.m1.1"><semantics id="S4.T10.6.6.6.1.m1.1a"><mo id="S4.T10.6.6.6.1.m1.1.1" xref="S4.T10.6.6.6.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T10.6.6.6.1.m1.1b"><csymbol cd="latexml" id="S4.T10.6.6.6.1.m1.1.1.cmml" xref="S4.T10.6.6.6.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T10.6.6.6.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T10.6.6.6.1.m1.1d">±</annotation></semantics></math>0.47</td> <td class="ltx_td ltx_align_center" id="S4.T10.6.6.6.7">-</td> </tr> <tr class="ltx_tr" id="S4.T10.7.7.7"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T10.7.7.7.2">S+C+VS</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T10.7.7.7.3">-</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T10.7.7.7.4">89.7</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T10.7.7.7.5">-</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T10.7.7.7.6">0.030</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T10.7.7.7.7">-</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T10.7.7.7.1">16.84<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T10.7.7.7.1.m1.1"><semantics id="S4.T10.7.7.7.1.m1.1a"><mo id="S4.T10.7.7.7.1.m1.1.1" xref="S4.T10.7.7.7.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T10.7.7.7.1.m1.1b"><csymbol cd="latexml" id="S4.T10.7.7.7.1.m1.1.1.cmml" xref="S4.T10.7.7.7.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T10.7.7.7.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T10.7.7.7.1.m1.1d">±</annotation></semantics></math>1.28</td> </tr> </table> </span></div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T10.9.1.1" style="font-size:90%;">Table 10</span>: </span><span class="ltx_text" id="S4.T10.10.2" style="font-size:90%;">Dataset ablation on HSI Skills. S: SAMP <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib12" title=""><span class="ltx_text" style="font-size:90%;">12</span></a>]</cite>, C: Couch <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib54" title=""><span class="ltx_text" style="font-size:90%;">54</span></a>]</cite>, VS: ViconStyle</span></figcaption> </figure> </section> <section class="ltx_subsubsection" id="S4.SS4.SSS4"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection">4.4.4 </span>Ablation of Policy Settings</h4> <div class="ltx_para" id="S4.SS4.SSS4.p1"> <p class="ltx_p" id="S4.SS4.SSS4.p1.1">We conducted an ablation study on different settings of our control policy, comparing the <em class="ltx_emph ltx_font_italic" id="S4.SS4.SSS4.p1.1.1">Success Rate</em> and <em class="ltx_emph ltx_font_italic" id="S4.SS4.SSS4.p1.1.2">Contact Error</em> for variations without heightmap and without text embedding. Both variants showed degraded performance. The height map provides essential information about the surrounding environment so the performance becomes worse when interacting with objects. When trained without text embedding, the APD metric shows an obvious degradation.</p> </div> <figure class="ltx_table" id="S4.T11"> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S4.T11.10" style="width:390.3pt;height:103.3pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(25.1pt,-6.7pt) scale(1.14790116383108,1.14790116383108) ;"> <table class="ltx_tabular ltx_align_middle" id="S4.T11.10.10"> <tr class="ltx_tr" id="S4.T11.2.2.2"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S4.T11.2.2.2.3" rowspan="2"><span class="ltx_text" id="S4.T11.2.2.2.3.1">Setting</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" colspan="3" id="S4.T11.1.1.1.1">Success Rate(%)<math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T11.1.1.1.1.m1.1"><semantics id="S4.T11.1.1.1.1.m1.1a"><mo id="S4.T11.1.1.1.1.m1.1.1" stretchy="false" xref="S4.T11.1.1.1.1.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T11.1.1.1.1.m1.1b"><ci id="S4.T11.1.1.1.1.m1.1.1.cmml" xref="S4.T11.1.1.1.1.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T11.1.1.1.1.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T11.1.1.1.1.m1.1d">↑</annotation></semantics></math> </td> <td class="ltx_td ltx_align_center ltx_border_tt" colspan="3" id="S4.T11.2.2.2.2">APD<math alttext="\uparrow" class="ltx_Math" display="inline" id="S4.T11.2.2.2.2.m1.1"><semantics id="S4.T11.2.2.2.2.m1.1a"><mo id="S4.T11.2.2.2.2.m1.1.1" stretchy="false" xref="S4.T11.2.2.2.2.m1.1.1.cmml">↑</mo><annotation-xml encoding="MathML-Content" id="S4.T11.2.2.2.2.m1.1b"><ci id="S4.T11.2.2.2.2.m1.1.1.cmml" xref="S4.T11.2.2.2.2.m1.1.1">↑</ci></annotation-xml><annotation encoding="application/x-tex" id="S4.T11.2.2.2.2.m1.1c">\uparrow</annotation><annotation encoding="application/x-llamapun" id="S4.T11.2.2.2.2.m1.1d">↑</annotation></semantics></math> </td> </tr> <tr class="ltx_tr" id="S4.T11.10.10.11"> <td class="ltx_td ltx_align_center" id="S4.T11.10.10.11.1">Sit</td> <td class="ltx_td ltx_align_center" id="S4.T11.10.10.11.2">Lie</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T11.10.10.11.3">Carry</td> <td class="ltx_td ltx_align_center" id="S4.T11.10.10.11.4">Sit</td> <td class="ltx_td ltx_align_center" id="S4.T11.10.10.11.5">Lie</td> <td class="ltx_td ltx_align_center" id="S4.T11.10.10.11.6">Carry</td> </tr> <tr class="ltx_tr" id="S4.T11.5.5.5"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T11.5.5.5.4">w/o text</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T11.5.5.5.5">89.7</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T11.5.5.5.6">89.6</td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S4.T11.5.5.5.7">92.4</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T11.3.3.3.1">16.29<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T11.3.3.3.1.m1.1"><semantics id="S4.T11.3.3.3.1.m1.1a"><mo id="S4.T11.3.3.3.1.m1.1.1" xref="S4.T11.3.3.3.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T11.3.3.3.1.m1.1b"><csymbol cd="latexml" id="S4.T11.3.3.3.1.m1.1.1.cmml" xref="S4.T11.3.3.3.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T11.3.3.3.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T11.3.3.3.1.m1.1d">±</annotation></semantics></math>0.22</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T11.4.4.4.2">16.59<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T11.4.4.4.2.m1.1"><semantics id="S4.T11.4.4.4.2.m1.1a"><mo id="S4.T11.4.4.4.2.m1.1.1" xref="S4.T11.4.4.4.2.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T11.4.4.4.2.m1.1b"><csymbol cd="latexml" id="S4.T11.4.4.4.2.m1.1.1.cmml" xref="S4.T11.4.4.4.2.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T11.4.4.4.2.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T11.4.4.4.2.m1.1d">±</annotation></semantics></math>0.28</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T11.5.5.5.3">12.41<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T11.5.5.5.3.m1.1"><semantics id="S4.T11.5.5.5.3.m1.1a"><mo id="S4.T11.5.5.5.3.m1.1.1" xref="S4.T11.5.5.5.3.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T11.5.5.5.3.m1.1b"><csymbol cd="latexml" id="S4.T11.5.5.5.3.m1.1.1.cmml" xref="S4.T11.5.5.5.3.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T11.5.5.5.3.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T11.5.5.5.3.m1.1d">±</annotation></semantics></math>0.19</td> </tr> <tr class="ltx_tr" id="S4.T11.7.7.7"> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T11.7.7.7.3">w/o htmp</td> <td class="ltx_td ltx_align_center" id="S4.T11.7.7.7.4">88.7</td> <td class="ltx_td ltx_align_center" id="S4.T11.7.7.7.5">79.8</td> <td class="ltx_td ltx_align_center ltx_border_r" id="S4.T11.7.7.7.6">-</td> <td class="ltx_td ltx_align_center" id="S4.T11.6.6.6.1">16.18<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T11.6.6.6.1.m1.1"><semantics id="S4.T11.6.6.6.1.m1.1a"><mo id="S4.T11.6.6.6.1.m1.1.1" xref="S4.T11.6.6.6.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T11.6.6.6.1.m1.1b"><csymbol cd="latexml" id="S4.T11.6.6.6.1.m1.1.1.cmml" xref="S4.T11.6.6.6.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T11.6.6.6.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T11.6.6.6.1.m1.1d">±</annotation></semantics></math>0.19</td> <td class="ltx_td ltx_align_center" id="S4.T11.7.7.7.2">16.94<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T11.7.7.7.2.m1.1"><semantics id="S4.T11.7.7.7.2.m1.1a"><mo id="S4.T11.7.7.7.2.m1.1.1" xref="S4.T11.7.7.7.2.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T11.7.7.7.2.m1.1b"><csymbol cd="latexml" id="S4.T11.7.7.7.2.m1.1.1.cmml" xref="S4.T11.7.7.7.2.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T11.7.7.7.2.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T11.7.7.7.2.m1.1d">±</annotation></semantics></math>0.29</td> <td class="ltx_td ltx_align_center" id="S4.T11.7.7.7.7">-</td> </tr> <tr class="ltx_tr" id="S4.T11.10.10.10"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T11.10.10.10.4">SIMS(ours)</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T11.10.10.10.5">96.9</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T11.10.10.10.6">89.7</td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S4.T11.10.10.10.7">96.4</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T11.8.8.8.1">16.52<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T11.8.8.8.1.m1.1"><semantics id="S4.T11.8.8.8.1.m1.1a"><mo id="S4.T11.8.8.8.1.m1.1.1" xref="S4.T11.8.8.8.1.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T11.8.8.8.1.m1.1b"><csymbol cd="latexml" id="S4.T11.8.8.8.1.m1.1.1.cmml" xref="S4.T11.8.8.8.1.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T11.8.8.8.1.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T11.8.8.8.1.m1.1d">±</annotation></semantics></math>0.47</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T11.9.9.9.2">16.99<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T11.9.9.9.2.m1.1"><semantics id="S4.T11.9.9.9.2.m1.1a"><mo id="S4.T11.9.9.9.2.m1.1.1" xref="S4.T11.9.9.9.2.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T11.9.9.9.2.m1.1b"><csymbol cd="latexml" id="S4.T11.9.9.9.2.m1.1.1.cmml" xref="S4.T11.9.9.9.2.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T11.9.9.9.2.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T11.9.9.9.2.m1.1d">±</annotation></semantics></math>1.28</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T11.10.10.10.3">14.92<math alttext="\pm" class="ltx_Math" display="inline" id="S4.T11.10.10.10.3.m1.1"><semantics id="S4.T11.10.10.10.3.m1.1a"><mo id="S4.T11.10.10.10.3.m1.1.1" xref="S4.T11.10.10.10.3.m1.1.1.cmml">±</mo><annotation-xml encoding="MathML-Content" id="S4.T11.10.10.10.3.m1.1b"><csymbol cd="latexml" id="S4.T11.10.10.10.3.m1.1.1.cmml" xref="S4.T11.10.10.10.3.m1.1.1">plus-or-minus</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.T11.10.10.10.3.m1.1c">\pm</annotation><annotation encoding="application/x-llamapun" id="S4.T11.10.10.10.3.m1.1d">±</annotation></semantics></math>0.23</td> </tr> </table> </span></div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T11.12.1.1" style="font-size:90%;">Table 11</span>: </span><span class="ltx_text" id="S4.T11.13.2" style="font-size:90%;">Ablation on different policy settings.</span></figcaption> </figure> </section> </section> <section class="ltx_subsection" id="S4.SS5"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">4.5 </span>Qualitative Results</h3> <div class="ltx_para" id="S4.SS5.p1"> <p class="ltx_p" id="S4.SS5.p1.1">We show 4 generated long narratives executed by our policies in two large indoor scenes. The details can be viewed in <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.F3" title="In 4.3.1 Physical Performance for Different Skills ‣ 4.3 Comparison with SOTA methods ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">3</span></a>. In <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.F4" title="In 4.3.3 User Study on SOTA Long-Term HSI Methods ‣ 4.3 Comparison with SOTA methods ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">4</span></a>, we also showed some qualitative samples for 5 skills: Carry, Idle, Walk, Sit, and Lie. We suggest the readers to refer to the demonstration videos for a better knowledge of our ability to generate long-term stylized motions.</p> </div> </section> </section> <section class="ltx_section" id="S5"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">5 </span>Conclusion</h2> <div class="ltx_para" id="S5.p1"> <p class="ltx_p" id="S5.p1.1">In this paper, we analyze and compare the current advancements in long-term human-scene interaction tasks, highlighting the lack of generating animations that are both physically plausible and stylistically expressive. To address this, we propose a novel framework for synthesizing long-term human-scene interactions by leveraging Retrieval-Augmented Generation as high-level planners and a multi-condition control policy as the low-level controller. By incorporating both stylized script generation and a stylized control policy, our approach facilitates the creation of diverse, expressive, and physically coherent long-term animations. Furthermore, the processed datasets open up new possibilities and directions for future research in this field.</p> </div> </section> <section class="ltx_section" id="S6"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">6 </span>Furture Work</h2> <div class="ltx_para" id="S6.p1"> <p class="ltx_p" id="S6.p1.1">In the future, it will be essential to collect more human motion data that captures realistic emotions and diverse styles. Additionally, exploring humanoid models with articulated fingers presents a promising avenue for research. Introducing multi-agent in HSI could also broaden the possibilities for physical animations.</p> <div class="ltx_pagination ltx_role_newpage"></div> </div> </section> <section class="ltx_bibliography" id="bib"> <h2 class="ltx_title ltx_title_bibliography" style="font-size:90%;">References</h2> <ul class="ltx_biblist"> <li class="ltx_bibitem" id="bib.bib1"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib1.5.5.1" style="font-size:90%;">Achiam et al. [2023]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib1.7.1" style="font-size:90%;"> Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib1.8.1" style="font-size:90%;">Gpt-4 technical report. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib1.9.1" style="font-size:90%;">arXiv preprint arXiv:2303.08774</em><span class="ltx_text" id="bib.bib1.10.2" style="font-size:90%;">, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib2"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib2.5.5.1" style="font-size:90%;">Araújo et al. [2023]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib2.7.1" style="font-size:90%;"> Joao Pedro Araújo, Jiaman Li, Karthik Vetrivel, Rishi Agarwal, Jiajun Wu, Deepak Gopinath, Alexander William Clegg, and Karen Liu. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib2.8.1" style="font-size:90%;">Circle: Capture in rich contextual environments. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib2.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib2.10.2" style="font-size:90%;">CVPR</em><span class="ltx_text" id="bib.bib2.11.3" style="font-size:90%;">, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib3"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib3.5.5.1" style="font-size:90%;">Cong et al. [2024]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib3.7.1" style="font-size:90%;"> Peishan Cong, Ziyi Wang, Zhiyang Dou, Yiming Ren, Wei Yin, Kai Cheng, Yujing Sun, Xiaoxiao Long, Xinge Zhu, and Yuexin Ma. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib3.8.1" style="font-size:90%;">Laserhuman: language-guided scene-aware human motion generation in free environment. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib3.9.1" style="font-size:90%;">arXiv preprint arXiv:2403.13307</em><span class="ltx_text" id="bib.bib3.10.2" style="font-size:90%;">, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib4"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib4.4.4.1" style="font-size:90%;">Devlin [2018]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib4.6.1" style="font-size:90%;"> Jacob Devlin. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib4.7.1" style="font-size:90%;">Bert: Pre-training of deep bidirectional transformers for language understanding. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib4.8.1" style="font-size:90%;">arXiv preprint arXiv:1810.04805</em><span class="ltx_text" id="bib.bib4.9.2" style="font-size:90%;">, 2018. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib5"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib5.5.5.1" style="font-size:90%;">Dou et al. [2023a]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib5.7.1" style="font-size:90%;"> Zhiyang Dou, Xuelin Chen, Qingnan Fan, Taku Komura, and Wenping Wang. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib5.8.1" style="font-size:90%;">C· ase: Learning conditional adversarial skill embeddings for physics-based characters. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib5.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib5.10.2" style="font-size:90%;">SIGGRAPH 2023</em><span class="ltx_text" id="bib.bib5.11.3" style="font-size:90%;">, 2023a. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib6"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib6.5.5.1" style="font-size:90%;">Dou et al. [2023b]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib6.7.1" style="font-size:90%;"> Zhiyang Dou, Qingxuan Wu, Cheng Lin, Zeyu Cao, Qiangqiang Wu, Weilin Wan, Taku Komura, and Wenping Wang. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib6.8.1" style="font-size:90%;">Tore: Token reduction for efficient human mesh recovery with transformer. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib6.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib6.10.2" style="font-size:90%;">ICCV</em><span class="ltx_text" id="bib.bib6.11.3" style="font-size:90%;">, 2023b. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib7"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib7.5.5.1" style="font-size:90%;">Fu et al. [2021]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib7.7.1" style="font-size:90%;"> Huan Fu, Bowen Cai, Lin Gao, Ling-Xiao Zhang, Jiaming Wang, Cao Li, Qixun Zeng, Chengyue Sun, Rongfei Jia, Binqiang Zhao, et al. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib7.8.1" style="font-size:90%;">3d-front: 3d furnished rooms with layouts and semantics. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib7.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib7.10.2" style="font-size:90%;">ICCV</em><span class="ltx_text" id="bib.bib7.11.3" style="font-size:90%;">, 2021. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib8"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib8.5.5.1" style="font-size:90%;">Gao et al. [2024]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib8.7.1" style="font-size:90%;"> Jiawei Gao, Ziqin Wang, Zeqi Xiao, Jingbo Wang, Tai Wang, Jinkun Cao, Xiaolin Hu, Si Liu, Jifeng Dai, and Jiangmiao Pang. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib8.8.1" style="font-size:90%;">Coohoi: Learning cooperative human-object interaction with manipulated object dynamics. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib8.9.1" style="font-size:90%;">Advances in Neural Information Processing Systems</em><span class="ltx_text" id="bib.bib8.10.2" style="font-size:90%;">, 37, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib9"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib9.5.5.1" style="font-size:90%;">Ge et al. [2024]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib9.7.1" style="font-size:90%;"> Yongtao Ge, Wenjia Wang, Yongfan Chen, Hao Chen, and Chunhua Shen. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib9.8.1" style="font-size:90%;">3d human reconstruction in the wild with synthetic data using generative models. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib9.9.1" style="font-size:90%;">arXiv preprint arXiv:2403.11111</em><span class="ltx_text" id="bib.bib9.10.2" style="font-size:90%;">, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib10"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib10.4.4.1" style="font-size:90%;">Ghorbani and Black [2021]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib10.6.1" style="font-size:90%;"> Nima Ghorbani and Michael J. Black. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib10.7.1" style="font-size:90%;">Soma: Solving optical marker-based mocap automatically. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib10.8.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib10.9.2" style="font-size:90%;">ICCV</em><span class="ltx_text" id="bib.bib10.10.3" style="font-size:90%;">, 2021. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib11"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib11.5.5.1" style="font-size:90%;">Guo et al. [2022]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib11.7.1" style="font-size:90%;"> Chuan Guo, Shihao Zou, Xinxin Zuo, Sen Wang, Wei Ji, Xingyu Li, and Li Cheng. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib11.8.1" style="font-size:90%;">Generating diverse and natural 3d human motions from text. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib11.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib11.10.2" style="font-size:90%;">CVPR</em><span class="ltx_text" id="bib.bib11.11.3" style="font-size:90%;">, 2022. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib12"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib12.5.5.1" style="font-size:90%;">Hassan et al. [2021]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib12.7.1" style="font-size:90%;"> Mohamed Hassan, Duygu Ceylan, Ruben Villegas, Jun Saito, Jimei Yang, Yi Zhou, and Michael J Black. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib12.8.1" style="font-size:90%;">Stochastic scene-aware motion prediction. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib12.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib12.10.2" style="font-size:90%;">ICCV</em><span class="ltx_text" id="bib.bib12.11.3" style="font-size:90%;">, 2021. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib13"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib13.5.5.1" style="font-size:90%;">Hassan et al. [2023]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib13.7.1" style="font-size:90%;"> Mohamed Hassan, Yunrong Guo, Tingwu Wang, Michael Black, Sanja Fidler, and Xue Bin Peng. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib13.8.1" style="font-size:90%;">Synthesizing physical character-scene interactions. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib13.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib13.10.2" style="font-size:90%;">SIGGRAPH 2023</em><span class="ltx_text" id="bib.bib13.11.3" style="font-size:90%;">, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib14"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib14.5.5.1" style="font-size:90%;">Huang et al. [2025]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib14.7.1" style="font-size:90%;"> Yiming Huang, Zhiyang Dou, and Lingjie Liu. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib14.8.1" style="font-size:90%;">Modskill: Physical character skill modularization. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib14.9.1" style="font-size:90%;">arXiv preprint arXiv:2502.14140</em><span class="ltx_text" id="bib.bib14.10.2" style="font-size:90%;">, 2025. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib15"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib15.5.5.1" style="font-size:90%;">Jiang et al. [2023]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib15.7.1" style="font-size:90%;"> Biao Jiang, Xin Chen, Wen Liu, Jingyi Yu, Gang Yu, and Tao Chen. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib15.8.1" style="font-size:90%;">Motiongpt: Human motion as a foreign language. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib15.9.1" style="font-size:90%;">NeuraIPS</em><span class="ltx_text" id="bib.bib15.10.2" style="font-size:90%;">, 36, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib16"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib16.5.5.1" style="font-size:90%;">Jiang et al. [2024]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib16.7.1" style="font-size:90%;"> Nan Jiang, Zhiyuan Zhang, Hongjie Li, Xiaoxuan Ma, Zan Wang, Yixin Chen, Tengyu Liu, Yixin Zhu, and Siyuan Huang. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib16.8.1" style="font-size:90%;">Scaling up dynamic human-scene interaction modeling. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib16.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib16.10.2" style="font-size:90%;">CVPR</em><span class="ltx_text" id="bib.bib16.11.3" style="font-size:90%;">, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib17"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib17.5.5.1" style="font-size:90%;">Juravsky et al. [2022]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib17.7.1" style="font-size:90%;"> Jordan Juravsky, Yunrong Guo, Sanja Fidler, and Xue Bin Peng. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib17.8.1" style="font-size:90%;">Padl: Language-directed physics-based character control. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib17.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib17.10.2" style="font-size:90%;">SIGGRAPH Asia 2022 Conference Papers</em><span class="ltx_text" id="bib.bib17.11.3" style="font-size:90%;">, 2022. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib18"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib18.5.5.1" style="font-size:90%;">Lewis et al. [2020]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib18.7.1" style="font-size:90%;"> Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib18.8.1" style="font-size:90%;">Retrieval-augmented generation for knowledge-intensive nlp tasks. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib18.9.1" style="font-size:90%;">NeuraIPS</em><span class="ltx_text" id="bib.bib18.10.2" style="font-size:90%;">, 33, 2020. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib19"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib19.5.5.1" style="font-size:90%;">Loper et al. [2015]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib19.7.1" style="font-size:90%;"> Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib19.8.1" style="font-size:90%;">Smpl: a skinned multi-person linear model. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib19.9.1" style="font-size:90%;">TOG</em><span class="ltx_text" id="bib.bib19.10.2" style="font-size:90%;">, 34(6), 2015. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib20"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib20.5.5.1" style="font-size:90%;">Lu et al. [2024]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib20.7.1" style="font-size:90%;"> Shunlin Lu, Jingbo Wang, Zeyu Lu, Ling-Hao Chen, Wenxun Dai, Junting Dong, Zhiyang Dou, Bo Dai, and Ruimao Zhang. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib20.8.1" style="font-size:90%;">Scamo: Exploring the scaling law in autoregressive motion generation model. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib20.9.1" style="font-size:90%;">arXiv preprint arXiv:2412.14559</em><span class="ltx_text" id="bib.bib20.10.2" style="font-size:90%;">, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib21"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib21.5.5.1" style="font-size:90%;">Luo et al. [2023]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib21.7.1" style="font-size:90%;"> Zhengyi Luo, Jinkun Cao, Kris Kitani, Weipeng Xu, et al. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib21.8.1" style="font-size:90%;">Perpetual humanoid control for real-time simulated avatars. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib21.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib21.10.2" style="font-size:90%;">ICCV</em><span class="ltx_text" id="bib.bib21.11.3" style="font-size:90%;">, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib22"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib22.5.5.1" style="font-size:90%;">Mahmood et al. [2019]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib22.7.1" style="font-size:90%;"> Naureen Mahmood, Nima Ghorbani, Nikolaus F Troje, Gerard Pons-Moll, and Michael J Black. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib22.8.1" style="font-size:90%;">Amass: Archive of motion capture as surface shapes. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib22.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib22.10.2" style="font-size:90%;">ICCV</em><span class="ltx_text" id="bib.bib22.11.3" style="font-size:90%;">, 2019. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib23"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib23.5.5.1" style="font-size:90%;">Makoviychuk et al. [2021]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib23.7.1" style="font-size:90%;"> Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, et al. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib23.8.1" style="font-size:90%;">Isaac gym: High performance gpu-based physics simulation for robot learning. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib23.9.1" style="font-size:90%;">arXiv preprint arXiv:2108.10470</em><span class="ltx_text" id="bib.bib23.10.2" style="font-size:90%;">, 2021. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib24"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib24.5.5.1" style="font-size:90%;">Mason et al. [2022]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib24.7.1" style="font-size:90%;"> Ian Mason, Sebastian Starke, and Taku Komura. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib24.8.1" style="font-size:90%;">Real-time style modelling of human locomotion via feature-wise transformations and local motion phases. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib24.9.1" style="font-size:90%;">Proceedings of the ACM on Computer Graphics and Interactive Techniques</em><span class="ltx_text" id="bib.bib24.10.2" style="font-size:90%;">, 5(1), 2022. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib25"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib25.5.5.1" style="font-size:90%;">Mo et al. [2019]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib25.7.1" style="font-size:90%;"> Kaichun Mo, Shilin Zhu, Angel X Chang, Li Yi, Subarna Tripathi, Leonidas J Guibas, and Hao Su. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib25.8.1" style="font-size:90%;">Partnet: A large-scale benchmark for fine-grained and hierarchical part-level 3d object understanding. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib25.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib25.10.2" style="font-size:90%;">CVPR</em><span class="ltx_text" id="bib.bib25.11.3" style="font-size:90%;">, 2019. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib26"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib26.5.5.1" style="font-size:90%;">Pan et al. [2024]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib26.7.1" style="font-size:90%;"> Liang Pan, Jingbo Wang, Buzhen Huang, Junyu Zhang, Haofan Wang, Xu Tang, and Yangang Wang. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib26.8.1" style="font-size:90%;">Synthesizing physically plausible human motions in 3d scenes. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib26.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib26.10.2" style="font-size:90%;">3DV</em><span class="ltx_text" id="bib.bib26.11.3" style="font-size:90%;">, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib27"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib27.5.5.1" style="font-size:90%;">Peng et al. [2018a]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib27.7.1" style="font-size:90%;"> Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel Van de Panne. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib27.8.1" style="font-size:90%;">Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib27.9.1" style="font-size:90%;">TOG</em><span class="ltx_text" id="bib.bib27.10.2" style="font-size:90%;">, 37(4), 2018a. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib28"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib28.5.5.1" style="font-size:90%;">Peng et al. [2018b]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib28.7.1" style="font-size:90%;"> Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel Van de Panne. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib28.8.1" style="font-size:90%;">Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib28.9.1" style="font-size:90%;">TOG</em><span class="ltx_text" id="bib.bib28.10.2" style="font-size:90%;">, 37(4), 2018b. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib29"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib29.5.5.1" style="font-size:90%;">Peng et al. [2021]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib29.7.1" style="font-size:90%;"> Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, and Angjoo Kanazawa. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib29.8.1" style="font-size:90%;">Amp: Adversarial motion priors for stylized physics-based character control. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib29.9.1" style="font-size:90%;">TOG</em><span class="ltx_text" id="bib.bib29.10.2" style="font-size:90%;">, 2021. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib30"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib30.5.5.1" style="font-size:90%;">Peng et al. [2022]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib30.7.1" style="font-size:90%;"> Xue Bin Peng, Yunrong Guo, Lina Halper, Sergey Levine, and Sanja Fidler. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib30.8.1" style="font-size:90%;">Ase: Large-scale reusable adversarial skill embeddings for physically simulated characters. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib30.9.1" style="font-size:90%;">TOG</em><span class="ltx_text" id="bib.bib30.10.2" style="font-size:90%;">, 41(4), 2022. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib31"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib31.5.5.1" style="font-size:90%;">Radford et al. [2021]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib31.7.1" style="font-size:90%;"> Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib31.8.1" style="font-size:90%;">Learning transferable visual models from natural language supervision. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib31.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib31.10.2" style="font-size:90%;">ICML</em><span class="ltx_text" id="bib.bib31.11.3" style="font-size:90%;">, 2021. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib32"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib32.4.4.1" style="font-size:90%;">Reimers and Gurevych [2019]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib32.6.1" style="font-size:90%;"> Nils Reimers and Iryna Gurevych. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib32.7.1" style="font-size:90%;">Sentence-bert: Sentence embeddings using siamese bert-networks. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib32.8.1" style="font-size:90%;">arXiv preprint arXiv:1908.10084</em><span class="ltx_text" id="bib.bib32.9.2" style="font-size:90%;">, 2019. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib33"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib33.5.5.1" style="font-size:90%;">Starke et al. [2019]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib33.7.1" style="font-size:90%;"> Sebastian Starke, He Zhang, Taku Komura, and Jun Saito. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib33.8.1" style="font-size:90%;">Neural state machine for character-scene interactions. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib33.9.1" style="font-size:90%;">TOG</em><span class="ltx_text" id="bib.bib33.10.2" style="font-size:90%;">, 38(6), 2019. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib34"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib34.5.5.1" style="font-size:90%;">Sun et al. [2024a]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib34.7.1" style="font-size:90%;"> Jingkai Sun, Qiang Zhang, Yiqun Duan, Xiaoyang Jiang, Chong Cheng, and Renjing Xu. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib34.8.1" style="font-size:90%;">Prompt, plan, perform: Llm-based humanoid control via quantized imitation learning. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib34.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib34.10.2" style="font-size:90%;">ICRA</em><span class="ltx_text" id="bib.bib34.11.3" style="font-size:90%;">, 2024a. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib35"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib35.5.5.1" style="font-size:90%;">Sun et al. [2024b]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib35.7.1" style="font-size:90%;"> Qingping Sun, Yanjun Wang, Ailing Zeng, Wanqi Yin, Chen Wei, Wenjia Wang, Haiyi Mei, Chi-Sing Leung, Ziwei Liu, Lei Yang, et al. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib35.8.1" style="font-size:90%;">Aios: All-in-one-stage expressive human pose and shape estimation. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib35.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib35.10.2" style="font-size:90%;">CVPR</em><span class="ltx_text" id="bib.bib35.11.3" style="font-size:90%;">, 2024b. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib36"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib36.5.5.1" style="font-size:90%;">Tessler et al. [2023]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib36.7.1" style="font-size:90%;"> Chen Tessler, Yoni Kasten, Yunrong Guo, Shie Mannor, Gal Chechik, and Xue Bin Peng. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib36.8.1" style="font-size:90%;">Calm: Conditional adversarial latent models for directable virtual characters. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib36.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib36.10.2" style="font-size:90%;">SIGGRAPH 2023</em><span class="ltx_text" id="bib.bib36.11.3" style="font-size:90%;">, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib37"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib37.5.5.1" style="font-size:90%;">Tessler et al. [2024]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib37.7.1" style="font-size:90%;"> Chen Tessler, Yunrong Guo, Ofir Nabati, Gal Chechik, and Xue Bin Peng. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib37.8.1" style="font-size:90%;">Maskedmimic: Unified physics-based character control through masked motion inpainting. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib37.9.1" style="font-size:90%;">TOG</em><span class="ltx_text" id="bib.bib37.10.2" style="font-size:90%;">, 43(6), 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib38"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib38.5.5.1" style="font-size:90%;">Tevet et al. [2022]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib38.7.1" style="font-size:90%;"> Guy Tevet, Brian Gordon, Amir Hertz, Amit H Bermano, and Daniel Cohen-Or. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib38.8.1" style="font-size:90%;">Motionclip: Exposing human motion generation to clip space. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib38.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib38.10.2" style="font-size:90%;">ECCV</em><span class="ltx_text" id="bib.bib38.11.3" style="font-size:90%;">, 2022. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib39"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib39.5.5.1" style="font-size:90%;">Tevet et al. [2023]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib39.7.1" style="font-size:90%;"> Guy Tevet, Sigal Raab, Brian Gordon, Yoni Shafir, Daniel Cohen-or, and Amit Haim Bermano. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib39.8.1" style="font-size:90%;">Human motion diffusion model. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib39.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib39.10.2" style="font-size:90%;">ICLR</em><span class="ltx_text" id="bib.bib39.11.3" style="font-size:90%;">, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib40"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib40.5.5.1" style="font-size:90%;">Wan et al. [2023]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib40.7.1" style="font-size:90%;"> Weilin Wan, Zhiyang Dou, Taku Komura, Wenping Wang, Dinesh Jayaraman, and Lingjie Liu. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib40.8.1" style="font-size:90%;">Tlcontrol: Trajectory and language control for human motion synthesis. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib40.9.1" style="font-size:90%;">arXiv preprint arXiv:2311.17135</em><span class="ltx_text" id="bib.bib40.10.2" style="font-size:90%;">, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib41"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib41.5.5.1" style="font-size:90%;">Wang et al. [2022a]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib41.7.1" style="font-size:90%;"> Jingbo Wang, Yu Rong, Jingyuan Liu, Sijie Yan, Dahua Lin, and Bo Dai. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib41.8.1" style="font-size:90%;">Towards diverse and natural scene-aware 3d human motion synthesis. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib41.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib41.10.2" style="font-size:90%;">CVPR</em><span class="ltx_text" id="bib.bib41.11.3" style="font-size:90%;">, 2022a. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib42"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib42.5.5.1" style="font-size:90%;">Wang et al. [2023a]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib42.7.1" style="font-size:90%;"> Jionghao Wang, Yuan Liu, Zhiyang Dou, Zhengming Yu, Yongqing Liang, Cheng Lin, Xin Li, Wenping Wang, Rong Xie, and Li Song. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib42.8.1" style="font-size:90%;">Disentangled clothed avatar generation from text descriptions. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib42.9.1" style="font-size:90%;">arXiv preprint arXiv:2312.05295</em><span class="ltx_text" id="bib.bib42.10.2" style="font-size:90%;">, 2023a. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib43"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib43.5.5.1" style="font-size:90%;">Wang et al. [2023b]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib43.7.1" style="font-size:90%;"> Wenjia Wang, Yongtao Ge, Haiyi Mei, Zhongang Cai, Qingping Sun, Yanjun Wang, Chunhua Shen, Lei Yang, and Taku Komura. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib43.8.1" style="font-size:90%;">Zolly: Zoom focal length correctly for perspective-distorted human mesh reconstruction. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib43.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib43.10.2" style="font-size:90%;">ICCV</em><span class="ltx_text" id="bib.bib43.11.3" style="font-size:90%;">, 2023b. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib44"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib44.5.5.1" style="font-size:90%;">Wang et al. [2022b]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib44.7.1" style="font-size:90%;"> Zan Wang, Yixin Chen, Tengyu Liu, Yixin Zhu, Wei Liang, and Siyuan Huang. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib44.8.1" style="font-size:90%;">Humanise: Language-conditioned human motion generation in 3d scenes. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib44.9.1" style="font-size:90%;">NeuraIPS</em><span class="ltx_text" id="bib.bib44.10.2" style="font-size:90%;">, 35, 2022b. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib45"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib45.5.5.1" style="font-size:90%;">Wang et al. [2024]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib45.7.1" style="font-size:90%;"> Zan Wang, Yixin Chen, Baoxiong Jia, Puhao Li, Jinlu Zhang, Jingze Zhang, Tengyu Liu, Yixin Zhu, Wei Liang, and Siyuan Huang. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib45.8.1" style="font-size:90%;">Move as you say interact as you can: Language-guided human motion generation with scene affordance. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib45.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib45.10.2" style="font-size:90%;">CVPR</em><span class="ltx_text" id="bib.bib45.11.3" style="font-size:90%;">, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib46"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib46.5.5.1" style="font-size:90%;">Won et al. [2022]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib46.7.1" style="font-size:90%;"> Jungdam Won, Deepak Gopinath, and Jessica Hodgins. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib46.8.1" style="font-size:90%;">Physics-based character controllers using conditional vaes. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib46.9.1" style="font-size:90%;">TOG</em><span class="ltx_text" id="bib.bib46.10.2" style="font-size:90%;">, 41(4), 2022. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib47"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib47.5.5.1" style="font-size:90%;">Xiao et al. [2024]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib47.7.1" style="font-size:90%;"> Zeqi Xiao, Tai Wang, Jingbo Wang, Jinkun Cao, Wenwei Zhang, Bo Dai, Dahua Lin, and Jiangmiao Pang. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib47.8.1" style="font-size:90%;">Unified human-scene interaction via prompted chain-of-contacts. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib47.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib47.10.2" style="font-size:90%;">ICLR</em><span class="ltx_text" id="bib.bib47.11.3" style="font-size:90%;">, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib48"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib48.5.5.1" style="font-size:90%;">Xie et al. [2023]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib48.7.1" style="font-size:90%;"> Zhaoming Xie, Jonathan Tseng, Sebastian Starke, Michiel van de Panne, and C Karen Liu. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib48.8.1" style="font-size:90%;">Hierarchical planning and control for box loco-manipulation. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib48.9.1" style="font-size:90%;">Proceedings of the ACM on Computer Graphics and Interactive Techniques</em><span class="ltx_text" id="bib.bib48.10.2" style="font-size:90%;">, 6(3), 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib49"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib49.5.5.1" style="font-size:90%;">Xu et al. [2025]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib49.7.1" style="font-size:90%;"> Sirui Xu, Hung Yu Ling, Yu-Xiong Wang, and Liang-Yan Gui. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib49.8.1" style="font-size:90%;">Intermimic: Towards universal whole-body control for physics-based human-object interactions. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib49.9.1" style="font-size:90%;">arXiv preprint arXiv:2502.20390</em><span class="ltx_text" id="bib.bib49.10.2" style="font-size:90%;">, 2025. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib50"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib50.5.5.1" style="font-size:90%;">Yi et al. [2024]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib50.7.1" style="font-size:90%;"> Hongwei Yi, Justus Thies, Michael J. Black, Xue Bin Peng, and Davis Rempe. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib50.8.1" style="font-size:90%;">Generating human interaction motions in scenes with text control. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib50.9.1" style="font-size:90%;">ECCV</em><span class="ltx_text" id="bib.bib50.10.2" style="font-size:90%;">, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib51"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib51.5.5.1" style="font-size:90%;">Yuan et al. [2023]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib51.7.1" style="font-size:90%;"> Ye Yuan, Jiaming Song, Umar Iqbal, Arash Vahdat, and Jan Kautz. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib51.8.1" style="font-size:90%;">Physdiff: Physics-guided human motion diffusion model. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib51.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib51.10.2" style="font-size:90%;">ICCV</em><span class="ltx_text" id="bib.bib51.11.3" style="font-size:90%;">, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib52"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib52.5.5.1" style="font-size:90%;">Zhang et al. [2022a]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib52.7.1" style="font-size:90%;"> Mingyuan Zhang, Zhongang Cai, Liang Pan, Fangzhou Hong, Xinying Guo, Lei Yang, and Ziwei Liu. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib52.8.1" style="font-size:90%;">Motiondiffuse: Text-driven human motion generation with diffusion model. </span> </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib52.9.1" style="font-size:90%;">arXiv preprint arXiv:2208.15001</em><span class="ltx_text" id="bib.bib52.10.2" style="font-size:90%;">, 2022a. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib53"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib53.5.5.1" style="font-size:90%;">Zhang et al. [2024]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib53.7.1" style="font-size:90%;"> Wanyue Zhang, Rishabh Dabral, Thomas Leimkühler, Vladislav Golyanik, Marc Habermann, and Christian Theobalt. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib53.8.1" style="font-size:90%;">Roam: Robust and object-aware motion generation using neural pose descriptors. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib53.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib53.10.2" style="font-size:90%;">3DV</em><span class="ltx_text" id="bib.bib53.11.3" style="font-size:90%;">, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib54"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib54.5.5.1" style="font-size:90%;">Zhang et al. [2022b]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib54.7.1" style="font-size:90%;"> Xiaohan Zhang, Bharat Lal Bhatnagar, Sebastian Starke, Vladimir Guzov, and Gerard Pons-Moll. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib54.8.1" style="font-size:90%;">Couch: Towards controllable human-chair interactions. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib54.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib54.10.2" style="font-size:90%;">ECCV</em><span class="ltx_text" id="bib.bib54.11.3" style="font-size:90%;">, 2022b. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib55"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib55.5.5.1" style="font-size:90%;">Zhao et al. [2023]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib55.7.1" style="font-size:90%;"> Kaifeng Zhao, Yan Zhang, Shaofei Wang, Thabo Beeler, and Siyu Tang. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib55.8.1" style="font-size:90%;">Synthesizing diverse human motions in 3d indoor scenes. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib55.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib55.10.2" style="font-size:90%;">ICCV</em><span class="ltx_text" id="bib.bib55.11.3" style="font-size:90%;">, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib56"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib56.5.5.1" style="font-size:90%;">Zhou et al. [2025]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib56.7.1" style="font-size:90%;"> Wenyang Zhou, Zhiyang Dou, Zeyu Cao, Zhouyingcheng Liao, Jingbo Wang, Wenjia Wang, Yuan Liu, Taku Komura, Wenping Wang, and Lingjie Liu. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib56.8.1" style="font-size:90%;">Emdm: Efficient motion diffusion model for fast and high-quality motion generation. </span> </span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib56.9.1" style="font-size:90%;">In </span><em class="ltx_emph ltx_font_italic" id="bib.bib56.10.2" style="font-size:90%;">ECCV</em><span class="ltx_text" id="bib.bib56.11.3" style="font-size:90%;">, 2025. </span> </span> </li> </ul> </section> <div class="ltx_pagination ltx_role_newpage"></div> <div class="ltx_pagination ltx_role_newpage"></div> <div class="ltx_para ltx_align_center" id="p2"> <span class="ltx_ERROR undefined" id="p2.1">\thetitle</span> <br class="ltx_break"/> <p class="ltx_p" id="p2.2"><span class="ltx_text" id="p2.2.1" style="font-size:144%;">Supplementary Material <br class="ltx_break"/></span></p> </div> <figure class="ltx_figure ltx_align_center" id="S6.F5"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="290" id="S6.F5.g1" src="x5.png" width="814"/> <figcaption class="ltx_caption ltx_centering" style="font-size:144%;"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S6.F5.4.1.1" style="font-size:63%;">Figure 5</span>: </span><span class="ltx_text" id="S6.F5.5.2" style="font-size:63%;">ViconStyle demos.</span></figcaption> </figure> <section class="ltx_section ltx_centering" id="S7"> <h2 class="ltx_title ltx_title_section" style="font-size:144%;"> <span class="ltx_tag ltx_tag_section">7 </span>Reward Templates</h2> <div class="ltx_para" id="S7.p1"> <p class="ltx_p" id="S7.p1.1"><span class="ltx_text" id="S7.p1.1.1" style="font-size:144%;">In this section, we introduce the reward functions in 3 parts: locomotion (Loco), human-scene interaction (HSI), and dynamic object interaction (DOI).</span></p> <ul class="ltx_itemize" id="S7.I1"> <li class="ltx_item" id="S7.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S7.I1.i1.p1"> <p class="ltx_p" id="S7.I1.i1.p1.7"><span class="ltx_text ltx_font_bold" id="S7.I1.i1.p1.7.1" style="font-size:144%;">Loco Reward.</span><span class="ltx_text" id="S7.I1.i1.p1.7.2" style="font-size:144%;"> The locomotion reward is defined in Equation </span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S7.E1" style="font-size:144%;" title="Equation 1 ‣ 1st item ‣ 7 Reward Templates ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">1</span></a><span class="ltx_text" id="S7.I1.i1.p1.7.3" style="font-size:144%;">. The overall reward comprises the far </span><math alttext="r_{t}^{far}" class="ltx_Math" display="inline" id="S7.I1.i1.p1.1.m1.1"><semantics id="S7.I1.i1.p1.1.m1.1a"><msubsup id="S7.I1.i1.p1.1.m1.1.1" xref="S7.I1.i1.p1.1.m1.1.1.cmml"><mi id="S7.I1.i1.p1.1.m1.1.1.2.2" mathsize="144%" xref="S7.I1.i1.p1.1.m1.1.1.2.2.cmml">r</mi><mi id="S7.I1.i1.p1.1.m1.1.1.2.3" mathsize="144%" xref="S7.I1.i1.p1.1.m1.1.1.2.3.cmml">t</mi><mrow id="S7.I1.i1.p1.1.m1.1.1.3" xref="S7.I1.i1.p1.1.m1.1.1.3.cmml"><mi id="S7.I1.i1.p1.1.m1.1.1.3.2" mathsize="144%" xref="S7.I1.i1.p1.1.m1.1.1.3.2.cmml">f</mi><mo id="S7.I1.i1.p1.1.m1.1.1.3.1" xref="S7.I1.i1.p1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.1.m1.1.1.3.3" mathsize="144%" xref="S7.I1.i1.p1.1.m1.1.1.3.3.cmml">a</mi><mo id="S7.I1.i1.p1.1.m1.1.1.3.1a" xref="S7.I1.i1.p1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.1.m1.1.1.3.4" mathsize="144%" xref="S7.I1.i1.p1.1.m1.1.1.3.4.cmml">r</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i1.p1.1.m1.1b"><apply id="S7.I1.i1.p1.1.m1.1.1.cmml" xref="S7.I1.i1.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.1.m1.1.1.1.cmml" xref="S7.I1.i1.p1.1.m1.1.1">superscript</csymbol><apply id="S7.I1.i1.p1.1.m1.1.1.2.cmml" xref="S7.I1.i1.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.1.m1.1.1.2.1.cmml" xref="S7.I1.i1.p1.1.m1.1.1">subscript</csymbol><ci id="S7.I1.i1.p1.1.m1.1.1.2.2.cmml" xref="S7.I1.i1.p1.1.m1.1.1.2.2">𝑟</ci><ci id="S7.I1.i1.p1.1.m1.1.1.2.3.cmml" xref="S7.I1.i1.p1.1.m1.1.1.2.3">𝑡</ci></apply><apply id="S7.I1.i1.p1.1.m1.1.1.3.cmml" xref="S7.I1.i1.p1.1.m1.1.1.3"><times id="S7.I1.i1.p1.1.m1.1.1.3.1.cmml" xref="S7.I1.i1.p1.1.m1.1.1.3.1"></times><ci id="S7.I1.i1.p1.1.m1.1.1.3.2.cmml" xref="S7.I1.i1.p1.1.m1.1.1.3.2">𝑓</ci><ci id="S7.I1.i1.p1.1.m1.1.1.3.3.cmml" xref="S7.I1.i1.p1.1.m1.1.1.3.3">𝑎</ci><ci id="S7.I1.i1.p1.1.m1.1.1.3.4.cmml" xref="S7.I1.i1.p1.1.m1.1.1.3.4">𝑟</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i1.p1.1.m1.1c">r_{t}^{far}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i1.p1.1.m1.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_f italic_a italic_r end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i1.p1.7.4" style="font-size:144%;">, near </span><math alttext="r_{t}^{near}" class="ltx_Math" display="inline" id="S7.I1.i1.p1.2.m2.1"><semantics id="S7.I1.i1.p1.2.m2.1a"><msubsup id="S7.I1.i1.p1.2.m2.1.1" xref="S7.I1.i1.p1.2.m2.1.1.cmml"><mi id="S7.I1.i1.p1.2.m2.1.1.2.2" mathsize="144%" xref="S7.I1.i1.p1.2.m2.1.1.2.2.cmml">r</mi><mi id="S7.I1.i1.p1.2.m2.1.1.2.3" mathsize="144%" xref="S7.I1.i1.p1.2.m2.1.1.2.3.cmml">t</mi><mrow id="S7.I1.i1.p1.2.m2.1.1.3" xref="S7.I1.i1.p1.2.m2.1.1.3.cmml"><mi id="S7.I1.i1.p1.2.m2.1.1.3.2" mathsize="144%" xref="S7.I1.i1.p1.2.m2.1.1.3.2.cmml">n</mi><mo id="S7.I1.i1.p1.2.m2.1.1.3.1" xref="S7.I1.i1.p1.2.m2.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.2.m2.1.1.3.3" mathsize="144%" xref="S7.I1.i1.p1.2.m2.1.1.3.3.cmml">e</mi><mo id="S7.I1.i1.p1.2.m2.1.1.3.1a" xref="S7.I1.i1.p1.2.m2.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.2.m2.1.1.3.4" mathsize="144%" xref="S7.I1.i1.p1.2.m2.1.1.3.4.cmml">a</mi><mo id="S7.I1.i1.p1.2.m2.1.1.3.1b" xref="S7.I1.i1.p1.2.m2.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.2.m2.1.1.3.5" mathsize="144%" xref="S7.I1.i1.p1.2.m2.1.1.3.5.cmml">r</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i1.p1.2.m2.1b"><apply id="S7.I1.i1.p1.2.m2.1.1.cmml" xref="S7.I1.i1.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.2.m2.1.1.1.cmml" xref="S7.I1.i1.p1.2.m2.1.1">superscript</csymbol><apply id="S7.I1.i1.p1.2.m2.1.1.2.cmml" xref="S7.I1.i1.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.2.m2.1.1.2.1.cmml" xref="S7.I1.i1.p1.2.m2.1.1">subscript</csymbol><ci id="S7.I1.i1.p1.2.m2.1.1.2.2.cmml" xref="S7.I1.i1.p1.2.m2.1.1.2.2">𝑟</ci><ci id="S7.I1.i1.p1.2.m2.1.1.2.3.cmml" xref="S7.I1.i1.p1.2.m2.1.1.2.3">𝑡</ci></apply><apply id="S7.I1.i1.p1.2.m2.1.1.3.cmml" xref="S7.I1.i1.p1.2.m2.1.1.3"><times id="S7.I1.i1.p1.2.m2.1.1.3.1.cmml" xref="S7.I1.i1.p1.2.m2.1.1.3.1"></times><ci id="S7.I1.i1.p1.2.m2.1.1.3.2.cmml" xref="S7.I1.i1.p1.2.m2.1.1.3.2">𝑛</ci><ci id="S7.I1.i1.p1.2.m2.1.1.3.3.cmml" xref="S7.I1.i1.p1.2.m2.1.1.3.3">𝑒</ci><ci id="S7.I1.i1.p1.2.m2.1.1.3.4.cmml" xref="S7.I1.i1.p1.2.m2.1.1.3.4">𝑎</ci><ci id="S7.I1.i1.p1.2.m2.1.1.3.5.cmml" xref="S7.I1.i1.p1.2.m2.1.1.3.5">𝑟</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i1.p1.2.m2.1c">r_{t}^{near}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i1.p1.2.m2.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n italic_e italic_a italic_r end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i1.p1.7.5" style="font-size:144%;">, and standstill </span><math alttext="r_{t}^{still}" class="ltx_Math" display="inline" id="S7.I1.i1.p1.3.m3.1"><semantics id="S7.I1.i1.p1.3.m3.1a"><msubsup id="S7.I1.i1.p1.3.m3.1.1" xref="S7.I1.i1.p1.3.m3.1.1.cmml"><mi id="S7.I1.i1.p1.3.m3.1.1.2.2" mathsize="144%" xref="S7.I1.i1.p1.3.m3.1.1.2.2.cmml">r</mi><mi id="S7.I1.i1.p1.3.m3.1.1.2.3" mathsize="144%" xref="S7.I1.i1.p1.3.m3.1.1.2.3.cmml">t</mi><mrow id="S7.I1.i1.p1.3.m3.1.1.3" xref="S7.I1.i1.p1.3.m3.1.1.3.cmml"><mi id="S7.I1.i1.p1.3.m3.1.1.3.2" mathsize="144%" xref="S7.I1.i1.p1.3.m3.1.1.3.2.cmml">s</mi><mo id="S7.I1.i1.p1.3.m3.1.1.3.1" xref="S7.I1.i1.p1.3.m3.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.3.m3.1.1.3.3" mathsize="144%" xref="S7.I1.i1.p1.3.m3.1.1.3.3.cmml">t</mi><mo id="S7.I1.i1.p1.3.m3.1.1.3.1a" xref="S7.I1.i1.p1.3.m3.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.3.m3.1.1.3.4" mathsize="144%" xref="S7.I1.i1.p1.3.m3.1.1.3.4.cmml">i</mi><mo id="S7.I1.i1.p1.3.m3.1.1.3.1b" xref="S7.I1.i1.p1.3.m3.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.3.m3.1.1.3.5" mathsize="144%" xref="S7.I1.i1.p1.3.m3.1.1.3.5.cmml">l</mi><mo id="S7.I1.i1.p1.3.m3.1.1.3.1c" xref="S7.I1.i1.p1.3.m3.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.3.m3.1.1.3.6" mathsize="144%" xref="S7.I1.i1.p1.3.m3.1.1.3.6.cmml">l</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i1.p1.3.m3.1b"><apply id="S7.I1.i1.p1.3.m3.1.1.cmml" xref="S7.I1.i1.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.3.m3.1.1.1.cmml" xref="S7.I1.i1.p1.3.m3.1.1">superscript</csymbol><apply id="S7.I1.i1.p1.3.m3.1.1.2.cmml" xref="S7.I1.i1.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.3.m3.1.1.2.1.cmml" xref="S7.I1.i1.p1.3.m3.1.1">subscript</csymbol><ci id="S7.I1.i1.p1.3.m3.1.1.2.2.cmml" xref="S7.I1.i1.p1.3.m3.1.1.2.2">𝑟</ci><ci id="S7.I1.i1.p1.3.m3.1.1.2.3.cmml" xref="S7.I1.i1.p1.3.m3.1.1.2.3">𝑡</ci></apply><apply id="S7.I1.i1.p1.3.m3.1.1.3.cmml" xref="S7.I1.i1.p1.3.m3.1.1.3"><times id="S7.I1.i1.p1.3.m3.1.1.3.1.cmml" xref="S7.I1.i1.p1.3.m3.1.1.3.1"></times><ci id="S7.I1.i1.p1.3.m3.1.1.3.2.cmml" xref="S7.I1.i1.p1.3.m3.1.1.3.2">𝑠</ci><ci id="S7.I1.i1.p1.3.m3.1.1.3.3.cmml" xref="S7.I1.i1.p1.3.m3.1.1.3.3">𝑡</ci><ci id="S7.I1.i1.p1.3.m3.1.1.3.4.cmml" xref="S7.I1.i1.p1.3.m3.1.1.3.4">𝑖</ci><ci id="S7.I1.i1.p1.3.m3.1.1.3.5.cmml" xref="S7.I1.i1.p1.3.m3.1.1.3.5">𝑙</ci><ci id="S7.I1.i1.p1.3.m3.1.1.3.6.cmml" xref="S7.I1.i1.p1.3.m3.1.1.3.6">𝑙</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i1.p1.3.m3.1c">r_{t}^{still}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i1.p1.3.m3.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s italic_t italic_i italic_l italic_l end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i1.p1.7.6" style="font-size:144%;"> rewards. The standstill reward ensures that the humanoid remains static once the target position has been reached. Given a target position </span><math alttext="x^{*}" class="ltx_Math" display="inline" id="S7.I1.i1.p1.4.m4.1"><semantics id="S7.I1.i1.p1.4.m4.1a"><msup id="S7.I1.i1.p1.4.m4.1.1" xref="S7.I1.i1.p1.4.m4.1.1.cmml"><mi id="S7.I1.i1.p1.4.m4.1.1.2" mathsize="144%" xref="S7.I1.i1.p1.4.m4.1.1.2.cmml">x</mi><mo id="S7.I1.i1.p1.4.m4.1.1.3" mathsize="144%" xref="S7.I1.i1.p1.4.m4.1.1.3.cmml">∗</mo></msup><annotation-xml encoding="MathML-Content" id="S7.I1.i1.p1.4.m4.1b"><apply id="S7.I1.i1.p1.4.m4.1.1.cmml" xref="S7.I1.i1.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.4.m4.1.1.1.cmml" xref="S7.I1.i1.p1.4.m4.1.1">superscript</csymbol><ci id="S7.I1.i1.p1.4.m4.1.1.2.cmml" xref="S7.I1.i1.p1.4.m4.1.1.2">𝑥</ci><times id="S7.I1.i1.p1.4.m4.1.1.3.cmml" xref="S7.I1.i1.p1.4.m4.1.1.3"></times></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i1.p1.4.m4.1c">x^{*}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i1.p1.4.m4.1d">italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i1.p1.7.7" style="font-size:144%;"> of the character’s root </span><math alttext="x^{root}" class="ltx_Math" display="inline" id="S7.I1.i1.p1.5.m5.1"><semantics id="S7.I1.i1.p1.5.m5.1a"><msup id="S7.I1.i1.p1.5.m5.1.1" xref="S7.I1.i1.p1.5.m5.1.1.cmml"><mi id="S7.I1.i1.p1.5.m5.1.1.2" mathsize="144%" xref="S7.I1.i1.p1.5.m5.1.1.2.cmml">x</mi><mrow id="S7.I1.i1.p1.5.m5.1.1.3" xref="S7.I1.i1.p1.5.m5.1.1.3.cmml"><mi id="S7.I1.i1.p1.5.m5.1.1.3.2" mathsize="144%" xref="S7.I1.i1.p1.5.m5.1.1.3.2.cmml">r</mi><mo id="S7.I1.i1.p1.5.m5.1.1.3.1" xref="S7.I1.i1.p1.5.m5.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.5.m5.1.1.3.3" mathsize="144%" xref="S7.I1.i1.p1.5.m5.1.1.3.3.cmml">o</mi><mo id="S7.I1.i1.p1.5.m5.1.1.3.1a" xref="S7.I1.i1.p1.5.m5.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.5.m5.1.1.3.4" mathsize="144%" xref="S7.I1.i1.p1.5.m5.1.1.3.4.cmml">o</mi><mo id="S7.I1.i1.p1.5.m5.1.1.3.1b" xref="S7.I1.i1.p1.5.m5.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.5.m5.1.1.3.5" mathsize="144%" xref="S7.I1.i1.p1.5.m5.1.1.3.5.cmml">t</mi></mrow></msup><annotation-xml encoding="MathML-Content" id="S7.I1.i1.p1.5.m5.1b"><apply id="S7.I1.i1.p1.5.m5.1.1.cmml" xref="S7.I1.i1.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.5.m5.1.1.1.cmml" xref="S7.I1.i1.p1.5.m5.1.1">superscript</csymbol><ci id="S7.I1.i1.p1.5.m5.1.1.2.cmml" xref="S7.I1.i1.p1.5.m5.1.1.2">𝑥</ci><apply id="S7.I1.i1.p1.5.m5.1.1.3.cmml" xref="S7.I1.i1.p1.5.m5.1.1.3"><times id="S7.I1.i1.p1.5.m5.1.1.3.1.cmml" xref="S7.I1.i1.p1.5.m5.1.1.3.1"></times><ci id="S7.I1.i1.p1.5.m5.1.1.3.2.cmml" xref="S7.I1.i1.p1.5.m5.1.1.3.2">𝑟</ci><ci id="S7.I1.i1.p1.5.m5.1.1.3.3.cmml" xref="S7.I1.i1.p1.5.m5.1.1.3.3">𝑜</ci><ci id="S7.I1.i1.p1.5.m5.1.1.3.4.cmml" xref="S7.I1.i1.p1.5.m5.1.1.3.4">𝑜</ci><ci id="S7.I1.i1.p1.5.m5.1.1.3.5.cmml" xref="S7.I1.i1.p1.5.m5.1.1.3.5">𝑡</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i1.p1.5.m5.1c">x^{root}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i1.p1.5.m5.1d">italic_x start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i1.p1.7.8" style="font-size:144%;">, a target direction </span><math alttext="d^{*}_{t}" class="ltx_Math" display="inline" id="S7.I1.i1.p1.6.m6.1"><semantics id="S7.I1.i1.p1.6.m6.1a"><msubsup id="S7.I1.i1.p1.6.m6.1.1" xref="S7.I1.i1.p1.6.m6.1.1.cmml"><mi id="S7.I1.i1.p1.6.m6.1.1.2.2" mathsize="144%" xref="S7.I1.i1.p1.6.m6.1.1.2.2.cmml">d</mi><mi id="S7.I1.i1.p1.6.m6.1.1.3" mathsize="144%" xref="S7.I1.i1.p1.6.m6.1.1.3.cmml">t</mi><mo id="S7.I1.i1.p1.6.m6.1.1.2.3" mathsize="144%" xref="S7.I1.i1.p1.6.m6.1.1.2.3.cmml">∗</mo></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i1.p1.6.m6.1b"><apply id="S7.I1.i1.p1.6.m6.1.1.cmml" xref="S7.I1.i1.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.6.m6.1.1.1.cmml" xref="S7.I1.i1.p1.6.m6.1.1">subscript</csymbol><apply id="S7.I1.i1.p1.6.m6.1.1.2.cmml" xref="S7.I1.i1.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.6.m6.1.1.2.1.cmml" xref="S7.I1.i1.p1.6.m6.1.1">superscript</csymbol><ci id="S7.I1.i1.p1.6.m6.1.1.2.2.cmml" xref="S7.I1.i1.p1.6.m6.1.1.2.2">𝑑</ci><times id="S7.I1.i1.p1.6.m6.1.1.2.3.cmml" xref="S7.I1.i1.p1.6.m6.1.1.2.3"></times></apply><ci id="S7.I1.i1.p1.6.m6.1.1.3.cmml" xref="S7.I1.i1.p1.6.m6.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i1.p1.6.m6.1c">d^{*}_{t}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i1.p1.6.m6.1d">italic_d start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i1.p1.7.9" style="font-size:144%;">, and a target scalar velocity </span><math alttext="g_{t}^{vel}" class="ltx_Math" display="inline" id="S7.I1.i1.p1.7.m7.1"><semantics id="S7.I1.i1.p1.7.m7.1a"><msubsup id="S7.I1.i1.p1.7.m7.1.1" xref="S7.I1.i1.p1.7.m7.1.1.cmml"><mi id="S7.I1.i1.p1.7.m7.1.1.2.2" mathsize="144%" xref="S7.I1.i1.p1.7.m7.1.1.2.2.cmml">g</mi><mi id="S7.I1.i1.p1.7.m7.1.1.2.3" mathsize="144%" xref="S7.I1.i1.p1.7.m7.1.1.2.3.cmml">t</mi><mrow id="S7.I1.i1.p1.7.m7.1.1.3" xref="S7.I1.i1.p1.7.m7.1.1.3.cmml"><mi id="S7.I1.i1.p1.7.m7.1.1.3.2" mathsize="144%" xref="S7.I1.i1.p1.7.m7.1.1.3.2.cmml">v</mi><mo id="S7.I1.i1.p1.7.m7.1.1.3.1" xref="S7.I1.i1.p1.7.m7.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.7.m7.1.1.3.3" mathsize="144%" xref="S7.I1.i1.p1.7.m7.1.1.3.3.cmml">e</mi><mo id="S7.I1.i1.p1.7.m7.1.1.3.1a" xref="S7.I1.i1.p1.7.m7.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i1.p1.7.m7.1.1.3.4" mathsize="144%" xref="S7.I1.i1.p1.7.m7.1.1.3.4.cmml">l</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i1.p1.7.m7.1b"><apply id="S7.I1.i1.p1.7.m7.1.1.cmml" xref="S7.I1.i1.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.7.m7.1.1.1.cmml" xref="S7.I1.i1.p1.7.m7.1.1">superscript</csymbol><apply id="S7.I1.i1.p1.7.m7.1.1.2.cmml" xref="S7.I1.i1.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S7.I1.i1.p1.7.m7.1.1.2.1.cmml" xref="S7.I1.i1.p1.7.m7.1.1">subscript</csymbol><ci id="S7.I1.i1.p1.7.m7.1.1.2.2.cmml" xref="S7.I1.i1.p1.7.m7.1.1.2.2">𝑔</ci><ci id="S7.I1.i1.p1.7.m7.1.1.2.3.cmml" xref="S7.I1.i1.p1.7.m7.1.1.2.3">𝑡</ci></apply><apply id="S7.I1.i1.p1.7.m7.1.1.3.cmml" xref="S7.I1.i1.p1.7.m7.1.1.3"><times id="S7.I1.i1.p1.7.m7.1.1.3.1.cmml" xref="S7.I1.i1.p1.7.m7.1.1.3.1"></times><ci id="S7.I1.i1.p1.7.m7.1.1.3.2.cmml" xref="S7.I1.i1.p1.7.m7.1.1.3.2">𝑣</ci><ci id="S7.I1.i1.p1.7.m7.1.1.3.3.cmml" xref="S7.I1.i1.p1.7.m7.1.1.3.3">𝑒</ci><ci id="S7.I1.i1.p1.7.m7.1.1.3.4.cmml" xref="S7.I1.i1.p1.7.m7.1.1.3.4">𝑙</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i1.p1.7.m7.1c">g_{t}^{vel}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i1.p1.7.m7.1d">italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v italic_e italic_l end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i1.p1.7.10" style="font-size:144%;">, the task reward is defined as:</span></p> </div> <div class="ltx_para" id="S7.I1.i1.p2"> <table class="ltx_equation ltx_eqn_table" id="S7.E1"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="r^{G}_{t}=\left\{\begin{aligned} 0.4\ r_{t}^{near}&amp;+0.5\ r_{t}^{far}+0,\left\|% x^{*}-x^{root}_{t}\right\|^{2}&gt;0.5,\\ 0.4\ r_{t}^{near}&amp;+0.5+0.1\ r_{t}^{still},\text{otherwise.}\end{aligned}\right." class="ltx_math_unparsed" display="block" id="S7.E1.m1.3"><semantics id="S7.E1.m1.3a"><mrow id="S7.E1.m1.3b"><msubsup id="S7.E1.m1.3.4"><mi id="S7.E1.m1.3.4.2.2" mathsize="144%">r</mi><mi id="S7.E1.m1.3.4.3" mathsize="144%">t</mi><mi id="S7.E1.m1.3.4.2.3" mathsize="144%">G</mi></msubsup><mo id="S7.E1.m1.3.5" mathsize="144%">=</mo><mrow id="S7.E1.m1.3.6"><mo id="S7.E1.m1.3.6.1">{</mo><mtable columnspacing="0pt" displaystyle="true" id="S7.E1.m1.3.3" rowspacing="0pt"><mtr id="S7.E1.m1.3.3a"><mtd class="ltx_align_right" columnalign="right" id="S7.E1.m1.3.3b"><mrow id="S7.E1.m1.1.1.1.2.1"><mn id="S7.E1.m1.1.1.1.2.1.2" mathsize="144%">0.4</mn><mo id="S7.E1.m1.1.1.1.2.1.1" lspace="0.720em">⁢</mo><msubsup id="S7.E1.m1.1.1.1.2.1.3"><mi id="S7.E1.m1.1.1.1.2.1.3.2.2" mathsize="144%">r</mi><mi id="S7.E1.m1.1.1.1.2.1.3.2.3" mathsize="144%">t</mi><mrow id="S7.E1.m1.1.1.1.2.1.3.3"><mi id="S7.E1.m1.1.1.1.2.1.3.3.2" mathsize="144%">n</mi><mo id="S7.E1.m1.1.1.1.2.1.3.3.1">⁢</mo><mi id="S7.E1.m1.1.1.1.2.1.3.3.3" mathsize="144%">e</mi><mo id="S7.E1.m1.1.1.1.2.1.3.3.1a">⁢</mo><mi id="S7.E1.m1.1.1.1.2.1.3.3.4" mathsize="144%">a</mi><mo id="S7.E1.m1.1.1.1.2.1.3.3.1b">⁢</mo><mi id="S7.E1.m1.1.1.1.2.1.3.3.5" mathsize="144%">r</mi></mrow></msubsup></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S7.E1.m1.3.3c"><mrow id="S7.E1.m1.1.1.1.1.1.1"><mrow id="S7.E1.m1.1.1.1.1.1.1.1"><mrow id="S7.E1.m1.1.1.1.1.1.1.1.2.2"><mrow id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1"><mrow id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2"><mo id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2a" mathsize="144%">+</mo><mrow id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2"><mn id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.2" mathsize="144%">0.5</mn><mo id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.1" lspace="0.720em">⁢</mo><msubsup id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.3"><mi id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.3.2.2" mathsize="144%">r</mi><mi id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.3.2.3" mathsize="144%">t</mi><mrow id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.3.3"><mi id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.3.3.2" mathsize="144%">f</mi><mo id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.3.3.1">⁢</mo><mi id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.3.3.3" mathsize="144%">a</mi><mo id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.3.3.1a">⁢</mo><mi id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.2.2.3.3.4" mathsize="144%">r</mi></mrow></msubsup></mrow></mrow><mo id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.1" mathsize="144%">+</mo><mn id="S7.E1.m1.1.1.1.1.1.1.1.1.1.1.3" mathsize="144%">0</mn></mrow><mo id="S7.E1.m1.1.1.1.1.1.1.1.2.2.3" mathsize="144%">,</mo><msup id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2"><mrow id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1"><mo id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.2">‖</mo><mrow id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1"><msup id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.2"><mi id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.2.2" mathsize="144%">x</mi><mo id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.2.3" mathsize="144%">∗</mo></msup><mo id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.1" mathsize="144%">−</mo><msubsup id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.3"><mi id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.3.2.2" mathsize="144%">x</mi><mi id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.3.3" mathsize="144%">t</mi><mrow id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.3.2.3"><mi id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.3.2.3.2" mathsize="144%">r</mi><mo id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.3.2.3.1">⁢</mo><mi id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.3.2.3.3" mathsize="144%">o</mi><mo id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.3.2.3.1a">⁢</mo><mi id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.3.2.3.4" mathsize="144%">o</mi><mo id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.3.2.3.1b">⁢</mo><mi id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.1.3.2.3.5" mathsize="144%">t</mi></mrow></msubsup></mrow><mo id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.1.1.3">‖</mo></mrow><mn id="S7.E1.m1.1.1.1.1.1.1.1.2.2.2.3" mathsize="144%">2</mn></msup></mrow><mo id="S7.E1.m1.1.1.1.1.1.1.1.3" mathsize="144%">&gt;</mo><mn id="S7.E1.m1.1.1.1.1.1.1.1.4" mathsize="144%">0.5</mn></mrow><mo id="S7.E1.m1.1.1.1.1.1.1.2" mathsize="144%">,</mo></mrow></mtd></mtr><mtr id="S7.E1.m1.3.3d"><mtd class="ltx_align_right" columnalign="right" id="S7.E1.m1.3.3e"><mrow id="S7.E1.m1.3.3.3.3.1"><mn id="S7.E1.m1.3.3.3.3.1.2" mathsize="144%">0.4</mn><mo id="S7.E1.m1.3.3.3.3.1.1" lspace="0.720em">⁢</mo><msubsup id="S7.E1.m1.3.3.3.3.1.3"><mi id="S7.E1.m1.3.3.3.3.1.3.2.2" mathsize="144%">r</mi><mi id="S7.E1.m1.3.3.3.3.1.3.2.3" mathsize="144%">t</mi><mrow id="S7.E1.m1.3.3.3.3.1.3.3"><mi id="S7.E1.m1.3.3.3.3.1.3.3.2" mathsize="144%">n</mi><mo id="S7.E1.m1.3.3.3.3.1.3.3.1">⁢</mo><mi id="S7.E1.m1.3.3.3.3.1.3.3.3" mathsize="144%">e</mi><mo id="S7.E1.m1.3.3.3.3.1.3.3.1a">⁢</mo><mi id="S7.E1.m1.3.3.3.3.1.3.3.4" mathsize="144%">a</mi><mo id="S7.E1.m1.3.3.3.3.1.3.3.1b">⁢</mo><mi id="S7.E1.m1.3.3.3.3.1.3.3.5" mathsize="144%">r</mi></mrow></msubsup></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S7.E1.m1.3.3f"><mrow id="S7.E1.m1.3.3.3.2.2.2"><mrow id="S7.E1.m1.3.3.3.2.2.2.1"><mrow id="S7.E1.m1.3.3.3.2.2.2.1.2"><mo id="S7.E1.m1.3.3.3.2.2.2.1.2a" mathsize="144%">+</mo><mn id="S7.E1.m1.3.3.3.2.2.2.1.2.2" mathsize="144%">0.5</mn></mrow><mo id="S7.E1.m1.3.3.3.2.2.2.1.1" mathsize="144%">+</mo><mrow id="S7.E1.m1.3.3.3.2.2.2.1.3"><mn id="S7.E1.m1.3.3.3.2.2.2.1.3.2" mathsize="144%">0.1</mn><mo id="S7.E1.m1.3.3.3.2.2.2.1.3.1" lspace="0.720em">⁢</mo><msubsup id="S7.E1.m1.3.3.3.2.2.2.1.3.3"><mi id="S7.E1.m1.3.3.3.2.2.2.1.3.3.2.2" mathsize="144%">r</mi><mi id="S7.E1.m1.3.3.3.2.2.2.1.3.3.2.3" mathsize="144%">t</mi><mrow id="S7.E1.m1.3.3.3.2.2.2.1.3.3.3"><mi id="S7.E1.m1.3.3.3.2.2.2.1.3.3.3.2" mathsize="144%">s</mi><mo id="S7.E1.m1.3.3.3.2.2.2.1.3.3.3.1">⁢</mo><mi id="S7.E1.m1.3.3.3.2.2.2.1.3.3.3.3" mathsize="144%">t</mi><mo id="S7.E1.m1.3.3.3.2.2.2.1.3.3.3.1a">⁢</mo><mi id="S7.E1.m1.3.3.3.2.2.2.1.3.3.3.4" mathsize="144%">i</mi><mo id="S7.E1.m1.3.3.3.2.2.2.1.3.3.3.1b">⁢</mo><mi id="S7.E1.m1.3.3.3.2.2.2.1.3.3.3.5" mathsize="144%">l</mi><mo id="S7.E1.m1.3.3.3.2.2.2.1.3.3.3.1c">⁢</mo><mi id="S7.E1.m1.3.3.3.2.2.2.1.3.3.3.6" mathsize="144%">l</mi></mrow></msubsup></mrow></mrow><mo id="S7.E1.m1.3.3.3.2.2.2.2" mathsize="144%">,</mo><mtext id="S7.E1.m1.2.2.2.1.1.1" mathsize="144%">otherwise.</mtext></mrow></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex" id="S7.E1.m1.3c">r^{G}_{t}=\left\{\begin{aligned} 0.4\ r_{t}^{near}&amp;+0.5\ r_{t}^{far}+0,\left\|% x^{*}-x^{root}_{t}\right\|^{2}&gt;0.5,\\ 0.4\ r_{t}^{near}&amp;+0.5+0.1\ r_{t}^{still},\text{otherwise.}\end{aligned}\right.</annotation><annotation encoding="application/x-llamapun" id="S7.E1.m1.3d">italic_r start_POSTSUPERSCRIPT italic_G end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = { start_ROW start_CELL 0.4 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n italic_e italic_a italic_r end_POSTSUPERSCRIPT end_CELL start_CELL + 0.5 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_f italic_a italic_r end_POSTSUPERSCRIPT + 0 , ∥ italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_x start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT &gt; 0.5 , end_CELL end_ROW start_ROW start_CELL 0.4 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n italic_e italic_a italic_r end_POSTSUPERSCRIPT end_CELL start_CELL + 0.5 + 0.1 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s italic_t italic_i italic_l italic_l end_POSTSUPERSCRIPT , otherwise. end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(1)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S7.I1.i1.p3"> <table class="ltx_equationgroup ltx_eqn_table" id="S7.E2"> <tbody> <tr class="ltx_equation ltx_eqn_row ltx_align_baseline" id="S7.E2X"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle r_{t}^{far}" class="ltx_Math" display="inline" id="S7.E2X.2.1.1.m1.1"><semantics id="S7.E2X.2.1.1.m1.1a"><msubsup id="S7.E2X.2.1.1.m1.1.1" xref="S7.E2X.2.1.1.m1.1.1.cmml"><mi id="S7.E2X.2.1.1.m1.1.1.2.2" mathsize="144%" xref="S7.E2X.2.1.1.m1.1.1.2.2.cmml">r</mi><mi id="S7.E2X.2.1.1.m1.1.1.2.3" mathsize="144%" xref="S7.E2X.2.1.1.m1.1.1.2.3.cmml">t</mi><mrow id="S7.E2X.2.1.1.m1.1.1.3" xref="S7.E2X.2.1.1.m1.1.1.3.cmml"><mi id="S7.E2X.2.1.1.m1.1.1.3.2" mathsize="144%" xref="S7.E2X.2.1.1.m1.1.1.3.2.cmml">f</mi><mo id="S7.E2X.2.1.1.m1.1.1.3.1" xref="S7.E2X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E2X.2.1.1.m1.1.1.3.3" mathsize="144%" xref="S7.E2X.2.1.1.m1.1.1.3.3.cmml">a</mi><mo id="S7.E2X.2.1.1.m1.1.1.3.1a" xref="S7.E2X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E2X.2.1.1.m1.1.1.3.4" mathsize="144%" xref="S7.E2X.2.1.1.m1.1.1.3.4.cmml">r</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.E2X.2.1.1.m1.1b"><apply id="S7.E2X.2.1.1.m1.1.1.cmml" xref="S7.E2X.2.1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.E2X.2.1.1.m1.1.1.1.cmml" xref="S7.E2X.2.1.1.m1.1.1">superscript</csymbol><apply id="S7.E2X.2.1.1.m1.1.1.2.cmml" xref="S7.E2X.2.1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.E2X.2.1.1.m1.1.1.2.1.cmml" xref="S7.E2X.2.1.1.m1.1.1">subscript</csymbol><ci id="S7.E2X.2.1.1.m1.1.1.2.2.cmml" xref="S7.E2X.2.1.1.m1.1.1.2.2">𝑟</ci><ci id="S7.E2X.2.1.1.m1.1.1.2.3.cmml" xref="S7.E2X.2.1.1.m1.1.1.2.3">𝑡</ci></apply><apply id="S7.E2X.2.1.1.m1.1.1.3.cmml" xref="S7.E2X.2.1.1.m1.1.1.3"><times id="S7.E2X.2.1.1.m1.1.1.3.1.cmml" xref="S7.E2X.2.1.1.m1.1.1.3.1"></times><ci id="S7.E2X.2.1.1.m1.1.1.3.2.cmml" xref="S7.E2X.2.1.1.m1.1.1.3.2">𝑓</ci><ci id="S7.E2X.2.1.1.m1.1.1.3.3.cmml" xref="S7.E2X.2.1.1.m1.1.1.3.3">𝑎</ci><ci id="S7.E2X.2.1.1.m1.1.1.3.4.cmml" xref="S7.E2X.2.1.1.m1.1.1.3.4">𝑟</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E2X.2.1.1.m1.1c">\displaystyle r_{t}^{far}</annotation><annotation encoding="application/x-llamapun" id="S7.E2X.2.1.1.m1.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_f italic_a italic_r end_POSTSUPERSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=0.6\ \text{exp}\big{(}-0.5\left\|x^{*}-x_{t}^{root}\right\|^{2}% \big{)}" class="ltx_Math" display="inline" id="S7.E2X.3.2.2.m1.1"><semantics id="S7.E2X.3.2.2.m1.1a"><mrow id="S7.E2X.3.2.2.m1.1.1" xref="S7.E2X.3.2.2.m1.1.1.cmml"><mi id="S7.E2X.3.2.2.m1.1.1.3" xref="S7.E2X.3.2.2.m1.1.1.3.cmml"></mi><mo id="S7.E2X.3.2.2.m1.1.1.2" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.2.cmml">=</mo><mrow id="S7.E2X.3.2.2.m1.1.1.1" xref="S7.E2X.3.2.2.m1.1.1.1.cmml"><mn id="S7.E2X.3.2.2.m1.1.1.1.3" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.3.cmml">0.6</mn><mo id="S7.E2X.3.2.2.m1.1.1.1.2" lspace="0.720em" xref="S7.E2X.3.2.2.m1.1.1.1.2.cmml">⁢</mo><mtext id="S7.E2X.3.2.2.m1.1.1.1.4" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.4a.cmml">exp</mtext><mo id="S7.E2X.3.2.2.m1.1.1.1.2a" xref="S7.E2X.3.2.2.m1.1.1.1.2.cmml">⁢</mo><mrow id="S7.E2X.3.2.2.m1.1.1.1.1.1" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.cmml"><mo id="S7.E2X.3.2.2.m1.1.1.1.1.1.2" maxsize="120%" minsize="120%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.cmml">(</mo><mrow id="S7.E2X.3.2.2.m1.1.1.1.1.1.1" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.cmml"><mo id="S7.E2X.3.2.2.m1.1.1.1.1.1.1a" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.cmml">−</mo><mrow id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.cmml"><mn id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.3.cmml">0.5</mn><mo id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.2" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.2.cmml">⁢</mo><msup id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.cmml"><mrow id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.2.cmml"><mo id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.cmml"><msup id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml">x</mi><mo id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml">∗</mo></msup><mo id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.1" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><msubsup id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml">x</mi><mi id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml">t</mi><mrow id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml"><mi id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml">r</mi><mo id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml">o</mi><mo id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1a" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml">o</mi><mo id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1b" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml">t</mi></mrow></msubsup></mrow><mo id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><mo id="S7.E2X.3.2.2.m1.1.1.1.1.1.3" maxsize="120%" minsize="120%" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S7.E2X.3.2.2.m1.1b"><apply id="S7.E2X.3.2.2.m1.1.1.cmml" xref="S7.E2X.3.2.2.m1.1.1"><eq id="S7.E2X.3.2.2.m1.1.1.2.cmml" xref="S7.E2X.3.2.2.m1.1.1.2"></eq><csymbol cd="latexml" id="S7.E2X.3.2.2.m1.1.1.3.cmml" xref="S7.E2X.3.2.2.m1.1.1.3">absent</csymbol><apply id="S7.E2X.3.2.2.m1.1.1.1.cmml" xref="S7.E2X.3.2.2.m1.1.1.1"><times id="S7.E2X.3.2.2.m1.1.1.1.2.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.2"></times><cn id="S7.E2X.3.2.2.m1.1.1.1.3.cmml" type="float" xref="S7.E2X.3.2.2.m1.1.1.1.3">0.6</cn><ci id="S7.E2X.3.2.2.m1.1.1.1.4a.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.4"><mtext id="S7.E2X.3.2.2.m1.1.1.1.4.cmml" mathsize="144%" xref="S7.E2X.3.2.2.m1.1.1.1.4">exp</mtext></ci><apply id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1"><minus id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.2.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1"></minus><apply id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1"><times id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.2.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.2"></times><cn id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.3.cmml" type="float" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.3">0.5</cn><apply id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1">superscript</csymbol><apply id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1"><minus id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2">superscript</csymbol><ci id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2">𝑥</ci><times id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3"></times></apply><apply id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3">superscript</csymbol><apply id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2">𝑥</ci><ci id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3">𝑡</ci></apply><apply id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3"><times id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1"></times><ci id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2">𝑟</ci><ci id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3">𝑜</ci><ci id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4">𝑜</ci><ci id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5">𝑡</ci></apply></apply></apply></apply><cn id="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S7.E2X.3.2.2.m1.1.1.1.1.1.1.1.1.3">2</cn></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E2X.3.2.2.m1.1c">\displaystyle=0.6\ \text{exp}\big{(}-0.5\left\|x^{*}-x_{t}^{root}\right\|^{2}% \big{)}</annotation><annotation encoding="application/x-llamapun" id="S7.E2X.3.2.2.m1.1d">= 0.6 exp ( - 0.5 ∥ italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="3"><span class="ltx_tag ltx_tag_equationgroup ltx_align_right">(2)</span></td> </tr> <tr class="ltx_equation ltx_eqn_row ltx_align_baseline" id="S7.E2Xa"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_eqn_cell"></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle+0.2\ \text{exp}\big{(}-2.0\left\|g_{t}^{vel}-d_{t}^{*}\cdot\dot{% x}^{root}_{t}\right\|^{2}\big{)}" class="ltx_Math" display="inline" id="S7.E2Xa.2.1.1.m1.1"><semantics id="S7.E2Xa.2.1.1.m1.1a"><mrow id="S7.E2Xa.2.1.1.m1.1.1" xref="S7.E2Xa.2.1.1.m1.1.1.cmml"><mo id="S7.E2Xa.2.1.1.m1.1.1a" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.cmml">+</mo><mrow id="S7.E2Xa.2.1.1.m1.1.1.1" xref="S7.E2Xa.2.1.1.m1.1.1.1.cmml"><mn id="S7.E2Xa.2.1.1.m1.1.1.1.3" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.3.cmml">0.2</mn><mo id="S7.E2Xa.2.1.1.m1.1.1.1.2" lspace="0.720em" xref="S7.E2Xa.2.1.1.m1.1.1.1.2.cmml">⁢</mo><mtext id="S7.E2Xa.2.1.1.m1.1.1.1.4" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.4a.cmml">exp</mtext><mo id="S7.E2Xa.2.1.1.m1.1.1.1.2a" xref="S7.E2Xa.2.1.1.m1.1.1.1.2.cmml">⁢</mo><mrow id="S7.E2Xa.2.1.1.m1.1.1.1.1.1" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.cmml"><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.2" maxsize="120%" minsize="120%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.cmml">(</mo><mrow id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.cmml"><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1a" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.cmml">−</mo><mrow id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.cmml"><mn id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.3.cmml">2.0</mn><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.2" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.2.cmml">⁢</mo><msup id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.cmml"><mrow id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.2.cmml"><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.cmml"><msubsup id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml">g</mi><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml">t</mi><mrow id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml"><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.2" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml">v</mi><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.3" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml">e</mi><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1a" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.4" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml">l</mi></mrow></msubsup><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.1" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><mrow id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><msubsup id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml"><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2.cmml">d</mi><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.3" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.3.cmml">t</mi><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml">∗</mo></msubsup><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.1" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml">⋅</mo><msubsup id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml"><mover accent="true" id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.cmml"><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.2" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.2.cmml">x</mi><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.1" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.1.cmml">˙</mo></mover><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml">t</mi><mrow id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.cmml"><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.2" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.2.cmml">r</mi><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1.cmml">⁢</mo><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.3" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.3.cmml">o</mi><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1a" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1.cmml">⁢</mo><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.4" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.4.cmml">o</mi><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1b" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1.cmml">⁢</mo><mi id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.5" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.5.cmml">t</mi></mrow></msubsup></mrow></mrow><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><mo id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.3" maxsize="120%" minsize="120%" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S7.E2Xa.2.1.1.m1.1b"><apply id="S7.E2Xa.2.1.1.m1.1.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1"><plus id="S7.E2Xa.2.1.1.m1.1.1.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1"></plus><apply id="S7.E2Xa.2.1.1.m1.1.1.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1"><times id="S7.E2Xa.2.1.1.m1.1.1.1.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.2"></times><cn id="S7.E2Xa.2.1.1.m1.1.1.1.3.cmml" type="float" xref="S7.E2Xa.2.1.1.m1.1.1.1.3">0.2</cn><ci id="S7.E2Xa.2.1.1.m1.1.1.1.4a.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.4"><mtext id="S7.E2Xa.2.1.1.m1.1.1.1.4.cmml" mathsize="144%" xref="S7.E2Xa.2.1.1.m1.1.1.1.4">exp</mtext></ci><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1"><minus id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1"></minus><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1"><times id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.2"></times><cn id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.3.cmml" type="float" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.3">2.0</cn><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1">superscript</csymbol><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1"><minus id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2">superscript</csymbol><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2">𝑔</ci><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3">𝑡</ci></apply><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3"><times id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1"></times><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.2">𝑣</ci><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.3">𝑒</ci><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.4">𝑙</ci></apply></apply><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3"><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.1">⋅</ci><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2"><csymbol cd="ambiguous" id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2">superscript</csymbol><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2"><csymbol cd="ambiguous" id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2">subscript</csymbol><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2">𝑑</ci><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.3.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.3">𝑡</ci></apply><times id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3"></times></apply><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3"><csymbol cd="ambiguous" id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3">subscript</csymbol><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3"><csymbol cd="ambiguous" id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3">superscript</csymbol><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2"><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.1">˙</ci><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.2">𝑥</ci></apply><apply id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3"><times id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1"></times><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.2.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.2">𝑟</ci><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.3.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.3">𝑜</ci><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.4.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.4">𝑜</ci><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.5.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.5">𝑡</ci></apply></apply><ci id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3">𝑡</ci></apply></apply></apply></apply><cn id="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S7.E2Xa.2.1.1.m1.1.1.1.1.1.1.1.1.3">2</cn></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E2Xa.2.1.1.m1.1c">\displaystyle+0.2\ \text{exp}\big{(}-2.0\left\|g_{t}^{vel}-d_{t}^{*}\cdot\dot{% x}^{root}_{t}\right\|^{2}\big{)}</annotation><annotation encoding="application/x-llamapun" id="S7.E2Xa.2.1.1.m1.1d">+ 0.2 exp ( - 2.0 ∥ italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v italic_e italic_l end_POSTSUPERSCRIPT - italic_d start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ⋅ over˙ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> </tr> <tr class="ltx_equation ltx_eqn_row ltx_align_baseline" id="S7.E2Xb"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_eqn_cell"></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle+0.2\ \left\|d^{*}_{t}\cdot d^{facing}_{t}\right\|^{2}" class="ltx_Math" display="inline" id="S7.E2Xb.2.1.1.m1.1"><semantics id="S7.E2Xb.2.1.1.m1.1a"><mrow id="S7.E2Xb.2.1.1.m1.1.1" xref="S7.E2Xb.2.1.1.m1.1.1.cmml"><mo id="S7.E2Xb.2.1.1.m1.1.1a" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.cmml">+</mo><mrow id="S7.E2Xb.2.1.1.m1.1.1.1" xref="S7.E2Xb.2.1.1.m1.1.1.1.cmml"><mn id="S7.E2Xb.2.1.1.m1.1.1.1.3" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.3.cmml">0.2</mn><mo id="S7.E2Xb.2.1.1.m1.1.1.1.2" lspace="0.720em" xref="S7.E2Xb.2.1.1.m1.1.1.1.2.cmml">⁢</mo><msup id="S7.E2Xb.2.1.1.m1.1.1.1.1" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.cmml"><mrow id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.2.cmml"><mo id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.2" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.cmml"><msubsup id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.cmml"><mi id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.2.2" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.2.2.cmml">d</mi><mi id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.3" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.3.cmml">t</mi><mo id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.2.3" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.2.3.cmml">∗</mo></msubsup><mo id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.1" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.1.cmml">⋅</mo><msubsup id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.cmml"><mi id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.2" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.2.cmml">d</mi><mi id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.3" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.3.cmml">t</mi><mrow id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.cmml"><mi id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.2" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.2.cmml">f</mi><mo id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1.cmml">⁢</mo><mi id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.3" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.3.cmml">a</mi><mo id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1a" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1.cmml">⁢</mo><mi id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.4" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.4.cmml">c</mi><mo id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1b" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1.cmml">⁢</mo><mi id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.5" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.5.cmml">i</mi><mo id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1c" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1.cmml">⁢</mo><mi id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.6" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.6.cmml">n</mi><mo id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1d" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1.cmml">⁢</mo><mi id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.7" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.7.cmml">g</mi></mrow></msubsup></mrow><mo id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.3" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E2Xb.2.1.1.m1.1.1.1.1.3" mathsize="144%" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S7.E2Xb.2.1.1.m1.1b"><apply id="S7.E2Xb.2.1.1.m1.1.1.cmml" xref="S7.E2Xb.2.1.1.m1.1.1"><plus id="S7.E2Xb.2.1.1.m1.1.1.2.cmml" xref="S7.E2Xb.2.1.1.m1.1.1"></plus><apply id="S7.E2Xb.2.1.1.m1.1.1.1.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1"><times id="S7.E2Xb.2.1.1.m1.1.1.1.2.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.2"></times><cn id="S7.E2Xb.2.1.1.m1.1.1.1.3.cmml" type="float" xref="S7.E2Xb.2.1.1.m1.1.1.1.3">0.2</cn><apply id="S7.E2Xb.2.1.1.m1.1.1.1.1.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E2Xb.2.1.1.m1.1.1.1.1.2.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1">superscript</csymbol><apply id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.2.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.2.1.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1"><ci id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.1.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.1">⋅</ci><apply id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2">subscript</csymbol><apply id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2">superscript</csymbol><ci id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.2.2">𝑑</ci><times id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.2.3"></times></apply><ci id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.2.3">𝑡</ci></apply><apply id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3">subscript</csymbol><apply id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3">superscript</csymbol><ci id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.2">𝑑</ci><apply id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3"><times id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.1"></times><ci id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.2.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.2">𝑓</ci><ci id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.3.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.3">𝑎</ci><ci id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.4.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.4">𝑐</ci><ci id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.5.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.5">𝑖</ci><ci id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.6.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.6">𝑛</ci><ci id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.7.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.2.3.7">𝑔</ci></apply></apply><ci id="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.1.1.1.3.3">𝑡</ci></apply></apply></apply><cn id="S7.E2Xb.2.1.1.m1.1.1.1.1.3.cmml" type="integer" xref="S7.E2Xb.2.1.1.m1.1.1.1.1.3">2</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E2Xb.2.1.1.m1.1c">\displaystyle+0.2\ \left\|d^{*}_{t}\cdot d^{facing}_{t}\right\|^{2}</annotation><annotation encoding="application/x-llamapun" id="S7.E2Xb.2.1.1.m1.1d">+ 0.2 ∥ italic_d start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⋅ italic_d start_POSTSUPERSCRIPT italic_f italic_a italic_c italic_i italic_n italic_g end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> </tr> </tbody> </table> </div> <div class="ltx_para" id="S7.I1.i1.p4"> <table class="ltx_equation ltx_eqn_table" id="S7.E3"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="r_{t}^{near}=\text{exp}\big{(}-10.0\left\|x^{*}-x_{t}^{root}\right\|^{2}\big{)}" class="ltx_Math" display="block" id="S7.E3.m1.1"><semantics id="S7.E3.m1.1a"><mrow id="S7.E3.m1.1.1" xref="S7.E3.m1.1.1.cmml"><msubsup id="S7.E3.m1.1.1.3" xref="S7.E3.m1.1.1.3.cmml"><mi id="S7.E3.m1.1.1.3.2.2" mathsize="144%" xref="S7.E3.m1.1.1.3.2.2.cmml">r</mi><mi id="S7.E3.m1.1.1.3.2.3" mathsize="144%" xref="S7.E3.m1.1.1.3.2.3.cmml">t</mi><mrow id="S7.E3.m1.1.1.3.3" xref="S7.E3.m1.1.1.3.3.cmml"><mi id="S7.E3.m1.1.1.3.3.2" mathsize="144%" xref="S7.E3.m1.1.1.3.3.2.cmml">n</mi><mo id="S7.E3.m1.1.1.3.3.1" xref="S7.E3.m1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E3.m1.1.1.3.3.3" mathsize="144%" xref="S7.E3.m1.1.1.3.3.3.cmml">e</mi><mo id="S7.E3.m1.1.1.3.3.1a" xref="S7.E3.m1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E3.m1.1.1.3.3.4" mathsize="144%" xref="S7.E3.m1.1.1.3.3.4.cmml">a</mi><mo id="S7.E3.m1.1.1.3.3.1b" xref="S7.E3.m1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E3.m1.1.1.3.3.5" mathsize="144%" xref="S7.E3.m1.1.1.3.3.5.cmml">r</mi></mrow></msubsup><mo id="S7.E3.m1.1.1.2" mathsize="144%" xref="S7.E3.m1.1.1.2.cmml">=</mo><mrow id="S7.E3.m1.1.1.1" xref="S7.E3.m1.1.1.1.cmml"><mtext id="S7.E3.m1.1.1.1.3" mathsize="144%" xref="S7.E3.m1.1.1.1.3a.cmml">exp</mtext><mo id="S7.E3.m1.1.1.1.2" xref="S7.E3.m1.1.1.1.2.cmml">⁢</mo><mrow id="S7.E3.m1.1.1.1.1.1" xref="S7.E3.m1.1.1.1.1.1.1.cmml"><mo id="S7.E3.m1.1.1.1.1.1.2" maxsize="120%" minsize="120%" xref="S7.E3.m1.1.1.1.1.1.1.cmml">(</mo><mrow id="S7.E3.m1.1.1.1.1.1.1" xref="S7.E3.m1.1.1.1.1.1.1.cmml"><mo id="S7.E3.m1.1.1.1.1.1.1a" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.cmml">−</mo><mrow id="S7.E3.m1.1.1.1.1.1.1.1" xref="S7.E3.m1.1.1.1.1.1.1.1.cmml"><mn id="S7.E3.m1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.1.3.cmml">10.0</mn><mo id="S7.E3.m1.1.1.1.1.1.1.1.2" xref="S7.E3.m1.1.1.1.1.1.1.1.2.cmml">⁢</mo><msup id="S7.E3.m1.1.1.1.1.1.1.1.1" xref="S7.E3.m1.1.1.1.1.1.1.1.1.cmml"><mrow id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.2.cmml"><mo id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.cmml"><msup id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2.2" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml">x</mi><mo id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2.3" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml">∗</mo></msup><mo id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><msubsup id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml">x</mi><mi id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml">t</mi><mrow id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml"><mi id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml">r</mi><mo id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml">o</mi><mo id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1a" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml">o</mi><mo id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1b" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml">t</mi></mrow></msubsup></mrow><mo id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E3.m1.1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E3.m1.1.1.1.1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><mo id="S7.E3.m1.1.1.1.1.1.3" maxsize="120%" minsize="120%" xref="S7.E3.m1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S7.E3.m1.1b"><apply id="S7.E3.m1.1.1.cmml" xref="S7.E3.m1.1.1"><eq id="S7.E3.m1.1.1.2.cmml" xref="S7.E3.m1.1.1.2"></eq><apply id="S7.E3.m1.1.1.3.cmml" xref="S7.E3.m1.1.1.3"><csymbol cd="ambiguous" id="S7.E3.m1.1.1.3.1.cmml" xref="S7.E3.m1.1.1.3">superscript</csymbol><apply id="S7.E3.m1.1.1.3.2.cmml" xref="S7.E3.m1.1.1.3"><csymbol cd="ambiguous" id="S7.E3.m1.1.1.3.2.1.cmml" xref="S7.E3.m1.1.1.3">subscript</csymbol><ci id="S7.E3.m1.1.1.3.2.2.cmml" xref="S7.E3.m1.1.1.3.2.2">𝑟</ci><ci id="S7.E3.m1.1.1.3.2.3.cmml" xref="S7.E3.m1.1.1.3.2.3">𝑡</ci></apply><apply id="S7.E3.m1.1.1.3.3.cmml" xref="S7.E3.m1.1.1.3.3"><times id="S7.E3.m1.1.1.3.3.1.cmml" xref="S7.E3.m1.1.1.3.3.1"></times><ci id="S7.E3.m1.1.1.3.3.2.cmml" xref="S7.E3.m1.1.1.3.3.2">𝑛</ci><ci id="S7.E3.m1.1.1.3.3.3.cmml" xref="S7.E3.m1.1.1.3.3.3">𝑒</ci><ci id="S7.E3.m1.1.1.3.3.4.cmml" xref="S7.E3.m1.1.1.3.3.4">𝑎</ci><ci id="S7.E3.m1.1.1.3.3.5.cmml" xref="S7.E3.m1.1.1.3.3.5">𝑟</ci></apply></apply><apply id="S7.E3.m1.1.1.1.cmml" xref="S7.E3.m1.1.1.1"><times id="S7.E3.m1.1.1.1.2.cmml" xref="S7.E3.m1.1.1.1.2"></times><ci id="S7.E3.m1.1.1.1.3a.cmml" xref="S7.E3.m1.1.1.1.3"><mtext id="S7.E3.m1.1.1.1.3.cmml" mathsize="144%" xref="S7.E3.m1.1.1.1.3">exp</mtext></ci><apply id="S7.E3.m1.1.1.1.1.1.1.cmml" xref="S7.E3.m1.1.1.1.1.1"><minus id="S7.E3.m1.1.1.1.1.1.1.2.cmml" xref="S7.E3.m1.1.1.1.1.1"></minus><apply id="S7.E3.m1.1.1.1.1.1.1.1.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1"><times id="S7.E3.m1.1.1.1.1.1.1.1.2.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.2"></times><cn id="S7.E3.m1.1.1.1.1.1.1.1.3.cmml" type="float" xref="S7.E3.m1.1.1.1.1.1.1.1.3">10.0</cn><apply id="S7.E3.m1.1.1.1.1.1.1.1.1.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E3.m1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1">superscript</csymbol><apply id="S7.E3.m1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E3.m1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1"><minus id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2">superscript</csymbol><ci id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2.2">𝑥</ci><times id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.2.3"></times></apply><apply id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3">superscript</csymbol><apply id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2">𝑥</ci><ci id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3">𝑡</ci></apply><apply id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3"><times id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1"></times><ci id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2">𝑟</ci><ci id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3">𝑜</ci><ci id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4">𝑜</ci><ci id="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml" xref="S7.E3.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5">𝑡</ci></apply></apply></apply></apply><cn id="S7.E3.m1.1.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S7.E3.m1.1.1.1.1.1.1.1.1.3">2</cn></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E3.m1.1c">r_{t}^{near}=\text{exp}\big{(}-10.0\left\|x^{*}-x_{t}^{root}\right\|^{2}\big{)}</annotation><annotation encoding="application/x-llamapun" id="S7.E3.m1.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n italic_e italic_a italic_r end_POSTSUPERSCRIPT = exp ( - 10.0 ∥ italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(3)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S7.I1.i1.p5"> <table class="ltx_equation ltx_eqn_table" id="S7.E4"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="r_{t}^{still}=\text{exp}\big{(}-2.0\left\|\dot{x}^{root}_{t}-\dot{x}^{root}_{t% -1}\right\|^{2})" class="ltx_Math" display="block" id="S7.E4.m1.1"><semantics id="S7.E4.m1.1a"><mrow id="S7.E4.m1.1.1" xref="S7.E4.m1.1.1.cmml"><msubsup id="S7.E4.m1.1.1.3" xref="S7.E4.m1.1.1.3.cmml"><mi id="S7.E4.m1.1.1.3.2.2" mathsize="144%" xref="S7.E4.m1.1.1.3.2.2.cmml">r</mi><mi id="S7.E4.m1.1.1.3.2.3" mathsize="144%" xref="S7.E4.m1.1.1.3.2.3.cmml">t</mi><mrow id="S7.E4.m1.1.1.3.3" xref="S7.E4.m1.1.1.3.3.cmml"><mi id="S7.E4.m1.1.1.3.3.2" mathsize="144%" xref="S7.E4.m1.1.1.3.3.2.cmml">s</mi><mo id="S7.E4.m1.1.1.3.3.1" xref="S7.E4.m1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E4.m1.1.1.3.3.3" mathsize="144%" xref="S7.E4.m1.1.1.3.3.3.cmml">t</mi><mo id="S7.E4.m1.1.1.3.3.1a" xref="S7.E4.m1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E4.m1.1.1.3.3.4" mathsize="144%" xref="S7.E4.m1.1.1.3.3.4.cmml">i</mi><mo id="S7.E4.m1.1.1.3.3.1b" xref="S7.E4.m1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E4.m1.1.1.3.3.5" mathsize="144%" xref="S7.E4.m1.1.1.3.3.5.cmml">l</mi><mo id="S7.E4.m1.1.1.3.3.1c" xref="S7.E4.m1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E4.m1.1.1.3.3.6" mathsize="144%" xref="S7.E4.m1.1.1.3.3.6.cmml">l</mi></mrow></msubsup><mo id="S7.E4.m1.1.1.2" mathsize="144%" xref="S7.E4.m1.1.1.2.cmml">=</mo><mrow id="S7.E4.m1.1.1.1" xref="S7.E4.m1.1.1.1.cmml"><mtext id="S7.E4.m1.1.1.1.3" mathsize="144%" xref="S7.E4.m1.1.1.1.3a.cmml">exp</mtext><mo id="S7.E4.m1.1.1.1.2" xref="S7.E4.m1.1.1.1.2.cmml">⁢</mo><mrow id="S7.E4.m1.1.1.1.1.1" xref="S7.E4.m1.1.1.1.1.1.1.cmml"><mo id="S7.E4.m1.1.1.1.1.1.2" maxsize="120%" minsize="120%" xref="S7.E4.m1.1.1.1.1.1.1.cmml">(</mo><mrow id="S7.E4.m1.1.1.1.1.1.1" xref="S7.E4.m1.1.1.1.1.1.1.cmml"><mo id="S7.E4.m1.1.1.1.1.1.1a" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.cmml">−</mo><mrow id="S7.E4.m1.1.1.1.1.1.1.1" xref="S7.E4.m1.1.1.1.1.1.1.1.cmml"><mn id="S7.E4.m1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.3.cmml">2.0</mn><mo id="S7.E4.m1.1.1.1.1.1.1.1.2" xref="S7.E4.m1.1.1.1.1.1.1.1.2.cmml">⁢</mo><msup id="S7.E4.m1.1.1.1.1.1.1.1.1" xref="S7.E4.m1.1.1.1.1.1.1.1.1.cmml"><mrow id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.2.cmml"><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.cmml"><msubsup id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"><mover accent="true" id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml"><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.2" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.2.cmml">x</mi><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.1" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.1.cmml">˙</mo></mover><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.3" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml">t</mi><mrow id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml"><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.2" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.2.cmml">r</mi><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.1" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.1.cmml">⁢</mo><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.3" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.3.cmml">o</mi><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.1a" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.1.cmml">⁢</mo><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.4" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.4.cmml">o</mi><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.1b" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.1.cmml">⁢</mo><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.5" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.5.cmml">t</mi></mrow></msubsup><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><msubsup id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><mover accent="true" id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml"><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2.cmml">x</mi><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.1" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.1.cmml">˙</mo></mover><mrow id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml"><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml">t</mi><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">−</mo><mn id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml">1</mn></mrow><mrow id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml"><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.2" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.2.cmml">r</mi><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.1" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.1.cmml">⁢</mo><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.3" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.3.cmml">o</mi><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.1a" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.1.cmml">⁢</mo><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.4" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.4.cmml">o</mi><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.1b" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.1.cmml">⁢</mo><mi id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.5" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.5.cmml">t</mi></mrow></msubsup></mrow><mo id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E4.m1.1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><mo id="S7.E4.m1.1.1.1.1.1.3" maxsize="144%" minsize="144%" xref="S7.E4.m1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S7.E4.m1.1b"><apply id="S7.E4.m1.1.1.cmml" xref="S7.E4.m1.1.1"><eq id="S7.E4.m1.1.1.2.cmml" xref="S7.E4.m1.1.1.2"></eq><apply id="S7.E4.m1.1.1.3.cmml" xref="S7.E4.m1.1.1.3"><csymbol cd="ambiguous" id="S7.E4.m1.1.1.3.1.cmml" xref="S7.E4.m1.1.1.3">superscript</csymbol><apply id="S7.E4.m1.1.1.3.2.cmml" xref="S7.E4.m1.1.1.3"><csymbol cd="ambiguous" id="S7.E4.m1.1.1.3.2.1.cmml" xref="S7.E4.m1.1.1.3">subscript</csymbol><ci id="S7.E4.m1.1.1.3.2.2.cmml" xref="S7.E4.m1.1.1.3.2.2">𝑟</ci><ci id="S7.E4.m1.1.1.3.2.3.cmml" xref="S7.E4.m1.1.1.3.2.3">𝑡</ci></apply><apply id="S7.E4.m1.1.1.3.3.cmml" xref="S7.E4.m1.1.1.3.3"><times id="S7.E4.m1.1.1.3.3.1.cmml" xref="S7.E4.m1.1.1.3.3.1"></times><ci id="S7.E4.m1.1.1.3.3.2.cmml" xref="S7.E4.m1.1.1.3.3.2">𝑠</ci><ci id="S7.E4.m1.1.1.3.3.3.cmml" xref="S7.E4.m1.1.1.3.3.3">𝑡</ci><ci id="S7.E4.m1.1.1.3.3.4.cmml" xref="S7.E4.m1.1.1.3.3.4">𝑖</ci><ci id="S7.E4.m1.1.1.3.3.5.cmml" xref="S7.E4.m1.1.1.3.3.5">𝑙</ci><ci id="S7.E4.m1.1.1.3.3.6.cmml" xref="S7.E4.m1.1.1.3.3.6">𝑙</ci></apply></apply><apply id="S7.E4.m1.1.1.1.cmml" xref="S7.E4.m1.1.1.1"><times id="S7.E4.m1.1.1.1.2.cmml" xref="S7.E4.m1.1.1.1.2"></times><ci id="S7.E4.m1.1.1.1.3a.cmml" xref="S7.E4.m1.1.1.1.3"><mtext id="S7.E4.m1.1.1.1.3.cmml" mathsize="144%" xref="S7.E4.m1.1.1.1.3">exp</mtext></ci><apply id="S7.E4.m1.1.1.1.1.1.1.cmml" xref="S7.E4.m1.1.1.1.1.1"><minus id="S7.E4.m1.1.1.1.1.1.1.2.cmml" xref="S7.E4.m1.1.1.1.1.1"></minus><apply id="S7.E4.m1.1.1.1.1.1.1.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1"><times id="S7.E4.m1.1.1.1.1.1.1.1.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.2"></times><cn id="S7.E4.m1.1.1.1.1.1.1.1.3.cmml" type="float" xref="S7.E4.m1.1.1.1.1.1.1.1.3">2.0</cn><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E4.m1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1">superscript</csymbol><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E4.m1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1"><minus id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2">subscript</csymbol><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2">superscript</csymbol><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2"><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.1">˙</ci><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.2">𝑥</ci></apply><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3"><times id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.1"></times><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.2">𝑟</ci><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.3.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.3">𝑜</ci><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.4.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.4">𝑜</ci><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.5.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.5">𝑡</ci></apply></apply><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.2.3">𝑡</ci></apply><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3">subscript</csymbol><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3">superscript</csymbol><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2"><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.1">˙</ci><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2">𝑥</ci></apply><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3"><times id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.1"></times><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.2">𝑟</ci><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.3.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.3">𝑜</ci><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.4.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.4">𝑜</ci><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.5.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.5">𝑡</ci></apply></apply><apply id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3"><minus id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1"></minus><ci id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2">𝑡</ci><cn id="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml" type="integer" xref="S7.E4.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3">1</cn></apply></apply></apply></apply><cn id="S7.E4.m1.1.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S7.E4.m1.1.1.1.1.1.1.1.1.3">2</cn></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E4.m1.1c">r_{t}^{still}=\text{exp}\big{(}-2.0\left\|\dot{x}^{root}_{t}-\dot{x}^{root}_{t% -1}\right\|^{2})</annotation><annotation encoding="application/x-llamapun" id="S7.E4.m1.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s italic_t italic_i italic_l italic_l end_POSTSUPERSCRIPT = exp ( - 2.0 ∥ over˙ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT - over˙ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t - 1 end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(4)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S7.I1.i1.p5.1"><span class="ltx_text" id="S7.I1.i1.p5.1.1" style="font-size:144%;">The main difference between Walk and Idle reward is that we allow a large distance threshold for Idle. We restrict the Walk skill to reach the target coordinate as close as possible, but only restrict Idle to maintain inside 3 meters distance.</span></p> </div> </li> <li class="ltx_item" id="S7.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S7.I1.i2.p1"> <p class="ltx_p" id="S7.I1.i2.p1.7"><span class="ltx_text ltx_font_bold" id="S7.I1.i2.p1.7.1" style="font-size:144%;">HSI Reward.</span><span class="ltx_text" id="S7.I1.i2.p1.7.2" style="font-size:144%;"> The HSI reward is defined in Eq </span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S7.E5" style="font-size:144%;" title="Equation 5 ‣ 2nd item ‣ 7 Reward Templates ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">5</span></a><span class="ltx_text" id="S7.I1.i2.p1.7.3" style="font-size:144%;">. The far reward </span><math alttext="r_{t}^{far}" class="ltx_Math" display="inline" id="S7.I1.i2.p1.1.m1.1"><semantics id="S7.I1.i2.p1.1.m1.1a"><msubsup id="S7.I1.i2.p1.1.m1.1.1" xref="S7.I1.i2.p1.1.m1.1.1.cmml"><mi id="S7.I1.i2.p1.1.m1.1.1.2.2" mathsize="144%" xref="S7.I1.i2.p1.1.m1.1.1.2.2.cmml">r</mi><mi id="S7.I1.i2.p1.1.m1.1.1.2.3" mathsize="144%" xref="S7.I1.i2.p1.1.m1.1.1.2.3.cmml">t</mi><mrow id="S7.I1.i2.p1.1.m1.1.1.3" xref="S7.I1.i2.p1.1.m1.1.1.3.cmml"><mi id="S7.I1.i2.p1.1.m1.1.1.3.2" mathsize="144%" xref="S7.I1.i2.p1.1.m1.1.1.3.2.cmml">f</mi><mo id="S7.I1.i2.p1.1.m1.1.1.3.1" xref="S7.I1.i2.p1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p1.1.m1.1.1.3.3" mathsize="144%" xref="S7.I1.i2.p1.1.m1.1.1.3.3.cmml">a</mi><mo id="S7.I1.i2.p1.1.m1.1.1.3.1a" xref="S7.I1.i2.p1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p1.1.m1.1.1.3.4" mathsize="144%" xref="S7.I1.i2.p1.1.m1.1.1.3.4.cmml">r</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i2.p1.1.m1.1b"><apply id="S7.I1.i2.p1.1.m1.1.1.cmml" xref="S7.I1.i2.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p1.1.m1.1.1.1.cmml" xref="S7.I1.i2.p1.1.m1.1.1">superscript</csymbol><apply id="S7.I1.i2.p1.1.m1.1.1.2.cmml" xref="S7.I1.i2.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p1.1.m1.1.1.2.1.cmml" xref="S7.I1.i2.p1.1.m1.1.1">subscript</csymbol><ci id="S7.I1.i2.p1.1.m1.1.1.2.2.cmml" xref="S7.I1.i2.p1.1.m1.1.1.2.2">𝑟</ci><ci id="S7.I1.i2.p1.1.m1.1.1.2.3.cmml" xref="S7.I1.i2.p1.1.m1.1.1.2.3">𝑡</ci></apply><apply id="S7.I1.i2.p1.1.m1.1.1.3.cmml" xref="S7.I1.i2.p1.1.m1.1.1.3"><times id="S7.I1.i2.p1.1.m1.1.1.3.1.cmml" xref="S7.I1.i2.p1.1.m1.1.1.3.1"></times><ci id="S7.I1.i2.p1.1.m1.1.1.3.2.cmml" xref="S7.I1.i2.p1.1.m1.1.1.3.2">𝑓</ci><ci id="S7.I1.i2.p1.1.m1.1.1.3.3.cmml" xref="S7.I1.i2.p1.1.m1.1.1.3.3">𝑎</ci><ci id="S7.I1.i2.p1.1.m1.1.1.3.4.cmml" xref="S7.I1.i2.p1.1.m1.1.1.3.4">𝑟</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i2.p1.1.m1.1c">r_{t}^{far}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i2.p1.1.m1.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_f italic_a italic_r end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i2.p1.7.4" style="font-size:144%;"> is to encourage the humanoid’s pelvis </span><math alttext="x^{root}" class="ltx_Math" display="inline" id="S7.I1.i2.p1.2.m2.1"><semantics id="S7.I1.i2.p1.2.m2.1a"><msup id="S7.I1.i2.p1.2.m2.1.1" xref="S7.I1.i2.p1.2.m2.1.1.cmml"><mi id="S7.I1.i2.p1.2.m2.1.1.2" mathsize="144%" xref="S7.I1.i2.p1.2.m2.1.1.2.cmml">x</mi><mrow id="S7.I1.i2.p1.2.m2.1.1.3" xref="S7.I1.i2.p1.2.m2.1.1.3.cmml"><mi id="S7.I1.i2.p1.2.m2.1.1.3.2" mathsize="144%" xref="S7.I1.i2.p1.2.m2.1.1.3.2.cmml">r</mi><mo id="S7.I1.i2.p1.2.m2.1.1.3.1" xref="S7.I1.i2.p1.2.m2.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p1.2.m2.1.1.3.3" mathsize="144%" xref="S7.I1.i2.p1.2.m2.1.1.3.3.cmml">o</mi><mo id="S7.I1.i2.p1.2.m2.1.1.3.1a" xref="S7.I1.i2.p1.2.m2.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p1.2.m2.1.1.3.4" mathsize="144%" xref="S7.I1.i2.p1.2.m2.1.1.3.4.cmml">o</mi><mo id="S7.I1.i2.p1.2.m2.1.1.3.1b" xref="S7.I1.i2.p1.2.m2.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p1.2.m2.1.1.3.5" mathsize="144%" xref="S7.I1.i2.p1.2.m2.1.1.3.5.cmml">t</mi></mrow></msup><annotation-xml encoding="MathML-Content" id="S7.I1.i2.p1.2.m2.1b"><apply id="S7.I1.i2.p1.2.m2.1.1.cmml" xref="S7.I1.i2.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p1.2.m2.1.1.1.cmml" xref="S7.I1.i2.p1.2.m2.1.1">superscript</csymbol><ci id="S7.I1.i2.p1.2.m2.1.1.2.cmml" xref="S7.I1.i2.p1.2.m2.1.1.2">𝑥</ci><apply id="S7.I1.i2.p1.2.m2.1.1.3.cmml" xref="S7.I1.i2.p1.2.m2.1.1.3"><times id="S7.I1.i2.p1.2.m2.1.1.3.1.cmml" xref="S7.I1.i2.p1.2.m2.1.1.3.1"></times><ci id="S7.I1.i2.p1.2.m2.1.1.3.2.cmml" xref="S7.I1.i2.p1.2.m2.1.1.3.2">𝑟</ci><ci id="S7.I1.i2.p1.2.m2.1.1.3.3.cmml" xref="S7.I1.i2.p1.2.m2.1.1.3.3">𝑜</ci><ci id="S7.I1.i2.p1.2.m2.1.1.3.4.cmml" xref="S7.I1.i2.p1.2.m2.1.1.3.4">𝑜</ci><ci id="S7.I1.i2.p1.2.m2.1.1.3.5.cmml" xref="S7.I1.i2.p1.2.m2.1.1.3.5">𝑡</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i2.p1.2.m2.1c">x^{root}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i2.p1.2.m2.1d">italic_x start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i2.p1.7.5" style="font-size:144%;"> to reach the target coordinate </span><math alttext="x^{*}" class="ltx_Math" display="inline" id="S7.I1.i2.p1.3.m3.1"><semantics id="S7.I1.i2.p1.3.m3.1a"><msup id="S7.I1.i2.p1.3.m3.1.1" xref="S7.I1.i2.p1.3.m3.1.1.cmml"><mi id="S7.I1.i2.p1.3.m3.1.1.2" mathsize="144%" xref="S7.I1.i2.p1.3.m3.1.1.2.cmml">x</mi><mo id="S7.I1.i2.p1.3.m3.1.1.3" mathsize="144%" xref="S7.I1.i2.p1.3.m3.1.1.3.cmml">∗</mo></msup><annotation-xml encoding="MathML-Content" id="S7.I1.i2.p1.3.m3.1b"><apply id="S7.I1.i2.p1.3.m3.1.1.cmml" xref="S7.I1.i2.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p1.3.m3.1.1.1.cmml" xref="S7.I1.i2.p1.3.m3.1.1">superscript</csymbol><ci id="S7.I1.i2.p1.3.m3.1.1.2.cmml" xref="S7.I1.i2.p1.3.m3.1.1.2">𝑥</ci><times id="S7.I1.i2.p1.3.m3.1.1.3.cmml" xref="S7.I1.i2.p1.3.m3.1.1.3"></times></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i2.p1.3.m3.1c">x^{*}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i2.p1.3.m3.1d">italic_x start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i2.p1.7.6" style="font-size:144%;"> with the target speed </span><math alttext="g_{t}^{vel}" class="ltx_Math" display="inline" id="S7.I1.i2.p1.4.m4.1"><semantics id="S7.I1.i2.p1.4.m4.1a"><msubsup id="S7.I1.i2.p1.4.m4.1.1" xref="S7.I1.i2.p1.4.m4.1.1.cmml"><mi id="S7.I1.i2.p1.4.m4.1.1.2.2" mathsize="144%" xref="S7.I1.i2.p1.4.m4.1.1.2.2.cmml">g</mi><mi id="S7.I1.i2.p1.4.m4.1.1.2.3" mathsize="144%" xref="S7.I1.i2.p1.4.m4.1.1.2.3.cmml">t</mi><mrow id="S7.I1.i2.p1.4.m4.1.1.3" xref="S7.I1.i2.p1.4.m4.1.1.3.cmml"><mi id="S7.I1.i2.p1.4.m4.1.1.3.2" mathsize="144%" xref="S7.I1.i2.p1.4.m4.1.1.3.2.cmml">v</mi><mo id="S7.I1.i2.p1.4.m4.1.1.3.1" xref="S7.I1.i2.p1.4.m4.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p1.4.m4.1.1.3.3" mathsize="144%" xref="S7.I1.i2.p1.4.m4.1.1.3.3.cmml">e</mi><mo id="S7.I1.i2.p1.4.m4.1.1.3.1a" xref="S7.I1.i2.p1.4.m4.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p1.4.m4.1.1.3.4" mathsize="144%" xref="S7.I1.i2.p1.4.m4.1.1.3.4.cmml">l</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i2.p1.4.m4.1b"><apply id="S7.I1.i2.p1.4.m4.1.1.cmml" xref="S7.I1.i2.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p1.4.m4.1.1.1.cmml" xref="S7.I1.i2.p1.4.m4.1.1">superscript</csymbol><apply id="S7.I1.i2.p1.4.m4.1.1.2.cmml" xref="S7.I1.i2.p1.4.m4.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p1.4.m4.1.1.2.1.cmml" xref="S7.I1.i2.p1.4.m4.1.1">subscript</csymbol><ci id="S7.I1.i2.p1.4.m4.1.1.2.2.cmml" xref="S7.I1.i2.p1.4.m4.1.1.2.2">𝑔</ci><ci id="S7.I1.i2.p1.4.m4.1.1.2.3.cmml" xref="S7.I1.i2.p1.4.m4.1.1.2.3">𝑡</ci></apply><apply id="S7.I1.i2.p1.4.m4.1.1.3.cmml" xref="S7.I1.i2.p1.4.m4.1.1.3"><times id="S7.I1.i2.p1.4.m4.1.1.3.1.cmml" xref="S7.I1.i2.p1.4.m4.1.1.3.1"></times><ci id="S7.I1.i2.p1.4.m4.1.1.3.2.cmml" xref="S7.I1.i2.p1.4.m4.1.1.3.2">𝑣</ci><ci id="S7.I1.i2.p1.4.m4.1.1.3.3.cmml" xref="S7.I1.i2.p1.4.m4.1.1.3.3">𝑒</ci><ci id="S7.I1.i2.p1.4.m4.1.1.3.4.cmml" xref="S7.I1.i2.p1.4.m4.1.1.3.4">𝑙</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i2.p1.4.m4.1c">g_{t}^{vel}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i2.p1.4.m4.1d">italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v italic_e italic_l end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i2.p1.7.7" style="font-size:144%;"> and target direction </span><math alttext="d^{*}_{t}" class="ltx_Math" display="inline" id="S7.I1.i2.p1.5.m5.1"><semantics id="S7.I1.i2.p1.5.m5.1a"><msubsup id="S7.I1.i2.p1.5.m5.1.1" xref="S7.I1.i2.p1.5.m5.1.1.cmml"><mi id="S7.I1.i2.p1.5.m5.1.1.2.2" mathsize="144%" xref="S7.I1.i2.p1.5.m5.1.1.2.2.cmml">d</mi><mi id="S7.I1.i2.p1.5.m5.1.1.3" mathsize="144%" xref="S7.I1.i2.p1.5.m5.1.1.3.cmml">t</mi><mo id="S7.I1.i2.p1.5.m5.1.1.2.3" mathsize="144%" xref="S7.I1.i2.p1.5.m5.1.1.2.3.cmml">∗</mo></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i2.p1.5.m5.1b"><apply id="S7.I1.i2.p1.5.m5.1.1.cmml" xref="S7.I1.i2.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p1.5.m5.1.1.1.cmml" xref="S7.I1.i2.p1.5.m5.1.1">subscript</csymbol><apply id="S7.I1.i2.p1.5.m5.1.1.2.cmml" xref="S7.I1.i2.p1.5.m5.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p1.5.m5.1.1.2.1.cmml" xref="S7.I1.i2.p1.5.m5.1.1">superscript</csymbol><ci id="S7.I1.i2.p1.5.m5.1.1.2.2.cmml" xref="S7.I1.i2.p1.5.m5.1.1.2.2">𝑑</ci><times id="S7.I1.i2.p1.5.m5.1.1.2.3.cmml" xref="S7.I1.i2.p1.5.m5.1.1.2.3"></times></apply><ci id="S7.I1.i2.p1.5.m5.1.1.3.cmml" xref="S7.I1.i2.p1.5.m5.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i2.p1.5.m5.1c">d^{*}_{t}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i2.p1.5.m5.1d">italic_d start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i2.p1.7.8" style="font-size:144%;">. Like UniHSI </span><cite class="ltx_cite ltx_citemacro_cite"><span class="ltx_text" id="S7.I1.i2.p1.7.9.1" style="font-size:144%;">[</span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib47" title=""><span class="ltx_text" style="font-size:90%;">47</span></a><span class="ltx_text" id="S7.I1.i2.p1.7.10.2" style="font-size:144%;">]</span></cite><span class="ltx_text" id="S7.I1.i2.p1.7.11" style="font-size:144%;">, the near reward </span><math alttext="r_{t}^{near}" class="ltx_Math" display="inline" id="S7.I1.i2.p1.6.m6.1"><semantics id="S7.I1.i2.p1.6.m6.1a"><msubsup id="S7.I1.i2.p1.6.m6.1.1" xref="S7.I1.i2.p1.6.m6.1.1.cmml"><mi id="S7.I1.i2.p1.6.m6.1.1.2.2" mathsize="144%" xref="S7.I1.i2.p1.6.m6.1.1.2.2.cmml">r</mi><mi id="S7.I1.i2.p1.6.m6.1.1.2.3" mathsize="144%" xref="S7.I1.i2.p1.6.m6.1.1.2.3.cmml">t</mi><mrow id="S7.I1.i2.p1.6.m6.1.1.3" xref="S7.I1.i2.p1.6.m6.1.1.3.cmml"><mi id="S7.I1.i2.p1.6.m6.1.1.3.2" mathsize="144%" xref="S7.I1.i2.p1.6.m6.1.1.3.2.cmml">n</mi><mo id="S7.I1.i2.p1.6.m6.1.1.3.1" xref="S7.I1.i2.p1.6.m6.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p1.6.m6.1.1.3.3" mathsize="144%" xref="S7.I1.i2.p1.6.m6.1.1.3.3.cmml">e</mi><mo id="S7.I1.i2.p1.6.m6.1.1.3.1a" xref="S7.I1.i2.p1.6.m6.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p1.6.m6.1.1.3.4" mathsize="144%" xref="S7.I1.i2.p1.6.m6.1.1.3.4.cmml">a</mi><mo id="S7.I1.i2.p1.6.m6.1.1.3.1b" xref="S7.I1.i2.p1.6.m6.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p1.6.m6.1.1.3.5" mathsize="144%" xref="S7.I1.i2.p1.6.m6.1.1.3.5.cmml">r</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i2.p1.6.m6.1b"><apply id="S7.I1.i2.p1.6.m6.1.1.cmml" xref="S7.I1.i2.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p1.6.m6.1.1.1.cmml" xref="S7.I1.i2.p1.6.m6.1.1">superscript</csymbol><apply id="S7.I1.i2.p1.6.m6.1.1.2.cmml" xref="S7.I1.i2.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p1.6.m6.1.1.2.1.cmml" xref="S7.I1.i2.p1.6.m6.1.1">subscript</csymbol><ci id="S7.I1.i2.p1.6.m6.1.1.2.2.cmml" xref="S7.I1.i2.p1.6.m6.1.1.2.2">𝑟</ci><ci id="S7.I1.i2.p1.6.m6.1.1.2.3.cmml" xref="S7.I1.i2.p1.6.m6.1.1.2.3">𝑡</ci></apply><apply id="S7.I1.i2.p1.6.m6.1.1.3.cmml" xref="S7.I1.i2.p1.6.m6.1.1.3"><times id="S7.I1.i2.p1.6.m6.1.1.3.1.cmml" xref="S7.I1.i2.p1.6.m6.1.1.3.1"></times><ci id="S7.I1.i2.p1.6.m6.1.1.3.2.cmml" xref="S7.I1.i2.p1.6.m6.1.1.3.2">𝑛</ci><ci id="S7.I1.i2.p1.6.m6.1.1.3.3.cmml" xref="S7.I1.i2.p1.6.m6.1.1.3.3">𝑒</ci><ci id="S7.I1.i2.p1.6.m6.1.1.3.4.cmml" xref="S7.I1.i2.p1.6.m6.1.1.3.4">𝑎</ci><ci id="S7.I1.i2.p1.6.m6.1.1.3.5.cmml" xref="S7.I1.i2.p1.6.m6.1.1.3.5">𝑟</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i2.p1.6.m6.1c">r_{t}^{near}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i2.p1.6.m6.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n italic_e italic_a italic_r end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i2.p1.7.12" style="font-size:144%;"> encourages the humanoid’s certain joint to contact the nearest point in an interactable part </span><math alttext="p" class="ltx_Math" display="inline" id="S7.I1.i2.p1.7.m7.1"><semantics id="S7.I1.i2.p1.7.m7.1a"><mi id="S7.I1.i2.p1.7.m7.1.1" mathsize="144%" xref="S7.I1.i2.p1.7.m7.1.1.cmml">p</mi><annotation-xml encoding="MathML-Content" id="S7.I1.i2.p1.7.m7.1b"><ci id="S7.I1.i2.p1.7.m7.1.1.cmml" xref="S7.I1.i2.p1.7.m7.1.1">𝑝</ci></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i2.p1.7.m7.1c">p</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i2.p1.7.m7.1d">italic_p</annotation></semantics></math><span class="ltx_text" id="S7.I1.i2.p1.7.13" style="font-size:144%;"> of the target object. For Sit we require pelvis to contact the target sitting point, while for Lie we require pelvis to reach the nearest point on the bed’s surface. For Reach, either left or right hand is supposed to reach the object’s surface. The task reward is defined as:</span></p> </div> <div class="ltx_para" id="S7.I1.i2.p2"> <table class="ltx_equation ltx_eqn_table" id="S7.E5"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="r^{G}_{t}=\left\{\begin{aligned} 0.7\ r_{t}^{near}&amp;+0.3\ r_{t}^{far},\left\|x_% {t}^{*}-x^{root}_{t}\right\|^{2}&gt;0.5\\ 0.7\ r_{t}^{near}&amp;+0.3,\text{otherwise}\end{aligned}\right." class="ltx_math_unparsed" display="block" id="S7.E5.m1.4"><semantics id="S7.E5.m1.4a"><mrow id="S7.E5.m1.4b"><msubsup id="S7.E5.m1.4.5"><mi id="S7.E5.m1.4.5.2.2" mathsize="144%">r</mi><mi id="S7.E5.m1.4.5.3" mathsize="144%">t</mi><mi id="S7.E5.m1.4.5.2.3" mathsize="144%">G</mi></msubsup><mo id="S7.E5.m1.4.6" mathsize="144%">=</mo><mrow id="S7.E5.m1.4.7"><mo id="S7.E5.m1.4.7.1">{</mo><mtable columnspacing="0pt" displaystyle="true" id="S7.E5.m1.4.4" rowspacing="0pt"><mtr id="S7.E5.m1.4.4a"><mtd class="ltx_align_right" columnalign="right" id="S7.E5.m1.4.4b"><mrow id="S7.E5.m1.2.2.2.3.1"><mn id="S7.E5.m1.2.2.2.3.1.2" mathsize="144%">0.7</mn><mo id="S7.E5.m1.2.2.2.3.1.1" lspace="0.720em">⁢</mo><msubsup id="S7.E5.m1.2.2.2.3.1.3"><mi id="S7.E5.m1.2.2.2.3.1.3.2.2" mathsize="144%">r</mi><mi id="S7.E5.m1.2.2.2.3.1.3.2.3" mathsize="144%">t</mi><mrow id="S7.E5.m1.2.2.2.3.1.3.3"><mi id="S7.E5.m1.2.2.2.3.1.3.3.2" mathsize="144%">n</mi><mo id="S7.E5.m1.2.2.2.3.1.3.3.1">⁢</mo><mi id="S7.E5.m1.2.2.2.3.1.3.3.3" mathsize="144%">e</mi><mo id="S7.E5.m1.2.2.2.3.1.3.3.1a">⁢</mo><mi id="S7.E5.m1.2.2.2.3.1.3.3.4" mathsize="144%">a</mi><mo id="S7.E5.m1.2.2.2.3.1.3.3.1b">⁢</mo><mi id="S7.E5.m1.2.2.2.3.1.3.3.5" mathsize="144%">r</mi></mrow></msubsup></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S7.E5.m1.4.4c"><mrow id="S7.E5.m1.2.2.2.2.2"><mrow id="S7.E5.m1.2.2.2.2.2.2.2"><mrow id="S7.E5.m1.1.1.1.1.1.1.1.1"><mo id="S7.E5.m1.1.1.1.1.1.1.1.1a" mathsize="144%">+</mo><mrow id="S7.E5.m1.1.1.1.1.1.1.1.1.2"><mn id="S7.E5.m1.1.1.1.1.1.1.1.1.2.2" mathsize="144%">0.3</mn><mo id="S7.E5.m1.1.1.1.1.1.1.1.1.2.1" lspace="0.720em">⁢</mo><msubsup id="S7.E5.m1.1.1.1.1.1.1.1.1.2.3"><mi id="S7.E5.m1.1.1.1.1.1.1.1.1.2.3.2.2" mathsize="144%">r</mi><mi id="S7.E5.m1.1.1.1.1.1.1.1.1.2.3.2.3" mathsize="144%">t</mi><mrow id="S7.E5.m1.1.1.1.1.1.1.1.1.2.3.3"><mi id="S7.E5.m1.1.1.1.1.1.1.1.1.2.3.3.2" mathsize="144%">f</mi><mo id="S7.E5.m1.1.1.1.1.1.1.1.1.2.3.3.1">⁢</mo><mi id="S7.E5.m1.1.1.1.1.1.1.1.1.2.3.3.3" mathsize="144%">a</mi><mo id="S7.E5.m1.1.1.1.1.1.1.1.1.2.3.3.1a">⁢</mo><mi id="S7.E5.m1.1.1.1.1.1.1.1.1.2.3.3.4" mathsize="144%">r</mi></mrow></msubsup></mrow></mrow><mo id="S7.E5.m1.2.2.2.2.2.2.2.3" mathsize="144%">,</mo><msup id="S7.E5.m1.2.2.2.2.2.2.2.2"><mrow id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1"><mo id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.2">‖</mo><mrow id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1"><msubsup id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.2"><mi id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.2.2.2" mathsize="144%">x</mi><mi id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.2.2.3" mathsize="144%">t</mi><mo id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.2.3" mathsize="144%">∗</mo></msubsup><mo id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.1" mathsize="144%">−</mo><msubsup id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.3"><mi id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.3.2.2" mathsize="144%">x</mi><mi id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.3.3" mathsize="144%">t</mi><mrow id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.3.2.3"><mi id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.3.2.3.2" mathsize="144%">r</mi><mo id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.3.2.3.1">⁢</mo><mi id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.3.2.3.3" mathsize="144%">o</mi><mo id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.3.2.3.1a">⁢</mo><mi id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.3.2.3.4" mathsize="144%">o</mi><mo id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.3.2.3.1b">⁢</mo><mi id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.1.3.2.3.5" mathsize="144%">t</mi></mrow></msubsup></mrow><mo id="S7.E5.m1.2.2.2.2.2.2.2.2.1.1.3">‖</mo></mrow><mn id="S7.E5.m1.2.2.2.2.2.2.2.2.3" mathsize="144%">2</mn></msup></mrow><mo id="S7.E5.m1.2.2.2.2.2.3" mathsize="144%">&gt;</mo><mn id="S7.E5.m1.2.2.2.2.2.4" mathsize="144%">0.5</mn></mrow></mtd></mtr><mtr id="S7.E5.m1.4.4d"><mtd class="ltx_align_right" columnalign="right" id="S7.E5.m1.4.4e"><mrow id="S7.E5.m1.4.4.4.3.1"><mn id="S7.E5.m1.4.4.4.3.1.2" mathsize="144%">0.7</mn><mo id="S7.E5.m1.4.4.4.3.1.1" lspace="0.720em">⁢</mo><msubsup id="S7.E5.m1.4.4.4.3.1.3"><mi id="S7.E5.m1.4.4.4.3.1.3.2.2" mathsize="144%">r</mi><mi id="S7.E5.m1.4.4.4.3.1.3.2.3" mathsize="144%">t</mi><mrow id="S7.E5.m1.4.4.4.3.1.3.3"><mi id="S7.E5.m1.4.4.4.3.1.3.3.2" mathsize="144%">n</mi><mo id="S7.E5.m1.4.4.4.3.1.3.3.1">⁢</mo><mi id="S7.E5.m1.4.4.4.3.1.3.3.3" mathsize="144%">e</mi><mo id="S7.E5.m1.4.4.4.3.1.3.3.1a">⁢</mo><mi id="S7.E5.m1.4.4.4.3.1.3.3.4" mathsize="144%">a</mi><mo id="S7.E5.m1.4.4.4.3.1.3.3.1b">⁢</mo><mi id="S7.E5.m1.4.4.4.3.1.3.3.5" mathsize="144%">r</mi></mrow></msubsup></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S7.E5.m1.4.4f"><mrow id="S7.E5.m1.4.4.4.2.2.2"><mrow id="S7.E5.m1.4.4.4.2.2.2.1"><mo id="S7.E5.m1.4.4.4.2.2.2.1a" mathsize="144%">+</mo><mn id="S7.E5.m1.4.4.4.2.2.2.1.2" mathsize="144%">0.3</mn></mrow><mo id="S7.E5.m1.4.4.4.2.2.2.2" mathsize="144%">,</mo><mtext id="S7.E5.m1.3.3.3.1.1.1" mathsize="144%">otherwise</mtext></mrow></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex" id="S7.E5.m1.4c">r^{G}_{t}=\left\{\begin{aligned} 0.7\ r_{t}^{near}&amp;+0.3\ r_{t}^{far},\left\|x_% {t}^{*}-x^{root}_{t}\right\|^{2}&gt;0.5\\ 0.7\ r_{t}^{near}&amp;+0.3,\text{otherwise}\end{aligned}\right.</annotation><annotation encoding="application/x-llamapun" id="S7.E5.m1.4d">italic_r start_POSTSUPERSCRIPT italic_G end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = { start_ROW start_CELL 0.7 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n italic_e italic_a italic_r end_POSTSUPERSCRIPT end_CELL start_CELL + 0.3 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_f italic_a italic_r end_POSTSUPERSCRIPT , ∥ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_x start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT &gt; 0.5 end_CELL end_ROW start_ROW start_CELL 0.7 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n italic_e italic_a italic_r end_POSTSUPERSCRIPT end_CELL start_CELL + 0.3 , otherwise end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(5)</span></td> </tr></tbody> </table> <table class="ltx_equationgroup ltx_eqn_table" id="S7.E6"> <tbody> <tr class="ltx_equation ltx_eqn_row ltx_align_baseline" id="S7.E6X"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle r_{t}^{far}" class="ltx_Math" display="inline" id="S7.E6X.2.1.1.m1.1"><semantics id="S7.E6X.2.1.1.m1.1a"><msubsup id="S7.E6X.2.1.1.m1.1.1" xref="S7.E6X.2.1.1.m1.1.1.cmml"><mi id="S7.E6X.2.1.1.m1.1.1.2.2" mathsize="144%" xref="S7.E6X.2.1.1.m1.1.1.2.2.cmml">r</mi><mi id="S7.E6X.2.1.1.m1.1.1.2.3" mathsize="144%" xref="S7.E6X.2.1.1.m1.1.1.2.3.cmml">t</mi><mrow id="S7.E6X.2.1.1.m1.1.1.3" xref="S7.E6X.2.1.1.m1.1.1.3.cmml"><mi id="S7.E6X.2.1.1.m1.1.1.3.2" mathsize="144%" xref="S7.E6X.2.1.1.m1.1.1.3.2.cmml">f</mi><mo id="S7.E6X.2.1.1.m1.1.1.3.1" xref="S7.E6X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E6X.2.1.1.m1.1.1.3.3" mathsize="144%" xref="S7.E6X.2.1.1.m1.1.1.3.3.cmml">a</mi><mo id="S7.E6X.2.1.1.m1.1.1.3.1a" xref="S7.E6X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E6X.2.1.1.m1.1.1.3.4" mathsize="144%" xref="S7.E6X.2.1.1.m1.1.1.3.4.cmml">r</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.E6X.2.1.1.m1.1b"><apply id="S7.E6X.2.1.1.m1.1.1.cmml" xref="S7.E6X.2.1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.E6X.2.1.1.m1.1.1.1.cmml" xref="S7.E6X.2.1.1.m1.1.1">superscript</csymbol><apply id="S7.E6X.2.1.1.m1.1.1.2.cmml" xref="S7.E6X.2.1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.E6X.2.1.1.m1.1.1.2.1.cmml" xref="S7.E6X.2.1.1.m1.1.1">subscript</csymbol><ci id="S7.E6X.2.1.1.m1.1.1.2.2.cmml" xref="S7.E6X.2.1.1.m1.1.1.2.2">𝑟</ci><ci id="S7.E6X.2.1.1.m1.1.1.2.3.cmml" xref="S7.E6X.2.1.1.m1.1.1.2.3">𝑡</ci></apply><apply id="S7.E6X.2.1.1.m1.1.1.3.cmml" xref="S7.E6X.2.1.1.m1.1.1.3"><times id="S7.E6X.2.1.1.m1.1.1.3.1.cmml" xref="S7.E6X.2.1.1.m1.1.1.3.1"></times><ci id="S7.E6X.2.1.1.m1.1.1.3.2.cmml" xref="S7.E6X.2.1.1.m1.1.1.3.2">𝑓</ci><ci id="S7.E6X.2.1.1.m1.1.1.3.3.cmml" xref="S7.E6X.2.1.1.m1.1.1.3.3">𝑎</ci><ci id="S7.E6X.2.1.1.m1.1.1.3.4.cmml" xref="S7.E6X.2.1.1.m1.1.1.3.4">𝑟</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E6X.2.1.1.m1.1c">\displaystyle r_{t}^{far}</annotation><annotation encoding="application/x-llamapun" id="S7.E6X.2.1.1.m1.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_f italic_a italic_r end_POSTSUPERSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=\text{exp}\big{(}-2.0\left\|g_{t}^{vel}-d_{t}^{*}\cdot\dot{x}^{% root}_{t}\right\|^{2}\big{)}" class="ltx_Math" display="inline" id="S7.E6X.3.2.2.m1.1"><semantics id="S7.E6X.3.2.2.m1.1a"><mrow id="S7.E6X.3.2.2.m1.1.1" xref="S7.E6X.3.2.2.m1.1.1.cmml"><mi id="S7.E6X.3.2.2.m1.1.1.3" xref="S7.E6X.3.2.2.m1.1.1.3.cmml"></mi><mo id="S7.E6X.3.2.2.m1.1.1.2" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.2.cmml">=</mo><mrow id="S7.E6X.3.2.2.m1.1.1.1" xref="S7.E6X.3.2.2.m1.1.1.1.cmml"><mtext id="S7.E6X.3.2.2.m1.1.1.1.3" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.3a.cmml">exp</mtext><mo id="S7.E6X.3.2.2.m1.1.1.1.2" xref="S7.E6X.3.2.2.m1.1.1.1.2.cmml">⁢</mo><mrow id="S7.E6X.3.2.2.m1.1.1.1.1.1" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.cmml"><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.2" maxsize="120%" minsize="120%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.cmml">(</mo><mrow id="S7.E6X.3.2.2.m1.1.1.1.1.1.1" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.cmml"><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1a" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.cmml">−</mo><mrow id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.cmml"><mn id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.3.cmml">2.0</mn><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.2" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.2.cmml">⁢</mo><msup id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.cmml"><mrow id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.2.cmml"><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.cmml"><msubsup id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml">g</mi><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml">t</mi><mrow id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml"><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.2" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml">v</mi><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.3" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml">e</mi><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1a" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.4" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml">l</mi></mrow></msubsup><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.1" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><mrow id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><msubsup id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml"><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2.cmml">d</mi><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.3" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.3.cmml">t</mi><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml">∗</mo></msubsup><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.1" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml">⋅</mo><msubsup id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml"><mover accent="true" id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.cmml"><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.2" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.2.cmml">x</mi><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.1" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.1.cmml">˙</mo></mover><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml">t</mi><mrow id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.cmml"><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.2" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.2.cmml">r</mi><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1.cmml">⁢</mo><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.3" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.3.cmml">o</mi><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1a" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1.cmml">⁢</mo><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.4" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.4.cmml">o</mi><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1b" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1.cmml">⁢</mo><mi id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.5" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.5.cmml">t</mi></mrow></msubsup></mrow></mrow><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><mo id="S7.E6X.3.2.2.m1.1.1.1.1.1.3" maxsize="120%" minsize="120%" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S7.E6X.3.2.2.m1.1b"><apply id="S7.E6X.3.2.2.m1.1.1.cmml" xref="S7.E6X.3.2.2.m1.1.1"><eq id="S7.E6X.3.2.2.m1.1.1.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.2"></eq><csymbol cd="latexml" id="S7.E6X.3.2.2.m1.1.1.3.cmml" xref="S7.E6X.3.2.2.m1.1.1.3">absent</csymbol><apply id="S7.E6X.3.2.2.m1.1.1.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1"><times id="S7.E6X.3.2.2.m1.1.1.1.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.2"></times><ci id="S7.E6X.3.2.2.m1.1.1.1.3a.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.3"><mtext id="S7.E6X.3.2.2.m1.1.1.1.3.cmml" mathsize="144%" xref="S7.E6X.3.2.2.m1.1.1.1.3">exp</mtext></ci><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1"><minus id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1"></minus><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1"><times id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.2"></times><cn id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.3.cmml" type="float" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.3">2.0</cn><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1">superscript</csymbol><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1"><minus id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2">superscript</csymbol><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2">𝑔</ci><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3">𝑡</ci></apply><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3"><times id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.1"></times><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.2">𝑣</ci><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.3">𝑒</ci><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.4">𝑙</ci></apply></apply><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3"><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.1">⋅</ci><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2"><csymbol cd="ambiguous" id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2">superscript</csymbol><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2"><csymbol cd="ambiguous" id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2">subscript</csymbol><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.2">𝑑</ci><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.3.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.3">𝑡</ci></apply><times id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3"></times></apply><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3"><csymbol cd="ambiguous" id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3">subscript</csymbol><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3"><csymbol cd="ambiguous" id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3">superscript</csymbol><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2"><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.1">˙</ci><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.2.2">𝑥</ci></apply><apply id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3"><times id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.1"></times><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.2.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.2">𝑟</ci><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.3.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.3">𝑜</ci><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.4.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.4">𝑜</ci><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.5.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.3.5">𝑡</ci></apply></apply><ci id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3">𝑡</ci></apply></apply></apply></apply><cn id="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S7.E6X.3.2.2.m1.1.1.1.1.1.1.1.1.3">2</cn></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E6X.3.2.2.m1.1c">\displaystyle=\text{exp}\big{(}-2.0\left\|g_{t}^{vel}-d_{t}^{*}\cdot\dot{x}^{% root}_{t}\right\|^{2}\big{)}</annotation><annotation encoding="application/x-llamapun" id="S7.E6X.3.2.2.m1.1d">= exp ( - 2.0 ∥ italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v italic_e italic_l end_POSTSUPERSCRIPT - italic_d start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ⋅ over˙ start_ARG italic_x end_ARG start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equationgroup ltx_align_right">(6)</span></td> </tr> </tbody> </table> <table class="ltx_equation ltx_eqn_table" id="S7.E7"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="r_{t}^{near}=\text{exp}\big{(}-10.0\left\|x_{t}^{*}-x_{t}^{root}\right\|^{2}% \big{)}" class="ltx_Math" display="block" id="S7.E7.m1.1"><semantics id="S7.E7.m1.1a"><mrow id="S7.E7.m1.1.1" xref="S7.E7.m1.1.1.cmml"><msubsup id="S7.E7.m1.1.1.3" xref="S7.E7.m1.1.1.3.cmml"><mi id="S7.E7.m1.1.1.3.2.2" mathsize="144%" xref="S7.E7.m1.1.1.3.2.2.cmml">r</mi><mi id="S7.E7.m1.1.1.3.2.3" mathsize="144%" xref="S7.E7.m1.1.1.3.2.3.cmml">t</mi><mrow id="S7.E7.m1.1.1.3.3" xref="S7.E7.m1.1.1.3.3.cmml"><mi id="S7.E7.m1.1.1.3.3.2" mathsize="144%" xref="S7.E7.m1.1.1.3.3.2.cmml">n</mi><mo id="S7.E7.m1.1.1.3.3.1" xref="S7.E7.m1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E7.m1.1.1.3.3.3" mathsize="144%" xref="S7.E7.m1.1.1.3.3.3.cmml">e</mi><mo id="S7.E7.m1.1.1.3.3.1a" xref="S7.E7.m1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E7.m1.1.1.3.3.4" mathsize="144%" xref="S7.E7.m1.1.1.3.3.4.cmml">a</mi><mo id="S7.E7.m1.1.1.3.3.1b" xref="S7.E7.m1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E7.m1.1.1.3.3.5" mathsize="144%" xref="S7.E7.m1.1.1.3.3.5.cmml">r</mi></mrow></msubsup><mo id="S7.E7.m1.1.1.2" mathsize="144%" xref="S7.E7.m1.1.1.2.cmml">=</mo><mrow id="S7.E7.m1.1.1.1" xref="S7.E7.m1.1.1.1.cmml"><mtext id="S7.E7.m1.1.1.1.3" mathsize="144%" xref="S7.E7.m1.1.1.1.3a.cmml">exp</mtext><mo id="S7.E7.m1.1.1.1.2" xref="S7.E7.m1.1.1.1.2.cmml">⁢</mo><mrow id="S7.E7.m1.1.1.1.1.1" xref="S7.E7.m1.1.1.1.1.1.1.cmml"><mo id="S7.E7.m1.1.1.1.1.1.2" maxsize="120%" minsize="120%" xref="S7.E7.m1.1.1.1.1.1.1.cmml">(</mo><mrow id="S7.E7.m1.1.1.1.1.1.1" xref="S7.E7.m1.1.1.1.1.1.1.cmml"><mo id="S7.E7.m1.1.1.1.1.1.1a" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.cmml">−</mo><mrow id="S7.E7.m1.1.1.1.1.1.1.1" xref="S7.E7.m1.1.1.1.1.1.1.1.cmml"><mn id="S7.E7.m1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.3.cmml">10.0</mn><mo id="S7.E7.m1.1.1.1.1.1.1.1.2" xref="S7.E7.m1.1.1.1.1.1.1.1.2.cmml">⁢</mo><msup id="S7.E7.m1.1.1.1.1.1.1.1.1" xref="S7.E7.m1.1.1.1.1.1.1.1.1.cmml"><mrow id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.2.cmml"><mo id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.cmml"><msubsup id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml">x</mi><mi id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml">t</mi><mo id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.3" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml">∗</mo></msubsup><mo id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.1" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><msubsup id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml">x</mi><mi id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml">t</mi><mrow id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml"><mi id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml">r</mi><mo id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml">o</mi><mo id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1a" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml">o</mi><mo id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1b" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml">t</mi></mrow></msubsup></mrow><mo id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E7.m1.1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E7.m1.1.1.1.1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><mo id="S7.E7.m1.1.1.1.1.1.3" maxsize="120%" minsize="120%" xref="S7.E7.m1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S7.E7.m1.1b"><apply id="S7.E7.m1.1.1.cmml" xref="S7.E7.m1.1.1"><eq id="S7.E7.m1.1.1.2.cmml" xref="S7.E7.m1.1.1.2"></eq><apply id="S7.E7.m1.1.1.3.cmml" xref="S7.E7.m1.1.1.3"><csymbol cd="ambiguous" id="S7.E7.m1.1.1.3.1.cmml" xref="S7.E7.m1.1.1.3">superscript</csymbol><apply id="S7.E7.m1.1.1.3.2.cmml" xref="S7.E7.m1.1.1.3"><csymbol cd="ambiguous" id="S7.E7.m1.1.1.3.2.1.cmml" xref="S7.E7.m1.1.1.3">subscript</csymbol><ci id="S7.E7.m1.1.1.3.2.2.cmml" xref="S7.E7.m1.1.1.3.2.2">𝑟</ci><ci id="S7.E7.m1.1.1.3.2.3.cmml" xref="S7.E7.m1.1.1.3.2.3">𝑡</ci></apply><apply id="S7.E7.m1.1.1.3.3.cmml" xref="S7.E7.m1.1.1.3.3"><times id="S7.E7.m1.1.1.3.3.1.cmml" xref="S7.E7.m1.1.1.3.3.1"></times><ci id="S7.E7.m1.1.1.3.3.2.cmml" xref="S7.E7.m1.1.1.3.3.2">𝑛</ci><ci id="S7.E7.m1.1.1.3.3.3.cmml" xref="S7.E7.m1.1.1.3.3.3">𝑒</ci><ci id="S7.E7.m1.1.1.3.3.4.cmml" xref="S7.E7.m1.1.1.3.3.4">𝑎</ci><ci id="S7.E7.m1.1.1.3.3.5.cmml" xref="S7.E7.m1.1.1.3.3.5">𝑟</ci></apply></apply><apply id="S7.E7.m1.1.1.1.cmml" xref="S7.E7.m1.1.1.1"><times id="S7.E7.m1.1.1.1.2.cmml" xref="S7.E7.m1.1.1.1.2"></times><ci id="S7.E7.m1.1.1.1.3a.cmml" xref="S7.E7.m1.1.1.1.3"><mtext id="S7.E7.m1.1.1.1.3.cmml" mathsize="144%" xref="S7.E7.m1.1.1.1.3">exp</mtext></ci><apply id="S7.E7.m1.1.1.1.1.1.1.cmml" xref="S7.E7.m1.1.1.1.1.1"><minus id="S7.E7.m1.1.1.1.1.1.1.2.cmml" xref="S7.E7.m1.1.1.1.1.1"></minus><apply id="S7.E7.m1.1.1.1.1.1.1.1.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1"><times id="S7.E7.m1.1.1.1.1.1.1.1.2.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.2"></times><cn id="S7.E7.m1.1.1.1.1.1.1.1.3.cmml" type="float" xref="S7.E7.m1.1.1.1.1.1.1.1.3">10.0</cn><apply id="S7.E7.m1.1.1.1.1.1.1.1.1.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E7.m1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1">superscript</csymbol><apply id="S7.E7.m1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E7.m1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1"><minus id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2">superscript</csymbol><apply id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.2">𝑥</ci><ci id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.2.3">𝑡</ci></apply><times id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.2.3"></times></apply><apply id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3">superscript</csymbol><apply id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.2">𝑥</ci><ci id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.2.3">𝑡</ci></apply><apply id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3"><times id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.1"></times><ci id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.2">𝑟</ci><ci id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.3">𝑜</ci><ci id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.4">𝑜</ci><ci id="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml" xref="S7.E7.m1.1.1.1.1.1.1.1.1.1.1.1.3.3.5">𝑡</ci></apply></apply></apply></apply><cn id="S7.E7.m1.1.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S7.E7.m1.1.1.1.1.1.1.1.1.3">2</cn></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E7.m1.1c">r_{t}^{near}=\text{exp}\big{(}-10.0\left\|x_{t}^{*}-x_{t}^{root}\right\|^{2}% \big{)}</annotation><annotation encoding="application/x-llamapun" id="S7.E7.m1.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n italic_e italic_a italic_r end_POSTSUPERSCRIPT = exp ( - 10.0 ∥ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(7)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S7.I1.i2.p3"> <p class="ltx_p" id="S7.I1.i2.p3.1"><span class="ltx_text ltx_font_bold" id="S7.I1.i2.p3.1.1" style="font-size:144%;">Getup Reward.</span><span class="ltx_text" id="S7.I1.i2.p3.1.2" style="font-size:144%;"> The GetUp skill is developed through step goals, which combine walk and contact rewards. If the contact goal has not been reached, the reward encourages the humanoid to sit or lie on the object. Conversely, when the contact goal is achieved, the reward motivates the humanoid to elevate its pelvis to a standing position. The formulation for this reward system aligns with that of the contact reward </span><math alttext="r^{near}_{t}" class="ltx_Math" display="inline" id="S7.I1.i2.p3.1.m1.1"><semantics id="S7.I1.i2.p3.1.m1.1a"><msubsup id="S7.I1.i2.p3.1.m1.1.1" xref="S7.I1.i2.p3.1.m1.1.1.cmml"><mi id="S7.I1.i2.p3.1.m1.1.1.2.2" mathsize="144%" xref="S7.I1.i2.p3.1.m1.1.1.2.2.cmml">r</mi><mi id="S7.I1.i2.p3.1.m1.1.1.3" mathsize="144%" xref="S7.I1.i2.p3.1.m1.1.1.3.cmml">t</mi><mrow id="S7.I1.i2.p3.1.m1.1.1.2.3" xref="S7.I1.i2.p3.1.m1.1.1.2.3.cmml"><mi id="S7.I1.i2.p3.1.m1.1.1.2.3.2" mathsize="144%" xref="S7.I1.i2.p3.1.m1.1.1.2.3.2.cmml">n</mi><mo id="S7.I1.i2.p3.1.m1.1.1.2.3.1" xref="S7.I1.i2.p3.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p3.1.m1.1.1.2.3.3" mathsize="144%" xref="S7.I1.i2.p3.1.m1.1.1.2.3.3.cmml">e</mi><mo id="S7.I1.i2.p3.1.m1.1.1.2.3.1a" xref="S7.I1.i2.p3.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p3.1.m1.1.1.2.3.4" mathsize="144%" xref="S7.I1.i2.p3.1.m1.1.1.2.3.4.cmml">a</mi><mo id="S7.I1.i2.p3.1.m1.1.1.2.3.1b" xref="S7.I1.i2.p3.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.I1.i2.p3.1.m1.1.1.2.3.5" mathsize="144%" xref="S7.I1.i2.p3.1.m1.1.1.2.3.5.cmml">r</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i2.p3.1.m1.1b"><apply id="S7.I1.i2.p3.1.m1.1.1.cmml" xref="S7.I1.i2.p3.1.m1.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p3.1.m1.1.1.1.cmml" xref="S7.I1.i2.p3.1.m1.1.1">subscript</csymbol><apply id="S7.I1.i2.p3.1.m1.1.1.2.cmml" xref="S7.I1.i2.p3.1.m1.1.1"><csymbol cd="ambiguous" id="S7.I1.i2.p3.1.m1.1.1.2.1.cmml" xref="S7.I1.i2.p3.1.m1.1.1">superscript</csymbol><ci id="S7.I1.i2.p3.1.m1.1.1.2.2.cmml" xref="S7.I1.i2.p3.1.m1.1.1.2.2">𝑟</ci><apply id="S7.I1.i2.p3.1.m1.1.1.2.3.cmml" xref="S7.I1.i2.p3.1.m1.1.1.2.3"><times id="S7.I1.i2.p3.1.m1.1.1.2.3.1.cmml" xref="S7.I1.i2.p3.1.m1.1.1.2.3.1"></times><ci id="S7.I1.i2.p3.1.m1.1.1.2.3.2.cmml" xref="S7.I1.i2.p3.1.m1.1.1.2.3.2">𝑛</ci><ci id="S7.I1.i2.p3.1.m1.1.1.2.3.3.cmml" xref="S7.I1.i2.p3.1.m1.1.1.2.3.3">𝑒</ci><ci id="S7.I1.i2.p3.1.m1.1.1.2.3.4.cmml" xref="S7.I1.i2.p3.1.m1.1.1.2.3.4">𝑎</ci><ci id="S7.I1.i2.p3.1.m1.1.1.2.3.5.cmml" xref="S7.I1.i2.p3.1.m1.1.1.2.3.5">𝑟</ci></apply></apply><ci id="S7.I1.i2.p3.1.m1.1.1.3.cmml" xref="S7.I1.i2.p3.1.m1.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i2.p3.1.m1.1c">r^{near}_{t}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i2.p3.1.m1.1d">italic_r start_POSTSUPERSCRIPT italic_n italic_e italic_a italic_r end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i2.p3.1.3" style="font-size:144%;">.</span></p> </div> </li> <li class="ltx_item" id="S7.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S7.I1.i3.p1"> <p class="ltx_p" id="S7.I1.i3.p1.3"><span class="ltx_text ltx_font_bold" id="S7.I1.i3.p1.3.1" style="font-size:144%;">DOI Reward.</span><span class="ltx_text" id="S7.I1.i3.p1.3.2" style="font-size:144%;"> In this version, we only implement Carry skill in DOI task. However, our DOI reward could serve as a universal template for dynamic object interactions, like push, throw, etc. The reward is split into 3 parts: walk reward </span><math alttext="r^{walk}_{t}" class="ltx_Math" display="inline" id="S7.I1.i3.p1.1.m1.1"><semantics id="S7.I1.i3.p1.1.m1.1a"><msubsup id="S7.I1.i3.p1.1.m1.1.1" xref="S7.I1.i3.p1.1.m1.1.1.cmml"><mi id="S7.I1.i3.p1.1.m1.1.1.2.2" mathsize="144%" xref="S7.I1.i3.p1.1.m1.1.1.2.2.cmml">r</mi><mi id="S7.I1.i3.p1.1.m1.1.1.3" mathsize="144%" xref="S7.I1.i3.p1.1.m1.1.1.3.cmml">t</mi><mrow id="S7.I1.i3.p1.1.m1.1.1.2.3" xref="S7.I1.i3.p1.1.m1.1.1.2.3.cmml"><mi id="S7.I1.i3.p1.1.m1.1.1.2.3.2" mathsize="144%" xref="S7.I1.i3.p1.1.m1.1.1.2.3.2.cmml">w</mi><mo id="S7.I1.i3.p1.1.m1.1.1.2.3.1" xref="S7.I1.i3.p1.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.I1.i3.p1.1.m1.1.1.2.3.3" mathsize="144%" xref="S7.I1.i3.p1.1.m1.1.1.2.3.3.cmml">a</mi><mo id="S7.I1.i3.p1.1.m1.1.1.2.3.1a" xref="S7.I1.i3.p1.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.I1.i3.p1.1.m1.1.1.2.3.4" mathsize="144%" xref="S7.I1.i3.p1.1.m1.1.1.2.3.4.cmml">l</mi><mo id="S7.I1.i3.p1.1.m1.1.1.2.3.1b" xref="S7.I1.i3.p1.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.I1.i3.p1.1.m1.1.1.2.3.5" mathsize="144%" xref="S7.I1.i3.p1.1.m1.1.1.2.3.5.cmml">k</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i3.p1.1.m1.1b"><apply id="S7.I1.i3.p1.1.m1.1.1.cmml" xref="S7.I1.i3.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.I1.i3.p1.1.m1.1.1.1.cmml" xref="S7.I1.i3.p1.1.m1.1.1">subscript</csymbol><apply id="S7.I1.i3.p1.1.m1.1.1.2.cmml" xref="S7.I1.i3.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.I1.i3.p1.1.m1.1.1.2.1.cmml" xref="S7.I1.i3.p1.1.m1.1.1">superscript</csymbol><ci id="S7.I1.i3.p1.1.m1.1.1.2.2.cmml" xref="S7.I1.i3.p1.1.m1.1.1.2.2">𝑟</ci><apply id="S7.I1.i3.p1.1.m1.1.1.2.3.cmml" xref="S7.I1.i3.p1.1.m1.1.1.2.3"><times id="S7.I1.i3.p1.1.m1.1.1.2.3.1.cmml" xref="S7.I1.i3.p1.1.m1.1.1.2.3.1"></times><ci id="S7.I1.i3.p1.1.m1.1.1.2.3.2.cmml" xref="S7.I1.i3.p1.1.m1.1.1.2.3.2">𝑤</ci><ci id="S7.I1.i3.p1.1.m1.1.1.2.3.3.cmml" xref="S7.I1.i3.p1.1.m1.1.1.2.3.3">𝑎</ci><ci id="S7.I1.i3.p1.1.m1.1.1.2.3.4.cmml" xref="S7.I1.i3.p1.1.m1.1.1.2.3.4">𝑙</ci><ci id="S7.I1.i3.p1.1.m1.1.1.2.3.5.cmml" xref="S7.I1.i3.p1.1.m1.1.1.2.3.5">𝑘</ci></apply></apply><ci id="S7.I1.i3.p1.1.m1.1.1.3.cmml" xref="S7.I1.i3.p1.1.m1.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i3.p1.1.m1.1c">r^{walk}_{t}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i3.p1.1.m1.1d">italic_r start_POSTSUPERSCRIPT italic_w italic_a italic_l italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i3.p1.3.3" style="font-size:144%;">, encourages the humanoid walk to the object first; hand contact reward </span><math alttext="r_{t}^{hand}" class="ltx_Math" display="inline" id="S7.I1.i3.p1.2.m2.1"><semantics id="S7.I1.i3.p1.2.m2.1a"><msubsup id="S7.I1.i3.p1.2.m2.1.1" xref="S7.I1.i3.p1.2.m2.1.1.cmml"><mi id="S7.I1.i3.p1.2.m2.1.1.2.2" mathsize="144%" xref="S7.I1.i3.p1.2.m2.1.1.2.2.cmml">r</mi><mi id="S7.I1.i3.p1.2.m2.1.1.2.3" mathsize="144%" xref="S7.I1.i3.p1.2.m2.1.1.2.3.cmml">t</mi><mrow id="S7.I1.i3.p1.2.m2.1.1.3" xref="S7.I1.i3.p1.2.m2.1.1.3.cmml"><mi id="S7.I1.i3.p1.2.m2.1.1.3.2" mathsize="144%" xref="S7.I1.i3.p1.2.m2.1.1.3.2.cmml">h</mi><mo id="S7.I1.i3.p1.2.m2.1.1.3.1" xref="S7.I1.i3.p1.2.m2.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i3.p1.2.m2.1.1.3.3" mathsize="144%" xref="S7.I1.i3.p1.2.m2.1.1.3.3.cmml">a</mi><mo id="S7.I1.i3.p1.2.m2.1.1.3.1a" xref="S7.I1.i3.p1.2.m2.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i3.p1.2.m2.1.1.3.4" mathsize="144%" xref="S7.I1.i3.p1.2.m2.1.1.3.4.cmml">n</mi><mo id="S7.I1.i3.p1.2.m2.1.1.3.1b" xref="S7.I1.i3.p1.2.m2.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i3.p1.2.m2.1.1.3.5" mathsize="144%" xref="S7.I1.i3.p1.2.m2.1.1.3.5.cmml">d</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i3.p1.2.m2.1b"><apply id="S7.I1.i3.p1.2.m2.1.1.cmml" xref="S7.I1.i3.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S7.I1.i3.p1.2.m2.1.1.1.cmml" xref="S7.I1.i3.p1.2.m2.1.1">superscript</csymbol><apply id="S7.I1.i3.p1.2.m2.1.1.2.cmml" xref="S7.I1.i3.p1.2.m2.1.1"><csymbol cd="ambiguous" id="S7.I1.i3.p1.2.m2.1.1.2.1.cmml" xref="S7.I1.i3.p1.2.m2.1.1">subscript</csymbol><ci id="S7.I1.i3.p1.2.m2.1.1.2.2.cmml" xref="S7.I1.i3.p1.2.m2.1.1.2.2">𝑟</ci><ci id="S7.I1.i3.p1.2.m2.1.1.2.3.cmml" xref="S7.I1.i3.p1.2.m2.1.1.2.3">𝑡</ci></apply><apply id="S7.I1.i3.p1.2.m2.1.1.3.cmml" xref="S7.I1.i3.p1.2.m2.1.1.3"><times id="S7.I1.i3.p1.2.m2.1.1.3.1.cmml" xref="S7.I1.i3.p1.2.m2.1.1.3.1"></times><ci id="S7.I1.i3.p1.2.m2.1.1.3.2.cmml" xref="S7.I1.i3.p1.2.m2.1.1.3.2">ℎ</ci><ci id="S7.I1.i3.p1.2.m2.1.1.3.3.cmml" xref="S7.I1.i3.p1.2.m2.1.1.3.3">𝑎</ci><ci id="S7.I1.i3.p1.2.m2.1.1.3.4.cmml" xref="S7.I1.i3.p1.2.m2.1.1.3.4">𝑛</ci><ci id="S7.I1.i3.p1.2.m2.1.1.3.5.cmml" xref="S7.I1.i3.p1.2.m2.1.1.3.5">𝑑</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i3.p1.2.m2.1c">r_{t}^{hand}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i3.p1.2.m2.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_h italic_a italic_n italic_d end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i3.p1.3.4" style="font-size:144%;">, encourages the humanoid place its hand on the object before the task been completed; moving reward </span><math alttext="r_{t}^{carry}" class="ltx_Math" display="inline" id="S7.I1.i3.p1.3.m3.1"><semantics id="S7.I1.i3.p1.3.m3.1a"><msubsup id="S7.I1.i3.p1.3.m3.1.1" xref="S7.I1.i3.p1.3.m3.1.1.cmml"><mi id="S7.I1.i3.p1.3.m3.1.1.2.2" mathsize="144%" xref="S7.I1.i3.p1.3.m3.1.1.2.2.cmml">r</mi><mi id="S7.I1.i3.p1.3.m3.1.1.2.3" mathsize="144%" xref="S7.I1.i3.p1.3.m3.1.1.2.3.cmml">t</mi><mrow id="S7.I1.i3.p1.3.m3.1.1.3" xref="S7.I1.i3.p1.3.m3.1.1.3.cmml"><mi id="S7.I1.i3.p1.3.m3.1.1.3.2" mathsize="144%" xref="S7.I1.i3.p1.3.m3.1.1.3.2.cmml">c</mi><mo id="S7.I1.i3.p1.3.m3.1.1.3.1" xref="S7.I1.i3.p1.3.m3.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i3.p1.3.m3.1.1.3.3" mathsize="144%" xref="S7.I1.i3.p1.3.m3.1.1.3.3.cmml">a</mi><mo id="S7.I1.i3.p1.3.m3.1.1.3.1a" xref="S7.I1.i3.p1.3.m3.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i3.p1.3.m3.1.1.3.4" mathsize="144%" xref="S7.I1.i3.p1.3.m3.1.1.3.4.cmml">r</mi><mo id="S7.I1.i3.p1.3.m3.1.1.3.1b" xref="S7.I1.i3.p1.3.m3.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i3.p1.3.m3.1.1.3.5" mathsize="144%" xref="S7.I1.i3.p1.3.m3.1.1.3.5.cmml">r</mi><mo id="S7.I1.i3.p1.3.m3.1.1.3.1c" xref="S7.I1.i3.p1.3.m3.1.1.3.1.cmml">⁢</mo><mi id="S7.I1.i3.p1.3.m3.1.1.3.6" mathsize="144%" xref="S7.I1.i3.p1.3.m3.1.1.3.6.cmml">y</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.I1.i3.p1.3.m3.1b"><apply id="S7.I1.i3.p1.3.m3.1.1.cmml" xref="S7.I1.i3.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S7.I1.i3.p1.3.m3.1.1.1.cmml" xref="S7.I1.i3.p1.3.m3.1.1">superscript</csymbol><apply id="S7.I1.i3.p1.3.m3.1.1.2.cmml" xref="S7.I1.i3.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S7.I1.i3.p1.3.m3.1.1.2.1.cmml" xref="S7.I1.i3.p1.3.m3.1.1">subscript</csymbol><ci id="S7.I1.i3.p1.3.m3.1.1.2.2.cmml" xref="S7.I1.i3.p1.3.m3.1.1.2.2">𝑟</ci><ci id="S7.I1.i3.p1.3.m3.1.1.2.3.cmml" xref="S7.I1.i3.p1.3.m3.1.1.2.3">𝑡</ci></apply><apply id="S7.I1.i3.p1.3.m3.1.1.3.cmml" xref="S7.I1.i3.p1.3.m3.1.1.3"><times id="S7.I1.i3.p1.3.m3.1.1.3.1.cmml" xref="S7.I1.i3.p1.3.m3.1.1.3.1"></times><ci id="S7.I1.i3.p1.3.m3.1.1.3.2.cmml" xref="S7.I1.i3.p1.3.m3.1.1.3.2">𝑐</ci><ci id="S7.I1.i3.p1.3.m3.1.1.3.3.cmml" xref="S7.I1.i3.p1.3.m3.1.1.3.3">𝑎</ci><ci id="S7.I1.i3.p1.3.m3.1.1.3.4.cmml" xref="S7.I1.i3.p1.3.m3.1.1.3.4">𝑟</ci><ci id="S7.I1.i3.p1.3.m3.1.1.3.5.cmml" xref="S7.I1.i3.p1.3.m3.1.1.3.5">𝑟</ci><ci id="S7.I1.i3.p1.3.m3.1.1.3.6.cmml" xref="S7.I1.i3.p1.3.m3.1.1.3.6">𝑦</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.I1.i3.p1.3.m3.1c">r_{t}^{carry}</annotation><annotation encoding="application/x-llamapun" id="S7.I1.i3.p1.3.m3.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c italic_a italic_r italic_r italic_y end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S7.I1.i3.p1.3.5" style="font-size:144%;">, encourages to the object to the target position.</span></p> </div> <div class="ltx_para" id="S7.I1.i3.p2"> <table class="ltx_equation ltx_eqn_table" id="S7.E8"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="r^{G}_{t}=\left\{\begin{aligned} &amp;0.3\ r_{t}^{walk}+0.5\ r_{t}^{carry}+0.2\ r_% {t}^{hand},&amp;&amp;\|x_{t}^{obj}-x_{t}^{goal}\|^{2}&gt;0.5,\\ &amp;0.3\ r_{t}^{walk}+0.5\ r_{t}^{carry}+0.2,&amp;&amp;\text{otherwise}.\end{aligned}\right." class="ltx_math_unparsed" display="block" id="S7.E8.m1.4"><semantics id="S7.E8.m1.4a"><mrow id="S7.E8.m1.4b"><msubsup id="S7.E8.m1.4.5"><mi id="S7.E8.m1.4.5.2.2" mathsize="144%">r</mi><mi id="S7.E8.m1.4.5.3" mathsize="144%">t</mi><mi id="S7.E8.m1.4.5.2.3" mathsize="144%">G</mi></msubsup><mo id="S7.E8.m1.4.6" mathsize="144%">=</mo><mrow id="S7.E8.m1.4.7"><mo id="S7.E8.m1.4.7.1">{</mo><mtable columnspacing="0pt" displaystyle="true" id="S7.E8.m1.4.4" rowspacing="0pt"><mtr id="S7.E8.m1.4.4a"><mtd id="S7.E8.m1.4.4b"></mtd><mtd class="ltx_align_left" columnalign="left" id="S7.E8.m1.4.4c"><mrow id="S7.E8.m1.1.1.1.1.1.1"><mrow id="S7.E8.m1.1.1.1.1.1.1.1"><mrow id="S7.E8.m1.1.1.1.1.1.1.1.2"><mn id="S7.E8.m1.1.1.1.1.1.1.1.2.2" mathsize="144%">0.3</mn><mo id="S7.E8.m1.1.1.1.1.1.1.1.2.1" lspace="0.720em">⁢</mo><msubsup id="S7.E8.m1.1.1.1.1.1.1.1.2.3"><mi id="S7.E8.m1.1.1.1.1.1.1.1.2.3.2.2" mathsize="144%">r</mi><mi id="S7.E8.m1.1.1.1.1.1.1.1.2.3.2.3" mathsize="144%">t</mi><mrow id="S7.E8.m1.1.1.1.1.1.1.1.2.3.3"><mi id="S7.E8.m1.1.1.1.1.1.1.1.2.3.3.2" mathsize="144%">w</mi><mo id="S7.E8.m1.1.1.1.1.1.1.1.2.3.3.1">⁢</mo><mi id="S7.E8.m1.1.1.1.1.1.1.1.2.3.3.3" mathsize="144%">a</mi><mo id="S7.E8.m1.1.1.1.1.1.1.1.2.3.3.1a">⁢</mo><mi id="S7.E8.m1.1.1.1.1.1.1.1.2.3.3.4" mathsize="144%">l</mi><mo id="S7.E8.m1.1.1.1.1.1.1.1.2.3.3.1b">⁢</mo><mi id="S7.E8.m1.1.1.1.1.1.1.1.2.3.3.5" mathsize="144%">k</mi></mrow></msubsup></mrow><mo id="S7.E8.m1.1.1.1.1.1.1.1.1" mathsize="144%">+</mo><mrow id="S7.E8.m1.1.1.1.1.1.1.1.3"><mn id="S7.E8.m1.1.1.1.1.1.1.1.3.2" mathsize="144%">0.5</mn><mo id="S7.E8.m1.1.1.1.1.1.1.1.3.1" lspace="0.720em">⁢</mo><msubsup id="S7.E8.m1.1.1.1.1.1.1.1.3.3"><mi id="S7.E8.m1.1.1.1.1.1.1.1.3.3.2.2" mathsize="144%">r</mi><mi id="S7.E8.m1.1.1.1.1.1.1.1.3.3.2.3" mathsize="144%">t</mi><mrow id="S7.E8.m1.1.1.1.1.1.1.1.3.3.3"><mi id="S7.E8.m1.1.1.1.1.1.1.1.3.3.3.2" mathsize="144%">c</mi><mo id="S7.E8.m1.1.1.1.1.1.1.1.3.3.3.1">⁢</mo><mi id="S7.E8.m1.1.1.1.1.1.1.1.3.3.3.3" mathsize="144%">a</mi><mo id="S7.E8.m1.1.1.1.1.1.1.1.3.3.3.1a">⁢</mo><mi id="S7.E8.m1.1.1.1.1.1.1.1.3.3.3.4" mathsize="144%">r</mi><mo id="S7.E8.m1.1.1.1.1.1.1.1.3.3.3.1b">⁢</mo><mi id="S7.E8.m1.1.1.1.1.1.1.1.3.3.3.5" mathsize="144%">r</mi><mo id="S7.E8.m1.1.1.1.1.1.1.1.3.3.3.1c">⁢</mo><mi id="S7.E8.m1.1.1.1.1.1.1.1.3.3.3.6" mathsize="144%">y</mi></mrow></msubsup></mrow><mo id="S7.E8.m1.1.1.1.1.1.1.1.1a" mathsize="144%">+</mo><mrow id="S7.E8.m1.1.1.1.1.1.1.1.4"><mn id="S7.E8.m1.1.1.1.1.1.1.1.4.2" mathsize="144%">0.2</mn><mo id="S7.E8.m1.1.1.1.1.1.1.1.4.1" lspace="0.720em">⁢</mo><msubsup id="S7.E8.m1.1.1.1.1.1.1.1.4.3"><mi id="S7.E8.m1.1.1.1.1.1.1.1.4.3.2.2" mathsize="144%">r</mi><mi id="S7.E8.m1.1.1.1.1.1.1.1.4.3.2.3" mathsize="144%">t</mi><mrow id="S7.E8.m1.1.1.1.1.1.1.1.4.3.3"><mi id="S7.E8.m1.1.1.1.1.1.1.1.4.3.3.2" mathsize="144%">h</mi><mo id="S7.E8.m1.1.1.1.1.1.1.1.4.3.3.1">⁢</mo><mi id="S7.E8.m1.1.1.1.1.1.1.1.4.3.3.3" mathsize="144%">a</mi><mo id="S7.E8.m1.1.1.1.1.1.1.1.4.3.3.1a">⁢</mo><mi id="S7.E8.m1.1.1.1.1.1.1.1.4.3.3.4" mathsize="144%">n</mi><mo id="S7.E8.m1.1.1.1.1.1.1.1.4.3.3.1b">⁢</mo><mi id="S7.E8.m1.1.1.1.1.1.1.1.4.3.3.5" mathsize="144%">d</mi></mrow></msubsup></mrow></mrow><mo id="S7.E8.m1.1.1.1.1.1.1.2" mathsize="144%">,</mo></mrow></mtd><mtd id="S7.E8.m1.4.4d"></mtd><mtd class="ltx_align_left" columnalign="left" id="S7.E8.m1.4.4e"><mrow id="S7.E8.m1.2.2.2.2.1.1"><mrow id="S7.E8.m1.2.2.2.2.1.1.1"><msup id="S7.E8.m1.2.2.2.2.1.1.1.1"><mrow id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1"><mo id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.2" maxsize="144%" minsize="144%">‖</mo><mrow id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1"><msubsup id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.2"><mi id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.2.2.2" mathsize="144%">x</mi><mi id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.2.2.3" mathsize="144%">t</mi><mrow id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.2.3"><mi id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.2.3.2" mathsize="144%">o</mi><mo id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.2.3.1">⁢</mo><mi id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.2.3.3" mathsize="144%">b</mi><mo id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.2.3.1a">⁢</mo><mi id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.2.3.4" mathsize="144%">j</mi></mrow></msubsup><mo id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.1" mathsize="144%">−</mo><msubsup id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.3"><mi id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.3.2.2" mathsize="144%">x</mi><mi id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.3.2.3" mathsize="144%">t</mi><mrow id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.3.3"><mi id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.3.3.2" mathsize="144%">g</mi><mo id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.3.3.1">⁢</mo><mi id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.3.3.3" mathsize="144%">o</mi><mo id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.3.3.1a">⁢</mo><mi id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.3.3.4" mathsize="144%">a</mi><mo id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.3.3.1b">⁢</mo><mi id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.1.3.3.5" mathsize="144%">l</mi></mrow></msubsup></mrow><mo id="S7.E8.m1.2.2.2.2.1.1.1.1.1.1.3" maxsize="144%" minsize="144%">‖</mo></mrow><mn id="S7.E8.m1.2.2.2.2.1.1.1.1.3" mathsize="144%">2</mn></msup><mo id="S7.E8.m1.2.2.2.2.1.1.1.2" mathsize="144%">&gt;</mo><mn id="S7.E8.m1.2.2.2.2.1.1.1.3" mathsize="144%">0.5</mn></mrow><mo id="S7.E8.m1.2.2.2.2.1.1.2" mathsize="144%">,</mo></mrow></mtd></mtr><mtr id="S7.E8.m1.4.4f"><mtd id="S7.E8.m1.4.4g"></mtd><mtd class="ltx_align_left" columnalign="left" id="S7.E8.m1.4.4h"><mrow id="S7.E8.m1.3.3.3.1.1.1"><mrow id="S7.E8.m1.3.3.3.1.1.1.1"><mrow id="S7.E8.m1.3.3.3.1.1.1.1.2"><mn id="S7.E8.m1.3.3.3.1.1.1.1.2.2" mathsize="144%">0.3</mn><mo id="S7.E8.m1.3.3.3.1.1.1.1.2.1" lspace="0.720em">⁢</mo><msubsup id="S7.E8.m1.3.3.3.1.1.1.1.2.3"><mi id="S7.E8.m1.3.3.3.1.1.1.1.2.3.2.2" mathsize="144%">r</mi><mi id="S7.E8.m1.3.3.3.1.1.1.1.2.3.2.3" mathsize="144%">t</mi><mrow id="S7.E8.m1.3.3.3.1.1.1.1.2.3.3"><mi id="S7.E8.m1.3.3.3.1.1.1.1.2.3.3.2" mathsize="144%">w</mi><mo id="S7.E8.m1.3.3.3.1.1.1.1.2.3.3.1">⁢</mo><mi id="S7.E8.m1.3.3.3.1.1.1.1.2.3.3.3" mathsize="144%">a</mi><mo id="S7.E8.m1.3.3.3.1.1.1.1.2.3.3.1a">⁢</mo><mi id="S7.E8.m1.3.3.3.1.1.1.1.2.3.3.4" mathsize="144%">l</mi><mo id="S7.E8.m1.3.3.3.1.1.1.1.2.3.3.1b">⁢</mo><mi id="S7.E8.m1.3.3.3.1.1.1.1.2.3.3.5" mathsize="144%">k</mi></mrow></msubsup></mrow><mo id="S7.E8.m1.3.3.3.1.1.1.1.1" mathsize="144%">+</mo><mrow id="S7.E8.m1.3.3.3.1.1.1.1.3"><mn id="S7.E8.m1.3.3.3.1.1.1.1.3.2" mathsize="144%">0.5</mn><mo id="S7.E8.m1.3.3.3.1.1.1.1.3.1" lspace="0.720em">⁢</mo><msubsup id="S7.E8.m1.3.3.3.1.1.1.1.3.3"><mi id="S7.E8.m1.3.3.3.1.1.1.1.3.3.2.2" mathsize="144%">r</mi><mi id="S7.E8.m1.3.3.3.1.1.1.1.3.3.2.3" mathsize="144%">t</mi><mrow id="S7.E8.m1.3.3.3.1.1.1.1.3.3.3"><mi id="S7.E8.m1.3.3.3.1.1.1.1.3.3.3.2" mathsize="144%">c</mi><mo id="S7.E8.m1.3.3.3.1.1.1.1.3.3.3.1">⁢</mo><mi id="S7.E8.m1.3.3.3.1.1.1.1.3.3.3.3" mathsize="144%">a</mi><mo id="S7.E8.m1.3.3.3.1.1.1.1.3.3.3.1a">⁢</mo><mi id="S7.E8.m1.3.3.3.1.1.1.1.3.3.3.4" mathsize="144%">r</mi><mo id="S7.E8.m1.3.3.3.1.1.1.1.3.3.3.1b">⁢</mo><mi id="S7.E8.m1.3.3.3.1.1.1.1.3.3.3.5" mathsize="144%">r</mi><mo id="S7.E8.m1.3.3.3.1.1.1.1.3.3.3.1c">⁢</mo><mi id="S7.E8.m1.3.3.3.1.1.1.1.3.3.3.6" mathsize="144%">y</mi></mrow></msubsup></mrow><mo id="S7.E8.m1.3.3.3.1.1.1.1.1a" mathsize="144%">+</mo><mn id="S7.E8.m1.3.3.3.1.1.1.1.4" mathsize="144%">0.2</mn></mrow><mo id="S7.E8.m1.3.3.3.1.1.1.2" mathsize="144%">,</mo></mrow></mtd><mtd id="S7.E8.m1.4.4i"></mtd><mtd class="ltx_align_left" columnalign="left" id="S7.E8.m1.4.4j"><mrow id="S7.E8.m1.4.4.4.2.1.3"><mtext id="S7.E8.m1.4.4.4.2.1.1" mathsize="144%">otherwise</mtext><mo id="S7.E8.m1.4.4.4.2.1.3.1" lspace="0em" mathsize="144%">.</mo></mrow></mtd></mtr></mtable></mrow></mrow><annotation encoding="application/x-tex" id="S7.E8.m1.4c">r^{G}_{t}=\left\{\begin{aligned} &amp;0.3\ r_{t}^{walk}+0.5\ r_{t}^{carry}+0.2\ r_% {t}^{hand},&amp;&amp;\|x_{t}^{obj}-x_{t}^{goal}\|^{2}&gt;0.5,\\ &amp;0.3\ r_{t}^{walk}+0.5\ r_{t}^{carry}+0.2,&amp;&amp;\text{otherwise}.\end{aligned}\right.</annotation><annotation encoding="application/x-llamapun" id="S7.E8.m1.4d">italic_r start_POSTSUPERSCRIPT italic_G end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = { start_ROW start_CELL end_CELL start_CELL 0.3 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_w italic_a italic_l italic_k end_POSTSUPERSCRIPT + 0.5 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c italic_a italic_r italic_r italic_y end_POSTSUPERSCRIPT + 0.2 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_h italic_a italic_n italic_d end_POSTSUPERSCRIPT , end_CELL start_CELL end_CELL start_CELL ∥ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_o italic_b italic_j end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g italic_o italic_a italic_l end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT &gt; 0.5 , end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL 0.3 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_w italic_a italic_l italic_k end_POSTSUPERSCRIPT + 0.5 italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c italic_a italic_r italic_r italic_y end_POSTSUPERSCRIPT + 0.2 , end_CELL start_CELL end_CELL start_CELL otherwise . end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(8)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S7.I1.i3.p3"> <table class="ltx_equationgroup ltx_eqn_table" id="S7.E9"> <tbody> <tr class="ltx_equation ltx_eqn_row ltx_align_baseline" id="S7.E9X"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle r_{t}^{walk}" class="ltx_Math" display="inline" id="S7.E9X.2.1.1.m1.1"><semantics id="S7.E9X.2.1.1.m1.1a"><msubsup id="S7.E9X.2.1.1.m1.1.1" xref="S7.E9X.2.1.1.m1.1.1.cmml"><mi id="S7.E9X.2.1.1.m1.1.1.2.2" mathsize="144%" xref="S7.E9X.2.1.1.m1.1.1.2.2.cmml">r</mi><mi id="S7.E9X.2.1.1.m1.1.1.2.3" mathsize="144%" xref="S7.E9X.2.1.1.m1.1.1.2.3.cmml">t</mi><mrow id="S7.E9X.2.1.1.m1.1.1.3" xref="S7.E9X.2.1.1.m1.1.1.3.cmml"><mi id="S7.E9X.2.1.1.m1.1.1.3.2" mathsize="144%" xref="S7.E9X.2.1.1.m1.1.1.3.2.cmml">w</mi><mo id="S7.E9X.2.1.1.m1.1.1.3.1" xref="S7.E9X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E9X.2.1.1.m1.1.1.3.3" mathsize="144%" xref="S7.E9X.2.1.1.m1.1.1.3.3.cmml">a</mi><mo id="S7.E9X.2.1.1.m1.1.1.3.1a" xref="S7.E9X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E9X.2.1.1.m1.1.1.3.4" mathsize="144%" xref="S7.E9X.2.1.1.m1.1.1.3.4.cmml">l</mi><mo id="S7.E9X.2.1.1.m1.1.1.3.1b" xref="S7.E9X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E9X.2.1.1.m1.1.1.3.5" mathsize="144%" xref="S7.E9X.2.1.1.m1.1.1.3.5.cmml">k</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.E9X.2.1.1.m1.1b"><apply id="S7.E9X.2.1.1.m1.1.1.cmml" xref="S7.E9X.2.1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.E9X.2.1.1.m1.1.1.1.cmml" xref="S7.E9X.2.1.1.m1.1.1">superscript</csymbol><apply id="S7.E9X.2.1.1.m1.1.1.2.cmml" xref="S7.E9X.2.1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.E9X.2.1.1.m1.1.1.2.1.cmml" xref="S7.E9X.2.1.1.m1.1.1">subscript</csymbol><ci id="S7.E9X.2.1.1.m1.1.1.2.2.cmml" xref="S7.E9X.2.1.1.m1.1.1.2.2">𝑟</ci><ci id="S7.E9X.2.1.1.m1.1.1.2.3.cmml" xref="S7.E9X.2.1.1.m1.1.1.2.3">𝑡</ci></apply><apply id="S7.E9X.2.1.1.m1.1.1.3.cmml" xref="S7.E9X.2.1.1.m1.1.1.3"><times id="S7.E9X.2.1.1.m1.1.1.3.1.cmml" xref="S7.E9X.2.1.1.m1.1.1.3.1"></times><ci id="S7.E9X.2.1.1.m1.1.1.3.2.cmml" xref="S7.E9X.2.1.1.m1.1.1.3.2">𝑤</ci><ci id="S7.E9X.2.1.1.m1.1.1.3.3.cmml" xref="S7.E9X.2.1.1.m1.1.1.3.3">𝑎</ci><ci id="S7.E9X.2.1.1.m1.1.1.3.4.cmml" xref="S7.E9X.2.1.1.m1.1.1.3.4">𝑙</ci><ci id="S7.E9X.2.1.1.m1.1.1.3.5.cmml" xref="S7.E9X.2.1.1.m1.1.1.3.5">𝑘</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E9X.2.1.1.m1.1c">\displaystyle r_{t}^{walk}</annotation><annotation encoding="application/x-llamapun" id="S7.E9X.2.1.1.m1.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_w italic_a italic_l italic_k end_POSTSUPERSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=0.8\cdot\exp\big{(}-10.0\cdot\|x_{t}^{root}-x_{t}^{obj}\|^{2}% \big{)}" class="ltx_Math" display="inline" id="S7.E9X.3.2.2.m1.2"><semantics id="S7.E9X.3.2.2.m1.2a"><mrow id="S7.E9X.3.2.2.m1.2.2" xref="S7.E9X.3.2.2.m1.2.2.cmml"><mi id="S7.E9X.3.2.2.m1.2.2.3" xref="S7.E9X.3.2.2.m1.2.2.3.cmml"></mi><mo id="S7.E9X.3.2.2.m1.2.2.2" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.2.cmml">=</mo><mrow id="S7.E9X.3.2.2.m1.2.2.1" xref="S7.E9X.3.2.2.m1.2.2.1.cmml"><mn id="S7.E9X.3.2.2.m1.2.2.1.3" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.3.cmml">0.8</mn><mo id="S7.E9X.3.2.2.m1.2.2.1.2" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E9X.3.2.2.m1.2.2.1.2.cmml">⋅</mo><mrow id="S7.E9X.3.2.2.m1.2.2.1.1.1" xref="S7.E9X.3.2.2.m1.2.2.1.1.2.cmml"><mi id="S7.E9X.3.2.2.m1.1.1" mathsize="144%" xref="S7.E9X.3.2.2.m1.1.1.cmml">exp</mi><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1a" xref="S7.E9X.3.2.2.m1.2.2.1.1.2.cmml">⁡</mo><mrow id="S7.E9X.3.2.2.m1.2.2.1.1.1.1" xref="S7.E9X.3.2.2.m1.2.2.1.1.2.cmml"><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.2" maxsize="120%" minsize="120%" xref="S7.E9X.3.2.2.m1.2.2.1.1.2.cmml">(</mo><mrow id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.cmml"><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1a" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.cmml">−</mo><mrow id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.cmml"><mn id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.3.cmml">10.0</mn><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.2" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.2.cmml">⋅</mo><msup id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.cmml"><mrow id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2.cmml"><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2" maxsize="144%" minsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.cmml"><msubsup id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.2" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml">x</mi><mi id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.3" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml">t</mi><mrow id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.cmml"><mi id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.2" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml">r</mi><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.3" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml">o</mi><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1a" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.4" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml">o</mi><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1b" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.5" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.5.cmml">t</mi></mrow></msubsup><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.1" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><msubsup id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.2" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml">x</mi><mi id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.3" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml">t</mi><mrow id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.cmml"><mi id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.2" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml">o</mi><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.3" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml">b</mi><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1a" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.4" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml">j</mi></mrow></msubsup></mrow><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3" maxsize="144%" minsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><mo id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.3" maxsize="120%" minsize="120%" xref="S7.E9X.3.2.2.m1.2.2.1.1.2.cmml">)</mo></mrow></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S7.E9X.3.2.2.m1.2b"><apply id="S7.E9X.3.2.2.m1.2.2.cmml" xref="S7.E9X.3.2.2.m1.2.2"><eq id="S7.E9X.3.2.2.m1.2.2.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.2"></eq><csymbol cd="latexml" id="S7.E9X.3.2.2.m1.2.2.3.cmml" xref="S7.E9X.3.2.2.m1.2.2.3">absent</csymbol><apply id="S7.E9X.3.2.2.m1.2.2.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1"><ci id="S7.E9X.3.2.2.m1.2.2.1.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.2">⋅</ci><cn id="S7.E9X.3.2.2.m1.2.2.1.3.cmml" type="float" xref="S7.E9X.3.2.2.m1.2.2.1.3">0.8</cn><apply id="S7.E9X.3.2.2.m1.2.2.1.1.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1"><exp id="S7.E9X.3.2.2.m1.1.1.cmml" xref="S7.E9X.3.2.2.m1.1.1"></exp><apply id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1"><minus id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1"></minus><apply id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1"><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.2">⋅</ci><cn id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.3.cmml" type="float" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.3">10.0</cn><apply id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1">superscript</csymbol><apply id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1"><minus id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2">superscript</csymbol><apply id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.2">𝑥</ci><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.3">𝑡</ci></apply><apply id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3"><times id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1"></times><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.2">𝑟</ci><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.3">𝑜</ci><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.4">𝑜</ci><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.5.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.5">𝑡</ci></apply></apply><apply id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3">superscript</csymbol><apply id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.2">𝑥</ci><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.3">𝑡</ci></apply><apply id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3"><times id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1"></times><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.2">𝑜</ci><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.3">𝑏</ci><ci id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.4">𝑗</ci></apply></apply></apply></apply><cn id="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S7.E9X.3.2.2.m1.2.2.1.1.1.1.1.1.1.3">2</cn></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E9X.3.2.2.m1.2c">\displaystyle=0.8\cdot\exp\big{(}-10.0\cdot\|x_{t}^{root}-x_{t}^{obj}\|^{2}% \big{)}</annotation><annotation encoding="application/x-llamapun" id="S7.E9X.3.2.2.m1.2d">= 0.8 ⋅ roman_exp ( - 10.0 ⋅ ∥ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_o italic_b italic_j end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="2"><span class="ltx_tag ltx_tag_equationgroup ltx_align_right">(9)</span></td> </tr> <tr class="ltx_equation ltx_eqn_row ltx_align_baseline" id="S7.E9Xa"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_eqn_cell"></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle+0.2\cdot\exp\big{(}-2.0\cdot\|v_{t}^{root}-v_{t}^{goal}\|^{2}% \big{)}," class="ltx_Math" display="inline" id="S7.E9Xa.2.1.1.m1.2"><semantics id="S7.E9Xa.2.1.1.m1.2a"><mrow id="S7.E9Xa.2.1.1.m1.2.2.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.cmml"><mrow id="S7.E9Xa.2.1.1.m1.2.2.1.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.cmml"><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1a" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.cmml">+</mo><mrow id="S7.E9Xa.2.1.1.m1.2.2.1.1.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.cmml"><mn id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.3" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.3.cmml">0.2</mn><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.2" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.2.cmml">⋅</mo><mrow id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml"><mi id="S7.E9Xa.2.1.1.m1.1.1" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.1.1.cmml">exp</mi><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1a" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml">⁡</mo><mrow id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml"><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.2" maxsize="120%" minsize="120%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml">(</mo><mrow id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.cmml"><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1a" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.cmml">−</mo><mrow id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.cmml"><mn id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.3.cmml">2.0</mn><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.2" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.2.cmml">⋅</mo><msup id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.cmml"><mrow id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.cmml"><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.2" maxsize="144%" minsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><msubsup id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.2" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml">v</mi><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.3" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml">t</mi><mrow id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml"><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.2" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml">r</mi><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.3" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml">o</mi><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1a" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.4" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml">o</mi><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1b" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.5" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.5.cmml">t</mi></mrow></msubsup><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.1" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><msubsup id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.2" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml">v</mi><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.3" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml">t</mi><mrow id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml"><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.2" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml">g</mi><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.3" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml">o</mi><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1a" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.4" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml">a</mi><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1b" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.5" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml">l</mi></mrow></msubsup></mrow><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3" maxsize="144%" minsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><mo id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.3" maxsize="120%" minsize="120%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml">)</mo></mrow></mrow></mrow></mrow><mo id="S7.E9Xa.2.1.1.m1.2.2.1.2" mathsize="144%" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.cmml">,</mo></mrow><annotation-xml encoding="MathML-Content" id="S7.E9Xa.2.1.1.m1.2b"><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1"><plus id="S7.E9Xa.2.1.1.m1.2.2.1.1.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1"></plus><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1"><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.2">⋅</ci><cn id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.3.cmml" type="float" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.3">0.2</cn><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1"><exp id="S7.E9Xa.2.1.1.m1.1.1.cmml" xref="S7.E9Xa.2.1.1.m1.1.1"></exp><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1"><minus id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1"></minus><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1"><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.2">⋅</ci><cn id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.3.cmml" type="float" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.3">2.0</cn><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1">superscript</csymbol><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1"><minus id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2">superscript</csymbol><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.2">𝑣</ci><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.3">𝑡</ci></apply><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3"><times id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1"></times><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.2">𝑟</ci><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.3">𝑜</ci><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.4">𝑜</ci><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.5.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.5">𝑡</ci></apply></apply><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3">superscript</csymbol><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.2">𝑣</ci><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.3">𝑡</ci></apply><apply id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3"><times id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1"></times><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.2">𝑔</ci><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.3">𝑜</ci><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.4">𝑎</ci><ci id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.5">𝑙</ci></apply></apply></apply></apply><cn id="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S7.E9Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.3">2</cn></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E9Xa.2.1.1.m1.2c">\displaystyle+0.2\cdot\exp\big{(}-2.0\cdot\|v_{t}^{root}-v_{t}^{goal}\|^{2}% \big{)},</annotation><annotation encoding="application/x-llamapun" id="S7.E9Xa.2.1.1.m1.2d">+ 0.2 ⋅ roman_exp ( - 2.0 ⋅ ∥ italic_v start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r italic_o italic_o italic_t end_POSTSUPERSCRIPT - italic_v start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g italic_o italic_a italic_l end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ,</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> </tr> </tbody> </table> </div> <div class="ltx_para" id="S7.I1.i3.p4"> <table class="ltx_equationgroup ltx_eqn_table" id="S7.E10"> <tbody> <tr class="ltx_equation ltx_eqn_row ltx_align_baseline" id="S7.E10X"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle r_{t}^{hand}" class="ltx_Math" display="inline" id="S7.E10X.2.1.1.m1.1"><semantics id="S7.E10X.2.1.1.m1.1a"><msubsup id="S7.E10X.2.1.1.m1.1.1" xref="S7.E10X.2.1.1.m1.1.1.cmml"><mi id="S7.E10X.2.1.1.m1.1.1.2.2" mathsize="144%" xref="S7.E10X.2.1.1.m1.1.1.2.2.cmml">r</mi><mi id="S7.E10X.2.1.1.m1.1.1.2.3" mathsize="144%" xref="S7.E10X.2.1.1.m1.1.1.2.3.cmml">t</mi><mrow id="S7.E10X.2.1.1.m1.1.1.3" xref="S7.E10X.2.1.1.m1.1.1.3.cmml"><mi id="S7.E10X.2.1.1.m1.1.1.3.2" mathsize="144%" xref="S7.E10X.2.1.1.m1.1.1.3.2.cmml">h</mi><mo id="S7.E10X.2.1.1.m1.1.1.3.1" xref="S7.E10X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E10X.2.1.1.m1.1.1.3.3" mathsize="144%" xref="S7.E10X.2.1.1.m1.1.1.3.3.cmml">a</mi><mo id="S7.E10X.2.1.1.m1.1.1.3.1a" xref="S7.E10X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E10X.2.1.1.m1.1.1.3.4" mathsize="144%" xref="S7.E10X.2.1.1.m1.1.1.3.4.cmml">n</mi><mo id="S7.E10X.2.1.1.m1.1.1.3.1b" xref="S7.E10X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E10X.2.1.1.m1.1.1.3.5" mathsize="144%" xref="S7.E10X.2.1.1.m1.1.1.3.5.cmml">d</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.E10X.2.1.1.m1.1b"><apply id="S7.E10X.2.1.1.m1.1.1.cmml" xref="S7.E10X.2.1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.E10X.2.1.1.m1.1.1.1.cmml" xref="S7.E10X.2.1.1.m1.1.1">superscript</csymbol><apply id="S7.E10X.2.1.1.m1.1.1.2.cmml" xref="S7.E10X.2.1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.E10X.2.1.1.m1.1.1.2.1.cmml" xref="S7.E10X.2.1.1.m1.1.1">subscript</csymbol><ci id="S7.E10X.2.1.1.m1.1.1.2.2.cmml" xref="S7.E10X.2.1.1.m1.1.1.2.2">𝑟</ci><ci id="S7.E10X.2.1.1.m1.1.1.2.3.cmml" xref="S7.E10X.2.1.1.m1.1.1.2.3">𝑡</ci></apply><apply id="S7.E10X.2.1.1.m1.1.1.3.cmml" xref="S7.E10X.2.1.1.m1.1.1.3"><times id="S7.E10X.2.1.1.m1.1.1.3.1.cmml" xref="S7.E10X.2.1.1.m1.1.1.3.1"></times><ci id="S7.E10X.2.1.1.m1.1.1.3.2.cmml" xref="S7.E10X.2.1.1.m1.1.1.3.2">ℎ</ci><ci id="S7.E10X.2.1.1.m1.1.1.3.3.cmml" xref="S7.E10X.2.1.1.m1.1.1.3.3">𝑎</ci><ci id="S7.E10X.2.1.1.m1.1.1.3.4.cmml" xref="S7.E10X.2.1.1.m1.1.1.3.4">𝑛</ci><ci id="S7.E10X.2.1.1.m1.1.1.3.5.cmml" xref="S7.E10X.2.1.1.m1.1.1.3.5">𝑑</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E10X.2.1.1.m1.1c">\displaystyle r_{t}^{hand}</annotation><annotation encoding="application/x-llamapun" id="S7.E10X.2.1.1.m1.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_h italic_a italic_n italic_d end_POSTSUPERSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=\exp\big{(}-0.5\cdot\|x_{t}^{hand}-x_{t}^{obj}\|^{2}\big{)}" class="ltx_Math" display="inline" id="S7.E10X.3.2.2.m1.2"><semantics id="S7.E10X.3.2.2.m1.2a"><mrow id="S7.E10X.3.2.2.m1.2.2" xref="S7.E10X.3.2.2.m1.2.2.cmml"><mi id="S7.E10X.3.2.2.m1.2.2.3" xref="S7.E10X.3.2.2.m1.2.2.3.cmml"></mi><mo id="S7.E10X.3.2.2.m1.2.2.2" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.2.cmml">=</mo><mrow id="S7.E10X.3.2.2.m1.2.2.1.1" xref="S7.E10X.3.2.2.m1.2.2.1.2.cmml"><mi id="S7.E10X.3.2.2.m1.1.1" mathsize="144%" xref="S7.E10X.3.2.2.m1.1.1.cmml">exp</mi><mo id="S7.E10X.3.2.2.m1.2.2.1.1a" xref="S7.E10X.3.2.2.m1.2.2.1.2.cmml">⁡</mo><mrow id="S7.E10X.3.2.2.m1.2.2.1.1.1" xref="S7.E10X.3.2.2.m1.2.2.1.2.cmml"><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.2" maxsize="120%" minsize="120%" xref="S7.E10X.3.2.2.m1.2.2.1.2.cmml">(</mo><mrow id="S7.E10X.3.2.2.m1.2.2.1.1.1.1" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.cmml"><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.1a" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.cmml">−</mo><mrow id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.cmml"><mn id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.3" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.3.cmml">0.5</mn><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.2" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.2.cmml">⋅</mo><msup id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.cmml"><mrow id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.2.cmml"><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2" maxsize="144%" minsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.cmml"><msubsup id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.2.2" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.2.2.cmml">x</mi><mi id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.2.3" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.2.3.cmml">t</mi><mrow id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.cmml"><mi id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.2" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.2.cmml">h</mi><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.1" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.3" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.3.cmml">a</mi><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.1a" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.4" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.4.cmml">n</mi><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.1b" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.5" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.5.cmml">d</mi></mrow></msubsup><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><msubsup id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.2.2" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.2.2.cmml">x</mi><mi id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.2.3" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.2.3.cmml">t</mi><mrow id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.cmml"><mi id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.2" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.2.cmml">o</mi><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.1" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.3" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.3.cmml">b</mi><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.1a" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.4" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.4.cmml">j</mi></mrow></msubsup></mrow><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.3" maxsize="144%" minsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><mo id="S7.E10X.3.2.2.m1.2.2.1.1.1.3" maxsize="120%" minsize="120%" xref="S7.E10X.3.2.2.m1.2.2.1.2.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S7.E10X.3.2.2.m1.2b"><apply id="S7.E10X.3.2.2.m1.2.2.cmml" xref="S7.E10X.3.2.2.m1.2.2"><eq id="S7.E10X.3.2.2.m1.2.2.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.2"></eq><csymbol cd="latexml" id="S7.E10X.3.2.2.m1.2.2.3.cmml" xref="S7.E10X.3.2.2.m1.2.2.3">absent</csymbol><apply id="S7.E10X.3.2.2.m1.2.2.1.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1"><exp id="S7.E10X.3.2.2.m1.1.1.cmml" xref="S7.E10X.3.2.2.m1.1.1"></exp><apply id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1"><minus id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1"></minus><apply id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1"><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.2">⋅</ci><cn id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.3.cmml" type="float" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.3">0.5</cn><apply id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1">superscript</csymbol><apply id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1"><minus id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2">superscript</csymbol><apply id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.2.2">𝑥</ci><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.2.3">𝑡</ci></apply><apply id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3"><times id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.1"></times><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.2">ℎ</ci><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.3.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.3">𝑎</ci><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.4.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.4">𝑛</ci><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.5.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2.3.5">𝑑</ci></apply></apply><apply id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3">superscript</csymbol><apply id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.2.2">𝑥</ci><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.2.3">𝑡</ci></apply><apply id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3"><times id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.1.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.1"></times><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.2.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.2">𝑜</ci><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.3.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.3">𝑏</ci><ci id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.4.cmml" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3.3.4">𝑗</ci></apply></apply></apply></apply><cn id="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.3.cmml" type="integer" xref="S7.E10X.3.2.2.m1.2.2.1.1.1.1.1.1.3">2</cn></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E10X.3.2.2.m1.2c">\displaystyle=\exp\big{(}-0.5\cdot\|x_{t}^{hand}-x_{t}^{obj}\|^{2}\big{)}</annotation><annotation encoding="application/x-llamapun" id="S7.E10X.3.2.2.m1.2d">= roman_exp ( - 0.5 ⋅ ∥ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_h italic_a italic_n italic_d end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_o italic_b italic_j end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equationgroup ltx_align_right">(10)</span></td> </tr> </tbody> </table> </div> <div class="ltx_para" id="S7.I1.i3.p5"> <table class="ltx_equationgroup ltx_eqn_table" id="S7.E11"> <tbody> <tr class="ltx_equation ltx_eqn_row ltx_align_baseline" id="S7.E11X"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle r_{t}^{carry}" class="ltx_Math" display="inline" id="S7.E11X.2.1.1.m1.1"><semantics id="S7.E11X.2.1.1.m1.1a"><msubsup id="S7.E11X.2.1.1.m1.1.1" xref="S7.E11X.2.1.1.m1.1.1.cmml"><mi id="S7.E11X.2.1.1.m1.1.1.2.2" mathsize="144%" xref="S7.E11X.2.1.1.m1.1.1.2.2.cmml">r</mi><mi id="S7.E11X.2.1.1.m1.1.1.2.3" mathsize="144%" xref="S7.E11X.2.1.1.m1.1.1.2.3.cmml">t</mi><mrow id="S7.E11X.2.1.1.m1.1.1.3" xref="S7.E11X.2.1.1.m1.1.1.3.cmml"><mi id="S7.E11X.2.1.1.m1.1.1.3.2" mathsize="144%" xref="S7.E11X.2.1.1.m1.1.1.3.2.cmml">c</mi><mo id="S7.E11X.2.1.1.m1.1.1.3.1" xref="S7.E11X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E11X.2.1.1.m1.1.1.3.3" mathsize="144%" xref="S7.E11X.2.1.1.m1.1.1.3.3.cmml">a</mi><mo id="S7.E11X.2.1.1.m1.1.1.3.1a" xref="S7.E11X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E11X.2.1.1.m1.1.1.3.4" mathsize="144%" xref="S7.E11X.2.1.1.m1.1.1.3.4.cmml">r</mi><mo id="S7.E11X.2.1.1.m1.1.1.3.1b" xref="S7.E11X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E11X.2.1.1.m1.1.1.3.5" mathsize="144%" xref="S7.E11X.2.1.1.m1.1.1.3.5.cmml">r</mi><mo id="S7.E11X.2.1.1.m1.1.1.3.1c" xref="S7.E11X.2.1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S7.E11X.2.1.1.m1.1.1.3.6" mathsize="144%" xref="S7.E11X.2.1.1.m1.1.1.3.6.cmml">y</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S7.E11X.2.1.1.m1.1b"><apply id="S7.E11X.2.1.1.m1.1.1.cmml" xref="S7.E11X.2.1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.E11X.2.1.1.m1.1.1.1.cmml" xref="S7.E11X.2.1.1.m1.1.1">superscript</csymbol><apply id="S7.E11X.2.1.1.m1.1.1.2.cmml" xref="S7.E11X.2.1.1.m1.1.1"><csymbol cd="ambiguous" id="S7.E11X.2.1.1.m1.1.1.2.1.cmml" xref="S7.E11X.2.1.1.m1.1.1">subscript</csymbol><ci id="S7.E11X.2.1.1.m1.1.1.2.2.cmml" xref="S7.E11X.2.1.1.m1.1.1.2.2">𝑟</ci><ci id="S7.E11X.2.1.1.m1.1.1.2.3.cmml" xref="S7.E11X.2.1.1.m1.1.1.2.3">𝑡</ci></apply><apply id="S7.E11X.2.1.1.m1.1.1.3.cmml" xref="S7.E11X.2.1.1.m1.1.1.3"><times id="S7.E11X.2.1.1.m1.1.1.3.1.cmml" xref="S7.E11X.2.1.1.m1.1.1.3.1"></times><ci id="S7.E11X.2.1.1.m1.1.1.3.2.cmml" xref="S7.E11X.2.1.1.m1.1.1.3.2">𝑐</ci><ci id="S7.E11X.2.1.1.m1.1.1.3.3.cmml" xref="S7.E11X.2.1.1.m1.1.1.3.3">𝑎</ci><ci id="S7.E11X.2.1.1.m1.1.1.3.4.cmml" xref="S7.E11X.2.1.1.m1.1.1.3.4">𝑟</ci><ci id="S7.E11X.2.1.1.m1.1.1.3.5.cmml" xref="S7.E11X.2.1.1.m1.1.1.3.5">𝑟</ci><ci id="S7.E11X.2.1.1.m1.1.1.3.6.cmml" xref="S7.E11X.2.1.1.m1.1.1.3.6">𝑦</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E11X.2.1.1.m1.1c">\displaystyle r_{t}^{carry}</annotation><annotation encoding="application/x-llamapun" id="S7.E11X.2.1.1.m1.1d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c italic_a italic_r italic_r italic_y end_POSTSUPERSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle=0.7\cdot\exp\big{(}-10.0\cdot\|x_{t}^{obj}-x_{t}^{goal}\|^{2}% \big{)}" class="ltx_Math" display="inline" id="S7.E11X.3.2.2.m1.2"><semantics id="S7.E11X.3.2.2.m1.2a"><mrow id="S7.E11X.3.2.2.m1.2.2" xref="S7.E11X.3.2.2.m1.2.2.cmml"><mi id="S7.E11X.3.2.2.m1.2.2.3" xref="S7.E11X.3.2.2.m1.2.2.3.cmml"></mi><mo id="S7.E11X.3.2.2.m1.2.2.2" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.2.cmml">=</mo><mrow id="S7.E11X.3.2.2.m1.2.2.1" xref="S7.E11X.3.2.2.m1.2.2.1.cmml"><mn id="S7.E11X.3.2.2.m1.2.2.1.3" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.3.cmml">0.7</mn><mo id="S7.E11X.3.2.2.m1.2.2.1.2" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E11X.3.2.2.m1.2.2.1.2.cmml">⋅</mo><mrow id="S7.E11X.3.2.2.m1.2.2.1.1.1" xref="S7.E11X.3.2.2.m1.2.2.1.1.2.cmml"><mi id="S7.E11X.3.2.2.m1.1.1" mathsize="144%" xref="S7.E11X.3.2.2.m1.1.1.cmml">exp</mi><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1a" xref="S7.E11X.3.2.2.m1.2.2.1.1.2.cmml">⁡</mo><mrow id="S7.E11X.3.2.2.m1.2.2.1.1.1.1" xref="S7.E11X.3.2.2.m1.2.2.1.1.2.cmml"><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.2" maxsize="120%" minsize="120%" xref="S7.E11X.3.2.2.m1.2.2.1.1.2.cmml">(</mo><mrow id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.cmml"><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1a" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.cmml">−</mo><mrow id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.cmml"><mn id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.3.cmml">10.0</mn><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.2" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.2.cmml">⋅</mo><msup id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.cmml"><mrow id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2.cmml"><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2" maxsize="144%" minsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.cmml"><msubsup id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.2" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml">x</mi><mi id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.3" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml">t</mi><mrow id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.cmml"><mi id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.2" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml">o</mi><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.3" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml">b</mi><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1a" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.4" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml">j</mi></mrow></msubsup><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.1" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><msubsup id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.2" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml">x</mi><mi id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.3" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml">t</mi><mrow id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.cmml"><mi id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.2" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml">g</mi><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.3" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml">o</mi><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1a" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.4" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml">a</mi><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1b" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.5" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml">l</mi></mrow></msubsup></mrow><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.3" maxsize="144%" minsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><mo id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.3" maxsize="120%" minsize="120%" xref="S7.E11X.3.2.2.m1.2.2.1.1.2.cmml">)</mo></mrow></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S7.E11X.3.2.2.m1.2b"><apply id="S7.E11X.3.2.2.m1.2.2.cmml" xref="S7.E11X.3.2.2.m1.2.2"><eq id="S7.E11X.3.2.2.m1.2.2.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.2"></eq><csymbol cd="latexml" id="S7.E11X.3.2.2.m1.2.2.3.cmml" xref="S7.E11X.3.2.2.m1.2.2.3">absent</csymbol><apply id="S7.E11X.3.2.2.m1.2.2.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1"><ci id="S7.E11X.3.2.2.m1.2.2.1.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.2">⋅</ci><cn id="S7.E11X.3.2.2.m1.2.2.1.3.cmml" type="float" xref="S7.E11X.3.2.2.m1.2.2.1.3">0.7</cn><apply id="S7.E11X.3.2.2.m1.2.2.1.1.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1"><exp id="S7.E11X.3.2.2.m1.1.1.cmml" xref="S7.E11X.3.2.2.m1.1.1"></exp><apply id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1"><minus id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1"></minus><apply id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1"><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.2">⋅</ci><cn id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.3.cmml" type="float" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.3">10.0</cn><apply id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1">superscript</csymbol><apply id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1"><minus id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2">superscript</csymbol><apply id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.2">𝑥</ci><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.2.3">𝑡</ci></apply><apply id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3"><times id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.1"></times><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.2">𝑜</ci><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.3">𝑏</ci><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.3.4">𝑗</ci></apply></apply><apply id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3">superscript</csymbol><apply id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.2">𝑥</ci><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.2.3">𝑡</ci></apply><apply id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3"><times id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.1"></times><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.2">𝑔</ci><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.3">𝑜</ci><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.4">𝑎</ci><ci id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.1.1.1.3.3.5">𝑙</ci></apply></apply></apply></apply><cn id="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S7.E11X.3.2.2.m1.2.2.1.1.1.1.1.1.1.3">2</cn></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E11X.3.2.2.m1.2c">\displaystyle=0.7\cdot\exp\big{(}-10.0\cdot\|x_{t}^{obj}-x_{t}^{goal}\|^{2}% \big{)}</annotation><annotation encoding="application/x-llamapun" id="S7.E11X.3.2.2.m1.2d">= 0.7 ⋅ roman_exp ( - 10.0 ⋅ ∥ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_o italic_b italic_j end_POSTSUPERSCRIPT - italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g italic_o italic_a italic_l end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT )</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="2"><span class="ltx_tag ltx_tag_equationgroup ltx_align_right">(11)</span></td> </tr> <tr class="ltx_equation ltx_eqn_row ltx_align_baseline" id="S7.E11Xa"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_eqn_cell"></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle+0.3\cdot\exp\big{(}-2.0\cdot\|v_{t}^{obj}-v_{t}^{goal}\|^{2}\big% {)}." class="ltx_Math" display="inline" id="S7.E11Xa.2.1.1.m1.2"><semantics id="S7.E11Xa.2.1.1.m1.2a"><mrow id="S7.E11Xa.2.1.1.m1.2.2.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.cmml"><mrow id="S7.E11Xa.2.1.1.m1.2.2.1.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.cmml"><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1a" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.cmml">+</mo><mrow id="S7.E11Xa.2.1.1.m1.2.2.1.1.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.cmml"><mn id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.3" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.3.cmml">0.3</mn><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.2" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.2.cmml">⋅</mo><mrow id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml"><mi id="S7.E11Xa.2.1.1.m1.1.1" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.1.1.cmml">exp</mi><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1a" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml">⁡</mo><mrow id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml"><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.2" maxsize="120%" minsize="120%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml">(</mo><mrow id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.cmml"><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1a" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.cmml">−</mo><mrow id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.cmml"><mn id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.3.cmml">2.0</mn><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.2" lspace="0.222em" mathsize="144%" rspace="0.222em" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.2.cmml">⋅</mo><msup id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.cmml"><mrow id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.cmml"><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.2" maxsize="144%" minsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo><mrow id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.cmml"><msubsup id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml"><mi id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.2" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml">v</mi><mi id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.3" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml">t</mi><mrow id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml"><mi id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.2" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml">o</mi><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.3" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml">b</mi><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1a" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">⁢</mo><mi id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.4" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml">j</mi></mrow></msubsup><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.1" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml">−</mo><msubsup id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml"><mi id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.2" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml">v</mi><mi id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.3" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml">t</mi><mrow id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml"><mi id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.2" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml">g</mi><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.3" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml">o</mi><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1a" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.4" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml">a</mi><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1b" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml">⁢</mo><mi id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.5" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml">l</mi></mrow></msubsup></mrow><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.3" maxsize="144%" minsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.3" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.3.cmml">2</mn></msup></mrow></mrow><mo id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.3" maxsize="120%" minsize="120%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml">)</mo></mrow></mrow></mrow></mrow><mo id="S7.E11Xa.2.1.1.m1.2.2.1.2" lspace="0em" mathsize="144%" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.cmml">.</mo></mrow><annotation-xml encoding="MathML-Content" id="S7.E11Xa.2.1.1.m1.2b"><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1"><plus id="S7.E11Xa.2.1.1.m1.2.2.1.1.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1"></plus><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1"><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.2">⋅</ci><cn id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.3.cmml" type="float" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.3">0.3</cn><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1"><exp id="S7.E11Xa.2.1.1.m1.1.1.cmml" xref="S7.E11Xa.2.1.1.m1.1.1"></exp><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1"><minus id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1"></minus><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1"><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.2">⋅</ci><cn id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.3.cmml" type="float" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.3">2.0</cn><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1">superscript</csymbol><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1"><minus id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.1"></minus><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2">superscript</csymbol><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.2">𝑣</ci><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.2.3">𝑡</ci></apply><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3"><times id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.1"></times><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.2">𝑜</ci><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.3">𝑏</ci><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.4.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.2.3.4">𝑗</ci></apply></apply><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3">superscript</csymbol><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.2">𝑣</ci><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.3.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.2.3">𝑡</ci></apply><apply id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3"><times id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.1"></times><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.2.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.2">𝑔</ci><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.3.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.3">𝑜</ci><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.4.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.4">𝑎</ci><ci id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.5.cmml" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.1.1.1.3.3.5">𝑙</ci></apply></apply></apply></apply><cn id="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S7.E11Xa.2.1.1.m1.2.2.1.1.1.1.1.1.1.1.1.3">2</cn></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S7.E11Xa.2.1.1.m1.2c">\displaystyle+0.3\cdot\exp\big{(}-2.0\cdot\|v_{t}^{obj}-v_{t}^{goal}\|^{2}\big% {)}.</annotation><annotation encoding="application/x-llamapun" id="S7.E11Xa.2.1.1.m1.2d">+ 0.3 ⋅ roman_exp ( - 2.0 ⋅ ∥ italic_v start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_o italic_b italic_j end_POSTSUPERSCRIPT - italic_v start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_g italic_o italic_a italic_l end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) .</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> </tr> </tbody> </table> </div> </li> </ul> </div> <figure class="ltx_figure" id="S7.F6"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_portrait" height="859" id="S7.F6.g1" src="x6.png" width="497"/> <figcaption class="ltx_caption ltx_centering" style="font-size:144%;"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S7.F6.4.1.1" style="font-size:63%;">Figure 6</span>: </span><span class="ltx_text" id="S7.F6.5.2" style="font-size:63%;">Scalability on new skills.</span></figcaption> </figure> </section> <section class="ltx_section ltx_centering" id="S8"> <h2 class="ltx_title ltx_title_section" style="font-size:144%;"> <span class="ltx_tag ltx_tag_section">8 </span>Re-implemented MotionCLIP</h2> <div class="ltx_para" id="S8.p1"> <p class="ltx_p" id="S8.p1.10"><span class="ltx_text" id="S8.p1.10.1" style="font-size:144%;">To control the policy language constraints, we aim to construct an embedding space fed into the policy network, where the embedding aligns motion representation with their corresponding natural language descriptions. To do this, we follow </span><cite class="ltx_cite ltx_citemacro_cite"><span class="ltx_text" id="S8.p1.10.2.1" style="font-size:144%;">[</span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib17" title=""><span class="ltx_text" style="font-size:90%;">17</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib38" title=""><span class="ltx_text" style="font-size:90%;">38</span></a><span class="ltx_text" id="S8.p1.10.3.2" style="font-size:144%;">]</span></cite><span class="ltx_text" id="S8.p1.10.4" style="font-size:144%;">, where a transformer auto-encoder is trained to encode motion sequences into a latent representation that aligns with the language embedding from a pre-trained CLIP text encoder </span><cite class="ltx_cite ltx_citemacro_cite"><span class="ltx_text" id="S8.p1.10.5.1" style="font-size:144%;">[</span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib31" title=""><span class="ltx_text" style="font-size:90%;">31</span></a><span class="ltx_text" id="S8.p1.10.6.2" style="font-size:144%;">]</span></cite><span class="ltx_text" id="S8.p1.10.7" style="font-size:144%;">. Given a motion clip </span><math alttext="\hat{\mathbf{m}}=(\hat{\mathbf{q}}_{1},\ldots,\hat{\mathbf{q}}_{n})" class="ltx_Math" display="inline" id="S8.p1.1.m1.3"><semantics id="S8.p1.1.m1.3a"><mrow id="S8.p1.1.m1.3.3" xref="S8.p1.1.m1.3.3.cmml"><mover accent="true" id="S8.p1.1.m1.3.3.4" xref="S8.p1.1.m1.3.3.4.cmml"><mi id="S8.p1.1.m1.3.3.4.2" mathsize="144%" xref="S8.p1.1.m1.3.3.4.2.cmml">𝐦</mi><mo id="S8.p1.1.m1.3.3.4.1" mathsize="144%" xref="S8.p1.1.m1.3.3.4.1.cmml">^</mo></mover><mo id="S8.p1.1.m1.3.3.3" mathsize="144%" xref="S8.p1.1.m1.3.3.3.cmml">=</mo><mrow id="S8.p1.1.m1.3.3.2.2" xref="S8.p1.1.m1.3.3.2.3.cmml"><mo id="S8.p1.1.m1.3.3.2.2.3" maxsize="144%" minsize="144%" xref="S8.p1.1.m1.3.3.2.3.cmml">(</mo><msub id="S8.p1.1.m1.2.2.1.1.1" xref="S8.p1.1.m1.2.2.1.1.1.cmml"><mover accent="true" id="S8.p1.1.m1.2.2.1.1.1.2" xref="S8.p1.1.m1.2.2.1.1.1.2.cmml"><mi id="S8.p1.1.m1.2.2.1.1.1.2.2" mathsize="144%" xref="S8.p1.1.m1.2.2.1.1.1.2.2.cmml">𝐪</mi><mo id="S8.p1.1.m1.2.2.1.1.1.2.1" mathsize="144%" xref="S8.p1.1.m1.2.2.1.1.1.2.1.cmml">^</mo></mover><mn id="S8.p1.1.m1.2.2.1.1.1.3" mathsize="144%" xref="S8.p1.1.m1.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S8.p1.1.m1.3.3.2.2.4" mathsize="144%" xref="S8.p1.1.m1.3.3.2.3.cmml">,</mo><mi id="S8.p1.1.m1.1.1" mathsize="144%" mathvariant="normal" xref="S8.p1.1.m1.1.1.cmml">…</mi><mo id="S8.p1.1.m1.3.3.2.2.5" mathsize="144%" xref="S8.p1.1.m1.3.3.2.3.cmml">,</mo><msub id="S8.p1.1.m1.3.3.2.2.2" xref="S8.p1.1.m1.3.3.2.2.2.cmml"><mover accent="true" id="S8.p1.1.m1.3.3.2.2.2.2" xref="S8.p1.1.m1.3.3.2.2.2.2.cmml"><mi id="S8.p1.1.m1.3.3.2.2.2.2.2" mathsize="144%" xref="S8.p1.1.m1.3.3.2.2.2.2.2.cmml">𝐪</mi><mo id="S8.p1.1.m1.3.3.2.2.2.2.1" mathsize="144%" xref="S8.p1.1.m1.3.3.2.2.2.2.1.cmml">^</mo></mover><mi id="S8.p1.1.m1.3.3.2.2.2.3" mathsize="144%" xref="S8.p1.1.m1.3.3.2.2.2.3.cmml">n</mi></msub><mo id="S8.p1.1.m1.3.3.2.2.6" maxsize="144%" minsize="144%" xref="S8.p1.1.m1.3.3.2.3.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S8.p1.1.m1.3b"><apply id="S8.p1.1.m1.3.3.cmml" xref="S8.p1.1.m1.3.3"><eq id="S8.p1.1.m1.3.3.3.cmml" xref="S8.p1.1.m1.3.3.3"></eq><apply id="S8.p1.1.m1.3.3.4.cmml" xref="S8.p1.1.m1.3.3.4"><ci id="S8.p1.1.m1.3.3.4.1.cmml" xref="S8.p1.1.m1.3.3.4.1">^</ci><ci id="S8.p1.1.m1.3.3.4.2.cmml" xref="S8.p1.1.m1.3.3.4.2">𝐦</ci></apply><vector id="S8.p1.1.m1.3.3.2.3.cmml" xref="S8.p1.1.m1.3.3.2.2"><apply id="S8.p1.1.m1.2.2.1.1.1.cmml" xref="S8.p1.1.m1.2.2.1.1.1"><csymbol cd="ambiguous" id="S8.p1.1.m1.2.2.1.1.1.1.cmml" xref="S8.p1.1.m1.2.2.1.1.1">subscript</csymbol><apply id="S8.p1.1.m1.2.2.1.1.1.2.cmml" xref="S8.p1.1.m1.2.2.1.1.1.2"><ci id="S8.p1.1.m1.2.2.1.1.1.2.1.cmml" xref="S8.p1.1.m1.2.2.1.1.1.2.1">^</ci><ci id="S8.p1.1.m1.2.2.1.1.1.2.2.cmml" xref="S8.p1.1.m1.2.2.1.1.1.2.2">𝐪</ci></apply><cn id="S8.p1.1.m1.2.2.1.1.1.3.cmml" type="integer" xref="S8.p1.1.m1.2.2.1.1.1.3">1</cn></apply><ci id="S8.p1.1.m1.1.1.cmml" xref="S8.p1.1.m1.1.1">…</ci><apply id="S8.p1.1.m1.3.3.2.2.2.cmml" xref="S8.p1.1.m1.3.3.2.2.2"><csymbol cd="ambiguous" id="S8.p1.1.m1.3.3.2.2.2.1.cmml" xref="S8.p1.1.m1.3.3.2.2.2">subscript</csymbol><apply id="S8.p1.1.m1.3.3.2.2.2.2.cmml" xref="S8.p1.1.m1.3.3.2.2.2.2"><ci id="S8.p1.1.m1.3.3.2.2.2.2.1.cmml" xref="S8.p1.1.m1.3.3.2.2.2.2.1">^</ci><ci id="S8.p1.1.m1.3.3.2.2.2.2.2.cmml" xref="S8.p1.1.m1.3.3.2.2.2.2.2">𝐪</ci></apply><ci id="S8.p1.1.m1.3.3.2.2.2.3.cmml" xref="S8.p1.1.m1.3.3.2.2.2.3">𝑛</ci></apply></vector></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p1.1.m1.3c">\hat{\mathbf{m}}=(\hat{\mathbf{q}}_{1},\ldots,\hat{\mathbf{q}}_{n})</annotation><annotation encoding="application/x-llamapun" id="S8.p1.1.m1.3d">over^ start_ARG bold_m end_ARG = ( over^ start_ARG bold_q end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , over^ start_ARG bold_q end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT )</annotation></semantics></math><span class="ltx_text" id="S8.p1.10.8" style="font-size:144%;">, a motion encoder </span><math alttext="{\mathbf{z}}=\text{Enc}_{m}(\hat{\mathbf{m}})" class="ltx_Math" display="inline" id="S8.p1.2.m2.1"><semantics id="S8.p1.2.m2.1a"><mrow id="S8.p1.2.m2.1.2" xref="S8.p1.2.m2.1.2.cmml"><mi id="S8.p1.2.m2.1.2.2" mathsize="144%" xref="S8.p1.2.m2.1.2.2.cmml">𝐳</mi><mo id="S8.p1.2.m2.1.2.1" mathsize="144%" xref="S8.p1.2.m2.1.2.1.cmml">=</mo><mrow id="S8.p1.2.m2.1.2.3" xref="S8.p1.2.m2.1.2.3.cmml"><msub id="S8.p1.2.m2.1.2.3.2" xref="S8.p1.2.m2.1.2.3.2.cmml"><mtext id="S8.p1.2.m2.1.2.3.2.2" mathsize="144%" xref="S8.p1.2.m2.1.2.3.2.2a.cmml">Enc</mtext><mi id="S8.p1.2.m2.1.2.3.2.3" mathsize="144%" xref="S8.p1.2.m2.1.2.3.2.3.cmml">m</mi></msub><mo id="S8.p1.2.m2.1.2.3.1" xref="S8.p1.2.m2.1.2.3.1.cmml">⁢</mo><mrow id="S8.p1.2.m2.1.2.3.3.2" xref="S8.p1.2.m2.1.1.cmml"><mo id="S8.p1.2.m2.1.2.3.3.2.1" maxsize="144%" minsize="144%" xref="S8.p1.2.m2.1.1.cmml">(</mo><mover accent="true" id="S8.p1.2.m2.1.1" xref="S8.p1.2.m2.1.1.cmml"><mi id="S8.p1.2.m2.1.1.2" mathsize="144%" xref="S8.p1.2.m2.1.1.2.cmml">𝐦</mi><mo id="S8.p1.2.m2.1.1.1" mathsize="144%" xref="S8.p1.2.m2.1.1.1.cmml">^</mo></mover><mo id="S8.p1.2.m2.1.2.3.3.2.2" maxsize="144%" minsize="144%" xref="S8.p1.2.m2.1.1.cmml">)</mo></mrow></mrow></mrow><annotation-xml encoding="MathML-Content" id="S8.p1.2.m2.1b"><apply id="S8.p1.2.m2.1.2.cmml" xref="S8.p1.2.m2.1.2"><eq id="S8.p1.2.m2.1.2.1.cmml" xref="S8.p1.2.m2.1.2.1"></eq><ci id="S8.p1.2.m2.1.2.2.cmml" xref="S8.p1.2.m2.1.2.2">𝐳</ci><apply id="S8.p1.2.m2.1.2.3.cmml" xref="S8.p1.2.m2.1.2.3"><times id="S8.p1.2.m2.1.2.3.1.cmml" xref="S8.p1.2.m2.1.2.3.1"></times><apply id="S8.p1.2.m2.1.2.3.2.cmml" xref="S8.p1.2.m2.1.2.3.2"><csymbol cd="ambiguous" id="S8.p1.2.m2.1.2.3.2.1.cmml" xref="S8.p1.2.m2.1.2.3.2">subscript</csymbol><ci id="S8.p1.2.m2.1.2.3.2.2a.cmml" xref="S8.p1.2.m2.1.2.3.2.2"><mtext id="S8.p1.2.m2.1.2.3.2.2.cmml" mathsize="144%" xref="S8.p1.2.m2.1.2.3.2.2">Enc</mtext></ci><ci id="S8.p1.2.m2.1.2.3.2.3.cmml" xref="S8.p1.2.m2.1.2.3.2.3">𝑚</ci></apply><apply id="S8.p1.2.m2.1.1.cmml" xref="S8.p1.2.m2.1.2.3.3.2"><ci id="S8.p1.2.m2.1.1.1.cmml" xref="S8.p1.2.m2.1.1.1">^</ci><ci id="S8.p1.2.m2.1.1.2.cmml" xref="S8.p1.2.m2.1.1.2">𝐦</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p1.2.m2.1c">{\mathbf{z}}=\text{Enc}_{m}(\hat{\mathbf{m}})</annotation><annotation encoding="application/x-llamapun" id="S8.p1.2.m2.1d">bold_z = Enc start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( over^ start_ARG bold_m end_ARG )</annotation></semantics></math><span class="ltx_text" id="S8.p1.10.9" style="font-size:144%;"> maps the motion to an embedding </span><math alttext="{\mathbf{z}}" class="ltx_Math" display="inline" id="S8.p1.3.m3.1"><semantics id="S8.p1.3.m3.1a"><mi id="S8.p1.3.m3.1.1" mathsize="144%" xref="S8.p1.3.m3.1.1.cmml">𝐳</mi><annotation-xml encoding="MathML-Content" id="S8.p1.3.m3.1b"><ci id="S8.p1.3.m3.1.1.cmml" xref="S8.p1.3.m3.1.1">𝐳</ci></annotation-xml><annotation encoding="application/x-tex" id="S8.p1.3.m3.1c">{\mathbf{z}}</annotation><annotation encoding="application/x-llamapun" id="S8.p1.3.m3.1d">bold_z</annotation></semantics></math><span class="ltx_text" id="S8.p1.10.10" style="font-size:144%;">. The embedding is normalized to lie on a unit sphere </span><math alttext="\|{\mathbf{z}}\|=1" class="ltx_Math" display="inline" id="S8.p1.4.m4.1"><semantics id="S8.p1.4.m4.1a"><mrow id="S8.p1.4.m4.1.2" xref="S8.p1.4.m4.1.2.cmml"><mrow id="S8.p1.4.m4.1.2.2.2" xref="S8.p1.4.m4.1.2.2.1.cmml"><mo id="S8.p1.4.m4.1.2.2.2.1" maxsize="144%" minsize="144%" xref="S8.p1.4.m4.1.2.2.1.1.cmml">‖</mo><mi id="S8.p1.4.m4.1.1" mathsize="144%" xref="S8.p1.4.m4.1.1.cmml">𝐳</mi><mo id="S8.p1.4.m4.1.2.2.2.2" maxsize="144%" minsize="144%" xref="S8.p1.4.m4.1.2.2.1.1.cmml">‖</mo></mrow><mo id="S8.p1.4.m4.1.2.1" mathsize="144%" xref="S8.p1.4.m4.1.2.1.cmml">=</mo><mn id="S8.p1.4.m4.1.2.3" mathsize="144%" xref="S8.p1.4.m4.1.2.3.cmml">1</mn></mrow><annotation-xml encoding="MathML-Content" id="S8.p1.4.m4.1b"><apply id="S8.p1.4.m4.1.2.cmml" xref="S8.p1.4.m4.1.2"><eq id="S8.p1.4.m4.1.2.1.cmml" xref="S8.p1.4.m4.1.2.1"></eq><apply id="S8.p1.4.m4.1.2.2.1.cmml" xref="S8.p1.4.m4.1.2.2.2"><csymbol cd="latexml" id="S8.p1.4.m4.1.2.2.1.1.cmml" xref="S8.p1.4.m4.1.2.2.2.1">norm</csymbol><ci id="S8.p1.4.m4.1.1.cmml" xref="S8.p1.4.m4.1.1">𝐳</ci></apply><cn id="S8.p1.4.m4.1.2.3.cmml" type="integer" xref="S8.p1.4.m4.1.2.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p1.4.m4.1c">\|{\mathbf{z}}\|=1</annotation><annotation encoding="application/x-llamapun" id="S8.p1.4.m4.1d">∥ bold_z ∥ = 1</annotation></semantics></math><span class="ltx_text" id="S8.p1.10.11" style="font-size:144%;">. We set the embedding size </span><math alttext="{\mathbf{z}}" class="ltx_Math" display="inline" id="S8.p1.5.m5.1"><semantics id="S8.p1.5.m5.1a"><mi id="S8.p1.5.m5.1.1" mathsize="144%" xref="S8.p1.5.m5.1.1.cmml">𝐳</mi><annotation-xml encoding="MathML-Content" id="S8.p1.5.m5.1b"><ci id="S8.p1.5.m5.1.1.cmml" xref="S8.p1.5.m5.1.1">𝐳</ci></annotation-xml><annotation encoding="application/x-tex" id="S8.p1.5.m5.1c">{\mathbf{z}}</annotation><annotation encoding="application/x-llamapun" id="S8.p1.5.m5.1d">bold_z</annotation></semantics></math><span class="ltx_text" id="S8.p1.10.12" style="font-size:144%;"> to 64 to save the computation cost. For the text embedding, we first extract the feature with CLIP Encoder </span><cite class="ltx_cite ltx_citemacro_cite"><span class="ltx_text" id="S8.p1.10.13.1" style="font-size:144%;">[</span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib31" title=""><span class="ltx_text" style="font-size:90%;">31</span></a><span class="ltx_text" id="S8.p1.10.14.2" style="font-size:144%;">]</span></cite><span class="ltx_text" id="S8.p1.10.15" style="font-size:144%;"> </span><math alttext="\text{Enc}_{l}" class="ltx_Math" display="inline" id="S8.p1.6.m6.1"><semantics id="S8.p1.6.m6.1a"><msub id="S8.p1.6.m6.1.1" xref="S8.p1.6.m6.1.1.cmml"><mtext id="S8.p1.6.m6.1.1.2" mathsize="144%" xref="S8.p1.6.m6.1.1.2a.cmml">Enc</mtext><mi id="S8.p1.6.m6.1.1.3" mathsize="144%" xref="S8.p1.6.m6.1.1.3.cmml">l</mi></msub><annotation-xml encoding="MathML-Content" id="S8.p1.6.m6.1b"><apply id="S8.p1.6.m6.1.1.cmml" xref="S8.p1.6.m6.1.1"><csymbol cd="ambiguous" id="S8.p1.6.m6.1.1.1.cmml" xref="S8.p1.6.m6.1.1">subscript</csymbol><ci id="S8.p1.6.m6.1.1.2a.cmml" xref="S8.p1.6.m6.1.1.2"><mtext id="S8.p1.6.m6.1.1.2.cmml" mathsize="144%" xref="S8.p1.6.m6.1.1.2">Enc</mtext></ci><ci id="S8.p1.6.m6.1.1.3.cmml" xref="S8.p1.6.m6.1.1.3">𝑙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p1.6.m6.1c">\text{Enc}_{l}</annotation><annotation encoding="application/x-llamapun" id="S8.p1.6.m6.1d">Enc start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT</annotation></semantics></math><span class="ltx_text" id="S8.p1.10.16" style="font-size:144%;"> from caption </span><math alttext="{\mathbf{c}}" class="ltx_Math" display="inline" id="S8.p1.7.m7.1"><semantics id="S8.p1.7.m7.1a"><mi id="S8.p1.7.m7.1.1" mathsize="144%" xref="S8.p1.7.m7.1.1.cmml">𝐜</mi><annotation-xml encoding="MathML-Content" id="S8.p1.7.m7.1b"><ci id="S8.p1.7.m7.1.1.cmml" xref="S8.p1.7.m7.1.1">𝐜</ci></annotation-xml><annotation encoding="application/x-tex" id="S8.p1.7.m7.1c">{\mathbf{c}}</annotation><annotation encoding="application/x-llamapun" id="S8.p1.7.m7.1d">bold_c</annotation></semantics></math><span class="ltx_text" id="S8.p1.10.17" style="font-size:144%;">, then use a multilayer perception </span><math alttext="\text{MLP}_{d}" class="ltx_Math" display="inline" id="S8.p1.8.m8.1"><semantics id="S8.p1.8.m8.1a"><msub id="S8.p1.8.m8.1.1" xref="S8.p1.8.m8.1.1.cmml"><mtext id="S8.p1.8.m8.1.1.2" mathsize="144%" xref="S8.p1.8.m8.1.1.2a.cmml">MLP</mtext><mi id="S8.p1.8.m8.1.1.3" mathsize="144%" xref="S8.p1.8.m8.1.1.3.cmml">d</mi></msub><annotation-xml encoding="MathML-Content" id="S8.p1.8.m8.1b"><apply id="S8.p1.8.m8.1.1.cmml" xref="S8.p1.8.m8.1.1"><csymbol cd="ambiguous" id="S8.p1.8.m8.1.1.1.cmml" xref="S8.p1.8.m8.1.1">subscript</csymbol><ci id="S8.p1.8.m8.1.1.2a.cmml" xref="S8.p1.8.m8.1.1.2"><mtext id="S8.p1.8.m8.1.1.2.cmml" mathsize="144%" xref="S8.p1.8.m8.1.1.2">MLP</mtext></ci><ci id="S8.p1.8.m8.1.1.3.cmml" xref="S8.p1.8.m8.1.1.3">𝑑</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p1.8.m8.1c">\text{MLP}_{d}</annotation><annotation encoding="application/x-llamapun" id="S8.p1.8.m8.1d">MLP start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT</annotation></semantics></math><span class="ltx_text" id="S8.p1.10.18" style="font-size:144%;"> to downsize the 512 dim CLIP feature to 64 dim and use an extra one </span><math alttext="\text{MLP}_{u}" class="ltx_Math" display="inline" id="S8.p1.9.m9.1"><semantics id="S8.p1.9.m9.1a"><msub id="S8.p1.9.m9.1.1" xref="S8.p1.9.m9.1.1.cmml"><mtext id="S8.p1.9.m9.1.1.2" mathsize="144%" xref="S8.p1.9.m9.1.1.2a.cmml">MLP</mtext><mi id="S8.p1.9.m9.1.1.3" mathsize="144%" xref="S8.p1.9.m9.1.1.3.cmml">u</mi></msub><annotation-xml encoding="MathML-Content" id="S8.p1.9.m9.1b"><apply id="S8.p1.9.m9.1.1.cmml" xref="S8.p1.9.m9.1.1"><csymbol cd="ambiguous" id="S8.p1.9.m9.1.1.1.cmml" xref="S8.p1.9.m9.1.1">subscript</csymbol><ci id="S8.p1.9.m9.1.1.2a.cmml" xref="S8.p1.9.m9.1.1.2"><mtext id="S8.p1.9.m9.1.1.2.cmml" mathsize="144%" xref="S8.p1.9.m9.1.1.2">MLP</mtext></ci><ci id="S8.p1.9.m9.1.1.3.cmml" xref="S8.p1.9.m9.1.1.3">𝑢</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p1.9.m9.1c">\text{MLP}_{u}</annotation><annotation encoding="application/x-llamapun" id="S8.p1.9.m9.1d">MLP start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT</annotation></semantics></math><span class="ltx_text" id="S8.p1.10.19" style="font-size:144%;"> to upsample it to 512 dim to maintain the semantic feature. The embedding </span><math alttext="{\mathbf{z}}" class="ltx_Math" display="inline" id="S8.p1.10.m10.1"><semantics id="S8.p1.10.m10.1a"><mi id="S8.p1.10.m10.1.1" mathsize="144%" xref="S8.p1.10.m10.1.1.cmml">𝐳</mi><annotation-xml encoding="MathML-Content" id="S8.p1.10.m10.1b"><ci id="S8.p1.10.m10.1.1.cmml" xref="S8.p1.10.m10.1.1">𝐳</ci></annotation-xml><annotation encoding="application/x-tex" id="S8.p1.10.m10.1c">{\mathbf{z}}</annotation><annotation encoding="application/x-llamapun" id="S8.p1.10.m10.1d">bold_z</annotation></semantics></math><span class="ltx_text" id="S8.p1.10.20" style="font-size:144%;"> should be aligned with the downsized CLIP feature. See details in </span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S8.F7" style="font-size:144%;" title="In 8 Re-implemented MotionCLIP ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">7</span></a><span class="ltx_text" id="S8.p1.10.21" style="font-size:144%;"></span></p> </div> <figure class="ltx_figure" id="S8.F7"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="348" id="S8.F7.g1" src="x7.png" width="664"/> <figcaption class="ltx_caption ltx_centering" style="font-size:144%;"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S8.F7.4.1.1" style="font-size:63%;">Figure 7</span>: </span><span class="ltx_text" id="S8.F7.5.2" style="font-size:63%;">Our re-implemented MotionClip.</span></figcaption> </figure> <div class="ltx_para" id="S8.p2"> <p class="ltx_p" id="S8.p2.5"><span class="ltx_text" id="S8.p2.5.1" style="font-size:144%;">Following </span><cite class="ltx_cite ltx_citemacro_cite"><span class="ltx_text" id="S8.p2.5.2.1" style="font-size:144%;">[</span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib38" title=""><span class="ltx_text" style="font-size:90%;">38</span></a><span class="ltx_text" id="S8.p2.5.3.2" style="font-size:144%;">]</span></cite><span class="ltx_text" id="S8.p2.5.4" style="font-size:144%;">, </span><math alttext="\text{Enc}_{m}\left(\mathbf{m}\right)" class="ltx_Math" display="inline" id="S8.p2.1.m1.1"><semantics id="S8.p2.1.m1.1a"><mrow id="S8.p2.1.m1.1.2" xref="S8.p2.1.m1.1.2.cmml"><msub id="S8.p2.1.m1.1.2.2" xref="S8.p2.1.m1.1.2.2.cmml"><mtext id="S8.p2.1.m1.1.2.2.2" mathsize="144%" xref="S8.p2.1.m1.1.2.2.2a.cmml">Enc</mtext><mi id="S8.p2.1.m1.1.2.2.3" mathsize="144%" xref="S8.p2.1.m1.1.2.2.3.cmml">m</mi></msub><mo id="S8.p2.1.m1.1.2.1" xref="S8.p2.1.m1.1.2.1.cmml">⁢</mo><mrow id="S8.p2.1.m1.1.2.3.2" xref="S8.p2.1.m1.1.2.cmml"><mo id="S8.p2.1.m1.1.2.3.2.1" xref="S8.p2.1.m1.1.2.cmml">(</mo><mi id="S8.p2.1.m1.1.1" mathsize="144%" xref="S8.p2.1.m1.1.1.cmml">𝐦</mi><mo id="S8.p2.1.m1.1.2.3.2.2" xref="S8.p2.1.m1.1.2.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S8.p2.1.m1.1b"><apply id="S8.p2.1.m1.1.2.cmml" xref="S8.p2.1.m1.1.2"><times id="S8.p2.1.m1.1.2.1.cmml" xref="S8.p2.1.m1.1.2.1"></times><apply id="S8.p2.1.m1.1.2.2.cmml" xref="S8.p2.1.m1.1.2.2"><csymbol cd="ambiguous" id="S8.p2.1.m1.1.2.2.1.cmml" xref="S8.p2.1.m1.1.2.2">subscript</csymbol><ci id="S8.p2.1.m1.1.2.2.2a.cmml" xref="S8.p2.1.m1.1.2.2.2"><mtext id="S8.p2.1.m1.1.2.2.2.cmml" mathsize="144%" xref="S8.p2.1.m1.1.2.2.2">Enc</mtext></ci><ci id="S8.p2.1.m1.1.2.2.3.cmml" xref="S8.p2.1.m1.1.2.2.3">𝑚</ci></apply><ci id="S8.p2.1.m1.1.1.cmml" xref="S8.p2.1.m1.1.1">𝐦</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p2.1.m1.1c">\text{Enc}_{m}\left(\mathbf{m}\right)</annotation><annotation encoding="application/x-llamapun" id="S8.p2.1.m1.1d">Enc start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( bold_m )</annotation></semantics></math><span class="ltx_text" id="S8.p2.5.5" style="font-size:144%;"> is modeled by a bidirectional transformer </span><cite class="ltx_cite ltx_citemacro_cite"><span class="ltx_text" id="S8.p2.5.6.1" style="font-size:144%;">[</span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib4" title=""><span class="ltx_text" style="font-size:90%;">4</span></a><span class="ltx_text" id="S8.p2.5.7.2" style="font-size:144%;">]</span></cite><span class="ltx_text" id="S8.p2.5.8" style="font-size:144%;">. The motion decoder is jointly trained with the encoder to produce a reconstruction sequence </span><math alttext="\mathbf{m}=(\mathbf{q}_{1},\ldots,\mathbf{q}_{n})" class="ltx_Math" display="inline" id="S8.p2.2.m2.3"><semantics id="S8.p2.2.m2.3a"><mrow id="S8.p2.2.m2.3.3" xref="S8.p2.2.m2.3.3.cmml"><mi id="S8.p2.2.m2.3.3.4" mathsize="144%" xref="S8.p2.2.m2.3.3.4.cmml">𝐦</mi><mo id="S8.p2.2.m2.3.3.3" mathsize="144%" xref="S8.p2.2.m2.3.3.3.cmml">=</mo><mrow id="S8.p2.2.m2.3.3.2.2" xref="S8.p2.2.m2.3.3.2.3.cmml"><mo id="S8.p2.2.m2.3.3.2.2.3" maxsize="144%" minsize="144%" xref="S8.p2.2.m2.3.3.2.3.cmml">(</mo><msub id="S8.p2.2.m2.2.2.1.1.1" xref="S8.p2.2.m2.2.2.1.1.1.cmml"><mi id="S8.p2.2.m2.2.2.1.1.1.2" mathsize="144%" xref="S8.p2.2.m2.2.2.1.1.1.2.cmml">𝐪</mi><mn id="S8.p2.2.m2.2.2.1.1.1.3" mathsize="144%" xref="S8.p2.2.m2.2.2.1.1.1.3.cmml">1</mn></msub><mo id="S8.p2.2.m2.3.3.2.2.4" mathsize="144%" xref="S8.p2.2.m2.3.3.2.3.cmml">,</mo><mi id="S8.p2.2.m2.1.1" mathsize="144%" mathvariant="normal" xref="S8.p2.2.m2.1.1.cmml">…</mi><mo id="S8.p2.2.m2.3.3.2.2.5" mathsize="144%" xref="S8.p2.2.m2.3.3.2.3.cmml">,</mo><msub id="S8.p2.2.m2.3.3.2.2.2" xref="S8.p2.2.m2.3.3.2.2.2.cmml"><mi id="S8.p2.2.m2.3.3.2.2.2.2" mathsize="144%" xref="S8.p2.2.m2.3.3.2.2.2.2.cmml">𝐪</mi><mi id="S8.p2.2.m2.3.3.2.2.2.3" mathsize="144%" xref="S8.p2.2.m2.3.3.2.2.2.3.cmml">n</mi></msub><mo id="S8.p2.2.m2.3.3.2.2.6" maxsize="144%" minsize="144%" xref="S8.p2.2.m2.3.3.2.3.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S8.p2.2.m2.3b"><apply id="S8.p2.2.m2.3.3.cmml" xref="S8.p2.2.m2.3.3"><eq id="S8.p2.2.m2.3.3.3.cmml" xref="S8.p2.2.m2.3.3.3"></eq><ci id="S8.p2.2.m2.3.3.4.cmml" xref="S8.p2.2.m2.3.3.4">𝐦</ci><vector id="S8.p2.2.m2.3.3.2.3.cmml" xref="S8.p2.2.m2.3.3.2.2"><apply id="S8.p2.2.m2.2.2.1.1.1.cmml" xref="S8.p2.2.m2.2.2.1.1.1"><csymbol cd="ambiguous" id="S8.p2.2.m2.2.2.1.1.1.1.cmml" xref="S8.p2.2.m2.2.2.1.1.1">subscript</csymbol><ci id="S8.p2.2.m2.2.2.1.1.1.2.cmml" xref="S8.p2.2.m2.2.2.1.1.1.2">𝐪</ci><cn id="S8.p2.2.m2.2.2.1.1.1.3.cmml" type="integer" xref="S8.p2.2.m2.2.2.1.1.1.3">1</cn></apply><ci id="S8.p2.2.m2.1.1.cmml" xref="S8.p2.2.m2.1.1">…</ci><apply id="S8.p2.2.m2.3.3.2.2.2.cmml" xref="S8.p2.2.m2.3.3.2.2.2"><csymbol cd="ambiguous" id="S8.p2.2.m2.3.3.2.2.2.1.cmml" xref="S8.p2.2.m2.3.3.2.2.2">subscript</csymbol><ci id="S8.p2.2.m2.3.3.2.2.2.2.cmml" xref="S8.p2.2.m2.3.3.2.2.2.2">𝐪</ci><ci id="S8.p2.2.m2.3.3.2.2.2.3.cmml" xref="S8.p2.2.m2.3.3.2.2.2.3">𝑛</ci></apply></vector></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p2.2.m2.3c">\mathbf{m}=(\mathbf{q}_{1},\ldots,\mathbf{q}_{n})</annotation><annotation encoding="application/x-llamapun" id="S8.p2.2.m2.3d">bold_m = ( bold_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_q start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT )</annotation></semantics></math><span class="ltx_text" id="S8.p2.5.9" style="font-size:144%;"> to recover </span><math alttext="\hat{\mathbf{m}}" class="ltx_Math" display="inline" id="S8.p2.3.m3.1"><semantics id="S8.p2.3.m3.1a"><mover accent="true" id="S8.p2.3.m3.1.1" xref="S8.p2.3.m3.1.1.cmml"><mi id="S8.p2.3.m3.1.1.2" mathsize="144%" xref="S8.p2.3.m3.1.1.2.cmml">𝐦</mi><mo id="S8.p2.3.m3.1.1.1" mathsize="144%" xref="S8.p2.3.m3.1.1.1.cmml">^</mo></mover><annotation-xml encoding="MathML-Content" id="S8.p2.3.m3.1b"><apply id="S8.p2.3.m3.1.1.cmml" xref="S8.p2.3.m3.1.1"><ci id="S8.p2.3.m3.1.1.1.cmml" xref="S8.p2.3.m3.1.1.1">^</ci><ci id="S8.p2.3.m3.1.1.2.cmml" xref="S8.p2.3.m3.1.1.2">𝐦</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p2.3.m3.1c">\hat{\mathbf{m}}</annotation><annotation encoding="application/x-llamapun" id="S8.p2.3.m3.1d">over^ start_ARG bold_m end_ARG</annotation></semantics></math><span class="ltx_text" id="S8.p2.5.10" style="font-size:144%;"> from </span><math alttext="{\mathbf{z}}" class="ltx_Math" display="inline" id="S8.p2.4.m4.1"><semantics id="S8.p2.4.m4.1a"><mi id="S8.p2.4.m4.1.1" mathsize="144%" xref="S8.p2.4.m4.1.1.cmml">𝐳</mi><annotation-xml encoding="MathML-Content" id="S8.p2.4.m4.1b"><ci id="S8.p2.4.m4.1.1.cmml" xref="S8.p2.4.m4.1.1">𝐳</ci></annotation-xml><annotation encoding="application/x-tex" id="S8.p2.4.m4.1c">{\mathbf{z}}</annotation><annotation encoding="application/x-llamapun" id="S8.p2.4.m4.1d">bold_z</annotation></semantics></math><span class="ltx_text" id="S8.p2.5.11" style="font-size:144%;">. The motion representation </span><math alttext="\mathbf{q}" class="ltx_Math" display="inline" id="S8.p2.5.m5.1"><semantics id="S8.p2.5.m5.1a"><mi id="S8.p2.5.m5.1.1" mathsize="144%" xref="S8.p2.5.m5.1.1.cmml">𝐪</mi><annotation-xml encoding="MathML-Content" id="S8.p2.5.m5.1b"><ci id="S8.p2.5.m5.1.1.cmml" xref="S8.p2.5.m5.1.1">𝐪</ci></annotation-xml><annotation encoding="application/x-tex" id="S8.p2.5.m5.1c">\mathbf{q}</annotation><annotation encoding="application/x-llamapun" id="S8.p2.5.m5.1d">bold_q</annotation></semantics></math><span class="ltx_text" id="S8.p2.5.12" style="font-size:144%;"> we use is a set of character motion features, following the discriminator observation used in AMP </span><cite class="ltx_cite ltx_citemacro_cite"><span class="ltx_text" id="S8.p2.5.13.1" style="font-size:144%;">[</span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib29" title=""><span class="ltx_text" style="font-size:90%;">29</span></a><span class="ltx_text" id="S8.p2.5.14.2" style="font-size:144%;">]</span></cite><span class="ltx_text" id="S8.p2.5.15" style="font-size:144%;">. The auto-encoder is trained with the loss:</span></p> <table class="ltx_equationgroup ltx_eqn_align ltx_eqn_table" id="S11.EGx1"> <tbody id="S8.E12"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\mathcal{L}_{\text{AE}}=\mathcal{L}_{\text{recon}}^{m}+\mathcal{L% }_{\text{align}}^{m,t}+\mathcal{L}_{\text{recon}}^{t}." class="ltx_Math" display="inline" id="S8.E12.m1.3"><semantics id="S8.E12.m1.3a"><mrow id="S8.E12.m1.3.3.1" xref="S8.E12.m1.3.3.1.1.cmml"><mrow id="S8.E12.m1.3.3.1.1" xref="S8.E12.m1.3.3.1.1.cmml"><msub id="S8.E12.m1.3.3.1.1.2" xref="S8.E12.m1.3.3.1.1.2.cmml"><mi class="ltx_font_mathcaligraphic" id="S8.E12.m1.3.3.1.1.2.2" mathsize="144%" xref="S8.E12.m1.3.3.1.1.2.2.cmml">ℒ</mi><mtext id="S8.E12.m1.3.3.1.1.2.3" mathsize="144%" xref="S8.E12.m1.3.3.1.1.2.3a.cmml">AE</mtext></msub><mo id="S8.E12.m1.3.3.1.1.1" mathsize="144%" xref="S8.E12.m1.3.3.1.1.1.cmml">=</mo><mrow id="S8.E12.m1.3.3.1.1.3" xref="S8.E12.m1.3.3.1.1.3.cmml"><msubsup id="S8.E12.m1.3.3.1.1.3.2" xref="S8.E12.m1.3.3.1.1.3.2.cmml"><mi class="ltx_font_mathcaligraphic" id="S8.E12.m1.3.3.1.1.3.2.2.2" mathsize="144%" xref="S8.E12.m1.3.3.1.1.3.2.2.2.cmml">ℒ</mi><mtext id="S8.E12.m1.3.3.1.1.3.2.2.3" mathsize="144%" xref="S8.E12.m1.3.3.1.1.3.2.2.3a.cmml">recon</mtext><mi id="S8.E12.m1.3.3.1.1.3.2.3" mathsize="144%" xref="S8.E12.m1.3.3.1.1.3.2.3.cmml">m</mi></msubsup><mo id="S8.E12.m1.3.3.1.1.3.1" mathsize="144%" xref="S8.E12.m1.3.3.1.1.3.1.cmml">+</mo><msubsup id="S8.E12.m1.3.3.1.1.3.3" xref="S8.E12.m1.3.3.1.1.3.3.cmml"><mi class="ltx_font_mathcaligraphic" id="S8.E12.m1.3.3.1.1.3.3.2.2" mathsize="144%" xref="S8.E12.m1.3.3.1.1.3.3.2.2.cmml">ℒ</mi><mtext id="S8.E12.m1.3.3.1.1.3.3.2.3" mathsize="144%" xref="S8.E12.m1.3.3.1.1.3.3.2.3a.cmml">align</mtext><mrow id="S8.E12.m1.2.2.2.4" xref="S8.E12.m1.2.2.2.3.cmml"><mi id="S8.E12.m1.1.1.1.1" mathsize="144%" xref="S8.E12.m1.1.1.1.1.cmml">m</mi><mo id="S8.E12.m1.2.2.2.4.1" mathsize="144%" xref="S8.E12.m1.2.2.2.3.cmml">,</mo><mi id="S8.E12.m1.2.2.2.2" mathsize="144%" xref="S8.E12.m1.2.2.2.2.cmml">t</mi></mrow></msubsup><mo id="S8.E12.m1.3.3.1.1.3.1a" mathsize="144%" xref="S8.E12.m1.3.3.1.1.3.1.cmml">+</mo><msubsup id="S8.E12.m1.3.3.1.1.3.4" xref="S8.E12.m1.3.3.1.1.3.4.cmml"><mi class="ltx_font_mathcaligraphic" id="S8.E12.m1.3.3.1.1.3.4.2.2" mathsize="144%" xref="S8.E12.m1.3.3.1.1.3.4.2.2.cmml">ℒ</mi><mtext id="S8.E12.m1.3.3.1.1.3.4.2.3" mathsize="144%" xref="S8.E12.m1.3.3.1.1.3.4.2.3a.cmml">recon</mtext><mi id="S8.E12.m1.3.3.1.1.3.4.3" mathsize="144%" xref="S8.E12.m1.3.3.1.1.3.4.3.cmml">t</mi></msubsup></mrow></mrow><mo id="S8.E12.m1.3.3.1.2" lspace="0em" mathsize="144%" xref="S8.E12.m1.3.3.1.1.cmml">.</mo></mrow><annotation-xml encoding="MathML-Content" id="S8.E12.m1.3b"><apply id="S8.E12.m1.3.3.1.1.cmml" xref="S8.E12.m1.3.3.1"><eq id="S8.E12.m1.3.3.1.1.1.cmml" xref="S8.E12.m1.3.3.1.1.1"></eq><apply id="S8.E12.m1.3.3.1.1.2.cmml" xref="S8.E12.m1.3.3.1.1.2"><csymbol cd="ambiguous" id="S8.E12.m1.3.3.1.1.2.1.cmml" xref="S8.E12.m1.3.3.1.1.2">subscript</csymbol><ci id="S8.E12.m1.3.3.1.1.2.2.cmml" xref="S8.E12.m1.3.3.1.1.2.2">ℒ</ci><ci id="S8.E12.m1.3.3.1.1.2.3a.cmml" xref="S8.E12.m1.3.3.1.1.2.3"><mtext id="S8.E12.m1.3.3.1.1.2.3.cmml" mathsize="101%" xref="S8.E12.m1.3.3.1.1.2.3">AE</mtext></ci></apply><apply id="S8.E12.m1.3.3.1.1.3.cmml" xref="S8.E12.m1.3.3.1.1.3"><plus id="S8.E12.m1.3.3.1.1.3.1.cmml" xref="S8.E12.m1.3.3.1.1.3.1"></plus><apply id="S8.E12.m1.3.3.1.1.3.2.cmml" xref="S8.E12.m1.3.3.1.1.3.2"><csymbol cd="ambiguous" id="S8.E12.m1.3.3.1.1.3.2.1.cmml" xref="S8.E12.m1.3.3.1.1.3.2">superscript</csymbol><apply id="S8.E12.m1.3.3.1.1.3.2.2.cmml" xref="S8.E12.m1.3.3.1.1.3.2"><csymbol cd="ambiguous" id="S8.E12.m1.3.3.1.1.3.2.2.1.cmml" xref="S8.E12.m1.3.3.1.1.3.2">subscript</csymbol><ci id="S8.E12.m1.3.3.1.1.3.2.2.2.cmml" xref="S8.E12.m1.3.3.1.1.3.2.2.2">ℒ</ci><ci id="S8.E12.m1.3.3.1.1.3.2.2.3a.cmml" xref="S8.E12.m1.3.3.1.1.3.2.2.3"><mtext id="S8.E12.m1.3.3.1.1.3.2.2.3.cmml" mathsize="101%" xref="S8.E12.m1.3.3.1.1.3.2.2.3">recon</mtext></ci></apply><ci id="S8.E12.m1.3.3.1.1.3.2.3.cmml" xref="S8.E12.m1.3.3.1.1.3.2.3">𝑚</ci></apply><apply id="S8.E12.m1.3.3.1.1.3.3.cmml" xref="S8.E12.m1.3.3.1.1.3.3"><csymbol cd="ambiguous" id="S8.E12.m1.3.3.1.1.3.3.1.cmml" xref="S8.E12.m1.3.3.1.1.3.3">superscript</csymbol><apply id="S8.E12.m1.3.3.1.1.3.3.2.cmml" xref="S8.E12.m1.3.3.1.1.3.3"><csymbol cd="ambiguous" id="S8.E12.m1.3.3.1.1.3.3.2.1.cmml" xref="S8.E12.m1.3.3.1.1.3.3">subscript</csymbol><ci id="S8.E12.m1.3.3.1.1.3.3.2.2.cmml" xref="S8.E12.m1.3.3.1.1.3.3.2.2">ℒ</ci><ci id="S8.E12.m1.3.3.1.1.3.3.2.3a.cmml" xref="S8.E12.m1.3.3.1.1.3.3.2.3"><mtext id="S8.E12.m1.3.3.1.1.3.3.2.3.cmml" mathsize="101%" xref="S8.E12.m1.3.3.1.1.3.3.2.3">align</mtext></ci></apply><list id="S8.E12.m1.2.2.2.3.cmml" xref="S8.E12.m1.2.2.2.4"><ci id="S8.E12.m1.1.1.1.1.cmml" xref="S8.E12.m1.1.1.1.1">𝑚</ci><ci id="S8.E12.m1.2.2.2.2.cmml" xref="S8.E12.m1.2.2.2.2">𝑡</ci></list></apply><apply id="S8.E12.m1.3.3.1.1.3.4.cmml" xref="S8.E12.m1.3.3.1.1.3.4"><csymbol cd="ambiguous" id="S8.E12.m1.3.3.1.1.3.4.1.cmml" xref="S8.E12.m1.3.3.1.1.3.4">superscript</csymbol><apply id="S8.E12.m1.3.3.1.1.3.4.2.cmml" xref="S8.E12.m1.3.3.1.1.3.4"><csymbol cd="ambiguous" id="S8.E12.m1.3.3.1.1.3.4.2.1.cmml" xref="S8.E12.m1.3.3.1.1.3.4">subscript</csymbol><ci id="S8.E12.m1.3.3.1.1.3.4.2.2.cmml" xref="S8.E12.m1.3.3.1.1.3.4.2.2">ℒ</ci><ci id="S8.E12.m1.3.3.1.1.3.4.2.3a.cmml" xref="S8.E12.m1.3.3.1.1.3.4.2.3"><mtext id="S8.E12.m1.3.3.1.1.3.4.2.3.cmml" mathsize="101%" xref="S8.E12.m1.3.3.1.1.3.4.2.3">recon</mtext></ci></apply><ci id="S8.E12.m1.3.3.1.1.3.4.3.cmml" xref="S8.E12.m1.3.3.1.1.3.4.3">𝑡</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.E12.m1.3c">\displaystyle\mathcal{L}_{\text{AE}}=\mathcal{L}_{\text{recon}}^{m}+\mathcal{L% }_{\text{align}}^{m,t}+\mathcal{L}_{\text{recon}}^{t}.</annotation><annotation encoding="application/x-llamapun" id="S8.E12.m1.3d">caligraphic_L start_POSTSUBSCRIPT AE end_POSTSUBSCRIPT = caligraphic_L start_POSTSUBSCRIPT recon end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT + caligraphic_L start_POSTSUBSCRIPT align end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m , italic_t end_POSTSUPERSCRIPT + caligraphic_L start_POSTSUBSCRIPT recon end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT .</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(12)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S8.p2.6"><span class="ltx_text" id="S8.p2.6.1" style="font-size:144%;">The reconstruction loss </span><math alttext="\mathcal{L}_{\text{recon}}^{m}" class="ltx_Math" display="inline" id="S8.p2.6.m1.1"><semantics id="S8.p2.6.m1.1a"><msubsup id="S8.p2.6.m1.1.1" xref="S8.p2.6.m1.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S8.p2.6.m1.1.1.2.2" mathsize="144%" xref="S8.p2.6.m1.1.1.2.2.cmml">ℒ</mi><mtext id="S8.p2.6.m1.1.1.2.3" mathsize="144%" xref="S8.p2.6.m1.1.1.2.3a.cmml">recon</mtext><mi id="S8.p2.6.m1.1.1.3" mathsize="144%" xref="S8.p2.6.m1.1.1.3.cmml">m</mi></msubsup><annotation-xml encoding="MathML-Content" id="S8.p2.6.m1.1b"><apply id="S8.p2.6.m1.1.1.cmml" xref="S8.p2.6.m1.1.1"><csymbol cd="ambiguous" id="S8.p2.6.m1.1.1.1.cmml" xref="S8.p2.6.m1.1.1">superscript</csymbol><apply id="S8.p2.6.m1.1.1.2.cmml" xref="S8.p2.6.m1.1.1"><csymbol cd="ambiguous" id="S8.p2.6.m1.1.1.2.1.cmml" xref="S8.p2.6.m1.1.1">subscript</csymbol><ci id="S8.p2.6.m1.1.1.2.2.cmml" xref="S8.p2.6.m1.1.1.2.2">ℒ</ci><ci id="S8.p2.6.m1.1.1.2.3a.cmml" xref="S8.p2.6.m1.1.1.2.3"><mtext id="S8.p2.6.m1.1.1.2.3.cmml" mathsize="101%" xref="S8.p2.6.m1.1.1.2.3">recon</mtext></ci></apply><ci id="S8.p2.6.m1.1.1.3.cmml" xref="S8.p2.6.m1.1.1.3">𝑚</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p2.6.m1.1c">\mathcal{L}_{\text{recon}}^{m}</annotation><annotation encoding="application/x-llamapun" id="S8.p2.6.m1.1d">caligraphic_L start_POSTSUBSCRIPT recon end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S8.p2.6.2" style="font-size:144%;"> measures the MSE error between the reconstructed sequence and original motion.</span></p> </div> <div class="ltx_para" id="S8.p3"> <p class="ltx_p" id="S8.p3.1"><span class="ltx_text" id="S8.p3.1.1" style="font-size:144%;">The alignment loss </span><math alttext="\mathcal{L}_{\text{align}}^{m,t}" class="ltx_Math" display="inline" id="S8.p3.1.m1.2"><semantics id="S8.p3.1.m1.2a"><msubsup id="S8.p3.1.m1.2.3" xref="S8.p3.1.m1.2.3.cmml"><mi class="ltx_font_mathcaligraphic" id="S8.p3.1.m1.2.3.2.2" mathsize="144%" xref="S8.p3.1.m1.2.3.2.2.cmml">ℒ</mi><mtext id="S8.p3.1.m1.2.3.2.3" mathsize="144%" xref="S8.p3.1.m1.2.3.2.3a.cmml">align</mtext><mrow id="S8.p3.1.m1.2.2.2.4" xref="S8.p3.1.m1.2.2.2.3.cmml"><mi id="S8.p3.1.m1.1.1.1.1" mathsize="144%" xref="S8.p3.1.m1.1.1.1.1.cmml">m</mi><mo id="S8.p3.1.m1.2.2.2.4.1" mathsize="144%" xref="S8.p3.1.m1.2.2.2.3.cmml">,</mo><mi id="S8.p3.1.m1.2.2.2.2" mathsize="144%" xref="S8.p3.1.m1.2.2.2.2.cmml">t</mi></mrow></msubsup><annotation-xml encoding="MathML-Content" id="S8.p3.1.m1.2b"><apply id="S8.p3.1.m1.2.3.cmml" xref="S8.p3.1.m1.2.3"><csymbol cd="ambiguous" id="S8.p3.1.m1.2.3.1.cmml" xref="S8.p3.1.m1.2.3">superscript</csymbol><apply id="S8.p3.1.m1.2.3.2.cmml" xref="S8.p3.1.m1.2.3"><csymbol cd="ambiguous" id="S8.p3.1.m1.2.3.2.1.cmml" xref="S8.p3.1.m1.2.3">subscript</csymbol><ci id="S8.p3.1.m1.2.3.2.2.cmml" xref="S8.p3.1.m1.2.3.2.2">ℒ</ci><ci id="S8.p3.1.m1.2.3.2.3a.cmml" xref="S8.p3.1.m1.2.3.2.3"><mtext id="S8.p3.1.m1.2.3.2.3.cmml" mathsize="101%" xref="S8.p3.1.m1.2.3.2.3">align</mtext></ci></apply><list id="S8.p3.1.m1.2.2.2.3.cmml" xref="S8.p3.1.m1.2.2.2.4"><ci id="S8.p3.1.m1.1.1.1.1.cmml" xref="S8.p3.1.m1.1.1.1.1">𝑚</ci><ci id="S8.p3.1.m1.2.2.2.2.cmml" xref="S8.p3.1.m1.2.2.2.2">𝑡</ci></list></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p3.1.m1.2c">\mathcal{L}_{\text{align}}^{m,t}</annotation><annotation encoding="application/x-llamapun" id="S8.p3.1.m1.2d">caligraphic_L start_POSTSUBSCRIPT align end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m , italic_t end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S8.p3.1.2" style="font-size:144%;"> measures the cosine distance between the motion embedding and the downsized CLIP feature:</span></p> <table class="ltx_equationgroup ltx_eqn_align ltx_eqn_table" id="S11.EGx2"> <tbody id="S8.E13"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\mathcal{L}_{\text{align}}^{m,t}=1-d_{\text{cos}}\left(\text{Enc}% _{m}\left(\hat{\mathbf{m}}\right),\text{MLP}_{d}(\text{Enc}_{l}({\mathbf{c}})% \right))." class="ltx_Math" display="inline" id="S8.E13.m1.5"><semantics id="S8.E13.m1.5a"><mrow id="S8.E13.m1.5.5.1" xref="S8.E13.m1.5.5.1.1.cmml"><mrow id="S8.E13.m1.5.5.1.1" xref="S8.E13.m1.5.5.1.1.cmml"><msubsup id="S8.E13.m1.5.5.1.1.4" xref="S8.E13.m1.5.5.1.1.4.cmml"><mi class="ltx_font_mathcaligraphic" id="S8.E13.m1.5.5.1.1.4.2.2" mathsize="144%" xref="S8.E13.m1.5.5.1.1.4.2.2.cmml">ℒ</mi><mtext id="S8.E13.m1.5.5.1.1.4.2.3" mathsize="144%" xref="S8.E13.m1.5.5.1.1.4.2.3a.cmml">align</mtext><mrow id="S8.E13.m1.2.2.2.4" xref="S8.E13.m1.2.2.2.3.cmml"><mi id="S8.E13.m1.1.1.1.1" mathsize="144%" xref="S8.E13.m1.1.1.1.1.cmml">m</mi><mo id="S8.E13.m1.2.2.2.4.1" mathsize="144%" xref="S8.E13.m1.2.2.2.3.cmml">,</mo><mi id="S8.E13.m1.2.2.2.2" mathsize="144%" xref="S8.E13.m1.2.2.2.2.cmml">t</mi></mrow></msubsup><mo id="S8.E13.m1.5.5.1.1.3" mathsize="144%" xref="S8.E13.m1.5.5.1.1.3.cmml">=</mo><mrow id="S8.E13.m1.5.5.1.1.2" xref="S8.E13.m1.5.5.1.1.2.cmml"><mn id="S8.E13.m1.5.5.1.1.2.4" mathsize="144%" xref="S8.E13.m1.5.5.1.1.2.4.cmml">1</mn><mo id="S8.E13.m1.5.5.1.1.2.3" mathsize="144%" xref="S8.E13.m1.5.5.1.1.2.3.cmml">−</mo><mrow id="S8.E13.m1.5.5.1.1.2.2" xref="S8.E13.m1.5.5.1.1.2.2.cmml"><msub id="S8.E13.m1.5.5.1.1.2.2.4" xref="S8.E13.m1.5.5.1.1.2.2.4.cmml"><mi id="S8.E13.m1.5.5.1.1.2.2.4.2" mathsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.4.2.cmml">d</mi><mtext id="S8.E13.m1.5.5.1.1.2.2.4.3" mathsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.4.3a.cmml">cos</mtext></msub><mo id="S8.E13.m1.5.5.1.1.2.2.3" xref="S8.E13.m1.5.5.1.1.2.2.3.cmml">⁢</mo><mrow id="S8.E13.m1.5.5.1.1.2.2.2.2" xref="S8.E13.m1.5.5.1.1.2.2.2.3.cmml"><mo id="S8.E13.m1.5.5.1.1.2.2.2.2.3" xref="S8.E13.m1.5.5.1.1.2.2.2.3.cmml">(</mo><mrow id="S8.E13.m1.5.5.1.1.1.1.1.1.1" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.cmml"><msub id="S8.E13.m1.5.5.1.1.1.1.1.1.1.2" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.cmml"><mtext id="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.2" mathsize="144%" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.2a.cmml">Enc</mtext><mi id="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.3" mathsize="144%" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.3.cmml">m</mi></msub><mo id="S8.E13.m1.5.5.1.1.1.1.1.1.1.1" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.1.cmml">⁢</mo><mrow id="S8.E13.m1.5.5.1.1.1.1.1.1.1.3.2" xref="S8.E13.m1.3.3.cmml"><mo id="S8.E13.m1.5.5.1.1.1.1.1.1.1.3.2.1" xref="S8.E13.m1.3.3.cmml">(</mo><mover accent="true" id="S8.E13.m1.3.3" xref="S8.E13.m1.3.3.cmml"><mi id="S8.E13.m1.3.3.2" mathsize="144%" xref="S8.E13.m1.3.3.2.cmml">𝐦</mi><mo id="S8.E13.m1.3.3.1" mathsize="144%" xref="S8.E13.m1.3.3.1.cmml">^</mo></mover><mo id="S8.E13.m1.5.5.1.1.1.1.1.1.1.3.2.2" xref="S8.E13.m1.3.3.cmml">)</mo></mrow></mrow><mo id="S8.E13.m1.5.5.1.1.2.2.2.2.4" mathsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.2.3.cmml">,</mo><mrow id="S8.E13.m1.5.5.1.1.2.2.2.2.2" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.cmml"><msub id="S8.E13.m1.5.5.1.1.2.2.2.2.2.3" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.cmml"><mtext id="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.2" mathsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.2a.cmml">MLP</mtext><mi id="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.3" mathsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.3.cmml">d</mi></msub><mo id="S8.E13.m1.5.5.1.1.2.2.2.2.2.2" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.2.cmml">⁢</mo><mrow id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.cmml"><mo id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.2" maxsize="144%" minsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.cmml">(</mo><mrow id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.cmml"><msub id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.cmml"><mtext id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.2" mathsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.2a.cmml">Enc</mtext><mi id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.3" mathsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.3.cmml">l</mi></msub><mo id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.1" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.1.cmml">⁢</mo><mrow id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.3.2" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.cmml"><mo id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.3.2.1" maxsize="144%" minsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.cmml">(</mo><mi id="S8.E13.m1.4.4" mathsize="144%" xref="S8.E13.m1.4.4.cmml">𝐜</mi><mo id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.3.2.2" maxsize="144%" minsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.cmml">)</mo></mrow></mrow><mo id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.3" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.cmml">)</mo></mrow></mrow><mo id="S8.E13.m1.5.5.1.1.2.2.2.2.5" maxsize="144%" minsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.2.3.cmml">)</mo></mrow></mrow></mrow></mrow><mo id="S8.E13.m1.5.5.1.2" lspace="0em" mathsize="144%" xref="S8.E13.m1.5.5.1.1.cmml">.</mo></mrow><annotation-xml encoding="MathML-Content" id="S8.E13.m1.5b"><apply id="S8.E13.m1.5.5.1.1.cmml" xref="S8.E13.m1.5.5.1"><eq id="S8.E13.m1.5.5.1.1.3.cmml" xref="S8.E13.m1.5.5.1.1.3"></eq><apply id="S8.E13.m1.5.5.1.1.4.cmml" xref="S8.E13.m1.5.5.1.1.4"><csymbol cd="ambiguous" id="S8.E13.m1.5.5.1.1.4.1.cmml" xref="S8.E13.m1.5.5.1.1.4">superscript</csymbol><apply id="S8.E13.m1.5.5.1.1.4.2.cmml" xref="S8.E13.m1.5.5.1.1.4"><csymbol cd="ambiguous" id="S8.E13.m1.5.5.1.1.4.2.1.cmml" xref="S8.E13.m1.5.5.1.1.4">subscript</csymbol><ci id="S8.E13.m1.5.5.1.1.4.2.2.cmml" xref="S8.E13.m1.5.5.1.1.4.2.2">ℒ</ci><ci id="S8.E13.m1.5.5.1.1.4.2.3a.cmml" xref="S8.E13.m1.5.5.1.1.4.2.3"><mtext id="S8.E13.m1.5.5.1.1.4.2.3.cmml" mathsize="101%" xref="S8.E13.m1.5.5.1.1.4.2.3">align</mtext></ci></apply><list id="S8.E13.m1.2.2.2.3.cmml" xref="S8.E13.m1.2.2.2.4"><ci id="S8.E13.m1.1.1.1.1.cmml" xref="S8.E13.m1.1.1.1.1">𝑚</ci><ci id="S8.E13.m1.2.2.2.2.cmml" xref="S8.E13.m1.2.2.2.2">𝑡</ci></list></apply><apply id="S8.E13.m1.5.5.1.1.2.cmml" xref="S8.E13.m1.5.5.1.1.2"><minus id="S8.E13.m1.5.5.1.1.2.3.cmml" xref="S8.E13.m1.5.5.1.1.2.3"></minus><cn id="S8.E13.m1.5.5.1.1.2.4.cmml" type="integer" xref="S8.E13.m1.5.5.1.1.2.4">1</cn><apply id="S8.E13.m1.5.5.1.1.2.2.cmml" xref="S8.E13.m1.5.5.1.1.2.2"><times id="S8.E13.m1.5.5.1.1.2.2.3.cmml" xref="S8.E13.m1.5.5.1.1.2.2.3"></times><apply id="S8.E13.m1.5.5.1.1.2.2.4.cmml" xref="S8.E13.m1.5.5.1.1.2.2.4"><csymbol cd="ambiguous" id="S8.E13.m1.5.5.1.1.2.2.4.1.cmml" xref="S8.E13.m1.5.5.1.1.2.2.4">subscript</csymbol><ci id="S8.E13.m1.5.5.1.1.2.2.4.2.cmml" xref="S8.E13.m1.5.5.1.1.2.2.4.2">𝑑</ci><ci id="S8.E13.m1.5.5.1.1.2.2.4.3a.cmml" xref="S8.E13.m1.5.5.1.1.2.2.4.3"><mtext id="S8.E13.m1.5.5.1.1.2.2.4.3.cmml" mathsize="101%" xref="S8.E13.m1.5.5.1.1.2.2.4.3">cos</mtext></ci></apply><interval closure="open" id="S8.E13.m1.5.5.1.1.2.2.2.3.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2"><apply id="S8.E13.m1.5.5.1.1.1.1.1.1.1.cmml" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1"><times id="S8.E13.m1.5.5.1.1.1.1.1.1.1.1.cmml" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.1"></times><apply id="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.cmml" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.1.cmml" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.2a.cmml" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.2"><mtext id="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.2.cmml" mathsize="144%" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.2">Enc</mtext></ci><ci id="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.3.cmml" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.2.3">𝑚</ci></apply><apply id="S8.E13.m1.3.3.cmml" xref="S8.E13.m1.5.5.1.1.1.1.1.1.1.3.2"><ci id="S8.E13.m1.3.3.1.cmml" xref="S8.E13.m1.3.3.1">^</ci><ci id="S8.E13.m1.3.3.2.cmml" xref="S8.E13.m1.3.3.2">𝐦</ci></apply></apply><apply id="S8.E13.m1.5.5.1.1.2.2.2.2.2.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2"><times id="S8.E13.m1.5.5.1.1.2.2.2.2.2.2.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.2"></times><apply id="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.3"><csymbol cd="ambiguous" id="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.1.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.3">subscript</csymbol><ci id="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.2a.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.2"><mtext id="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.2.cmml" mathsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.2">MLP</mtext></ci><ci id="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.3.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.3.3">𝑑</ci></apply><apply id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1"><times id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.1.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.1"></times><apply id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2"><csymbol cd="ambiguous" id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.1.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2">subscript</csymbol><ci id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.2a.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.2"><mtext id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.2.cmml" mathsize="144%" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.2">Enc</mtext></ci><ci id="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.3.cmml" xref="S8.E13.m1.5.5.1.1.2.2.2.2.2.1.1.1.2.3">𝑙</ci></apply><ci id="S8.E13.m1.4.4.cmml" xref="S8.E13.m1.4.4">𝐜</ci></apply></apply></interval></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.E13.m1.5c">\displaystyle\mathcal{L}_{\text{align}}^{m,t}=1-d_{\text{cos}}\left(\text{Enc}% _{m}\left(\hat{\mathbf{m}}\right),\text{MLP}_{d}(\text{Enc}_{l}({\mathbf{c}})% \right)).</annotation><annotation encoding="application/x-llamapun" id="S8.E13.m1.5d">caligraphic_L start_POSTSUBSCRIPT align end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m , italic_t end_POSTSUPERSCRIPT = 1 - italic_d start_POSTSUBSCRIPT cos end_POSTSUBSCRIPT ( Enc start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( over^ start_ARG bold_m end_ARG ) , MLP start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ( Enc start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ( bold_c ) ) ) .</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(13)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S8.p3.2"><span class="ltx_text" id="S8.p3.2.1" style="font-size:144%;">The text embedding reconstruction loss </span><math alttext="\mathcal{L}_{\text{recon}}^{t}" class="ltx_Math" display="inline" id="S8.p3.2.m1.1"><semantics id="S8.p3.2.m1.1a"><msubsup id="S8.p3.2.m1.1.1" xref="S8.p3.2.m1.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S8.p3.2.m1.1.1.2.2" mathsize="144%" xref="S8.p3.2.m1.1.1.2.2.cmml">ℒ</mi><mtext id="S8.p3.2.m1.1.1.2.3" mathsize="144%" xref="S8.p3.2.m1.1.1.2.3a.cmml">recon</mtext><mi id="S8.p3.2.m1.1.1.3" mathsize="144%" xref="S8.p3.2.m1.1.1.3.cmml">t</mi></msubsup><annotation-xml encoding="MathML-Content" id="S8.p3.2.m1.1b"><apply id="S8.p3.2.m1.1.1.cmml" xref="S8.p3.2.m1.1.1"><csymbol cd="ambiguous" id="S8.p3.2.m1.1.1.1.cmml" xref="S8.p3.2.m1.1.1">superscript</csymbol><apply id="S8.p3.2.m1.1.1.2.cmml" xref="S8.p3.2.m1.1.1"><csymbol cd="ambiguous" id="S8.p3.2.m1.1.1.2.1.cmml" xref="S8.p3.2.m1.1.1">subscript</csymbol><ci id="S8.p3.2.m1.1.1.2.2.cmml" xref="S8.p3.2.m1.1.1.2.2">ℒ</ci><ci id="S8.p3.2.m1.1.1.2.3a.cmml" xref="S8.p3.2.m1.1.1.2.3"><mtext id="S8.p3.2.m1.1.1.2.3.cmml" mathsize="101%" xref="S8.p3.2.m1.1.1.2.3">recon</mtext></ci></apply><ci id="S8.p3.2.m1.1.1.3.cmml" xref="S8.p3.2.m1.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p3.2.m1.1c">\mathcal{L}_{\text{recon}}^{t}</annotation><annotation encoding="application/x-llamapun" id="S8.p3.2.m1.1d">caligraphic_L start_POSTSUBSCRIPT recon end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT</annotation></semantics></math><span class="ltx_text" id="S8.p3.2.2" style="font-size:144%;"> measures the MSE distance between the reconstructed CLIP embedding and the original one:</span></p> <table class="ltx_equationgroup ltx_eqn_align ltx_eqn_table" id="S11.EGx3"> <tbody id="S8.E14"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\mathcal{L}_{\text{recon}}^{t}=\|\text{MLP}_{u}(\text{MLP}_{d}(% \text{Enc}_{l}({\mathbf{c}}))))-\text{Enc}_{l}({\mathbf{c}})\|_{2}" class="ltx_math_unparsed" display="inline" id="S8.E14.m1.1"><semantics id="S8.E14.m1.1a"><mrow id="S8.E14.m1.1b"><msubsup id="S8.E14.m1.1.2"><mi class="ltx_font_mathcaligraphic" id="S8.E14.m1.1.2.2.2" mathsize="144%">ℒ</mi><mtext id="S8.E14.m1.1.2.2.3" mathsize="144%">recon</mtext><mi id="S8.E14.m1.1.2.3" mathsize="144%">t</mi></msubsup><mo id="S8.E14.m1.1.3" mathsize="144%" rspace="0em">=</mo><mo id="S8.E14.m1.1.4" lspace="0em" mathsize="144%" rspace="0.167em">∥</mo><msub id="S8.E14.m1.1.5"><mtext id="S8.E14.m1.1.5.2" mathsize="144%">MLP</mtext><mi id="S8.E14.m1.1.5.3" mathsize="144%">u</mi></msub><mrow id="S8.E14.m1.1.6"><mo id="S8.E14.m1.1.6.1" maxsize="144%" minsize="144%">(</mo><msub id="S8.E14.m1.1.6.2"><mtext id="S8.E14.m1.1.6.2.2" mathsize="144%">MLP</mtext><mi id="S8.E14.m1.1.6.2.3" mathsize="144%">d</mi></msub><mrow id="S8.E14.m1.1.6.3"><mo id="S8.E14.m1.1.6.3.1" maxsize="144%" minsize="144%">(</mo><msub id="S8.E14.m1.1.6.3.2"><mtext id="S8.E14.m1.1.6.3.2.2" mathsize="144%">Enc</mtext><mi id="S8.E14.m1.1.6.3.2.3" mathsize="144%">l</mi></msub><mrow id="S8.E14.m1.1.6.3.3"><mo id="S8.E14.m1.1.6.3.3.1" maxsize="144%" minsize="144%">(</mo><mi id="S8.E14.m1.1.1" mathsize="144%">𝐜</mi><mo id="S8.E14.m1.1.6.3.3.2" maxsize="144%" minsize="144%">)</mo></mrow><mo id="S8.E14.m1.1.6.3.4" maxsize="144%" minsize="144%">)</mo></mrow><mo id="S8.E14.m1.1.6.4" maxsize="144%" minsize="144%">)</mo></mrow><mo id="S8.E14.m1.1.7" maxsize="144%" minsize="144%">)</mo><mo id="S8.E14.m1.1.8" mathsize="144%">−</mo><mtext id="S8.E14.m1.1.9" mathsize="144%">Enc</mtext><msub id="S8.E14.m1.1.10"><mi id="S8.E14.m1.1.10a"></mi><mi id="S8.E14.m1.1.10.1" mathsize="144%">l</mi></msub><mo id="S8.E14.m1.1.11" maxsize="144%" minsize="144%">(</mo><mi id="S8.E14.m1.1.12" mathsize="144%">𝐜</mi><mo id="S8.E14.m1.1.13" maxsize="144%" minsize="144%">)</mo><mo id="S8.E14.m1.1.14" lspace="0em" mathsize="144%" rspace="0.167em">∥</mo><msub id="S8.E14.m1.1.15"><mi id="S8.E14.m1.1.15a"></mi><mn id="S8.E14.m1.1.15.1" mathsize="144%">2</mn></msub></mrow><annotation encoding="application/x-tex" id="S8.E14.m1.1c">\displaystyle\mathcal{L}_{\text{recon}}^{t}=\|\text{MLP}_{u}(\text{MLP}_{d}(% \text{Enc}_{l}({\mathbf{c}}))))-\text{Enc}_{l}({\mathbf{c}})\|_{2}</annotation><annotation encoding="application/x-llamapun" id="S8.E14.m1.1d">caligraphic_L start_POSTSUBSCRIPT recon end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = ∥ MLP start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ( MLP start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ( Enc start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ( bold_c ) ) ) ) - Enc start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ( bold_c ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(14)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S8.p3.3"><span class="ltx_text" id="S8.p3.3.1" style="font-size:144%;">The weights of </span><math alttext="\text{Enc}_{l}" class="ltx_Math" display="inline" id="S8.p3.3.m1.1"><semantics id="S8.p3.3.m1.1a"><msub id="S8.p3.3.m1.1.1" xref="S8.p3.3.m1.1.1.cmml"><mtext id="S8.p3.3.m1.1.1.2" mathsize="144%" xref="S8.p3.3.m1.1.1.2a.cmml">Enc</mtext><mi id="S8.p3.3.m1.1.1.3" mathsize="144%" xref="S8.p3.3.m1.1.1.3.cmml">l</mi></msub><annotation-xml encoding="MathML-Content" id="S8.p3.3.m1.1b"><apply id="S8.p3.3.m1.1.1.cmml" xref="S8.p3.3.m1.1.1"><csymbol cd="ambiguous" id="S8.p3.3.m1.1.1.1.cmml" xref="S8.p3.3.m1.1.1">subscript</csymbol><ci id="S8.p3.3.m1.1.1.2a.cmml" xref="S8.p3.3.m1.1.1.2"><mtext id="S8.p3.3.m1.1.1.2.cmml" mathsize="144%" xref="S8.p3.3.m1.1.1.2">Enc</mtext></ci><ci id="S8.p3.3.m1.1.1.3.cmml" xref="S8.p3.3.m1.1.1.3">𝑙</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S8.p3.3.m1.1c">\text{Enc}_{l}</annotation><annotation encoding="application/x-llamapun" id="S8.p3.3.m1.1d">Enc start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT</annotation></semantics></math><span class="ltx_text" id="S8.p3.3.2" style="font-size:144%;"> are fixed during training. To maintain the semantic information, we follow the sampling strategy used in MotionCLIP </span><cite class="ltx_cite ltx_citemacro_cite"><span class="ltx_text" id="S8.p3.3.3.1" style="font-size:144%;">[</span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib38" title=""><span class="ltx_text" style="font-size:90%;">38</span></a><span class="ltx_text" id="S8.p3.3.4.2" style="font-size:144%;">]</span></cite><span class="ltx_text" id="S8.p3.3.5" style="font-size:144%;">. We sample 300 frames from the 30fps motion data and use skip sampling for the motion clips that are longer than 10 seconds so that all the information is included.</span></p> </div> </section> <section class="ltx_section ltx_centering" id="S9"> <h2 class="ltx_title ltx_title_section" style="font-size:144%;"> <span class="ltx_tag ltx_tag_section">9 </span>New Skill Scalability</h2> <div class="ltx_para" id="S9.p1"> <p class="ltx_p" id="S9.p1.1"><span class="ltx_text" id="S9.p1.1.1" style="font-size:144%;">In </span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S7.F6" style="font-size:144%;" title="In 7 Reward Templates ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">6</span></a><span class="ltx_text" id="S9.p1.1.2" style="font-size:144%;">, we show the easy scalability of our framework. When new skills of new styles come, we need to train the corresponding skill based on the 3 kinds of templates, and expand the scripts database following the instruction of </span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S3.SS1" style="font-size:144%;" title="3.1 Short Script Database Construction ‣ 3 Method ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Sec.</span> <span class="ltx_text ltx_ref_tag">3.1</span></a><span class="ltx_text" id="S9.p1.1.3" style="font-size:144%;">.</span></p> </div> </section> <section class="ltx_section ltx_centering" id="S10"> <h2 class="ltx_title ltx_title_section" style="font-size:144%;"> <span class="ltx_tag ltx_tag_section">10 </span>ViconStyle Dataset</h2> <div class="ltx_para" id="S10.p1"> <p class="ltx_p" id="S10.p1.1"><span class="ltx_text" id="S10.p1.1.1" style="font-size:144%;">We propose a comprehensive motion dataset called ViconStyle, in which well-labeled reconstructed motion clips with diverse styles and multiple skills are provided.</span></p> </div> <figure class="ltx_figure" id="S10.F8"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_portrait" height="551" id="S10.F8.g1" src="x8.png" width="415"/> <figcaption class="ltx_caption ltx_centering" style="font-size:144%;"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S10.F8.4.1.1" style="font-size:63%;">Figure 8</span>: </span><span class="ltx_text" id="S10.F8.5.2" style="font-size:63%;">The motion capture environment of Vicon optical motion capture system.</span></figcaption> </figure> <section class="ltx_subsection" id="S10.SS1"> <h3 class="ltx_title ltx_title_subsection" style="font-size:144%;"> <span class="ltx_tag ltx_tag_subsection">10.1 </span>Capture Setting</h3> <div class="ltx_para" id="S10.SS1.p1"> <p class="ltx_p" id="S10.SS1.p1.1"><span class="ltx_text" id="S10.SS1.p1.1.1" style="font-size:144%;">The motion clips are captured with Vicon, an optical motion capture system, as shown in figure </span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S10.F8" style="font-size:144%;" title="Figure 8 ‣ 10 ViconStyle Dataset ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">8</span></a><span class="ltx_text" id="S10.SS1.p1.1.2" style="font-size:144%;">. All motion clips are captured with 120 fps. During the capture, we asked actors to interact with scene objects of different sizes and weights, such as lying on the sofa or carrying boxes.</span></p> </div> <div class="ltx_para" id="S10.SS1.p2"> <p class="ltx_p" id="S10.SS1.p2.1"><span class="ltx_text" id="S10.SS1.p2.1.1" style="font-size:144%;">We used SOMA </span><cite class="ltx_cite ltx_citemacro_cite"><span class="ltx_text" id="S10.SS1.p2.1.2.1" style="font-size:144%;">[</span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib10" title=""><span class="ltx_text" style="font-size:90%;">10</span></a><span class="ltx_text" id="S10.SS1.p2.1.3.2" style="font-size:144%;">]</span></cite><span class="ltx_text" id="S10.SS1.p2.1.4" style="font-size:144%;"> to fit the SMPL </span><cite class="ltx_cite ltx_citemacro_cite"><span class="ltx_text" id="S10.SS1.p2.1.5.1" style="font-size:144%;">[</span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#bib.bib19" title=""><span class="ltx_text" style="font-size:90%;">19</span></a><span class="ltx_text" id="S10.SS1.p2.1.6.2" style="font-size:144%;">]</span></cite><span class="ltx_text" id="S10.SS1.p2.1.7" style="font-size:144%;"> body model and its pose parameters. The mocap data are then annotated with text descriptions containing motion details such as ”hands on the thighs” and ”lean back” and motion styles and emotions.</span></p> </div> <div class="ltx_para" id="S10.SS1.p3"> <p class="ltx_p" id="S10.SS1.p3.1"><span class="ltx_text" id="S10.SS1.p3.1.1" style="font-size:144%;">We also used a method to calculate the transformation and orientation and fit the size of the scene objects that we captured. We divide the reconstruction problem into two stages. In the first stage, we need to approximate the initial state of the scene objects. Since the scene objects are mainly boxes, the state estimation problem can be converted into an axis regression problem. We first regress the most suitable local coordinate by rotating the axis to minimum the max distance from the captured marker points to the axis. Then we move the origin point to the center of the bounding boxes of the marker points, and the scale can also be easily calculated. In the second stage, we trivially represent the subsequent transformation and orientation in the form of displacements and rotations relative to the initial frame.</span></p> </div> </section> <section class="ltx_subsection" id="S10.SS2"> <h3 class="ltx_title ltx_title_subsection" style="font-size:144%;"> <span class="ltx_tag ltx_tag_subsection">10.2 </span>Dataset Statistics</h3> <div class="ltx_para" id="S10.SS2.p1"> <p class="ltx_p" id="S10.SS2.p1.1"><span class="ltx_text" id="S10.SS2.p1.1.1" style="font-size:144%;">We recruited three actors to capture the dataset. The motion clips we captured contain 7 skills and actors are asked to perform in different styles and add details in every motion clip. The motion data set is 71.6 minutes in length and has 415 clips in total. The information of the actors is listed in table </span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S10.T12" style="font-size:144%;" title="In 10.2 Dataset Statistics ‣ 10 ViconStyle Dataset ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">12</span></a><span class="ltx_text" id="S10.SS2.p1.1.2" style="font-size:144%;">, and the detailed statistics of the data set are listed in table </span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S4.T2" style="font-size:144%;" title="In 4.1 Dataset ‣ 4 Experiments ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">2</span></a><span class="ltx_text" id="S10.SS2.p1.1.3" style="font-size:144%;">.</span></p> </div> <figure class="ltx_table" id="S10.T12"> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S10.T12.2" style="width:346.9pt;height:86.4pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(28.9pt,-7.2pt) scale(1.20008585366142,1.20008585366142) ;"> <table class="ltx_tabular ltx_align_middle" id="S10.T12.2.1"> <tr class="ltx_tr" id="S10.T12.2.1.1"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S10.T12.2.1.1.1"><span class="ltx_text" id="S10.T12.2.1.1.1.1" style="font-size:144%;">Actors No.</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S10.T12.2.1.1.2"><span class="ltx_text" id="S10.T12.2.1.1.2.1" style="font-size:144%;">Age</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S10.T12.2.1.1.3"><span class="ltx_text" id="S10.T12.2.1.1.3.1" style="font-size:144%;">Gender</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_tt" id="S10.T12.2.1.1.4"><span class="ltx_text" id="S10.T12.2.1.1.4.1" style="font-size:144%;">Height</span></td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S10.T12.2.1.1.5"><span class="ltx_text" id="S10.T12.2.1.1.5.1" style="font-size:144%;">Weight</span></td> </tr> <tr class="ltx_tr" id="S10.T12.2.1.2"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S10.T12.2.1.2.1"><span class="ltx_text" id="S10.T12.2.1.2.1.1" style="font-size:144%;">1</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S10.T12.2.1.2.2"><span class="ltx_text" id="S10.T12.2.1.2.2.1" style="font-size:144%;">22</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S10.T12.2.1.2.3"><span class="ltx_text" id="S10.T12.2.1.2.3.1" style="font-size:144%;">Female</span></td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S10.T12.2.1.2.4"><span class="ltx_text" id="S10.T12.2.1.2.4.1" style="font-size:144%;">168</span></td> <td class="ltx_td ltx_align_center ltx_border_t" id="S10.T12.2.1.2.5"><span class="ltx_text" id="S10.T12.2.1.2.5.1" style="font-size:144%;">55</span></td> </tr> <tr class="ltx_tr" id="S10.T12.2.1.3"> <td class="ltx_td ltx_align_center ltx_border_r" id="S10.T12.2.1.3.1"><span class="ltx_text" id="S10.T12.2.1.3.1.1" style="font-size:144%;">2</span></td> <td class="ltx_td ltx_align_center ltx_border_r" id="S10.T12.2.1.3.2"><span class="ltx_text" id="S10.T12.2.1.3.2.1" style="font-size:144%;">22</span></td> <td class="ltx_td ltx_align_center ltx_border_r" id="S10.T12.2.1.3.3"><span class="ltx_text" id="S10.T12.2.1.3.3.1" style="font-size:144%;">Male</span></td> <td class="ltx_td ltx_align_center ltx_border_r" id="S10.T12.2.1.3.4"><span class="ltx_text" id="S10.T12.2.1.3.4.1" style="font-size:144%;">182</span></td> <td class="ltx_td ltx_align_center" id="S10.T12.2.1.3.5"><span class="ltx_text" id="S10.T12.2.1.3.5.1" style="font-size:144%;">71</span></td> </tr> <tr class="ltx_tr" id="S10.T12.2.1.4"> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S10.T12.2.1.4.1"><span class="ltx_text" id="S10.T12.2.1.4.1.1" style="font-size:144%;">3</span></td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S10.T12.2.1.4.2"><span class="ltx_text" id="S10.T12.2.1.4.2.1" style="font-size:144%;">30</span></td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S10.T12.2.1.4.3"><span class="ltx_text" id="S10.T12.2.1.4.3.1" style="font-size:144%;">Male</span></td> <td class="ltx_td ltx_align_center ltx_border_bb ltx_border_r" id="S10.T12.2.1.4.4"><span class="ltx_text" id="S10.T12.2.1.4.4.1" style="font-size:144%;">175</span></td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S10.T12.2.1.4.5"><span class="ltx_text" id="S10.T12.2.1.4.5.1" style="font-size:144%;">85</span></td> </tr> </table> </span></div> <figcaption class="ltx_caption ltx_centering" style="font-size:144%;"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S10.T12.5.1.1" style="font-size:63%;">Table 12</span>: </span><span class="ltx_text" id="S10.T12.6.2" style="font-size:63%;">Actor information.</span></figcaption> </figure> </section> <section class="ltx_subsection" id="S10.SS3"> <h3 class="ltx_title ltx_title_subsection" style="font-size:144%;"> <span class="ltx_tag ltx_tag_subsection">10.3 </span>Qualitative Results</h3> <div class="ltx_para" id="S10.SS3.p1"> <p class="ltx_p" id="S10.SS3.p1.1"><span class="ltx_text" id="S10.SS3.p1.1.1" style="font-size:144%;">The captured motion contains diverse styles of Idle, Lie, Carry, and GetUp skills. See  </span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S6.F5" style="font-size:144%;" title="In SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Fig.</span> <span class="ltx_text ltx_ref_tag">5</span></a><span class="ltx_text" id="S10.SS3.p1.1.2" style="font-size:144%;"> for demonstration.</span></p> </div> </section> </section> <section class="ltx_section ltx_centering" id="S11"> <h2 class="ltx_title ltx_title_section" style="font-size:144%;"> <span class="ltx_tag ltx_tag_section">11 </span>Short Script Examples</h2> <figure class="ltx_table" id="S11.T13"> <div class="ltx_flex_figure ltx_flex_table"> <div class="ltx_flex_cell ltx_flex_size_1"> <div class="ltx_inline-block ltx_figure_panel ltx_transformed_outer" id="S11.T13.2" style="width:372.6pt;height:95.8pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-58.8pt,15.1pt) scale(0.76,0.76) ;"> <table class="ltx_tabular ltx_align_middle" id="S11.T13.2.1"> <tr class="ltx_tr" id="S11.T13.2.1.1"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_tt" colspan="4" id="S11.T13.2.1.1.1"> <span class="ltx_text" id="S11.T13.2.1.1.1.1" style="font-size:90%;">Summary: The character enjoys a </span><span class="ltx_text ltx_font_bold" id="S11.T13.2.1.1.1.2" style="font-size:90%;">relaxed</span><span class="ltx_text" id="S11.T13.2.1.1.1.3" style="font-size:90%;"> afternoon in the living room.</span> </td> </tr> <tr class="ltx_tr" id="S11.T13.2.1.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.2.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.2.1.1"> <span class="ltx_p" id="S11.T13.2.1.2.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.2.1.1.1.1" style="font-size:90%;">skill</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.2.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.2.2.1"> <span class="ltx_p" id="S11.T13.2.1.2.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.2.2.1.1.1" style="font-size:90%;">style</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.2.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.2.3.1"> <span class="ltx_p" id="S11.T13.2.1.2.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.2.3.1.1.1" style="font-size:90%;">object</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.2.1.2.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.2.4.1"> <span class="ltx_p" id="S11.T13.2.1.2.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.2.1.2.4.1.1.1" style="font-size:90%;">captions</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.2.1.3"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.3.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.3.1.1"> <span class="ltx_p" id="S11.T13.2.1.3.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.3.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.3.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.3.2.1"> <span class="ltx_p" id="S11.T13.2.1.3.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.3.2.1.1.1" style="font-size:90%;">neutral</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.3.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.3.3.1"> <span class="ltx_p" id="S11.T13.2.1.3.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.3.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.2.1.3.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.3.4.1"> <span class="ltx_p" id="S11.T13.2.1.3.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.2.1.3.4.1.1.1" style="font-size:90%;">smoothly forward walk</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.2.1.4"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.4.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.4.1.1"> <span class="ltx_p" id="S11.T13.2.1.4.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.4.1.1.1.1" style="font-size:90%;">idle</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.4.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.4.2.1"> <span class="ltx_p" id="S11.T13.2.1.4.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.4.2.1.1.1" style="font-size:90%;">relaxed</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.4.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.4.3.1"> <span class="ltx_p" id="S11.T13.2.1.4.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.4.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.2.1.4.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.4.4.1"> <span class="ltx_p" id="S11.T13.2.1.4.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.2.1.4.4.1.1.1" style="font-size:90%;">relaxing body</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.2.1.5"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.5.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.5.1.1"> <span class="ltx_p" id="S11.T13.2.1.5.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.5.1.1.1.1" style="font-size:90%;">sit</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.5.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.5.2.1"> <span class="ltx_p" id="S11.T13.2.1.5.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.5.2.1.1.1" style="font-size:90%;">relaxed</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.5.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.5.3.1"> <span class="ltx_p" id="S11.T13.2.1.5.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.5.3.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.2.1.5.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.5.4.1"> <span class="ltx_p" id="S11.T13.2.1.5.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.2.1.5.4.1.1.1" style="font-size:90%;">leaning back, legs straight, hands supporting head</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.2.1.6"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.6.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.6.1.1"> <span class="ltx_p" id="S11.T13.2.1.6.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.6.1.1.1.1" style="font-size:90%;">getup</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.6.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.6.2.1"> <span class="ltx_p" id="S11.T13.2.1.6.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.6.2.1.1.1" style="font-size:90%;">neutral</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.2.1.6.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.6.3.1"> <span class="ltx_p" id="S11.T13.2.1.6.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.6.3.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.2.1.6.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.6.4.1"> <span class="ltx_p" id="S11.T13.2.1.6.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.2.1.6.4.1.1.1" style="font-size:90%;">-</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.2.1.7"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.2.1.7.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.7.1.1"> <span class="ltx_p" id="S11.T13.2.1.7.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.7.1.1.1.1" style="font-size:90%;">touch</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.2.1.7.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.7.2.1"> <span class="ltx_p" id="S11.T13.2.1.7.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.7.2.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.2.1.7.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.7.3.1"> <span class="ltx_p" id="S11.T13.2.1.7.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.2.1.7.3.1.1.1" style="font-size:90%;">shelf</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S11.T13.2.1.7.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.2.1.7.4.1"> <span class="ltx_p" id="S11.T13.2.1.7.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.2.1.7.4.1.1.1" style="font-size:90%;">-</span></span> </span> </td> </tr> </table> </span></div> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <div class="ltx_inline-block ltx_figure_panel ltx_transformed_outer" id="S11.T13.3" style="width:372.6pt;height:82.1pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-58.8pt,13.0pt) scale(0.76,0.76) ;"> <table class="ltx_tabular ltx_align_middle" id="S11.T13.3.1"> <tr class="ltx_tr" id="S11.T13.3.1.1"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_tt" colspan="4" id="S11.T13.3.1.1.1"> <span class="ltx_text" id="S11.T13.3.1.1.1.1" style="font-size:90%;">Summary: The character rushed </span><span class="ltx_text ltx_font_bold" id="S11.T13.3.1.1.1.2" style="font-size:90%;">anxiously</span><span class="ltx_text" id="S11.T13.3.1.1.1.3" style="font-size:90%;"> through the living room.</span> </td> </tr> <tr class="ltx_tr" id="S11.T13.3.1.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.2.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.2.1.1"> <span class="ltx_p" id="S11.T13.3.1.2.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.2.1.1.1.1" style="font-size:90%;">skill</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.2.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.2.2.1"> <span class="ltx_p" id="S11.T13.3.1.2.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.2.2.1.1.1" style="font-size:90%;">style</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.2.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.2.3.1"> <span class="ltx_p" id="S11.T13.3.1.2.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.2.3.1.1.1" style="font-size:90%;">object</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.3.1.2.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.2.4.1"> <span class="ltx_p" id="S11.T13.3.1.2.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.3.1.2.4.1.1.1" style="font-size:90%;">captions</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.3.1.3"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.3.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.3.1.1"> <span class="ltx_p" id="S11.T13.3.1.3.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.3.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.3.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.3.2.1"> <span class="ltx_p" id="S11.T13.3.1.3.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.3.2.1.1.1" style="font-size:90%;">anxious</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.3.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.3.3.1"> <span class="ltx_p" id="S11.T13.3.1.3.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.3.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.3.1.3.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.3.4.1"> <span class="ltx_p" id="S11.T13.3.1.3.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.3.1.3.4.1.1.1" style="font-size:90%;">rush anxiously forward</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.3.1.4"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.4.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.4.1.1"> <span class="ltx_p" id="S11.T13.3.1.4.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.4.1.1.1.1" style="font-size:90%;">touch</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.4.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.4.2.1"> <span class="ltx_p" id="S11.T13.3.1.4.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.4.2.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.4.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.4.3.1"> <span class="ltx_p" id="S11.T13.3.1.4.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.4.3.1.1.1" style="font-size:90%;">shelf</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.3.1.4.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.4.4.1"> <span class="ltx_p" id="S11.T13.3.1.4.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.3.1.4.4.1.1.1" style="font-size:90%;">-</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.3.1.5"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.5.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.5.1.1"> <span class="ltx_p" id="S11.T13.3.1.5.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.5.1.1.1.1" style="font-size:90%;">idle</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.5.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.5.2.1"> <span class="ltx_p" id="S11.T13.3.1.5.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.5.2.1.1.1" style="font-size:90%;">anxious</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.3.1.5.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.5.3.1"> <span class="ltx_p" id="S11.T13.3.1.5.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.5.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.3.1.5.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.5.4.1"> <span class="ltx_p" id="S11.T13.3.1.5.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.3.1.5.4.1.1.1" style="font-size:90%;">pace around nervously</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.3.1.6"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.3.1.6.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.6.1.1"> <span class="ltx_p" id="S11.T13.3.1.6.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.6.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.3.1.6.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.6.2.1"> <span class="ltx_p" id="S11.T13.3.1.6.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.6.2.1.1.1" style="font-size:90%;">hurried</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.3.1.6.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.6.3.1"> <span class="ltx_p" id="S11.T13.3.1.6.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.3.1.6.3.1.1.1" style="font-size:90%;">table</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S11.T13.3.1.6.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.3.1.6.4.1"> <span class="ltx_p" id="S11.T13.3.1.6.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.3.1.6.4.1.1.1" style="font-size:90%;">walk with large steps</span></span> </span> </td> </tr> </table> </span></div> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <div class="ltx_inline-block ltx_figure_panel ltx_transformed_outer" id="S11.T13.4" style="width:372.6pt;height:95.8pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-58.8pt,15.1pt) scale(0.76,0.76) ;"> <table class="ltx_tabular ltx_align_middle" id="S11.T13.4.1"> <tr class="ltx_tr" id="S11.T13.4.1.1"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_tt" colspan="4" id="S11.T13.4.1.1.1"> <span class="ltx_text" id="S11.T13.4.1.1.1.1" style="font-size:90%;">Summary: Character felt utterly </span><span class="ltx_text ltx_font_bold" id="S11.T13.4.1.1.1.2" style="font-size:90%;">tired</span><span class="ltx_text" id="S11.T13.4.1.1.1.3" style="font-size:90%;"> and sleep in the bedroom.</span> </td> </tr> <tr class="ltx_tr" id="S11.T13.4.1.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.2.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.2.1.1"> <span class="ltx_p" id="S11.T13.4.1.2.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.2.1.1.1.1" style="font-size:90%;">skill</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.2.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.2.2.1"> <span class="ltx_p" id="S11.T13.4.1.2.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.2.2.1.1.1" style="font-size:90%;">style</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.2.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.2.3.1"> <span class="ltx_p" id="S11.T13.4.1.2.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.2.3.1.1.1" style="font-size:90%;">object</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.4.1.2.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.2.4.1"> <span class="ltx_p" id="S11.T13.4.1.2.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.4.1.2.4.1.1.1" style="font-size:90%;">captions</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.4.1.3"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.3.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.3.1.1"> <span class="ltx_p" id="S11.T13.4.1.3.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.3.1.1.1.1" style="font-size:90%;">idle</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.3.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.3.2.1"> <span class="ltx_p" id="S11.T13.4.1.3.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.3.2.1.1.1" style="font-size:90%;">tired</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.3.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.3.3.1"> <span class="ltx_p" id="S11.T13.4.1.3.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.3.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.4.1.3.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.3.4.1"> <span class="ltx_p" id="S11.T13.4.1.3.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.4.1.3.4.1.1.1" style="font-size:90%;">bent over with hands on knees</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.4.1.4"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.4.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.4.1.1"> <span class="ltx_p" id="S11.T13.4.1.4.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.4.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.4.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.4.2.1"> <span class="ltx_p" id="S11.T13.4.1.4.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.4.2.1.1.1" style="font-size:90%;">tired</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.4.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.4.3.1"> <span class="ltx_p" id="S11.T13.4.1.4.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.4.3.1.1.1" style="font-size:90%;">lamp</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.4.1.4.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.4.4.1"> <span class="ltx_p" id="S11.T13.4.1.4.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.4.1.4.4.1.1.1" style="font-size:90%;">head bowed and body bent while walking</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.4.1.5"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.5.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.5.1.1"> <span class="ltx_p" id="S11.T13.4.1.5.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.5.1.1.1.1" style="font-size:90%;">touch</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.5.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.5.2.1"> <span class="ltx_p" id="S11.T13.4.1.5.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.5.2.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.5.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.5.3.1"> <span class="ltx_p" id="S11.T13.4.1.5.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.5.3.1.1.1" style="font-size:90%;">lamp</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.4.1.5.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.5.4.1"> <span class="ltx_p" id="S11.T13.4.1.5.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.4.1.5.4.1.1.1" style="font-size:90%;">-</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.4.1.6"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.6.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.6.1.1"> <span class="ltx_p" id="S11.T13.4.1.6.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.6.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.6.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.6.2.1"> <span class="ltx_p" id="S11.T13.4.1.6.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.6.2.1.1.1" style="font-size:90%;">neutral</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.4.1.6.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.6.3.1"> <span class="ltx_p" id="S11.T13.4.1.6.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.6.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.4.1.6.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.6.4.1"> <span class="ltx_p" id="S11.T13.4.1.6.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.4.1.6.4.1.1.1" style="font-size:90%;">moving backward while walking</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.4.1.7"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.4.1.7.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.7.1.1"> <span class="ltx_p" id="S11.T13.4.1.7.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.7.1.1.1.1" style="font-size:90%;">lie</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.4.1.7.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.7.2.1"> <span class="ltx_p" id="S11.T13.4.1.7.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.7.2.1.1.1" style="font-size:90%;">tired</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.4.1.7.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.7.3.1"> <span class="ltx_p" id="S11.T13.4.1.7.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.4.1.7.3.1.1.1" style="font-size:90%;">bed</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S11.T13.4.1.7.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.4.1.7.4.1"> <span class="ltx_p" id="S11.T13.4.1.7.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.4.1.7.4.1.1.1" style="font-size:90%;">lying down, legs straight</span></span> </span> </td> </tr> </table> </span></div> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <div class="ltx_inline-block ltx_figure_panel ltx_transformed_outer" id="S11.T13.5" style="width:372.6pt;height:82.1pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-58.8pt,13.0pt) scale(0.76,0.76) ;"> <table class="ltx_tabular ltx_align_middle" id="S11.T13.5.1"> <tr class="ltx_tr" id="S11.T13.5.1.1"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_tt" colspan="4" id="S11.T13.5.1.1.1"> <span class="ltx_text" id="S11.T13.5.1.1.1.1" style="font-size:90%;">Summary: The character </span><span class="ltx_text ltx_font_bold" id="S11.T13.5.1.1.1.2" style="font-size:90%;">happily</span><span class="ltx_text" id="S11.T13.5.1.1.1.3" style="font-size:90%;"> played and relaxed around the bedroom</span> </td> </tr> <tr class="ltx_tr" id="S11.T13.5.1.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.2.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.2.1.1"> <span class="ltx_p" id="S11.T13.5.1.2.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.2.1.1.1.1" style="font-size:90%;">skill</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.2.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.2.2.1"> <span class="ltx_p" id="S11.T13.5.1.2.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.2.2.1.1.1" style="font-size:90%;">style</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.2.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.2.3.1"> <span class="ltx_p" id="S11.T13.5.1.2.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.2.3.1.1.1" style="font-size:90%;">object</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.5.1.2.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.2.4.1"> <span class="ltx_p" id="S11.T13.5.1.2.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.5.1.2.4.1.1.1" style="font-size:90%;">captions</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.5.1.3"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.3.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.3.1.1"> <span class="ltx_p" id="S11.T13.5.1.3.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.3.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.3.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.3.2.1"> <span class="ltx_p" id="S11.T13.5.1.3.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.3.2.1.1.1" style="font-size:90%;">happy</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.3.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.3.3.1"> <span class="ltx_p" id="S11.T13.5.1.3.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.3.3.1.1.1" style="font-size:90%;">wardrobe</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.5.1.3.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.3.4.1"> <span class="ltx_p" id="S11.T13.5.1.3.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.5.1.3.4.1.1.1" style="font-size:90%;">excited walk</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.5.1.4"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.4.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.4.1.1"> <span class="ltx_p" id="S11.T13.5.1.4.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.4.1.1.1.1" style="font-size:90%;">carry</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.4.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.4.2.1"> <span class="ltx_p" id="S11.T13.5.1.4.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.4.2.1.1.1" style="font-size:90%;">happy</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.4.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.4.3.1"> <span class="ltx_p" id="S11.T13.5.1.4.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.4.3.1.1.1" style="font-size:90%;">toy</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.5.1.4.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.4.4.1"> <span class="ltx_p" id="S11.T13.5.1.4.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.5.1.4.4.1.1.1" style="font-size:90%;">carry object happily</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.5.1.5"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.5.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.5.1.1"> <span class="ltx_p" id="S11.T13.5.1.5.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.5.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.5.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.5.2.1"> <span class="ltx_p" id="S11.T13.5.1.5.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.5.2.1.1.1" style="font-size:90%;">happy</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.5.1.5.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.5.3.1"> <span class="ltx_p" id="S11.T13.5.1.5.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.5.3.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.5.1.5.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.5.4.1"> <span class="ltx_p" id="S11.T13.5.1.5.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.5.1.5.4.1.1.1" style="font-size:90%;">excited walk</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.5.1.6"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.5.1.6.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.6.1.1"> <span class="ltx_p" id="S11.T13.5.1.6.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.6.1.1.1.1" style="font-size:90%;">sitdown</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.5.1.6.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.6.2.1"> <span class="ltx_p" id="S11.T13.5.1.6.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.6.2.1.1.1" style="font-size:90%;">relaxed</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.5.1.6.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.6.3.1"> <span class="ltx_p" id="S11.T13.5.1.6.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.5.1.6.3.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S11.T13.5.1.6.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.5.1.6.4.1"> <span class="ltx_p" id="S11.T13.5.1.6.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.5.1.6.4.1.1.1" style="font-size:90%;">hands support body, cross-legged</span></span> </span> </td> </tr> </table> </span></div> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <div class="ltx_inline-block ltx_figure_panel ltx_transformed_outer" id="S11.T13.6" style="width:372.6pt;height:82.1pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-58.8pt,13.0pt) scale(0.76,0.76) ;"> <table class="ltx_tabular ltx_align_middle" id="S11.T13.6.1"> <tr class="ltx_tr" id="S11.T13.6.1.1"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_tt" colspan="4" id="S11.T13.6.1.1.1"> <span class="ltx_text" id="S11.T13.6.1.1.1.1" style="font-size:90%;">Summary: The character is </span><span class="ltx_text ltx_font_bold" id="S11.T13.6.1.1.1.2" style="font-size:90%;">angry</span><span class="ltx_text" id="S11.T13.6.1.1.1.3" style="font-size:90%;"> and knocks on the table, then sit.</span> </td> </tr> <tr class="ltx_tr" id="S11.T13.6.1.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.2.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.2.1.1"> <span class="ltx_p" id="S11.T13.6.1.2.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.2.1.1.1.1" style="font-size:90%;">skill</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.2.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.2.2.1"> <span class="ltx_p" id="S11.T13.6.1.2.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.2.2.1.1.1" style="font-size:90%;">style</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.2.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.2.3.1"> <span class="ltx_p" id="S11.T13.6.1.2.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.2.3.1.1.1" style="font-size:90%;">object</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.6.1.2.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.2.4.1"> <span class="ltx_p" id="S11.T13.6.1.2.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.6.1.2.4.1.1.1" style="font-size:90%;">captions</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.6.1.3"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.3.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.3.1.1"> <span class="ltx_p" id="S11.T13.6.1.3.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.3.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.3.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.3.2.1"> <span class="ltx_p" id="S11.T13.6.1.3.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.3.2.1.1.1" style="font-size:90%;">angry</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.3.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.3.3.1"> <span class="ltx_p" id="S11.T13.6.1.3.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.3.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.6.1.3.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.3.4.1"> <span class="ltx_p" id="S11.T13.6.1.3.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.6.1.3.4.1.1.1" style="font-size:90%;">angrily walking</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.6.1.4"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.4.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.4.1.1"> <span class="ltx_p" id="S11.T13.6.1.4.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.4.1.1.1.1" style="font-size:90%;">idle</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.4.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.4.2.1"> <span class="ltx_p" id="S11.T13.6.1.4.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.4.2.1.1.1" style="font-size:90%;">angry</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.4.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.4.3.1"> <span class="ltx_p" id="S11.T13.6.1.4.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.4.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.6.1.4.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.4.4.1"> <span class="ltx_p" id="S11.T13.6.1.4.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.6.1.4.4.1.1.1" style="font-size:90%;">stomp angrily against the ground</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.6.1.5"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.5.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.5.1.1"> <span class="ltx_p" id="S11.T13.6.1.5.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.5.1.1.1.1" style="font-size:90%;">touch</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.5.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.5.2.1"> <span class="ltx_p" id="S11.T13.6.1.5.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.5.2.1.1.1" style="font-size:90%;">table</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.6.1.5.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.5.3.1"> <span class="ltx_p" id="S11.T13.6.1.5.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.5.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.6.1.5.4"></td> </tr> <tr class="ltx_tr" id="S11.T13.6.1.6"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.6.1.6.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.6.1.1"> <span class="ltx_p" id="S11.T13.6.1.6.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.6.1.1.1.1" style="font-size:90%;">sit</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.6.1.6.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.6.2.1"> <span class="ltx_p" id="S11.T13.6.1.6.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.6.2.1.1.1" style="font-size:90%;">angry</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.6.1.6.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.6.3.1"> <span class="ltx_p" id="S11.T13.6.1.6.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.6.1.6.3.1.1.1" style="font-size:90%;">armchair</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S11.T13.6.1.6.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.6.1.6.4.1"> <span class="ltx_p" id="S11.T13.6.1.6.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.6.1.6.4.1.1.1" style="font-size:90%;">crossing arms</span></span> </span> </td> </tr> </table> </span></div> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <div class="ltx_inline-block ltx_figure_panel ltx_transformed_outer" id="S11.T13.7" style="width:372.6pt;height:109.4pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-58.8pt,17.3pt) scale(0.76,0.76) ;"> <table class="ltx_tabular ltx_align_middle" id="S11.T13.7.1"> <tr class="ltx_tr" id="S11.T13.7.1.1"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_tt" colspan="4" id="S11.T13.7.1.1.1"> <span class="ltx_text" id="S11.T13.7.1.1.1.1" style="font-size:90%;">Summary: The character gets </span><span class="ltx_text ltx_font_bold" id="S11.T13.7.1.1.1.2" style="font-size:90%;">drunk</span><span class="ltx_text" id="S11.T13.7.1.1.1.3" style="font-size:90%;"> and stumbles around the living room.</span> </td> </tr> <tr class="ltx_tr" id="S11.T13.7.1.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.2.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.2.1.1"> <span class="ltx_p" id="S11.T13.7.1.2.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.2.1.1.1.1" style="font-size:90%;">skill</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.2.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.2.2.1"> <span class="ltx_p" id="S11.T13.7.1.2.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.2.2.1.1.1" style="font-size:90%;">style</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.2.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.2.3.1"> <span class="ltx_p" id="S11.T13.7.1.2.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.2.3.1.1.1" style="font-size:90%;">object</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.7.1.2.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.2.4.1"> <span class="ltx_p" id="S11.T13.7.1.2.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.7.1.2.4.1.1.1" style="font-size:90%;">captions</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.7.1.3"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.3.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.3.1.1"> <span class="ltx_p" id="S11.T13.7.1.3.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.3.1.1.1.1" style="font-size:90%;">idle</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.3.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.3.2.1"> <span class="ltx_p" id="S11.T13.7.1.3.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.3.2.1.1.1" style="font-size:90%;">drunk</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.3.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.3.3.1"> <span class="ltx_p" id="S11.T13.7.1.3.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.3.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.7.1.3.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.3.4.1"> <span class="ltx_p" id="S11.T13.7.1.3.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.7.1.3.4.1.1.1" style="font-size:90%;">stand drunkenly</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.7.1.4"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.4.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.4.1.1"> <span class="ltx_p" id="S11.T13.7.1.4.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.4.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.4.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.4.2.1"> <span class="ltx_p" id="S11.T13.7.1.4.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.4.2.1.1.1" style="font-size:90%;">drunk</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.4.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.4.3.1"> <span class="ltx_p" id="S11.T13.7.1.4.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.4.3.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.7.1.4.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.4.4.1"> <span class="ltx_p" id="S11.T13.7.1.4.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.7.1.4.4.1.1.1" style="font-size:90%;">walking drunkenly</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.7.1.5"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.5.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.5.1.1"> <span class="ltx_p" id="S11.T13.7.1.5.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.5.1.1.1.1" style="font-size:90%;">sit</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.5.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.5.2.1"> <span class="ltx_p" id="S11.T13.7.1.5.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.5.2.1.1.1" style="font-size:90%;">drunk</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.5.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.5.3.1"> <span class="ltx_p" id="S11.T13.7.1.5.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.5.3.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.7.1.5.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.5.4.1"> <span class="ltx_p" id="S11.T13.7.1.5.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.7.1.5.4.1.1.1" style="font-size:90%;">right leg held, left leg stretched out</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.7.1.6"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.6.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.6.1.1"> <span class="ltx_p" id="S11.T13.7.1.6.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.6.1.1.1.1" style="font-size:90%;">touch</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.6.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.6.2.1"> <span class="ltx_p" id="S11.T13.7.1.6.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.6.2.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.6.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.6.3.1"> <span class="ltx_p" id="S11.T13.7.1.6.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.6.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.7.1.6.4"></td> </tr> <tr class="ltx_tr" id="S11.T13.7.1.7"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.7.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.7.1.1"> <span class="ltx_p" id="S11.T13.7.1.7.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.7.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.7.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.7.2.1"> <span class="ltx_p" id="S11.T13.7.1.7.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.7.2.1.1.1" style="font-size:90%;">drunk</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.7.1.7.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.7.3.1"> <span class="ltx_p" id="S11.T13.7.1.7.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.7.3.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.7.1.7.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.7.4.1"> <span class="ltx_p" id="S11.T13.7.1.7.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.7.1.7.4.1.1.1" style="font-size:90%;">walking drunkenly</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.7.1.8"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.7.1.8.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.8.1.1"> <span class="ltx_p" id="S11.T13.7.1.8.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.8.1.1.1.1" style="font-size:90%;">lie</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.7.1.8.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.8.2.1"> <span class="ltx_p" id="S11.T13.7.1.8.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.8.2.1.1.1" style="font-size:90%;">tired</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.7.1.8.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.8.3.1"> <span class="ltx_p" id="S11.T13.7.1.8.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.7.1.8.3.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S11.T13.7.1.8.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.7.1.8.4.1"> <span class="ltx_p" id="S11.T13.7.1.8.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.7.1.8.4.1.1.1" style="font-size:90%;">lying down, legs straight</span></span> </span> </td> </tr> </table> </span></div> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <div class="ltx_inline-block ltx_figure_panel ltx_transformed_outer" id="S11.T13.8" style="width:372.6pt;height:82.1pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-58.8pt,13.0pt) scale(0.76,0.76) ;"> <table class="ltx_tabular ltx_align_middle" id="S11.T13.8.1"> <tr class="ltx_tr" id="S11.T13.8.1.1"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_tt" colspan="4" id="S11.T13.8.1.1.1"> <span class="ltx_text" id="S11.T13.8.1.1.1.1" style="font-size:90%;">Summary: The character feels </span><span class="ltx_text ltx_font_bold" id="S11.T13.8.1.1.1.2" style="font-size:90%;">stressed</span><span class="ltx_text" id="S11.T13.8.1.1.1.3" style="font-size:90%;"> and seeks comfort in the living room.</span> </td> </tr> <tr class="ltx_tr" id="S11.T13.8.1.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.2.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.2.1.1"> <span class="ltx_p" id="S11.T13.8.1.2.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.2.1.1.1.1" style="font-size:90%;">skill</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.2.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.2.2.1"> <span class="ltx_p" id="S11.T13.8.1.2.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.2.2.1.1.1" style="font-size:90%;">style</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.2.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.2.3.1"> <span class="ltx_p" id="S11.T13.8.1.2.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.2.3.1.1.1" style="font-size:90%;">object</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.8.1.2.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.2.4.1"> <span class="ltx_p" id="S11.T13.8.1.2.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.8.1.2.4.1.1.1" style="font-size:90%;">captions</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.8.1.3"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.3.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.3.1.1"> <span class="ltx_p" id="S11.T13.8.1.3.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.3.1.1.1.1" style="font-size:90%;">sit</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.3.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.3.2.1"> <span class="ltx_p" id="S11.T13.8.1.3.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.3.2.1.1.1" style="font-size:90%;">stressed</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.3.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.3.3.1"> <span class="ltx_p" id="S11.T13.8.1.3.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.3.3.1.1.1" style="font-size:90%;">armchair</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.8.1.3.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.3.4.1"> <span class="ltx_p" id="S11.T13.8.1.3.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.8.1.3.4.1.1.1" style="font-size:90%;">sitting with head bowed, hands resting on thighs</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.8.1.4"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.4.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.4.1.1"> <span class="ltx_p" id="S11.T13.8.1.4.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.4.1.1.1.1" style="font-size:90%;">touch</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.4.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.4.2.1"> <span class="ltx_p" id="S11.T13.8.1.4.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.4.2.1.1.1" style="font-size:90%;">armchair</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.4.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.4.3.1"> <span class="ltx_p" id="S11.T13.8.1.4.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.4.3.1.1.1" style="font-size:90%;">-</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.8.1.4.4"></td> </tr> <tr class="ltx_tr" id="S11.T13.8.1.5"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.5.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.5.1.1"> <span class="ltx_p" id="S11.T13.8.1.5.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.5.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.5.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.5.2.1"> <span class="ltx_p" id="S11.T13.8.1.5.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.5.2.1.1.1" style="font-size:90%;">stressed</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.8.1.5.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.5.3.1"> <span class="ltx_p" id="S11.T13.8.1.5.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.5.3.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.8.1.5.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.5.4.1"> <span class="ltx_p" id="S11.T13.8.1.5.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.8.1.5.4.1.1.1" style="font-size:90%;">walking slowly, hands behind back</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.8.1.6"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.8.1.6.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.6.1.1"> <span class="ltx_p" id="S11.T13.8.1.6.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.6.1.1.1.1" style="font-size:90%;">lie</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.8.1.6.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.6.2.1"> <span class="ltx_p" id="S11.T13.8.1.6.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.6.2.1.1.1" style="font-size:90%;">stressed</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.8.1.6.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.6.3.1"> <span class="ltx_p" id="S11.T13.8.1.6.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.8.1.6.3.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S11.T13.8.1.6.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.8.1.6.4.1"> <span class="ltx_p" id="S11.T13.8.1.6.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.8.1.6.4.1.1.1" style="font-size:90%;">side-lie on left with left arm as pillow, legs bent</span></span> </span> </td> </tr> </table> </span></div> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <div class="ltx_inline-block ltx_figure_panel ltx_transformed_outer" id="S11.T13.9" style="width:372.6pt;height:82.1pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(-58.8pt,13.0pt) scale(0.76,0.76) ;"> <table class="ltx_tabular ltx_align_middle" id="S11.T13.9.1"> <tr class="ltx_tr" id="S11.T13.9.1.1"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_tt" colspan="4" id="S11.T13.9.1.1.1"><span class="ltx_text" id="S11.T13.9.1.1.1.1" style="font-size:90%;">Summary: The character discovered an old vase on the shelf, settled on the sofa.</span></td> </tr> <tr class="ltx_tr" id="S11.T13.9.1.2"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.2.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.2.1.1"> <span class="ltx_p" id="S11.T13.9.1.2.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.2.1.1.1.1" style="font-size:90%;">skill</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.2.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.2.2.1"> <span class="ltx_p" id="S11.T13.9.1.2.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.2.2.1.1.1" style="font-size:90%;">style</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.2.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.2.3.1"> <span class="ltx_p" id="S11.T13.9.1.2.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.2.3.1.1.1" style="font-size:90%;">object</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.9.1.2.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.2.4.1"> <span class="ltx_p" id="S11.T13.9.1.2.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.9.1.2.4.1.1.1" style="font-size:90%;">captions</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.9.1.3"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.3.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.3.1.1"> <span class="ltx_p" id="S11.T13.9.1.3.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.3.1.1.1.1" style="font-size:90%;">loco</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.3.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.3.2.1"> <span class="ltx_p" id="S11.T13.9.1.3.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.3.2.1.1.1" style="font-size:90%;">neutral</span></span> </span> </td> <td class="ltx_td ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.3.3"></td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.9.1.3.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.3.4.1"> <span class="ltx_p" id="S11.T13.9.1.3.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.9.1.3.4.1.1.1" style="font-size:90%;">side-stepping</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.9.1.4"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.4.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.4.1.1"> <span class="ltx_p" id="S11.T13.9.1.4.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.4.1.1.1.1" style="font-size:90%;">touch</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.4.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.4.2.1"> <span class="ltx_p" id="S11.T13.9.1.4.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.4.2.1.1.1" style="font-size:90%;">neutral</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.4.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.4.3.1"> <span class="ltx_p" id="S11.T13.9.1.4.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.4.3.1.1.1" style="font-size:90%;">shelf</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.9.1.4.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.4.4.1"> <span class="ltx_p" id="S11.T13.9.1.4.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.9.1.4.4.1.1.1" style="font-size:90%;">-</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.9.1.5"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.5.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.5.1.1"> <span class="ltx_p" id="S11.T13.9.1.5.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.5.1.1.1.1" style="font-size:90%;">carry</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.5.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.5.2.1"> <span class="ltx_p" id="S11.T13.9.1.5.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.5.2.1.1.1" style="font-size:90%;">neutral</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_r ltx_border_t" id="S11.T13.9.1.5.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.5.3.1"> <span class="ltx_p" id="S11.T13.9.1.5.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.5.3.1.1.1" style="font-size:90%;">vase</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_t" id="S11.T13.9.1.5.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.5.4.1"> <span class="ltx_p" id="S11.T13.9.1.5.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.9.1.5.4.1.1.1" style="font-size:90%;">carry object calmly</span></span> </span> </td> </tr> <tr class="ltx_tr" id="S11.T13.9.1.6"> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.9.1.6.1"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.6.1.1"> <span class="ltx_p" id="S11.T13.9.1.6.1.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.6.1.1.1.1" style="font-size:90%;">liedown</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.9.1.6.2"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.6.2.1"> <span class="ltx_p" id="S11.T13.9.1.6.2.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.6.2.1.1.1" style="font-size:90%;">neutral</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_r ltx_border_t" id="S11.T13.9.1.6.3"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.6.3.1"> <span class="ltx_p" id="S11.T13.9.1.6.3.1.1" style="width:69.4pt;"><span class="ltx_text" id="S11.T13.9.1.6.3.1.1.1" style="font-size:90%;">sofa</span></span> </span> </td> <td class="ltx_td ltx_align_justify ltx_align_top ltx_border_bb ltx_border_t" id="S11.T13.9.1.6.4"> <span class="ltx_inline-block ltx_align_top" id="S11.T13.9.1.6.4.1"> <span class="ltx_p" id="S11.T13.9.1.6.4.1.1" style="width:234.2pt;"><span class="ltx_text" id="S11.T13.9.1.6.4.1.1.1" style="font-size:90%;">legs bend</span></span> </span> </td> </tr> </table> </span></div> </div> </div> <figcaption class="ltx_caption" style="font-size:90%;"><span class="ltx_tag ltx_tag_table">Table 13: </span>Examples in the Short Script Database.</figcaption> </figure> <div class="ltx_para" id="S11.p1"> <p class="ltx_p" id="S11.p1.1"><span class="ltx_text" id="S11.p1.1.1" style="font-size:144%;">We show some vivid examples in </span><a class="ltx_ref" href="https://arxiv.org/html/2411.19921v2#S11.T13" style="font-size:144%;" title="In 11 Short Script Examples ‣ SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation"><span class="ltx_text ltx_ref_tag">Tab.</span> <span class="ltx_text ltx_ref_tag">13</span></a><span class="ltx_text" id="S11.p1.1.2" style="font-size:144%;"> for all the emotions/styles we use. Please check the skills, style label, object type, and captions, which are essential for FSM control.</span></p> </div> <div class="ltx_pagination ltx_role_newpage"></div> </section> </article> </div> <footer class="ltx_page_footer"> <div class="ltx_page_logo">Generated on Sun Mar 16 04:02:48 2025 by <a class="ltx_LaTeXML_logo" href="http://dlmf.nist.gov/LaTeXML/"><span style="letter-spacing:-0.2em; margin-right:0.1em;">L<span class="ltx_font_smallcaps" style="position:relative; bottom:2.2pt;">a</span>T<span class="ltx_font_smallcaps" style="font-size:120%;position:relative; bottom:-0.2ex;">e</span></span><span style="font-size:90%; position:relative; bottom:-0.2ex;">XML</span><img alt="Mascot Sammy" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAsAAAAOCAYAAAD5YeaVAAAAAXNSR0IArs4c6QAAAAZiS0dEAP8A/wD/oL2nkwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9wKExQZLWTEaOUAAAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAdpJREFUKM9tkL+L2nAARz9fPZNCKFapUn8kyI0e4iRHSR1Kb8ng0lJw6FYHFwv2LwhOpcWxTjeUunYqOmqd6hEoRDhtDWdA8ApRYsSUCDHNt5ul13vz4w0vWCgUnnEc975arX6ORqN3VqtVZbfbTQC4uEHANM3jSqXymFI6yWazP2KxWAXAL9zCUa1Wy2tXVxheKA9YNoR8Pt+aTqe4FVVVvz05O6MBhqUIBGk8Hn8HAOVy+T+XLJfLS4ZhTiRJgqIoVBRFIoric47jPnmeB1mW/9rr9ZpSSn3Lsmir1fJZlqWlUonKsvwWwD8ymc/nXwVBeLjf7xEKhdBut9Hr9WgmkyGEkJwsy5eHG5vN5g0AKIoCAEgkEkin0wQAfN9/cXPdheu6P33fBwB4ngcAcByHJpPJl+fn54mD3Gg0NrquXxeLRQAAwzAYj8cwTZPwPH9/sVg8PXweDAauqqr2cDjEer1GJBLBZDJBs9mE4zjwfZ85lAGg2+06hmGgXq+j3+/DsixYlgVN03a9Xu8jgCNCyIegIAgx13Vfd7vdu+FweG8YRkjXdWy329+dTgeSJD3ieZ7RNO0VAXAPwDEAO5VKndi2fWrb9jWl9Esul6PZbDY9Go1OZ7PZ9z/lyuD3OozU2wAAAABJRU5ErkJggg=="/></a> </div></footer> </div> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10