CINXE.COM

Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies

<!DOCTYPE html> <html lang="en"> <head> <meta content="text/html; charset=utf-8" http-equiv="content-type"/> <title>Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies</title> <!--Generated on Thu Mar 20 01:09:40 2025 by LaTeXML (version 0.8.8) http://dlmf.nist.gov/LaTeXML/.--> <meta content="width=device-width, initial-scale=1, shrink-to-fit=no" name="viewport"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv-fonts.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/latexml_styles.css" rel="stylesheet" type="text/css"/> <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.3.3/html2canvas.min.js"></script> <script src="/static/browse/0.3.4/js/addons_new.js"></script> <script src="/static/browse/0.3.4/js/feedbackOverlay.js"></script> <meta content=" Multi-Agent Systems, Autonomous Vehicles, Reinforcement Learning, Digital Twins, Real2Sim, Sim2Real" lang="en" name="keywords"/> <base href="/html/2403.10996v5/"/></head> <body> <nav class="ltx_page_navbar"> <nav class="ltx_TOC"> <ol class="ltx_toclist"> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S1" title="In Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">I </span><span class="ltx_text ltx_font_smallcaps">Introduction</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2" title="In Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">II </span><span class="ltx_text ltx_font_smallcaps">Case Studies</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2.SS1" title="In II Case Studies ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-A</span> </span><span class="ltx_text ltx_font_italic">Cooperative Multi-Agent Scenario</span></span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2.SS1.SSS1" title="In II-A Cooperative Multi-Agent Scenario ‣ II Case Studies ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-A</span>1 </span>Observation Space</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2.SS1.SSS2" title="In II-A Cooperative Multi-Agent Scenario ‣ II Case Studies ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-A</span>2 </span>Action Space</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2.SS1.SSS3" title="In II-A Cooperative Multi-Agent Scenario ‣ II Case Studies ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-A</span>3 </span>Reward Function</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2.SS1.SSS4" title="In II-A Cooperative Multi-Agent Scenario ‣ II Case Studies ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-A</span>4 </span>Optimization Problem</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2.SS2" title="In II Case Studies ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-B</span> </span><span class="ltx_text ltx_font_italic">Competitive Multi-Agent Scenario</span></span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2.SS2.SSS1" title="In II-B Competitive Multi-Agent Scenario ‣ II Case Studies ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-B</span>1 </span>Observation Space</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2.SS2.SSS2" title="In II-B Competitive Multi-Agent Scenario ‣ II Case Studies ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-B</span>2 </span>Action Space</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2.SS2.SSS3" title="In II-B Competitive Multi-Agent Scenario ‣ II Case Studies ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-B</span>3 </span>Reward Function</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2.SS2.SSS4" title="In II-B Competitive Multi-Agent Scenario ‣ II Case Studies ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-B</span>4 </span>Optimization Problem</span></a></li> </ol> </li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S3" title="In Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">III </span><span class="ltx_text ltx_font_smallcaps">Methodology</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S3.SS1" title="In III Methodology ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">III-A</span> </span><span class="ltx_text ltx_font_italic">Simulation Parallelization</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S3.SS2" title="In III Methodology ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">III-B</span> </span><span class="ltx_text ltx_font_italic">Learning Architecture</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S3.SS3" title="In III Methodology ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">III-C</span> </span><span class="ltx_text ltx_font_italic">Domain Randomization</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S3.SS4" title="In III Methodology ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">III-D</span> </span><span class="ltx_text ltx_font_italic">Hybrid Sim2Real Transfer</span></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4" title="In Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">IV </span><span class="ltx_text ltx_font_smallcaps">Results</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.SS1" title="In IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-A</span> </span><span class="ltx_text ltx_font_italic">Cooperative Multi-Agent Scenario</span></span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.SS1.SSS1" title="In IV-A Cooperative Multi-Agent Scenario ‣ IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-A</span>1 </span>Training and Simulation Parallelization</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.SS1.SSS2" title="In IV-A Cooperative Multi-Agent Scenario ‣ IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-A</span>2 </span>Deployment and Sim2Real Transfer</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.SS2" title="In IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-B</span> </span><span class="ltx_text ltx_font_italic">Competitive Multi-Agent Scenario</span></span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.SS2.SSS1" title="In IV-B Competitive Multi-Agent Scenario ‣ IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-B</span>1 </span>Training and Simulation Parallelization</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.SS2.SSS2" title="In IV-B Competitive Multi-Agent Scenario ‣ IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-B</span>2 </span>Deployment and Sim2Real Transfer</span></a></li> </ol> </li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S5" title="In Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">V </span><span class="ltx_text ltx_font_smallcaps">Conclusion</span></span></a></li> </ol></nav> </nav> <div class="ltx_page_main"> <div class="ltx_page_content"> <article class="ltx_document ltx_authors_1line"> <h1 class="ltx_title ltx_title_document"> Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies </h1> <div class="ltx_authors"> <span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Chinmay V. Samak<sup class="ltx_sup" id="id4.4.id1"><span class="ltx_text ltx_font_italic" id="id4.4.id1.1">∗</span></sup> <span class="ltx_ERROR undefined" id="id5.5.id2">\orcidlink</span>0000-0002-6455-6716, Tanmay V. Samak<sup class="ltx_sup" id="id6.6.id3"><span class="ltx_text ltx_font_italic" id="id6.6.id3.1">∗</span></sup> <span class="ltx_ERROR undefined" id="id7.7.id4">\orcidlink</span>0000-0002-9717-0764 and Venkat N. Krovi<span class="ltx_ERROR undefined" id="id8.8.id5">\orcidlink</span>0000-0003-2539-896X </span><span class="ltx_author_notes"><sup class="ltx_sup" id="id9.9.id1"><span class="ltx_text ltx_font_italic" id="id9.9.id1.1">∗</span></sup>These authors contributed equally.This work was supported in part by the U.S. National Science Foundation under NSF IIS-1925500 and NSF CNS-1939058.C. V. Samak, T. V. Samak, and V. N. Krovi are with the Department of Automotive Engineering, Clemson University International Center for Automotive Research (CU-ICAR), Greenville, SC 29607, USA. Email: <span class="ltx_text ltx_font_typewriter" id="id10.10.id1" style="font-size:90%;">{<a class="ltx_ref ltx_href" href="mailto:csamak@clemson.edu" title="">csamak</a>, <a class="ltx_ref ltx_href" href="mailto:tsamak@clemson.edu" title="">tsamak</a>, <a class="ltx_ref ltx_href" href="mailto:vkrovi@clemson.edu" title="">vkrovi</a>}@clemson.edu</span></span></span> </div> <div class="ltx_abstract"> <h6 class="ltx_title ltx_title_abstract">Abstract</h6> <p class="ltx_p" id="id11.id1">Multi-agent reinforcement learning (MARL) for cyber-physical vehicle systems usually requires a significantly long training time due to their inherent complexity. Furthermore, deploying the trained policies in the real world demands a feature-rich environment along with multiple physical embodied agents, which may not be feasible due to monetary, physical, energy, or safety constraints. This work seeks to address these pain points by presenting a mixed-reality digital twin framework capable of: (i) selectively scaling parallelized workloads on-demand, and (ii) evaluating the trained policies across simulation-to-reality (sim2real) experiments. The viability and performance of the proposed framework are highlighted through two representative use cases, which cover cooperative as well as competitive classes of MARL problems. We study the effect of: (i) agent and environment parallelization on training time, and (ii) systematic domain randomization on zero-shot sim2real transfer across both case studies. Results indicate up to 76.3% reduction in training time with the proposed parallelization scheme and sim2real gap as low as 2.9% using the proposed deployment method. <br class="ltx_break"/></p> </div> <div class="ltx_keywords"> <h6 class="ltx_title ltx_title_keywords">Index Terms: </h6> Multi-Agent Systems, Autonomous Vehicles, Reinforcement Learning, Digital Twins, Real2Sim, Sim2Real <br class="ltx_break"/> </div> <section class="ltx_section" id="S1"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">I </span><span class="ltx_text ltx_font_smallcaps" id="S1.1.1">Introduction</span> </h2> <div class="ltx_para" id="S1.p1"> <p class="ltx_p" id="S1.p1.1">Connected autonomous vehicles (CAVs) are exemplars of cyber-physical systems (CPS) operating within an environment with other agents. The development and deployment of such systems present a formidable challenge due to the complexity-growth of multi-agent interactions. In such a milieu, multi-agent learning stands out as a promising avenue for developing autonomous vehicles capable of navigating complex and dynamic environments while considering the nature of interactions with their peers. Particularly, multi-agent reinforcement learning (MARL) offers the tantalizing potential of learning through self-exploration, which can potentially capture intricate cooperative/competitive multi-agent interactions.</p> </div> <figure class="ltx_figure" id="S1.F1"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_1"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S1.F1.sf1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="168" id="S1.F1.sf1.g1" src="extracted/6294916/fig1a.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S1.F1.sf1.2.1.1" style="font-size:90%;">(a)</span> </span><span class="ltx_text" id="S1.F1.sf1.3.2" style="font-size:90%;">Cooperative MARL using Nigel.</span></figcaption> </figure> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S1.F1.sf2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="168" id="S1.F1.sf2.g1" src="extracted/6294916/fig1b.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S1.F1.sf2.2.1.1" style="font-size:90%;">(b)</span> </span><span class="ltx_text" id="S1.F1.sf2.3.2" style="font-size:90%;">Competitive MARL using F1TENTH.</span></figcaption> </figure> </div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S1.F1.2.1.1" style="font-size:90%;">Figure 1</span>: </span><span class="ltx_text" id="S1.F1.3.2" style="font-size:90%;">Mixed-reality digital twin framework for hybrid sim2real transfer of MARL systems: (1) observe, (2) decide, (3) act, (4) estimate, and (5) update.</span></figcaption> </figure> <figure class="ltx_figure" id="S1.F2"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_4"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S1.F2.sf1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S1.F2.sf1.g1" src="extracted/6294916/fig2a.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S1.F2.sf1.2.1.1" style="font-size:90%;">(a)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_4"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S1.F2.sf2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S1.F2.sf2.g1" src="extracted/6294916/fig2b.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S1.F2.sf2.2.1.1" style="font-size:90%;">(b)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_4"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S1.F2.sf3"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S1.F2.sf3.g1" src="extracted/6294916/fig2c.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S1.F2.sf3.2.1.1" style="font-size:90%;">(c)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_4"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S1.F2.sf4"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S1.F2.sf4.g1" src="extracted/6294916/fig2d.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S1.F2.sf4.2.1.1" style="font-size:90%;">(d)</span> </span></figcaption> </figure> </div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S1.F2.4.2.1" style="font-size:90%;">Figure 2</span>: </span><span class="ltx_text" id="S1.F2.2.1" style="font-size:90%;">Simulation parallelization: (a) depicts a snapshot of 25 cooperative MARL environments training in parallel, and (b) denotes the training time for different levels of environment parallelization. Similarly, (c) depicts a snapshot of 10<math alttext="\times" class="ltx_Math" display="inline" id="S1.F2.2.1.m1.1"><semantics id="S1.F2.2.1.m1.1b"><mo id="S1.F2.2.1.m1.1.1" xref="S1.F2.2.1.m1.1.1.cmml">×</mo><annotation-xml encoding="MathML-Content" id="S1.F2.2.1.m1.1c"><times id="S1.F2.2.1.m1.1.1.cmml" xref="S1.F2.2.1.m1.1.1"></times></annotation-xml><annotation encoding="application/x-tex" id="S1.F2.2.1.m1.1d">\times</annotation><annotation encoding="application/x-llamapun" id="S1.F2.2.1.m1.1e">×</annotation></semantics></math>2 competitive MARL agents training in parallel, and (d) denotes the training time for different levels of agent parallelization.</span></figcaption> </figure> <div class="ltx_para" id="S1.p2"> <p class="ltx_p" id="S1.p2.1">Cooperative MARL <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib1" title="">1</a>, <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib2" title="">2</a>, <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib3" title="">3</a>, <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib4" title="">4</a>]</cite> fosters an environment where autonomous vehicles collaborate and share information to accomplish collective objectives such as optimizing traffic flow, enhancing safety, and efficiently navigating road networks. It mirrors traffic situations where vehicles must work together, such as intersection management or platooning scenarios. Challenges in cooperative MARL include coordinating vehicle actions to minimize congestion, maintaining safety margins, and ensuring smooth interactions with peers.</p> </div> <div class="ltx_para" id="S1.p3"> <p class="ltx_p" id="S1.p3.1">On the other hand, competitive MARL <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib5" title="">5</a>, <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib6" title="">6</a>, <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib7" title="">7</a>, <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib8" title="">8</a>]</cite> introduces elements of privacy and rivalry in autonomous driving, simulating scenarios such as overtaking, merging in congested traffic, or racing. In this paradigm, agents strive to outperform their counterparts, prioritizing individual success over coordination. Challenges in competitive MARL encompass strategic decision-making, opponent modeling, and adapting to aggressive driving behaviors while preserving safety.</p> </div> <div class="ltx_para" id="S1.p4"> <p class="ltx_p" id="S1.p4.1">Irrespective of the problem formulation, one of the key challenges in training MARL policies is the sample efficiency of the environment, which translates to longer training times. This is often addressed by (a) training in low-fidelity simulations that can run faster than real-time <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib1" title="">1</a>, <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib2" title="">2</a>, <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib3" title="">3</a>]</cite>, (b) parallelizing simulations to accelerate data collection <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib7" title="">7</a>, <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib8" title="">8</a>]</cite>, or (c) both <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib6" title="">6</a>]</cite>. Adopting the first approach usually leads to a heightened sim2real gap, as marked in the literature through simulation-only deployments or explicit remarks for real-world experiments <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib6" title="">6</a>]</cite>. Adopting the second approach, on the other hand, usually requires extensive computational resources <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib7" title="">7</a>, <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib8" title="">8</a>, <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib9" title="">9</a>]</cite>, which may not be sustainable. Additionally, none of the prior works addresses the challenge of selectively isolating collision, interaction, and perception between parallelized replicas of multi-agent systems.</p> </div> <div class="ltx_para" id="S1.p5"> <p class="ltx_p" id="S1.p5.1">Another important challenge arises during the real-world deployment of MARL policies. While generic RL has studied the sim2real transfer of trained policies using techniques such as domain adaptation <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib10" title="">10</a>]</cite>, identification <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib11" title="">11</a>]</cite>, or augmentation <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib12" title="">12</a>]</cite>, the sim2real transfer of MARL systems has been typically under-explored. While recent works such as <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib4" title="">4</a>]</cite> and <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib6" title="">6</a>]</cite> have tried to fill this gap, they simplified the formulation by adopting extensive observation spaces with ground truth information about the environment and employed benchmark equipment such as mocap for real-time feedback during sim2real transfer. Additionally, these works used multiple physical vehicles, albeit scaled, within a synthetically constructed physical test environment for real-world deployments, which may not always be feasible due to monetary, spatial, energy, or safety constraints.</p> </div> <div class="ltx_para" id="S1.p6"> <p class="ltx_p" id="S1.p6.1">This work tries to address these two MARL pain points through the following open-source<span class="ltx_note ltx_role_footnote" id="footnote1"><sup class="ltx_note_mark">1</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">1</sup><span class="ltx_tag ltx_tag_note">1</span>GitHub: <a class="ltx_ref ltx_url ltx_font_typewriter" href="https://github.com/autodrive-ecosystem" title="">https://github.com/autodrive-ecosystem</a></span></span></span> contributions:</p> </div> <div class="ltx_para" id="S1.p7"> <ul class="ltx_itemize" id="S1.I1"> <li class="ltx_item" id="S1.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S1.I1.i1.p1"> <p class="ltx_p" id="S1.I1.i1.p1.1"><span class="ltx_text ltx_font_bold" id="S1.I1.i1.p1.1.1">Parallel Training:</span> This work contributes a modular simulation parallelization framework, which allows selectively isolating the exteroceptive perception, collision, and interaction within or among MARL system(s).</p> </div> </li> <li class="ltx_item" id="S1.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S1.I1.i2.p1"> <p class="ltx_p" id="S1.I1.i2.p1.1"><span class="ltx_text ltx_font_bold" id="S1.I1.i2.p1.1.1">Sim2Real Transfer:</span> We introduce a bi-directional digital twinning framework to immerse a limited number of physical agents within a digital environment running virtual peer(s) to evaluate the MARL policies.</p> </div> </li> <li class="ltx_item" id="S1.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S1.I1.i3.p1"> <p class="ltx_p" id="S1.I1.i3.p1.1"><span class="ltx_text ltx_font_bold" id="S1.I1.i3.p1.1.1">Case Studies:</span> This work presents a cooperative non-zero-sum use-case of intersection traversal and a competitive zero-sum use-case of head-to-head autonomous racing. The agents are provided with realistically sparse observation spaces and employ onboard state estimation for real-time feedback during sim2real transfer.</p> </div> </li> <li class="ltx_item" id="S1.I1.i4" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S1.I1.i4.p1"> <p class="ltx_p" id="S1.I1.i4.p1.1"><span class="ltx_text ltx_font_bold" id="S1.I1.i4.p1.1.1">MARL Analysis:</span> The proposed MARL case studies are benchmarked against a decentralized reactive planning algorithm to assess their efficacy. We numerically evaluate the sim2real gap to analyze the effect of systematic domain randomization introduced in this work.</p> </div> </li> </ul> </div> <div class="ltx_para" id="S1.p8"> <p class="ltx_p" id="S1.p8.1">The remainder of this paper is organized as follows: Section <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S2" title="II Case Studies ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">II</span></a> elucidates the two MARL case studies, including their mathematical formulation. Section <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S3" title="III Methodology ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">III</span></a> describes the workflow adopted to train and deploy the MARL policies, including their sim2real transfer. Section <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4" title="IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">IV</span></a> analyzes the MARL training and deployment results for both case studies. Finally, Section <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S5" title="V Conclusion ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">V</span></a> provides concluding remarks and future research directions.</p> </div> </section> <section class="ltx_section" id="S2"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">II </span><span class="ltx_text ltx_font_smallcaps" id="S2.1.1">Case Studies</span> </h2> <section class="ltx_subsection" id="S2.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S2.SS1.5.1.1">II-A</span> </span><span class="ltx_text ltx_font_italic" id="S2.SS1.6.2">Cooperative Multi-Agent Scenario</span> </h3> <div class="ltx_para" id="S2.SS1.p1"> <p class="ltx_p" id="S2.SS1.p1.1">We formulated a decentralized 4-agent collaborative scenario, wherein each agent’s objective was to traverse a 2+2 lane, 4-way intersection without colliding or overstepping lane bounds. This scenario is representative of a standard uncontrolled traffic intersection. Here, each agent perceived its intrinsic states and received limited information from its peers (via V2V communication); no external sensing modalities were employed. Each agent was reset independently, resulting in highly stochastic initial conditions. The exact structure/map of the environment was not known to any agent. Consequently, this problem was framed as a partially observable Markov decision process (POMDP), which captured hidden state information through limited observations.</p> </div> <section class="ltx_subsubsection" id="S2.SS1.SSS1"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S2.SS1.SSS1.5.1.1">II-A</span>1 </span>Observation Space</h4> <div class="ltx_para" id="S2.SS1.SSS1.p1"> <p class="ltx_p" id="S2.SS1.SSS1.p1.12">Each agent, <math alttext="i" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.1.m1.1"><semantics id="S2.SS1.SSS1.p1.1.m1.1a"><mi id="S2.SS1.SSS1.p1.1.m1.1.1" xref="S2.SS1.SSS1.p1.1.m1.1.1.cmml">i</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.1.m1.1b"><ci id="S2.SS1.SSS1.p1.1.m1.1.1.cmml" xref="S2.SS1.SSS1.p1.1.m1.1.1">𝑖</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.1.m1.1c">i</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.1.m1.1d">italic_i</annotation></semantics></math> (<math alttext="0&lt;i&lt;N" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.2.m2.1"><semantics id="S2.SS1.SSS1.p1.2.m2.1a"><mrow id="S2.SS1.SSS1.p1.2.m2.1.1" xref="S2.SS1.SSS1.p1.2.m2.1.1.cmml"><mn id="S2.SS1.SSS1.p1.2.m2.1.1.2" xref="S2.SS1.SSS1.p1.2.m2.1.1.2.cmml">0</mn><mo id="S2.SS1.SSS1.p1.2.m2.1.1.3" xref="S2.SS1.SSS1.p1.2.m2.1.1.3.cmml">&lt;</mo><mi id="S2.SS1.SSS1.p1.2.m2.1.1.4" xref="S2.SS1.SSS1.p1.2.m2.1.1.4.cmml">i</mi><mo id="S2.SS1.SSS1.p1.2.m2.1.1.5" xref="S2.SS1.SSS1.p1.2.m2.1.1.5.cmml">&lt;</mo><mi id="S2.SS1.SSS1.p1.2.m2.1.1.6" xref="S2.SS1.SSS1.p1.2.m2.1.1.6.cmml">N</mi></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.2.m2.1b"><apply id="S2.SS1.SSS1.p1.2.m2.1.1.cmml" xref="S2.SS1.SSS1.p1.2.m2.1.1"><and id="S2.SS1.SSS1.p1.2.m2.1.1a.cmml" xref="S2.SS1.SSS1.p1.2.m2.1.1"></and><apply id="S2.SS1.SSS1.p1.2.m2.1.1b.cmml" xref="S2.SS1.SSS1.p1.2.m2.1.1"><lt id="S2.SS1.SSS1.p1.2.m2.1.1.3.cmml" xref="S2.SS1.SSS1.p1.2.m2.1.1.3"></lt><cn id="S2.SS1.SSS1.p1.2.m2.1.1.2.cmml" type="integer" xref="S2.SS1.SSS1.p1.2.m2.1.1.2">0</cn><ci id="S2.SS1.SSS1.p1.2.m2.1.1.4.cmml" xref="S2.SS1.SSS1.p1.2.m2.1.1.4">𝑖</ci></apply><apply id="S2.SS1.SSS1.p1.2.m2.1.1c.cmml" xref="S2.SS1.SSS1.p1.2.m2.1.1"><lt id="S2.SS1.SSS1.p1.2.m2.1.1.5.cmml" xref="S2.SS1.SSS1.p1.2.m2.1.1.5"></lt><share href="https://arxiv.org/html/2403.10996v5#S2.SS1.SSS1.p1.2.m2.1.1.4.cmml" id="S2.SS1.SSS1.p1.2.m2.1.1d.cmml" xref="S2.SS1.SSS1.p1.2.m2.1.1"></share><ci id="S2.SS1.SSS1.p1.2.m2.1.1.6.cmml" xref="S2.SS1.SSS1.p1.2.m2.1.1.6">𝑁</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.2.m2.1c">0&lt;i&lt;N</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.2.m2.1d">0 &lt; italic_i &lt; italic_N</annotation></semantics></math>), employed an appropriate subset of its sensor suite to collect observations: <math alttext="o_{t}^{i}=\left[g^{i},\tilde{p}^{i},\tilde{\psi}^{i},\tilde{v}^{i}\right]_{t}% \in\mathbb{R}^{2+4(N-1)}" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.3.m3.5"><semantics id="S2.SS1.SSS1.p1.3.m3.5a"><mrow id="S2.SS1.SSS1.p1.3.m3.5.5" xref="S2.SS1.SSS1.p1.3.m3.5.5.cmml"><msubsup id="S2.SS1.SSS1.p1.3.m3.5.5.6" xref="S2.SS1.SSS1.p1.3.m3.5.5.6.cmml"><mi id="S2.SS1.SSS1.p1.3.m3.5.5.6.2.2" xref="S2.SS1.SSS1.p1.3.m3.5.5.6.2.2.cmml">o</mi><mi id="S2.SS1.SSS1.p1.3.m3.5.5.6.2.3" xref="S2.SS1.SSS1.p1.3.m3.5.5.6.2.3.cmml">t</mi><mi id="S2.SS1.SSS1.p1.3.m3.5.5.6.3" xref="S2.SS1.SSS1.p1.3.m3.5.5.6.3.cmml">i</mi></msubsup><mo id="S2.SS1.SSS1.p1.3.m3.5.5.7" xref="S2.SS1.SSS1.p1.3.m3.5.5.7.cmml">=</mo><msub id="S2.SS1.SSS1.p1.3.m3.5.5.4" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.cmml"><mrow id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.5.cmml"><mo id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.5" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.5.cmml">[</mo><msup id="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1" xref="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1.cmml"><mi id="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1.2" xref="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1.2.cmml">g</mi><mi id="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1.3" xref="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1.3.cmml">i</mi></msup><mo id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.6" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.5.cmml">,</mo><msup id="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2" xref="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.cmml"><mover accent="true" id="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2" xref="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2.cmml"><mi id="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2.2" xref="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2.2.cmml">p</mi><mo id="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2.1" xref="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2.1.cmml">~</mo></mover><mi id="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.3" xref="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.3.cmml">i</mi></msup><mo id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.7" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.5.cmml">,</mo><msup id="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3" xref="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.cmml"><mover accent="true" id="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2" xref="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2.cmml"><mi id="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2.2" xref="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2.2.cmml">ψ</mi><mo id="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2.1" xref="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2.1.cmml">~</mo></mover><mi id="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.3" xref="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.3.cmml">i</mi></msup><mo id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.8" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.5.cmml">,</mo><msup id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.cmml"><mover accent="true" id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2.cmml"><mi id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2.2" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2.2.cmml">v</mi><mo id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2.1" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2.1.cmml">~</mo></mover><mi id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.3" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.3.cmml">i</mi></msup><mo id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.9" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.5.cmml">]</mo></mrow><mi id="S2.SS1.SSS1.p1.3.m3.5.5.4.6" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.6.cmml">t</mi></msub><mo id="S2.SS1.SSS1.p1.3.m3.5.5.8" xref="S2.SS1.SSS1.p1.3.m3.5.5.8.cmml">∈</mo><msup id="S2.SS1.SSS1.p1.3.m3.5.5.9" xref="S2.SS1.SSS1.p1.3.m3.5.5.9.cmml"><mi id="S2.SS1.SSS1.p1.3.m3.5.5.9.2" xref="S2.SS1.SSS1.p1.3.m3.5.5.9.2.cmml">ℝ</mi><mrow id="S2.SS1.SSS1.p1.3.m3.1.1.1" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.cmml"><mn id="S2.SS1.SSS1.p1.3.m3.1.1.1.3" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.3.cmml">2</mn><mo id="S2.SS1.SSS1.p1.3.m3.1.1.1.2" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.2.cmml">+</mo><mrow id="S2.SS1.SSS1.p1.3.m3.1.1.1.1" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.cmml"><mn id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.3" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.3.cmml">4</mn><mo id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.2" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.2.cmml">⁢</mo><mrow id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.cmml"><mo id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.2" stretchy="false" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.cmml"><mi id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.2" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.2.cmml">N</mi><mo id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.1" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.1.cmml">−</mo><mn id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.3" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.3.cmml">1</mn></mrow><mo id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.3" stretchy="false" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></mrow></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.3.m3.5b"><apply id="S2.SS1.SSS1.p1.3.m3.5.5.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5"><and id="S2.SS1.SSS1.p1.3.m3.5.5a.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5"></and><apply id="S2.SS1.SSS1.p1.3.m3.5.5b.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5"><eq id="S2.SS1.SSS1.p1.3.m3.5.5.7.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.7"></eq><apply id="S2.SS1.SSS1.p1.3.m3.5.5.6.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.6"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.3.m3.5.5.6.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.6">superscript</csymbol><apply id="S2.SS1.SSS1.p1.3.m3.5.5.6.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.6"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.3.m3.5.5.6.2.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.6">subscript</csymbol><ci id="S2.SS1.SSS1.p1.3.m3.5.5.6.2.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.6.2.2">𝑜</ci><ci id="S2.SS1.SSS1.p1.3.m3.5.5.6.2.3.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.6.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS1.p1.3.m3.5.5.6.3.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.6.3">𝑖</ci></apply><apply id="S2.SS1.SSS1.p1.3.m3.5.5.4.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.4"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.3.m3.5.5.4.5.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.4">subscript</csymbol><list id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.5.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4"><apply id="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1">superscript</csymbol><ci id="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1.2">𝑔</ci><ci id="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1.3.cmml" xref="S2.SS1.SSS1.p1.3.m3.2.2.1.1.1.1.3">𝑖</ci></apply><apply id="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2">superscript</csymbol><apply id="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2"><ci id="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2.1">~</ci><ci id="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.2.2">𝑝</ci></apply><ci id="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.3.m3.3.3.2.2.2.2.3">𝑖</ci></apply><apply id="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.cmml" xref="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3">superscript</csymbol><apply id="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2"><ci id="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2.1">~</ci><ci id="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.2.2">𝜓</ci></apply><ci id="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.3.cmml" xref="S2.SS1.SSS1.p1.3.m3.4.4.3.3.3.3.3">𝑖</ci></apply><apply id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4">superscript</csymbol><apply id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2"><ci id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2.1">~</ci><ci id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.2.2">𝑣</ci></apply><ci id="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.3.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.4.4.4.3">𝑖</ci></apply></list><ci id="S2.SS1.SSS1.p1.3.m3.5.5.4.6.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.4.6">𝑡</ci></apply></apply><apply id="S2.SS1.SSS1.p1.3.m3.5.5c.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5"><in id="S2.SS1.SSS1.p1.3.m3.5.5.8.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.8"></in><share href="https://arxiv.org/html/2403.10996v5#S2.SS1.SSS1.p1.3.m3.5.5.4.cmml" id="S2.SS1.SSS1.p1.3.m3.5.5d.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5"></share><apply id="S2.SS1.SSS1.p1.3.m3.5.5.9.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.9"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.3.m3.5.5.9.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.9">superscript</csymbol><ci id="S2.SS1.SSS1.p1.3.m3.5.5.9.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.5.5.9.2">ℝ</ci><apply id="S2.SS1.SSS1.p1.3.m3.1.1.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.1.1.1"><plus id="S2.SS1.SSS1.p1.3.m3.1.1.1.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.2"></plus><cn id="S2.SS1.SSS1.p1.3.m3.1.1.1.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.3">2</cn><apply id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1"><times id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.2"></times><cn id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.3">4</cn><apply id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1"><minus id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.1"></minus><ci id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.2.cmml" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.2">𝑁</ci><cn id="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.3.m3.1.1.1.1.1.1.1.3">1</cn></apply></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.3.m3.5c">o_{t}^{i}=\left[g^{i},\tilde{p}^{i},\tilde{\psi}^{i},\tilde{v}^{i}\right]_{t}% \in\mathbb{R}^{2+4(N-1)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.3.m3.5d">italic_o start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = [ italic_g start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , over~ start_ARG italic_p end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , over~ start_ARG italic_ψ end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , over~ start_ARG italic_v end_ARG start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ] start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 2 + 4 ( italic_N - 1 ) end_POSTSUPERSCRIPT</annotation></semantics></math>. This included IPS for positional coordinates <math alttext="\left[p_{x},p_{y}\right]_{t}\in\mathbb{R}^{2}" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.4.m4.2"><semantics id="S2.SS1.SSS1.p1.4.m4.2a"><mrow id="S2.SS1.SSS1.p1.4.m4.2.2" xref="S2.SS1.SSS1.p1.4.m4.2.2.cmml"><msub id="S2.SS1.SSS1.p1.4.m4.2.2.2" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.cmml"><mrow id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.3.cmml"><mo id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.3" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.3.cmml">[</mo><msub id="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1" xref="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1.cmml"><mi id="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1.2" xref="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1.2.cmml">p</mi><mi id="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1.3" xref="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1.3.cmml">x</mi></msub><mo id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.4" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.3.cmml">,</mo><msub id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2.cmml"><mi id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2.2" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2.2.cmml">p</mi><mi id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2.3" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2.3.cmml">y</mi></msub><mo id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.5" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.3.cmml">]</mo></mrow><mi id="S2.SS1.SSS1.p1.4.m4.2.2.2.4" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.4.cmml">t</mi></msub><mo id="S2.SS1.SSS1.p1.4.m4.2.2.3" xref="S2.SS1.SSS1.p1.4.m4.2.2.3.cmml">∈</mo><msup id="S2.SS1.SSS1.p1.4.m4.2.2.4" xref="S2.SS1.SSS1.p1.4.m4.2.2.4.cmml"><mi id="S2.SS1.SSS1.p1.4.m4.2.2.4.2" xref="S2.SS1.SSS1.p1.4.m4.2.2.4.2.cmml">ℝ</mi><mn id="S2.SS1.SSS1.p1.4.m4.2.2.4.3" xref="S2.SS1.SSS1.p1.4.m4.2.2.4.3.cmml">2</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.4.m4.2b"><apply id="S2.SS1.SSS1.p1.4.m4.2.2.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2"><in id="S2.SS1.SSS1.p1.4.m4.2.2.3.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.3"></in><apply id="S2.SS1.SSS1.p1.4.m4.2.2.2.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.4.m4.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.2">subscript</csymbol><interval closure="closed" id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2"><apply id="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1">subscript</csymbol><ci id="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1.2.cmml" xref="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1.2">𝑝</ci><ci id="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1.3.cmml" xref="S2.SS1.SSS1.p1.4.m4.1.1.1.1.1.1.3">𝑥</ci></apply><apply id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2.1.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2">subscript</csymbol><ci id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2.2">𝑝</ci><ci id="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.2.2.2.3">𝑦</ci></apply></interval><ci id="S2.SS1.SSS1.p1.4.m4.2.2.2.4.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.2.4">𝑡</ci></apply><apply id="S2.SS1.SSS1.p1.4.m4.2.2.4.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.4"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.4.m4.2.2.4.1.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.4">superscript</csymbol><ci id="S2.SS1.SSS1.p1.4.m4.2.2.4.2.cmml" xref="S2.SS1.SSS1.p1.4.m4.2.2.4.2">ℝ</ci><cn id="S2.SS1.SSS1.p1.4.m4.2.2.4.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.4.m4.2.2.4.3">2</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.4.m4.2c">\left[p_{x},p_{y}\right]_{t}\in\mathbb{R}^{2}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.4.m4.2d">[ italic_p start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT</annotation></semantics></math>, IMU for yaw <math alttext="\psi_{t}\in\mathbb{R}^{1}" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.5.m5.1"><semantics id="S2.SS1.SSS1.p1.5.m5.1a"><mrow id="S2.SS1.SSS1.p1.5.m5.1.1" xref="S2.SS1.SSS1.p1.5.m5.1.1.cmml"><msub id="S2.SS1.SSS1.p1.5.m5.1.1.2" xref="S2.SS1.SSS1.p1.5.m5.1.1.2.cmml"><mi id="S2.SS1.SSS1.p1.5.m5.1.1.2.2" xref="S2.SS1.SSS1.p1.5.m5.1.1.2.2.cmml">ψ</mi><mi id="S2.SS1.SSS1.p1.5.m5.1.1.2.3" xref="S2.SS1.SSS1.p1.5.m5.1.1.2.3.cmml">t</mi></msub><mo id="S2.SS1.SSS1.p1.5.m5.1.1.1" xref="S2.SS1.SSS1.p1.5.m5.1.1.1.cmml">∈</mo><msup id="S2.SS1.SSS1.p1.5.m5.1.1.3" xref="S2.SS1.SSS1.p1.5.m5.1.1.3.cmml"><mi id="S2.SS1.SSS1.p1.5.m5.1.1.3.2" xref="S2.SS1.SSS1.p1.5.m5.1.1.3.2.cmml">ℝ</mi><mn id="S2.SS1.SSS1.p1.5.m5.1.1.3.3" xref="S2.SS1.SSS1.p1.5.m5.1.1.3.3.cmml">1</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.5.m5.1b"><apply id="S2.SS1.SSS1.p1.5.m5.1.1.cmml" xref="S2.SS1.SSS1.p1.5.m5.1.1"><in id="S2.SS1.SSS1.p1.5.m5.1.1.1.cmml" xref="S2.SS1.SSS1.p1.5.m5.1.1.1"></in><apply id="S2.SS1.SSS1.p1.5.m5.1.1.2.cmml" xref="S2.SS1.SSS1.p1.5.m5.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.5.m5.1.1.2.1.cmml" xref="S2.SS1.SSS1.p1.5.m5.1.1.2">subscript</csymbol><ci id="S2.SS1.SSS1.p1.5.m5.1.1.2.2.cmml" xref="S2.SS1.SSS1.p1.5.m5.1.1.2.2">𝜓</ci><ci id="S2.SS1.SSS1.p1.5.m5.1.1.2.3.cmml" xref="S2.SS1.SSS1.p1.5.m5.1.1.2.3">𝑡</ci></apply><apply id="S2.SS1.SSS1.p1.5.m5.1.1.3.cmml" xref="S2.SS1.SSS1.p1.5.m5.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.5.m5.1.1.3.1.cmml" xref="S2.SS1.SSS1.p1.5.m5.1.1.3">superscript</csymbol><ci id="S2.SS1.SSS1.p1.5.m5.1.1.3.2.cmml" xref="S2.SS1.SSS1.p1.5.m5.1.1.3.2">ℝ</ci><cn id="S2.SS1.SSS1.p1.5.m5.1.1.3.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.5.m5.1.1.3.3">1</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.5.m5.1c">\psi_{t}\in\mathbb{R}^{1}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.5.m5.1d">italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT</annotation></semantics></math>, and incremental encoders for estimating vehicle velocity <math alttext="v_{t}\in\mathbb{R}^{1}" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.6.m6.1"><semantics id="S2.SS1.SSS1.p1.6.m6.1a"><mrow id="S2.SS1.SSS1.p1.6.m6.1.1" xref="S2.SS1.SSS1.p1.6.m6.1.1.cmml"><msub id="S2.SS1.SSS1.p1.6.m6.1.1.2" xref="S2.SS1.SSS1.p1.6.m6.1.1.2.cmml"><mi id="S2.SS1.SSS1.p1.6.m6.1.1.2.2" xref="S2.SS1.SSS1.p1.6.m6.1.1.2.2.cmml">v</mi><mi id="S2.SS1.SSS1.p1.6.m6.1.1.2.3" xref="S2.SS1.SSS1.p1.6.m6.1.1.2.3.cmml">t</mi></msub><mo id="S2.SS1.SSS1.p1.6.m6.1.1.1" xref="S2.SS1.SSS1.p1.6.m6.1.1.1.cmml">∈</mo><msup id="S2.SS1.SSS1.p1.6.m6.1.1.3" xref="S2.SS1.SSS1.p1.6.m6.1.1.3.cmml"><mi id="S2.SS1.SSS1.p1.6.m6.1.1.3.2" xref="S2.SS1.SSS1.p1.6.m6.1.1.3.2.cmml">ℝ</mi><mn id="S2.SS1.SSS1.p1.6.m6.1.1.3.3" xref="S2.SS1.SSS1.p1.6.m6.1.1.3.3.cmml">1</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.6.m6.1b"><apply id="S2.SS1.SSS1.p1.6.m6.1.1.cmml" xref="S2.SS1.SSS1.p1.6.m6.1.1"><in id="S2.SS1.SSS1.p1.6.m6.1.1.1.cmml" xref="S2.SS1.SSS1.p1.6.m6.1.1.1"></in><apply id="S2.SS1.SSS1.p1.6.m6.1.1.2.cmml" xref="S2.SS1.SSS1.p1.6.m6.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.6.m6.1.1.2.1.cmml" xref="S2.SS1.SSS1.p1.6.m6.1.1.2">subscript</csymbol><ci id="S2.SS1.SSS1.p1.6.m6.1.1.2.2.cmml" xref="S2.SS1.SSS1.p1.6.m6.1.1.2.2">𝑣</ci><ci id="S2.SS1.SSS1.p1.6.m6.1.1.2.3.cmml" xref="S2.SS1.SSS1.p1.6.m6.1.1.2.3">𝑡</ci></apply><apply id="S2.SS1.SSS1.p1.6.m6.1.1.3.cmml" xref="S2.SS1.SSS1.p1.6.m6.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.6.m6.1.1.3.1.cmml" xref="S2.SS1.SSS1.p1.6.m6.1.1.3">superscript</csymbol><ci id="S2.SS1.SSS1.p1.6.m6.1.1.3.2.cmml" xref="S2.SS1.SSS1.p1.6.m6.1.1.3.2">ℝ</ci><cn id="S2.SS1.SSS1.p1.6.m6.1.1.3.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.6.m6.1.1.3.3">1</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.6.m6.1c">v_{t}\in\mathbb{R}^{1}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.6.m6.1d">italic_v start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT</annotation></semantics></math>. This follows that <math alttext="g_{t}^{i}=\left[g_{x}^{i}-p_{x}^{i},g_{y}^{i}-p_{y}^{i}\right]_{t}\in\mathbb{R% }^{2}" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.7.m7.2"><semantics id="S2.SS1.SSS1.p1.7.m7.2a"><mrow id="S2.SS1.SSS1.p1.7.m7.2.2" xref="S2.SS1.SSS1.p1.7.m7.2.2.cmml"><msubsup id="S2.SS1.SSS1.p1.7.m7.2.2.4" xref="S2.SS1.SSS1.p1.7.m7.2.2.4.cmml"><mi id="S2.SS1.SSS1.p1.7.m7.2.2.4.2.2" xref="S2.SS1.SSS1.p1.7.m7.2.2.4.2.2.cmml">g</mi><mi id="S2.SS1.SSS1.p1.7.m7.2.2.4.2.3" xref="S2.SS1.SSS1.p1.7.m7.2.2.4.2.3.cmml">t</mi><mi id="S2.SS1.SSS1.p1.7.m7.2.2.4.3" xref="S2.SS1.SSS1.p1.7.m7.2.2.4.3.cmml">i</mi></msubsup><mo id="S2.SS1.SSS1.p1.7.m7.2.2.5" xref="S2.SS1.SSS1.p1.7.m7.2.2.5.cmml">=</mo><msub id="S2.SS1.SSS1.p1.7.m7.2.2.2" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.cmml"><mrow id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.3.cmml"><mo id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.3" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.3.cmml">[</mo><mrow id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.cmml"><msubsup id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.cmml"><mi id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.2.2" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.2.2.cmml">g</mi><mi id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.2.3" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.2.3.cmml">x</mi><mi id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.3" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.3.cmml">i</mi></msubsup><mo id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.1" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.1.cmml">−</mo><msubsup id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.cmml"><mi id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.2.2" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.2.2.cmml">p</mi><mi id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.2.3" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.2.3.cmml">x</mi><mi id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.3" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.3.cmml">i</mi></msubsup></mrow><mo id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.4" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.3.cmml">,</mo><mrow id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.cmml"><msubsup id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.cmml"><mi id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.2.2" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.2.2.cmml">g</mi><mi id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.2.3" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.2.3.cmml">y</mi><mi id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.3" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.3.cmml">i</mi></msubsup><mo id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.1" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.1.cmml">−</mo><msubsup id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.cmml"><mi id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.2.2" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.2.2.cmml">p</mi><mi id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.2.3" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.2.3.cmml">y</mi><mi id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.3" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.3.cmml">i</mi></msubsup></mrow><mo id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.5" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.3.cmml">]</mo></mrow><mi id="S2.SS1.SSS1.p1.7.m7.2.2.2.4" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.4.cmml">t</mi></msub><mo id="S2.SS1.SSS1.p1.7.m7.2.2.6" xref="S2.SS1.SSS1.p1.7.m7.2.2.6.cmml">∈</mo><msup id="S2.SS1.SSS1.p1.7.m7.2.2.7" xref="S2.SS1.SSS1.p1.7.m7.2.2.7.cmml"><mi id="S2.SS1.SSS1.p1.7.m7.2.2.7.2" xref="S2.SS1.SSS1.p1.7.m7.2.2.7.2.cmml">ℝ</mi><mn id="S2.SS1.SSS1.p1.7.m7.2.2.7.3" xref="S2.SS1.SSS1.p1.7.m7.2.2.7.3.cmml">2</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.7.m7.2b"><apply id="S2.SS1.SSS1.p1.7.m7.2.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2"><and id="S2.SS1.SSS1.p1.7.m7.2.2a.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2"></and><apply id="S2.SS1.SSS1.p1.7.m7.2.2b.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2"><eq id="S2.SS1.SSS1.p1.7.m7.2.2.5.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.5"></eq><apply id="S2.SS1.SSS1.p1.7.m7.2.2.4.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.4"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.2.2.4.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.4">superscript</csymbol><apply id="S2.SS1.SSS1.p1.7.m7.2.2.4.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.4"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.2.2.4.2.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.4">subscript</csymbol><ci id="S2.SS1.SSS1.p1.7.m7.2.2.4.2.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.4.2.2">𝑔</ci><ci id="S2.SS1.SSS1.p1.7.m7.2.2.4.2.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.4.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS1.p1.7.m7.2.2.4.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.4.3">𝑖</ci></apply><apply id="S2.SS1.SSS1.p1.7.m7.2.2.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2">subscript</csymbol><interval closure="closed" id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2"><apply id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1"><minus id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.1"></minus><apply id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2">superscript</csymbol><apply id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.2.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2">subscript</csymbol><ci id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.2.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.2.2">𝑔</ci><ci id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.2.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.2.3">𝑥</ci></apply><ci id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.2.3">𝑖</ci></apply><apply id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3">superscript</csymbol><apply id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.2.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3">subscript</csymbol><ci id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.2.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.2.2">𝑝</ci><ci id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.2.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.2.3">𝑥</ci></apply><ci id="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.1.1.1.1.1.1.3.3">𝑖</ci></apply></apply><apply id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2"><minus id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.1"></minus><apply id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2">superscript</csymbol><apply id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.2.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2">subscript</csymbol><ci id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.2.2">𝑔</ci><ci id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.2.3">𝑦</ci></apply><ci id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.2.3">𝑖</ci></apply><apply id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3">superscript</csymbol><apply id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.2.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3">subscript</csymbol><ci id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.2.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.2.2">𝑝</ci><ci id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.2.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.2.3">𝑦</ci></apply><ci id="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.3.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.2.2.2.3.3">𝑖</ci></apply></apply></interval><ci id="S2.SS1.SSS1.p1.7.m7.2.2.2.4.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.2.4">𝑡</ci></apply></apply><apply id="S2.SS1.SSS1.p1.7.m7.2.2c.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2"><in id="S2.SS1.SSS1.p1.7.m7.2.2.6.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.6"></in><share href="https://arxiv.org/html/2403.10996v5#S2.SS1.SSS1.p1.7.m7.2.2.2.cmml" id="S2.SS1.SSS1.p1.7.m7.2.2d.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2"></share><apply id="S2.SS1.SSS1.p1.7.m7.2.2.7.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.7"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.7.m7.2.2.7.1.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.7">superscript</csymbol><ci id="S2.SS1.SSS1.p1.7.m7.2.2.7.2.cmml" xref="S2.SS1.SSS1.p1.7.m7.2.2.7.2">ℝ</ci><cn id="S2.SS1.SSS1.p1.7.m7.2.2.7.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.7.m7.2.2.7.3">2</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.7.m7.2c">g_{t}^{i}=\left[g_{x}^{i}-p_{x}^{i},g_{y}^{i}-p_{y}^{i}\right]_{t}\in\mathbb{R% }^{2}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.7.m7.2d">italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = [ italic_g start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT - italic_p start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , italic_g start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT - italic_p start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ] start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT</annotation></semantics></math> was the ego agent’s goal location relative to itself, <math alttext="\tilde{p}_{t}^{i}=\left[p_{x}^{j}-p_{x}^{i},p_{y}^{j}-p_{y}^{i}\right]_{t}\in% \mathbb{R}^{2(N-1)}" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.8.m8.3"><semantics id="S2.SS1.SSS1.p1.8.m8.3a"><mrow id="S2.SS1.SSS1.p1.8.m8.3.3" xref="S2.SS1.SSS1.p1.8.m8.3.3.cmml"><msubsup id="S2.SS1.SSS1.p1.8.m8.3.3.4" xref="S2.SS1.SSS1.p1.8.m8.3.3.4.cmml"><mover accent="true" id="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2" xref="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2.cmml"><mi id="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2.2" xref="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2.2.cmml">p</mi><mo id="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2.1" xref="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2.1.cmml">~</mo></mover><mi id="S2.SS1.SSS1.p1.8.m8.3.3.4.2.3" xref="S2.SS1.SSS1.p1.8.m8.3.3.4.2.3.cmml">t</mi><mi id="S2.SS1.SSS1.p1.8.m8.3.3.4.3" xref="S2.SS1.SSS1.p1.8.m8.3.3.4.3.cmml">i</mi></msubsup><mo id="S2.SS1.SSS1.p1.8.m8.3.3.5" xref="S2.SS1.SSS1.p1.8.m8.3.3.5.cmml">=</mo><msub id="S2.SS1.SSS1.p1.8.m8.3.3.2" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.cmml"><mrow id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.3.cmml"><mo id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.3" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.3.cmml">[</mo><mrow id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.cmml"><msubsup id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.cmml"><mi id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.2.2" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.2.2.cmml">p</mi><mi id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.2.3" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.2.3.cmml">x</mi><mi id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.3" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.3.cmml">j</mi></msubsup><mo id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.1" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.1.cmml">−</mo><msubsup id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.cmml"><mi id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.2.2" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.2.2.cmml">p</mi><mi id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.2.3" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.2.3.cmml">x</mi><mi id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.3" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.3.cmml">i</mi></msubsup></mrow><mo id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.4" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.3.cmml">,</mo><mrow id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.cmml"><msubsup id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.cmml"><mi id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.2.2" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.2.2.cmml">p</mi><mi id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.2.3" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.2.3.cmml">y</mi><mi id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.3" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.3.cmml">j</mi></msubsup><mo id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.1" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.1.cmml">−</mo><msubsup id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.cmml"><mi id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.2.2" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.2.2.cmml">p</mi><mi id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.2.3" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.2.3.cmml">y</mi><mi id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.3" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.3.cmml">i</mi></msubsup></mrow><mo id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.5" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.3.cmml">]</mo></mrow><mi id="S2.SS1.SSS1.p1.8.m8.3.3.2.4" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.4.cmml">t</mi></msub><mo id="S2.SS1.SSS1.p1.8.m8.3.3.6" xref="S2.SS1.SSS1.p1.8.m8.3.3.6.cmml">∈</mo><msup id="S2.SS1.SSS1.p1.8.m8.3.3.7" xref="S2.SS1.SSS1.p1.8.m8.3.3.7.cmml"><mi id="S2.SS1.SSS1.p1.8.m8.3.3.7.2" xref="S2.SS1.SSS1.p1.8.m8.3.3.7.2.cmml">ℝ</mi><mrow id="S2.SS1.SSS1.p1.8.m8.1.1.1" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.cmml"><mn id="S2.SS1.SSS1.p1.8.m8.1.1.1.3" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.3.cmml">2</mn><mo id="S2.SS1.SSS1.p1.8.m8.1.1.1.2" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.2.cmml">⁢</mo><mrow id="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.cmml"><mo id="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.2" stretchy="false" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.cmml"><mi id="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.2" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.2.cmml">N</mi><mo id="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.1" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.1.cmml">−</mo><mn id="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.3" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.3.cmml">1</mn></mrow><mo id="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.3" stretchy="false" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.8.m8.3b"><apply id="S2.SS1.SSS1.p1.8.m8.3.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3"><and id="S2.SS1.SSS1.p1.8.m8.3.3a.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3"></and><apply id="S2.SS1.SSS1.p1.8.m8.3.3b.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3"><eq id="S2.SS1.SSS1.p1.8.m8.3.3.5.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.5"></eq><apply id="S2.SS1.SSS1.p1.8.m8.3.3.4.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.4"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.3.3.4.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.4">superscript</csymbol><apply id="S2.SS1.SSS1.p1.8.m8.3.3.4.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.4"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.3.3.4.2.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.4">subscript</csymbol><apply id="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2"><ci id="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2.1">~</ci><ci id="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.4.2.2.2">𝑝</ci></apply><ci id="S2.SS1.SSS1.p1.8.m8.3.3.4.2.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.4.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS1.p1.8.m8.3.3.4.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.4.3">𝑖</ci></apply><apply id="S2.SS1.SSS1.p1.8.m8.3.3.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.3.3.2.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2">subscript</csymbol><interval closure="closed" id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2"><apply id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1"><minus id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.1"></minus><apply id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2">superscript</csymbol><apply id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.2.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2">subscript</csymbol><ci id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.2.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.2.2">𝑝</ci><ci id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.2.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.2.3">𝑥</ci></apply><ci id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.2.3">𝑗</ci></apply><apply id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3">superscript</csymbol><apply id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.2.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3">subscript</csymbol><ci id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.2.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.2.2">𝑝</ci><ci id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.2.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.2.3">𝑥</ci></apply><ci id="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.2.2.1.1.1.1.3.3">𝑖</ci></apply></apply><apply id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2"><minus id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.1"></minus><apply id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2">superscript</csymbol><apply id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.2.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2">subscript</csymbol><ci id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.2.2">𝑝</ci><ci id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.2.3">𝑦</ci></apply><ci id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.2.3">𝑗</ci></apply><apply id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3">superscript</csymbol><apply id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.2.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3">subscript</csymbol><ci id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.2.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.2.2">𝑝</ci><ci id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.2.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.2.3">𝑦</ci></apply><ci id="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.3.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.2.2.2.3.3">𝑖</ci></apply></apply></interval><ci id="S2.SS1.SSS1.p1.8.m8.3.3.2.4.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.2.4">𝑡</ci></apply></apply><apply id="S2.SS1.SSS1.p1.8.m8.3.3c.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3"><in id="S2.SS1.SSS1.p1.8.m8.3.3.6.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.6"></in><share href="https://arxiv.org/html/2403.10996v5#S2.SS1.SSS1.p1.8.m8.3.3.2.cmml" id="S2.SS1.SSS1.p1.8.m8.3.3d.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3"></share><apply id="S2.SS1.SSS1.p1.8.m8.3.3.7.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.7"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.8.m8.3.3.7.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.7">superscript</csymbol><ci id="S2.SS1.SSS1.p1.8.m8.3.3.7.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.3.3.7.2">ℝ</ci><apply id="S2.SS1.SSS1.p1.8.m8.1.1.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.1.1.1"><times id="S2.SS1.SSS1.p1.8.m8.1.1.1.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.2"></times><cn id="S2.SS1.SSS1.p1.8.m8.1.1.1.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.3">2</cn><apply id="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1"><minus id="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.1"></minus><ci id="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.2.cmml" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.2">𝑁</ci><cn id="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.8.m8.1.1.1.1.1.1.3">1</cn></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.8.m8.3c">\tilde{p}_{t}^{i}=\left[p_{x}^{j}-p_{x}^{i},p_{y}^{j}-p_{y}^{i}\right]_{t}\in% \mathbb{R}^{2(N-1)}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.8.m8.3d">over~ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = [ italic_p start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT - italic_p start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , italic_p start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT - italic_p start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ] start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 2 ( italic_N - 1 ) end_POSTSUPERSCRIPT</annotation></semantics></math> was the position of every peer agent relative to the ego agent, <math alttext="\tilde{\psi}_{t}^{i}=\psi_{t}^{j}-\psi_{t}^{i}\in\mathbb{R}^{N-1}" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.9.m9.1"><semantics id="S2.SS1.SSS1.p1.9.m9.1a"><mrow id="S2.SS1.SSS1.p1.9.m9.1.1" xref="S2.SS1.SSS1.p1.9.m9.1.1.cmml"><msubsup id="S2.SS1.SSS1.p1.9.m9.1.1.2" xref="S2.SS1.SSS1.p1.9.m9.1.1.2.cmml"><mover accent="true" id="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2" xref="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2.cmml"><mi id="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2.2" xref="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2.2.cmml">ψ</mi><mo id="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2.1" xref="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2.1.cmml">~</mo></mover><mi id="S2.SS1.SSS1.p1.9.m9.1.1.2.2.3" xref="S2.SS1.SSS1.p1.9.m9.1.1.2.2.3.cmml">t</mi><mi id="S2.SS1.SSS1.p1.9.m9.1.1.2.3" xref="S2.SS1.SSS1.p1.9.m9.1.1.2.3.cmml">i</mi></msubsup><mo id="S2.SS1.SSS1.p1.9.m9.1.1.3" xref="S2.SS1.SSS1.p1.9.m9.1.1.3.cmml">=</mo><mrow id="S2.SS1.SSS1.p1.9.m9.1.1.4" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.cmml"><msubsup id="S2.SS1.SSS1.p1.9.m9.1.1.4.2" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.2.cmml"><mi id="S2.SS1.SSS1.p1.9.m9.1.1.4.2.2.2" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.2.2.2.cmml">ψ</mi><mi id="S2.SS1.SSS1.p1.9.m9.1.1.4.2.2.3" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.2.2.3.cmml">t</mi><mi id="S2.SS1.SSS1.p1.9.m9.1.1.4.2.3" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.2.3.cmml">j</mi></msubsup><mo id="S2.SS1.SSS1.p1.9.m9.1.1.4.1" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.1.cmml">−</mo><msubsup id="S2.SS1.SSS1.p1.9.m9.1.1.4.3" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.3.cmml"><mi id="S2.SS1.SSS1.p1.9.m9.1.1.4.3.2.2" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.3.2.2.cmml">ψ</mi><mi id="S2.SS1.SSS1.p1.9.m9.1.1.4.3.2.3" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.3.2.3.cmml">t</mi><mi id="S2.SS1.SSS1.p1.9.m9.1.1.4.3.3" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.3.3.cmml">i</mi></msubsup></mrow><mo id="S2.SS1.SSS1.p1.9.m9.1.1.5" xref="S2.SS1.SSS1.p1.9.m9.1.1.5.cmml">∈</mo><msup id="S2.SS1.SSS1.p1.9.m9.1.1.6" xref="S2.SS1.SSS1.p1.9.m9.1.1.6.cmml"><mi id="S2.SS1.SSS1.p1.9.m9.1.1.6.2" xref="S2.SS1.SSS1.p1.9.m9.1.1.6.2.cmml">ℝ</mi><mrow id="S2.SS1.SSS1.p1.9.m9.1.1.6.3" xref="S2.SS1.SSS1.p1.9.m9.1.1.6.3.cmml"><mi id="S2.SS1.SSS1.p1.9.m9.1.1.6.3.2" xref="S2.SS1.SSS1.p1.9.m9.1.1.6.3.2.cmml">N</mi><mo id="S2.SS1.SSS1.p1.9.m9.1.1.6.3.1" xref="S2.SS1.SSS1.p1.9.m9.1.1.6.3.1.cmml">−</mo><mn id="S2.SS1.SSS1.p1.9.m9.1.1.6.3.3" xref="S2.SS1.SSS1.p1.9.m9.1.1.6.3.3.cmml">1</mn></mrow></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.9.m9.1b"><apply id="S2.SS1.SSS1.p1.9.m9.1.1.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1"><and id="S2.SS1.SSS1.p1.9.m9.1.1a.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1"></and><apply id="S2.SS1.SSS1.p1.9.m9.1.1b.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1"><eq id="S2.SS1.SSS1.p1.9.m9.1.1.3.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.3"></eq><apply id="S2.SS1.SSS1.p1.9.m9.1.1.2.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.9.m9.1.1.2.1.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.2">superscript</csymbol><apply id="S2.SS1.SSS1.p1.9.m9.1.1.2.2.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.9.m9.1.1.2.2.1.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.2">subscript</csymbol><apply id="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2"><ci id="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2.1.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2.1">~</ci><ci id="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.2.2.2.2">𝜓</ci></apply><ci id="S2.SS1.SSS1.p1.9.m9.1.1.2.2.3.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.2.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS1.p1.9.m9.1.1.2.3.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.2.3">𝑖</ci></apply><apply id="S2.SS1.SSS1.p1.9.m9.1.1.4.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4"><minus id="S2.SS1.SSS1.p1.9.m9.1.1.4.1.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.1"></minus><apply id="S2.SS1.SSS1.p1.9.m9.1.1.4.2.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.9.m9.1.1.4.2.1.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.2">superscript</csymbol><apply id="S2.SS1.SSS1.p1.9.m9.1.1.4.2.2.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.9.m9.1.1.4.2.2.1.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.2">subscript</csymbol><ci id="S2.SS1.SSS1.p1.9.m9.1.1.4.2.2.2.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.2.2.2">𝜓</ci><ci id="S2.SS1.SSS1.p1.9.m9.1.1.4.2.2.3.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.2.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS1.p1.9.m9.1.1.4.2.3.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.2.3">𝑗</ci></apply><apply id="S2.SS1.SSS1.p1.9.m9.1.1.4.3.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.9.m9.1.1.4.3.1.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.3">superscript</csymbol><apply id="S2.SS1.SSS1.p1.9.m9.1.1.4.3.2.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.3"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.9.m9.1.1.4.3.2.1.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.3">subscript</csymbol><ci id="S2.SS1.SSS1.p1.9.m9.1.1.4.3.2.2.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.3.2.2">𝜓</ci><ci id="S2.SS1.SSS1.p1.9.m9.1.1.4.3.2.3.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.3.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS1.p1.9.m9.1.1.4.3.3.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.4.3.3">𝑖</ci></apply></apply></apply><apply id="S2.SS1.SSS1.p1.9.m9.1.1c.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1"><in id="S2.SS1.SSS1.p1.9.m9.1.1.5.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.5"></in><share href="https://arxiv.org/html/2403.10996v5#S2.SS1.SSS1.p1.9.m9.1.1.4.cmml" id="S2.SS1.SSS1.p1.9.m9.1.1d.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1"></share><apply id="S2.SS1.SSS1.p1.9.m9.1.1.6.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.6"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.9.m9.1.1.6.1.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.6">superscript</csymbol><ci id="S2.SS1.SSS1.p1.9.m9.1.1.6.2.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.6.2">ℝ</ci><apply id="S2.SS1.SSS1.p1.9.m9.1.1.6.3.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.6.3"><minus id="S2.SS1.SSS1.p1.9.m9.1.1.6.3.1.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.6.3.1"></minus><ci id="S2.SS1.SSS1.p1.9.m9.1.1.6.3.2.cmml" xref="S2.SS1.SSS1.p1.9.m9.1.1.6.3.2">𝑁</ci><cn id="S2.SS1.SSS1.p1.9.m9.1.1.6.3.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.9.m9.1.1.6.3.3">1</cn></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.9.m9.1c">\tilde{\psi}_{t}^{i}=\psi_{t}^{j}-\psi_{t}^{i}\in\mathbb{R}^{N-1}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.9.m9.1d">over~ start_ARG italic_ψ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT - italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT</annotation></semantics></math> was the yaw of every peer agent relative to the ego agent, and <math alttext="\tilde{v}_{t}^{i}=v_{t}^{j}\in\mathbb{R}^{N-1}" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.10.m10.1"><semantics id="S2.SS1.SSS1.p1.10.m10.1a"><mrow id="S2.SS1.SSS1.p1.10.m10.1.1" xref="S2.SS1.SSS1.p1.10.m10.1.1.cmml"><msubsup id="S2.SS1.SSS1.p1.10.m10.1.1.2" xref="S2.SS1.SSS1.p1.10.m10.1.1.2.cmml"><mover accent="true" id="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2" xref="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2.cmml"><mi id="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2.2" xref="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2.2.cmml">v</mi><mo id="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2.1" xref="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2.1.cmml">~</mo></mover><mi id="S2.SS1.SSS1.p1.10.m10.1.1.2.2.3" xref="S2.SS1.SSS1.p1.10.m10.1.1.2.2.3.cmml">t</mi><mi id="S2.SS1.SSS1.p1.10.m10.1.1.2.3" xref="S2.SS1.SSS1.p1.10.m10.1.1.2.3.cmml">i</mi></msubsup><mo id="S2.SS1.SSS1.p1.10.m10.1.1.3" xref="S2.SS1.SSS1.p1.10.m10.1.1.3.cmml">=</mo><msubsup id="S2.SS1.SSS1.p1.10.m10.1.1.4" xref="S2.SS1.SSS1.p1.10.m10.1.1.4.cmml"><mi id="S2.SS1.SSS1.p1.10.m10.1.1.4.2.2" xref="S2.SS1.SSS1.p1.10.m10.1.1.4.2.2.cmml">v</mi><mi id="S2.SS1.SSS1.p1.10.m10.1.1.4.2.3" xref="S2.SS1.SSS1.p1.10.m10.1.1.4.2.3.cmml">t</mi><mi id="S2.SS1.SSS1.p1.10.m10.1.1.4.3" xref="S2.SS1.SSS1.p1.10.m10.1.1.4.3.cmml">j</mi></msubsup><mo id="S2.SS1.SSS1.p1.10.m10.1.1.5" xref="S2.SS1.SSS1.p1.10.m10.1.1.5.cmml">∈</mo><msup id="S2.SS1.SSS1.p1.10.m10.1.1.6" xref="S2.SS1.SSS1.p1.10.m10.1.1.6.cmml"><mi id="S2.SS1.SSS1.p1.10.m10.1.1.6.2" xref="S2.SS1.SSS1.p1.10.m10.1.1.6.2.cmml">ℝ</mi><mrow id="S2.SS1.SSS1.p1.10.m10.1.1.6.3" xref="S2.SS1.SSS1.p1.10.m10.1.1.6.3.cmml"><mi id="S2.SS1.SSS1.p1.10.m10.1.1.6.3.2" xref="S2.SS1.SSS1.p1.10.m10.1.1.6.3.2.cmml">N</mi><mo id="S2.SS1.SSS1.p1.10.m10.1.1.6.3.1" xref="S2.SS1.SSS1.p1.10.m10.1.1.6.3.1.cmml">−</mo><mn id="S2.SS1.SSS1.p1.10.m10.1.1.6.3.3" xref="S2.SS1.SSS1.p1.10.m10.1.1.6.3.3.cmml">1</mn></mrow></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.10.m10.1b"><apply id="S2.SS1.SSS1.p1.10.m10.1.1.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1"><and id="S2.SS1.SSS1.p1.10.m10.1.1a.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1"></and><apply id="S2.SS1.SSS1.p1.10.m10.1.1b.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1"><eq id="S2.SS1.SSS1.p1.10.m10.1.1.3.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.3"></eq><apply id="S2.SS1.SSS1.p1.10.m10.1.1.2.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.10.m10.1.1.2.1.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.2">superscript</csymbol><apply id="S2.SS1.SSS1.p1.10.m10.1.1.2.2.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.10.m10.1.1.2.2.1.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.2">subscript</csymbol><apply id="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2"><ci id="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2.1.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2.1">~</ci><ci id="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2.2.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.2.2.2.2">𝑣</ci></apply><ci id="S2.SS1.SSS1.p1.10.m10.1.1.2.2.3.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.2.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS1.p1.10.m10.1.1.2.3.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.2.3">𝑖</ci></apply><apply id="S2.SS1.SSS1.p1.10.m10.1.1.4.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.4"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.10.m10.1.1.4.1.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.4">superscript</csymbol><apply id="S2.SS1.SSS1.p1.10.m10.1.1.4.2.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.4"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.10.m10.1.1.4.2.1.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.4">subscript</csymbol><ci id="S2.SS1.SSS1.p1.10.m10.1.1.4.2.2.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.4.2.2">𝑣</ci><ci id="S2.SS1.SSS1.p1.10.m10.1.1.4.2.3.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.4.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS1.p1.10.m10.1.1.4.3.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.4.3">𝑗</ci></apply></apply><apply id="S2.SS1.SSS1.p1.10.m10.1.1c.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1"><in id="S2.SS1.SSS1.p1.10.m10.1.1.5.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.5"></in><share href="https://arxiv.org/html/2403.10996v5#S2.SS1.SSS1.p1.10.m10.1.1.4.cmml" id="S2.SS1.SSS1.p1.10.m10.1.1d.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1"></share><apply id="S2.SS1.SSS1.p1.10.m10.1.1.6.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.6"><csymbol cd="ambiguous" id="S2.SS1.SSS1.p1.10.m10.1.1.6.1.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.6">superscript</csymbol><ci id="S2.SS1.SSS1.p1.10.m10.1.1.6.2.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.6.2">ℝ</ci><apply id="S2.SS1.SSS1.p1.10.m10.1.1.6.3.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.6.3"><minus id="S2.SS1.SSS1.p1.10.m10.1.1.6.3.1.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.6.3.1"></minus><ci id="S2.SS1.SSS1.p1.10.m10.1.1.6.3.2.cmml" xref="S2.SS1.SSS1.p1.10.m10.1.1.6.3.2">𝑁</ci><cn id="S2.SS1.SSS1.p1.10.m10.1.1.6.3.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.10.m10.1.1.6.3.3">1</cn></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.10.m10.1c">\tilde{v}_{t}^{i}=v_{t}^{j}\in\mathbb{R}^{N-1}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.10.m10.1d">over~ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_N - 1 end_POSTSUPERSCRIPT</annotation></semantics></math> was the velocity of every peer agent. Here, <math alttext="i" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.11.m11.1"><semantics id="S2.SS1.SSS1.p1.11.m11.1a"><mi id="S2.SS1.SSS1.p1.11.m11.1.1" xref="S2.SS1.SSS1.p1.11.m11.1.1.cmml">i</mi><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.11.m11.1b"><ci id="S2.SS1.SSS1.p1.11.m11.1.1.cmml" xref="S2.SS1.SSS1.p1.11.m11.1.1">𝑖</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.11.m11.1c">i</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.11.m11.1d">italic_i</annotation></semantics></math> represents the ego agent and <math alttext="j\in\left[0,N-1\right]" class="ltx_Math" display="inline" id="S2.SS1.SSS1.p1.12.m12.2"><semantics id="S2.SS1.SSS1.p1.12.m12.2a"><mrow id="S2.SS1.SSS1.p1.12.m12.2.2" xref="S2.SS1.SSS1.p1.12.m12.2.2.cmml"><mi id="S2.SS1.SSS1.p1.12.m12.2.2.3" xref="S2.SS1.SSS1.p1.12.m12.2.2.3.cmml">j</mi><mo id="S2.SS1.SSS1.p1.12.m12.2.2.2" xref="S2.SS1.SSS1.p1.12.m12.2.2.2.cmml">∈</mo><mrow id="S2.SS1.SSS1.p1.12.m12.2.2.1.1" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.2.cmml"><mo id="S2.SS1.SSS1.p1.12.m12.2.2.1.1.2" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.2.cmml">[</mo><mn id="S2.SS1.SSS1.p1.12.m12.1.1" xref="S2.SS1.SSS1.p1.12.m12.1.1.cmml">0</mn><mo id="S2.SS1.SSS1.p1.12.m12.2.2.1.1.3" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.2.cmml">,</mo><mrow id="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.cmml"><mi id="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.2" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.2.cmml">N</mi><mo id="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.1" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.1.cmml">−</mo><mn id="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.3" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.3.cmml">1</mn></mrow><mo id="S2.SS1.SSS1.p1.12.m12.2.2.1.1.4" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.2.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS1.p1.12.m12.2b"><apply id="S2.SS1.SSS1.p1.12.m12.2.2.cmml" xref="S2.SS1.SSS1.p1.12.m12.2.2"><in id="S2.SS1.SSS1.p1.12.m12.2.2.2.cmml" xref="S2.SS1.SSS1.p1.12.m12.2.2.2"></in><ci id="S2.SS1.SSS1.p1.12.m12.2.2.3.cmml" xref="S2.SS1.SSS1.p1.12.m12.2.2.3">𝑗</ci><interval closure="closed" id="S2.SS1.SSS1.p1.12.m12.2.2.1.2.cmml" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.1"><cn id="S2.SS1.SSS1.p1.12.m12.1.1.cmml" type="integer" xref="S2.SS1.SSS1.p1.12.m12.1.1">0</cn><apply id="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.cmml" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1"><minus id="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.1.cmml" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.1"></minus><ci id="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.2.cmml" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.2">𝑁</ci><cn id="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.3.cmml" type="integer" xref="S2.SS1.SSS1.p1.12.m12.2.2.1.1.1.3">1</cn></apply></interval></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS1.p1.12.m12.2c">j\in\left[0,N-1\right]</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS1.p1.12.m12.2d">italic_j ∈ [ 0 , italic_N - 1 ]</annotation></semantics></math> represents every other (peer) agent.</p> </div> </section> <section class="ltx_subsubsection" id="S2.SS1.SSS2"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S2.SS1.SSS2.5.1.1">II-A</span>2 </span>Action Space</h4> <div class="ltx_para" id="S2.SS1.SSS2.p1"> <p class="ltx_p" id="S2.SS1.SSS2.p1.1">The Ackermann-steered vehicles were controlled using throttle and steering commands: <math alttext="a_{t}^{i}=\left[\tau_{t}^{i},\delta_{t}^{i}\right]\in\mathbb{R}^{2}" class="ltx_Math" display="inline" id="S2.SS1.SSS2.p1.1.m1.2"><semantics id="S2.SS1.SSS2.p1.1.m1.2a"><mrow id="S2.SS1.SSS2.p1.1.m1.2.2" xref="S2.SS1.SSS2.p1.1.m1.2.2.cmml"><msubsup id="S2.SS1.SSS2.p1.1.m1.2.2.4" xref="S2.SS1.SSS2.p1.1.m1.2.2.4.cmml"><mi id="S2.SS1.SSS2.p1.1.m1.2.2.4.2.2" xref="S2.SS1.SSS2.p1.1.m1.2.2.4.2.2.cmml">a</mi><mi id="S2.SS1.SSS2.p1.1.m1.2.2.4.2.3" xref="S2.SS1.SSS2.p1.1.m1.2.2.4.2.3.cmml">t</mi><mi id="S2.SS1.SSS2.p1.1.m1.2.2.4.3" xref="S2.SS1.SSS2.p1.1.m1.2.2.4.3.cmml">i</mi></msubsup><mo id="S2.SS1.SSS2.p1.1.m1.2.2.5" xref="S2.SS1.SSS2.p1.1.m1.2.2.5.cmml">=</mo><mrow id="S2.SS1.SSS2.p1.1.m1.2.2.2.2" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.3.cmml"><mo id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.3" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.3.cmml">[</mo><msubsup id="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1" xref="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.cmml"><mi id="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.2.2" xref="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.2.2.cmml">τ</mi><mi id="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.2.3" xref="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.2.3.cmml">t</mi><mi id="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.3" xref="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.3.cmml">i</mi></msubsup><mo id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.4" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.3.cmml">,</mo><msubsup id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.cmml"><mi id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.2.2" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.2.2.cmml">δ</mi><mi id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.2.3" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.2.3.cmml">t</mi><mi id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.3" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.3.cmml">i</mi></msubsup><mo id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.5" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.3.cmml">]</mo></mrow><mo id="S2.SS1.SSS2.p1.1.m1.2.2.6" xref="S2.SS1.SSS2.p1.1.m1.2.2.6.cmml">∈</mo><msup id="S2.SS1.SSS2.p1.1.m1.2.2.7" xref="S2.SS1.SSS2.p1.1.m1.2.2.7.cmml"><mi id="S2.SS1.SSS2.p1.1.m1.2.2.7.2" xref="S2.SS1.SSS2.p1.1.m1.2.2.7.2.cmml">ℝ</mi><mn id="S2.SS1.SSS2.p1.1.m1.2.2.7.3" xref="S2.SS1.SSS2.p1.1.m1.2.2.7.3.cmml">2</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS2.p1.1.m1.2b"><apply id="S2.SS1.SSS2.p1.1.m1.2.2.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2"><and id="S2.SS1.SSS2.p1.1.m1.2.2a.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2"></and><apply id="S2.SS1.SSS2.p1.1.m1.2.2b.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2"><eq id="S2.SS1.SSS2.p1.1.m1.2.2.5.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.5"></eq><apply id="S2.SS1.SSS2.p1.1.m1.2.2.4.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.4"><csymbol cd="ambiguous" id="S2.SS1.SSS2.p1.1.m1.2.2.4.1.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.4">superscript</csymbol><apply id="S2.SS1.SSS2.p1.1.m1.2.2.4.2.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.4"><csymbol cd="ambiguous" id="S2.SS1.SSS2.p1.1.m1.2.2.4.2.1.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.4">subscript</csymbol><ci id="S2.SS1.SSS2.p1.1.m1.2.2.4.2.2.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.4.2.2">𝑎</ci><ci id="S2.SS1.SSS2.p1.1.m1.2.2.4.2.3.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.4.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS2.p1.1.m1.2.2.4.3.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.4.3">𝑖</ci></apply><interval closure="closed" id="S2.SS1.SSS2.p1.1.m1.2.2.2.3.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2"><apply id="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.cmml" xref="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1">superscript</csymbol><apply id="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.2.cmml" xref="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.2.1.cmml" xref="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1">subscript</csymbol><ci id="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.2.2.cmml" xref="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.2.2">𝜏</ci><ci id="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.2.3.cmml" xref="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.3.cmml" xref="S2.SS1.SSS2.p1.1.m1.1.1.1.1.1.3">𝑖</ci></apply><apply id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.1.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2">superscript</csymbol><apply id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.2.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.2.1.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2">subscript</csymbol><ci id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.2.2.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.2.2">𝛿</ci><ci id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.2.3.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.3.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.2.2.2.3">𝑖</ci></apply></interval></apply><apply id="S2.SS1.SSS2.p1.1.m1.2.2c.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2"><in id="S2.SS1.SSS2.p1.1.m1.2.2.6.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.6"></in><share href="https://arxiv.org/html/2403.10996v5#S2.SS1.SSS2.p1.1.m1.2.2.2.cmml" id="S2.SS1.SSS2.p1.1.m1.2.2d.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2"></share><apply id="S2.SS1.SSS2.p1.1.m1.2.2.7.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.7"><csymbol cd="ambiguous" id="S2.SS1.SSS2.p1.1.m1.2.2.7.1.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.7">superscript</csymbol><ci id="S2.SS1.SSS2.p1.1.m1.2.2.7.2.cmml" xref="S2.SS1.SSS2.p1.1.m1.2.2.7.2">ℝ</ci><cn id="S2.SS1.SSS2.p1.1.m1.2.2.7.3.cmml" type="integer" xref="S2.SS1.SSS2.p1.1.m1.2.2.7.3">2</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS2.p1.1.m1.2c">a_{t}^{i}=\left[\tau_{t}^{i},\delta_{t}^{i}\right]\in\mathbb{R}^{2}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS2.p1.1.m1.2d">italic_a start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = [ italic_τ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ] ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT</annotation></semantics></math>. From a collision avoidance perspective, the throttle command allowed the agents to speed up or slow down, and the steering command allowed the agents to dodge their peers.</p> </div> </section> <section class="ltx_subsubsection" id="S2.SS1.SSS3"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S2.SS1.SSS3.5.1.1">II-A</span>3 </span>Reward Function</h4> <div class="ltx_para" id="S2.SS1.SSS3.p1"> <p class="ltx_p" id="S2.SS1.SSS3.p1.1">Extrinsic reward was formulated as:</p> <table class="ltx_equation ltx_eqn_table" id="S2.E1"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="r_{t}^{i}=\begin{cases}r_{goal}&amp;\text{if safe traversal}\\ -k_{p}*\left\|g_{t}^{i}\right\|_{2}&amp;\text{if traffic violation}\\ k_{r}*(0.001+\left\|g_{t}^{i}\right\|_{2})^{-1}&amp;\text{otherwise}\end{cases}" class="ltx_Math" display="block" id="S2.E1.m1.6"><semantics id="S2.E1.m1.6a"><mrow id="S2.E1.m1.6.7" xref="S2.E1.m1.6.7.cmml"><msubsup id="S2.E1.m1.6.7.2" xref="S2.E1.m1.6.7.2.cmml"><mi id="S2.E1.m1.6.7.2.2.2" xref="S2.E1.m1.6.7.2.2.2.cmml">r</mi><mi id="S2.E1.m1.6.7.2.2.3" xref="S2.E1.m1.6.7.2.2.3.cmml">t</mi><mi id="S2.E1.m1.6.7.2.3" xref="S2.E1.m1.6.7.2.3.cmml">i</mi></msubsup><mo id="S2.E1.m1.6.7.1" xref="S2.E1.m1.6.7.1.cmml">=</mo><mrow id="S2.E1.m1.6.6" xref="S2.E1.m1.6.7.3.1.cmml"><mo id="S2.E1.m1.6.6.7" xref="S2.E1.m1.6.7.3.1.1.cmml">{</mo><mtable columnspacing="5pt" displaystyle="true" id="S2.E1.m1.6.6.6" rowspacing="0pt" xref="S2.E1.m1.6.7.3.1.cmml"><mtr id="S2.E1.m1.6.6.6a" xref="S2.E1.m1.6.7.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E1.m1.6.6.6b" xref="S2.E1.m1.6.7.3.1.cmml"><msub id="S2.E1.m1.1.1.1.1.1.1" xref="S2.E1.m1.1.1.1.1.1.1.cmml"><mi id="S2.E1.m1.1.1.1.1.1.1.2" xref="S2.E1.m1.1.1.1.1.1.1.2.cmml">r</mi><mrow id="S2.E1.m1.1.1.1.1.1.1.3" xref="S2.E1.m1.1.1.1.1.1.1.3.cmml"><mi id="S2.E1.m1.1.1.1.1.1.1.3.2" xref="S2.E1.m1.1.1.1.1.1.1.3.2.cmml">g</mi><mo id="S2.E1.m1.1.1.1.1.1.1.3.1" xref="S2.E1.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S2.E1.m1.1.1.1.1.1.1.3.3" xref="S2.E1.m1.1.1.1.1.1.1.3.3.cmml">o</mi><mo id="S2.E1.m1.1.1.1.1.1.1.3.1a" xref="S2.E1.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S2.E1.m1.1.1.1.1.1.1.3.4" xref="S2.E1.m1.1.1.1.1.1.1.3.4.cmml">a</mi><mo id="S2.E1.m1.1.1.1.1.1.1.3.1b" xref="S2.E1.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S2.E1.m1.1.1.1.1.1.1.3.5" xref="S2.E1.m1.1.1.1.1.1.1.3.5.cmml">l</mi></mrow></msub></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E1.m1.6.6.6c" xref="S2.E1.m1.6.7.3.1.cmml"><mtext id="S2.E1.m1.2.2.2.2.2.1" xref="S2.E1.m1.2.2.2.2.2.1a.cmml">if safe traversal</mtext></mtd></mtr><mtr id="S2.E1.m1.6.6.6d" xref="S2.E1.m1.6.7.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E1.m1.6.6.6e" xref="S2.E1.m1.6.7.3.1.cmml"><mrow id="S2.E1.m1.3.3.3.3.1.1" xref="S2.E1.m1.3.3.3.3.1.1.cmml"><mo id="S2.E1.m1.3.3.3.3.1.1a" xref="S2.E1.m1.3.3.3.3.1.1.cmml">−</mo><mrow id="S2.E1.m1.3.3.3.3.1.1.1" xref="S2.E1.m1.3.3.3.3.1.1.1.cmml"><msub id="S2.E1.m1.3.3.3.3.1.1.1.3" xref="S2.E1.m1.3.3.3.3.1.1.1.3.cmml"><mi id="S2.E1.m1.3.3.3.3.1.1.1.3.2" xref="S2.E1.m1.3.3.3.3.1.1.1.3.2.cmml">k</mi><mi id="S2.E1.m1.3.3.3.3.1.1.1.3.3" xref="S2.E1.m1.3.3.3.3.1.1.1.3.3.cmml">p</mi></msub><mo id="S2.E1.m1.3.3.3.3.1.1.1.2" lspace="0.222em" rspace="0.222em" xref="S2.E1.m1.3.3.3.3.1.1.1.2.cmml">∗</mo><msub id="S2.E1.m1.3.3.3.3.1.1.1.1" xref="S2.E1.m1.3.3.3.3.1.1.1.1.cmml"><mrow id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.2.cmml"><mo id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.2" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.2.1.cmml">‖</mo><msubsup id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.cmml"><mi id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.2.2" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.2.2.cmml">g</mi><mi id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.2.3" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.2.3.cmml">t</mi><mi id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.3" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.3.cmml">i</mi></msubsup><mo id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.3" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S2.E1.m1.3.3.3.3.1.1.1.1.3" xref="S2.E1.m1.3.3.3.3.1.1.1.1.3.cmml">2</mn></msub></mrow></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E1.m1.6.6.6f" xref="S2.E1.m1.6.7.3.1.cmml"><mtext id="S2.E1.m1.4.4.4.4.2.1" xref="S2.E1.m1.4.4.4.4.2.1a.cmml">if traffic violation</mtext></mtd></mtr><mtr id="S2.E1.m1.6.6.6g" xref="S2.E1.m1.6.7.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E1.m1.6.6.6h" xref="S2.E1.m1.6.7.3.1.cmml"><mrow id="S2.E1.m1.5.5.5.5.1.1" xref="S2.E1.m1.5.5.5.5.1.1.cmml"><msub id="S2.E1.m1.5.5.5.5.1.1.3" xref="S2.E1.m1.5.5.5.5.1.1.3.cmml"><mi id="S2.E1.m1.5.5.5.5.1.1.3.2" xref="S2.E1.m1.5.5.5.5.1.1.3.2.cmml">k</mi><mi id="S2.E1.m1.5.5.5.5.1.1.3.3" xref="S2.E1.m1.5.5.5.5.1.1.3.3.cmml">r</mi></msub><mo id="S2.E1.m1.5.5.5.5.1.1.2" lspace="0.222em" rspace="0.222em" xref="S2.E1.m1.5.5.5.5.1.1.2.cmml">∗</mo><msup id="S2.E1.m1.5.5.5.5.1.1.1" xref="S2.E1.m1.5.5.5.5.1.1.1.cmml"><mrow id="S2.E1.m1.5.5.5.5.1.1.1.1.1" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.cmml"><mo id="S2.E1.m1.5.5.5.5.1.1.1.1.1.2" stretchy="false" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.cmml"><mn id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.3" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.3.cmml">0.001</mn><mo id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.2" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.2.cmml">+</mo><msub id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.cmml"><mrow id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.2.cmml"><mo id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.2" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo><msubsup id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.cmml"><mi id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.2.2" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.2.2.cmml">g</mi><mi id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.2.3" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.2.3.cmml">t</mi><mi id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.3" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.3.cmml">i</mi></msubsup><mo id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.3" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.3" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.3.cmml">2</mn></msub></mrow><mo id="S2.E1.m1.5.5.5.5.1.1.1.1.1.3" stretchy="false" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.cmml">)</mo></mrow><mrow id="S2.E1.m1.5.5.5.5.1.1.1.3" xref="S2.E1.m1.5.5.5.5.1.1.1.3.cmml"><mo id="S2.E1.m1.5.5.5.5.1.1.1.3a" xref="S2.E1.m1.5.5.5.5.1.1.1.3.cmml">−</mo><mn id="S2.E1.m1.5.5.5.5.1.1.1.3.2" xref="S2.E1.m1.5.5.5.5.1.1.1.3.2.cmml">1</mn></mrow></msup></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E1.m1.6.6.6i" xref="S2.E1.m1.6.7.3.1.cmml"><mtext id="S2.E1.m1.6.6.6.6.2.1" xref="S2.E1.m1.6.6.6.6.2.1a.cmml">otherwise</mtext></mtd></mtr></mtable></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E1.m1.6b"><apply id="S2.E1.m1.6.7.cmml" xref="S2.E1.m1.6.7"><eq id="S2.E1.m1.6.7.1.cmml" xref="S2.E1.m1.6.7.1"></eq><apply id="S2.E1.m1.6.7.2.cmml" xref="S2.E1.m1.6.7.2"><csymbol cd="ambiguous" id="S2.E1.m1.6.7.2.1.cmml" xref="S2.E1.m1.6.7.2">superscript</csymbol><apply id="S2.E1.m1.6.7.2.2.cmml" xref="S2.E1.m1.6.7.2"><csymbol cd="ambiguous" id="S2.E1.m1.6.7.2.2.1.cmml" xref="S2.E1.m1.6.7.2">subscript</csymbol><ci id="S2.E1.m1.6.7.2.2.2.cmml" xref="S2.E1.m1.6.7.2.2.2">𝑟</ci><ci id="S2.E1.m1.6.7.2.2.3.cmml" xref="S2.E1.m1.6.7.2.2.3">𝑡</ci></apply><ci id="S2.E1.m1.6.7.2.3.cmml" xref="S2.E1.m1.6.7.2.3">𝑖</ci></apply><apply id="S2.E1.m1.6.7.3.1.cmml" xref="S2.E1.m1.6.6"><csymbol cd="latexml" id="S2.E1.m1.6.7.3.1.1.cmml" xref="S2.E1.m1.6.6.7">cases</csymbol><apply id="S2.E1.m1.1.1.1.1.1.1.cmml" xref="S2.E1.m1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E1.m1.1.1.1.1.1.1.1.cmml" xref="S2.E1.m1.1.1.1.1.1.1">subscript</csymbol><ci id="S2.E1.m1.1.1.1.1.1.1.2.cmml" xref="S2.E1.m1.1.1.1.1.1.1.2">𝑟</ci><apply id="S2.E1.m1.1.1.1.1.1.1.3.cmml" xref="S2.E1.m1.1.1.1.1.1.1.3"><times id="S2.E1.m1.1.1.1.1.1.1.3.1.cmml" xref="S2.E1.m1.1.1.1.1.1.1.3.1"></times><ci id="S2.E1.m1.1.1.1.1.1.1.3.2.cmml" xref="S2.E1.m1.1.1.1.1.1.1.3.2">𝑔</ci><ci id="S2.E1.m1.1.1.1.1.1.1.3.3.cmml" xref="S2.E1.m1.1.1.1.1.1.1.3.3">𝑜</ci><ci id="S2.E1.m1.1.1.1.1.1.1.3.4.cmml" xref="S2.E1.m1.1.1.1.1.1.1.3.4">𝑎</ci><ci id="S2.E1.m1.1.1.1.1.1.1.3.5.cmml" xref="S2.E1.m1.1.1.1.1.1.1.3.5">𝑙</ci></apply></apply><ci id="S2.E1.m1.2.2.2.2.2.1a.cmml" xref="S2.E1.m1.2.2.2.2.2.1"><mtext id="S2.E1.m1.2.2.2.2.2.1.cmml" xref="S2.E1.m1.2.2.2.2.2.1">if safe traversal</mtext></ci><apply id="S2.E1.m1.3.3.3.3.1.1.cmml" xref="S2.E1.m1.3.3.3.3.1.1"><minus id="S2.E1.m1.3.3.3.3.1.1.2.cmml" xref="S2.E1.m1.3.3.3.3.1.1"></minus><apply id="S2.E1.m1.3.3.3.3.1.1.1.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1"><times id="S2.E1.m1.3.3.3.3.1.1.1.2.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.2"></times><apply id="S2.E1.m1.3.3.3.3.1.1.1.3.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.3"><csymbol cd="ambiguous" id="S2.E1.m1.3.3.3.3.1.1.1.3.1.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.3">subscript</csymbol><ci id="S2.E1.m1.3.3.3.3.1.1.1.3.2.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.3.2">𝑘</ci><ci id="S2.E1.m1.3.3.3.3.1.1.1.3.3.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.3.3">𝑝</ci></apply><apply id="S2.E1.m1.3.3.3.3.1.1.1.1.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.1"><csymbol cd="ambiguous" id="S2.E1.m1.3.3.3.3.1.1.1.1.2.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.1">subscript</csymbol><apply id="S2.E1.m1.3.3.3.3.1.1.1.1.1.2.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1"><csymbol cd="latexml" id="S2.E1.m1.3.3.3.3.1.1.1.1.1.2.1.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.2">norm</csymbol><apply id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.1.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1">superscript</csymbol><apply id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.2.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1">subscript</csymbol><ci id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.2.2.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.2.2">𝑔</ci><ci id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.2.3.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.2.3">𝑡</ci></apply><ci id="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.3.cmml" xref="S2.E1.m1.3.3.3.3.1.1.1.1.1.1.1.3">𝑖</ci></apply></apply><cn id="S2.E1.m1.3.3.3.3.1.1.1.1.3.cmml" type="integer" xref="S2.E1.m1.3.3.3.3.1.1.1.1.3">2</cn></apply></apply></apply><ci id="S2.E1.m1.4.4.4.4.2.1a.cmml" xref="S2.E1.m1.4.4.4.4.2.1"><mtext id="S2.E1.m1.4.4.4.4.2.1.cmml" xref="S2.E1.m1.4.4.4.4.2.1">if traffic violation</mtext></ci><apply id="S2.E1.m1.5.5.5.5.1.1.cmml" xref="S2.E1.m1.5.5.5.5.1.1"><times id="S2.E1.m1.5.5.5.5.1.1.2.cmml" xref="S2.E1.m1.5.5.5.5.1.1.2"></times><apply id="S2.E1.m1.5.5.5.5.1.1.3.cmml" xref="S2.E1.m1.5.5.5.5.1.1.3"><csymbol cd="ambiguous" id="S2.E1.m1.5.5.5.5.1.1.3.1.cmml" xref="S2.E1.m1.5.5.5.5.1.1.3">subscript</csymbol><ci id="S2.E1.m1.5.5.5.5.1.1.3.2.cmml" xref="S2.E1.m1.5.5.5.5.1.1.3.2">𝑘</ci><ci id="S2.E1.m1.5.5.5.5.1.1.3.3.cmml" xref="S2.E1.m1.5.5.5.5.1.1.3.3">𝑟</ci></apply><apply id="S2.E1.m1.5.5.5.5.1.1.1.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1"><csymbol cd="ambiguous" id="S2.E1.m1.5.5.5.5.1.1.1.2.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1">superscript</csymbol><apply id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1"><plus id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.2.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.2"></plus><cn id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.3.cmml" type="float" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.3">0.001</cn><apply id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.2.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1">subscript</csymbol><apply id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.2">norm</csymbol><apply id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1">superscript</csymbol><apply id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1">subscript</csymbol><ci id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.2.2">𝑔</ci><ci id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.2.3">𝑡</ci></apply><ci id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.1.1.1.3">𝑖</ci></apply></apply><cn id="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.3.cmml" type="integer" xref="S2.E1.m1.5.5.5.5.1.1.1.1.1.1.1.3">2</cn></apply></apply><apply id="S2.E1.m1.5.5.5.5.1.1.1.3.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.3"><minus id="S2.E1.m1.5.5.5.5.1.1.1.3.1.cmml" xref="S2.E1.m1.5.5.5.5.1.1.1.3"></minus><cn id="S2.E1.m1.5.5.5.5.1.1.1.3.2.cmml" type="integer" xref="S2.E1.m1.5.5.5.5.1.1.1.3.2">1</cn></apply></apply></apply><ci id="S2.E1.m1.6.6.6.6.2.1a.cmml" xref="S2.E1.m1.6.6.6.6.2.1"><mtext id="S2.E1.m1.6.6.6.6.2.1.cmml" xref="S2.E1.m1.6.6.6.6.2.1">otherwise</mtext></ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E1.m1.6c">r_{t}^{i}=\begin{cases}r_{goal}&amp;\text{if safe traversal}\\ -k_{p}*\left\|g_{t}^{i}\right\|_{2}&amp;\text{if traffic violation}\\ k_{r}*(0.001+\left\|g_{t}^{i}\right\|_{2})^{-1}&amp;\text{otherwise}\end{cases}</annotation><annotation encoding="application/x-llamapun" id="S2.E1.m1.6d">italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = { start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_g italic_o italic_a italic_l end_POSTSUBSCRIPT end_CELL start_CELL if safe traversal end_CELL end_ROW start_ROW start_CELL - italic_k start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∗ ∥ italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL if traffic violation end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT ∗ ( 0.001 + ∥ italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT end_CELL start_CELL otherwise end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(1)</span></td> </tr></tbody> </table> </div> <div class="ltx_para" id="S2.SS1.SSS3.p2"> <p class="ltx_p" id="S2.SS1.SSS3.p2.6">This function rewarded each agent with <math alttext="r_{goal}=1" class="ltx_Math" display="inline" id="S2.SS1.SSS3.p2.1.m1.1"><semantics id="S2.SS1.SSS3.p2.1.m1.1a"><mrow id="S2.SS1.SSS3.p2.1.m1.1.1" xref="S2.SS1.SSS3.p2.1.m1.1.1.cmml"><msub id="S2.SS1.SSS3.p2.1.m1.1.1.2" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.cmml"><mi id="S2.SS1.SSS3.p2.1.m1.1.1.2.2" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.2.cmml">r</mi><mrow id="S2.SS1.SSS3.p2.1.m1.1.1.2.3" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.cmml"><mi id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.2" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.2.cmml">g</mi><mo id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.1" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.3" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.3.cmml">o</mi><mo id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.1a" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.4" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.4.cmml">a</mi><mo id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.1b" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.1.cmml">⁢</mo><mi id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.5" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.5.cmml">l</mi></mrow></msub><mo id="S2.SS1.SSS3.p2.1.m1.1.1.1" xref="S2.SS1.SSS3.p2.1.m1.1.1.1.cmml">=</mo><mn id="S2.SS1.SSS3.p2.1.m1.1.1.3" xref="S2.SS1.SSS3.p2.1.m1.1.1.3.cmml">1</mn></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS3.p2.1.m1.1b"><apply id="S2.SS1.SSS3.p2.1.m1.1.1.cmml" xref="S2.SS1.SSS3.p2.1.m1.1.1"><eq id="S2.SS1.SSS3.p2.1.m1.1.1.1.cmml" xref="S2.SS1.SSS3.p2.1.m1.1.1.1"></eq><apply id="S2.SS1.SSS3.p2.1.m1.1.1.2.cmml" xref="S2.SS1.SSS3.p2.1.m1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS3.p2.1.m1.1.1.2.1.cmml" xref="S2.SS1.SSS3.p2.1.m1.1.1.2">subscript</csymbol><ci id="S2.SS1.SSS3.p2.1.m1.1.1.2.2.cmml" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.2">𝑟</ci><apply id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.cmml" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3"><times id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.1.cmml" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.1"></times><ci id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.2.cmml" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.2">𝑔</ci><ci id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.3.cmml" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.3">𝑜</ci><ci id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.4.cmml" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.4">𝑎</ci><ci id="S2.SS1.SSS3.p2.1.m1.1.1.2.3.5.cmml" xref="S2.SS1.SSS3.p2.1.m1.1.1.2.3.5">𝑙</ci></apply></apply><cn id="S2.SS1.SSS3.p2.1.m1.1.1.3.cmml" type="integer" xref="S2.SS1.SSS3.p2.1.m1.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS3.p2.1.m1.1c">r_{goal}=1</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS3.p2.1.m1.1d">italic_r start_POSTSUBSCRIPT italic_g italic_o italic_a italic_l end_POSTSUBSCRIPT = 1</annotation></semantics></math> for successfully traversing the intersection and penalized them proportional to their distance from the goal, represented as <math alttext="k_{p}*\left\|g_{t}^{i}\right\|_{2}" class="ltx_Math" display="inline" id="S2.SS1.SSS3.p2.2.m2.1"><semantics id="S2.SS1.SSS3.p2.2.m2.1a"><mrow id="S2.SS1.SSS3.p2.2.m2.1.1" xref="S2.SS1.SSS3.p2.2.m2.1.1.cmml"><msub id="S2.SS1.SSS3.p2.2.m2.1.1.3" xref="S2.SS1.SSS3.p2.2.m2.1.1.3.cmml"><mi id="S2.SS1.SSS3.p2.2.m2.1.1.3.2" xref="S2.SS1.SSS3.p2.2.m2.1.1.3.2.cmml">k</mi><mi id="S2.SS1.SSS3.p2.2.m2.1.1.3.3" xref="S2.SS1.SSS3.p2.2.m2.1.1.3.3.cmml">p</mi></msub><mo id="S2.SS1.SSS3.p2.2.m2.1.1.2" lspace="0.222em" rspace="0.222em" xref="S2.SS1.SSS3.p2.2.m2.1.1.2.cmml">∗</mo><msub id="S2.SS1.SSS3.p2.2.m2.1.1.1" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.cmml"><mrow id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.2.cmml"><mo id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.2" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.2.1.cmml">‖</mo><msubsup id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.cmml"><mi id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.2.2" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.2.2.cmml">g</mi><mi id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.2.3" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.2.3.cmml">t</mi><mi id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.3" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.3.cmml">i</mi></msubsup><mo id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.3" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.2.1.cmml">‖</mo></mrow><mn id="S2.SS1.SSS3.p2.2.m2.1.1.1.3" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.3.cmml">2</mn></msub></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS3.p2.2.m2.1b"><apply id="S2.SS1.SSS3.p2.2.m2.1.1.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1"><times id="S2.SS1.SSS3.p2.2.m2.1.1.2.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.2"></times><apply id="S2.SS1.SSS3.p2.2.m2.1.1.3.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.SSS3.p2.2.m2.1.1.3.1.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.3">subscript</csymbol><ci id="S2.SS1.SSS3.p2.2.m2.1.1.3.2.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.3.2">𝑘</ci><ci id="S2.SS1.SSS3.p2.2.m2.1.1.3.3.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.3.3">𝑝</ci></apply><apply id="S2.SS1.SSS3.p2.2.m2.1.1.1.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.SSS3.p2.2.m2.1.1.1.2.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.1">subscript</csymbol><apply id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.2.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.2.1.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.2">norm</csymbol><apply id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1">superscript</csymbol><apply id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.2.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.2.1.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1">subscript</csymbol><ci id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.2.2.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.2.2">𝑔</ci><ci id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.2.3.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.2.3">𝑡</ci></apply><ci id="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.3.cmml" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.1.1.1.3">𝑖</ci></apply></apply><cn id="S2.SS1.SSS3.p2.2.m2.1.1.1.3.cmml" type="integer" xref="S2.SS1.SSS3.p2.2.m2.1.1.1.3">2</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS3.p2.2.m2.1c">k_{p}*\left\|g_{t}^{i}\right\|_{2}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS3.p2.2.m2.1d">italic_k start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∗ ∥ italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT</annotation></semantics></math>, for collisions or lane-boundary violations. The agents were also continuously rewarded inversely proportional to their distance-to-goal, to negotiate the sparse-reward problem. The reward (<math alttext="k_{r}" class="ltx_Math" display="inline" id="S2.SS1.SSS3.p2.3.m3.1"><semantics id="S2.SS1.SSS3.p2.3.m3.1a"><msub id="S2.SS1.SSS3.p2.3.m3.1.1" xref="S2.SS1.SSS3.p2.3.m3.1.1.cmml"><mi id="S2.SS1.SSS3.p2.3.m3.1.1.2" xref="S2.SS1.SSS3.p2.3.m3.1.1.2.cmml">k</mi><mi id="S2.SS1.SSS3.p2.3.m3.1.1.3" xref="S2.SS1.SSS3.p2.3.m3.1.1.3.cmml">r</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS3.p2.3.m3.1b"><apply id="S2.SS1.SSS3.p2.3.m3.1.1.cmml" xref="S2.SS1.SSS3.p2.3.m3.1.1"><csymbol cd="ambiguous" id="S2.SS1.SSS3.p2.3.m3.1.1.1.cmml" xref="S2.SS1.SSS3.p2.3.m3.1.1">subscript</csymbol><ci id="S2.SS1.SSS3.p2.3.m3.1.1.2.cmml" xref="S2.SS1.SSS3.p2.3.m3.1.1.2">𝑘</ci><ci id="S2.SS1.SSS3.p2.3.m3.1.1.3.cmml" xref="S2.SS1.SSS3.p2.3.m3.1.1.3">𝑟</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS3.p2.3.m3.1c">k_{r}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS3.p2.3.m3.1d">italic_k start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT</annotation></semantics></math>) and penalty (<math alttext="k_{p}" class="ltx_Math" display="inline" id="S2.SS1.SSS3.p2.4.m4.1"><semantics id="S2.SS1.SSS3.p2.4.m4.1a"><msub id="S2.SS1.SSS3.p2.4.m4.1.1" xref="S2.SS1.SSS3.p2.4.m4.1.1.cmml"><mi id="S2.SS1.SSS3.p2.4.m4.1.1.2" xref="S2.SS1.SSS3.p2.4.m4.1.1.2.cmml">k</mi><mi id="S2.SS1.SSS3.p2.4.m4.1.1.3" xref="S2.SS1.SSS3.p2.4.m4.1.1.3.cmml">p</mi></msub><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS3.p2.4.m4.1b"><apply id="S2.SS1.SSS3.p2.4.m4.1.1.cmml" xref="S2.SS1.SSS3.p2.4.m4.1.1"><csymbol cd="ambiguous" id="S2.SS1.SSS3.p2.4.m4.1.1.1.cmml" xref="S2.SS1.SSS3.p2.4.m4.1.1">subscript</csymbol><ci id="S2.SS1.SSS3.p2.4.m4.1.1.2.cmml" xref="S2.SS1.SSS3.p2.4.m4.1.1.2">𝑘</ci><ci id="S2.SS1.SSS3.p2.4.m4.1.1.3.cmml" xref="S2.SS1.SSS3.p2.4.m4.1.1.3">𝑝</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS3.p2.4.m4.1c">k_{p}</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS3.p2.4.m4.1d">italic_k start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT</annotation></semantics></math>) constants were set to <math alttext="0.01" class="ltx_Math" display="inline" id="S2.SS1.SSS3.p2.5.m5.1"><semantics id="S2.SS1.SSS3.p2.5.m5.1a"><mn id="S2.SS1.SSS3.p2.5.m5.1.1" xref="S2.SS1.SSS3.p2.5.m5.1.1.cmml">0.01</mn><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS3.p2.5.m5.1b"><cn id="S2.SS1.SSS3.p2.5.m5.1.1.cmml" type="float" xref="S2.SS1.SSS3.p2.5.m5.1.1">0.01</cn></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS3.p2.5.m5.1c">0.01</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS3.p2.5.m5.1d">0.01</annotation></semantics></math> and <math alttext="0.425" class="ltx_Math" display="inline" id="S2.SS1.SSS3.p2.6.m6.1"><semantics id="S2.SS1.SSS3.p2.6.m6.1a"><mn id="S2.SS1.SSS3.p2.6.m6.1.1" xref="S2.SS1.SSS3.p2.6.m6.1.1.cmml">0.425</mn><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS3.p2.6.m6.1b"><cn id="S2.SS1.SSS3.p2.6.m6.1.1.cmml" type="float" xref="S2.SS1.SSS3.p2.6.m6.1.1">0.425</cn></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS3.p2.6.m6.1c">0.425</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS3.p2.6.m6.1d">0.425</annotation></semantics></math> respectively.</p> </div> </section> <section class="ltx_subsubsection" id="S2.SS1.SSS4"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S2.SS1.SSS4.5.1.1">II-A</span>4 </span>Optimization Problem</h4> <div class="ltx_para" id="S2.SS1.SSS4.p1"> <p class="ltx_p" id="S2.SS1.SSS4.p1.1">The extrinsic reward function motivated each agent to maximize its expected future discounted reward by learning an optimal policy <math alttext="\pi^{*}_{\theta}\left(a_{t}|o_{t}\right)" class="ltx_Math" display="inline" id="S2.SS1.SSS4.p1.1.m1.1"><semantics id="S2.SS1.SSS4.p1.1.m1.1a"><mrow id="S2.SS1.SSS4.p1.1.m1.1.1" xref="S2.SS1.SSS4.p1.1.m1.1.1.cmml"><msubsup id="S2.SS1.SSS4.p1.1.m1.1.1.3" xref="S2.SS1.SSS4.p1.1.m1.1.1.3.cmml"><mi id="S2.SS1.SSS4.p1.1.m1.1.1.3.2.2" xref="S2.SS1.SSS4.p1.1.m1.1.1.3.2.2.cmml">π</mi><mi id="S2.SS1.SSS4.p1.1.m1.1.1.3.3" xref="S2.SS1.SSS4.p1.1.m1.1.1.3.3.cmml">θ</mi><mo id="S2.SS1.SSS4.p1.1.m1.1.1.3.2.3" xref="S2.SS1.SSS4.p1.1.m1.1.1.3.2.3.cmml">∗</mo></msubsup><mo id="S2.SS1.SSS4.p1.1.m1.1.1.2" xref="S2.SS1.SSS4.p1.1.m1.1.1.2.cmml">⁢</mo><mrow id="S2.SS1.SSS4.p1.1.m1.1.1.1.1" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.cmml"><mo id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.2" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.cmml">(</mo><mrow id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.cmml"><msub id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2.cmml"><mi id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2.2" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2.2.cmml">a</mi><mi id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2.3" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2.3.cmml">t</mi></msub><mo fence="false" id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.1" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.1.cmml">|</mo><msub id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3.cmml"><mi id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3.2" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3.2.cmml">o</mi><mi id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3.3" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3.3.cmml">t</mi></msub></mrow><mo id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.3" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.cmml">)</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS1.SSS4.p1.1.m1.1b"><apply id="S2.SS1.SSS4.p1.1.m1.1.1.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1"><times id="S2.SS1.SSS4.p1.1.m1.1.1.2.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.2"></times><apply id="S2.SS1.SSS4.p1.1.m1.1.1.3.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.SSS4.p1.1.m1.1.1.3.1.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.3">subscript</csymbol><apply id="S2.SS1.SSS4.p1.1.m1.1.1.3.2.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.SSS4.p1.1.m1.1.1.3.2.1.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.3">superscript</csymbol><ci id="S2.SS1.SSS4.p1.1.m1.1.1.3.2.2.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.3.2.2">𝜋</ci><times id="S2.SS1.SSS4.p1.1.m1.1.1.3.2.3.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.3.2.3"></times></apply><ci id="S2.SS1.SSS4.p1.1.m1.1.1.3.3.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.3.3">𝜃</ci></apply><apply id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1"><csymbol cd="latexml" id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.1.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.1">conditional</csymbol><apply id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2.1.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2">subscript</csymbol><ci id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2.2.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2.2">𝑎</ci><ci id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2.3.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.2.3">𝑡</ci></apply><apply id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3.1.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3">subscript</csymbol><ci id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3.2.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3.2">𝑜</ci><ci id="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3.3.cmml" xref="S2.SS1.SSS4.p1.1.m1.1.1.1.1.1.3.3">𝑡</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS1.SSS4.p1.1.m1.1c">\pi^{*}_{\theta}\left(a_{t}|o_{t}\right)</annotation><annotation encoding="application/x-llamapun" id="S2.SS1.SSS4.p1.1.m1.1d">italic_π start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_a start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | italic_o start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT )</annotation></semantics></math>.</p> <table class="ltx_equationgroup ltx_eqn_align ltx_eqn_table" id="S5.EGx1"> <tbody id="S2.E2"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\operatorname*{\arg\!\max}_{\pi_{\theta}\left(a_{t}|o_{t}\right)}\quad" class="ltx_Math" display="inline" id="S2.E2.m1.2"><semantics id="S2.E2.m1.2a"><mrow id="S2.E2.m1.2.2.1" xref="S2.E2.m1.2.2.1.1.cmml"><munder id="S2.E2.m1.2.2.1.1" xref="S2.E2.m1.2.2.1.1.cmml"><mrow id="S2.E2.m1.2.2.1.1.2" xref="S2.E2.m1.2.2.1.1.2.cmml"><mi id="S2.E2.m1.2.2.1.1.2.1" xref="S2.E2.m1.2.2.1.1.2.1.cmml">arg</mi><mo id="S2.E2.m1.2.2.1.1.2a" xref="S2.E2.m1.2.2.1.1.2.cmml">⁡</mo><mi id="S2.E2.m1.2.2.1.1.2.2" xref="S2.E2.m1.2.2.1.1.2.2.cmml">max</mi></mrow><mrow id="S2.E2.m1.1.1.1" xref="S2.E2.m1.1.1.1.cmml"><msub id="S2.E2.m1.1.1.1.3" xref="S2.E2.m1.1.1.1.3.cmml"><mi id="S2.E2.m1.1.1.1.3.2" xref="S2.E2.m1.1.1.1.3.2.cmml">π</mi><mi id="S2.E2.m1.1.1.1.3.3" xref="S2.E2.m1.1.1.1.3.3.cmml">θ</mi></msub><mo id="S2.E2.m1.1.1.1.2" xref="S2.E2.m1.1.1.1.2.cmml">⁢</mo><mrow id="S2.E2.m1.1.1.1.1.1" xref="S2.E2.m1.1.1.1.1.1.1.cmml"><mo id="S2.E2.m1.1.1.1.1.1.2" xref="S2.E2.m1.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.E2.m1.1.1.1.1.1.1" xref="S2.E2.m1.1.1.1.1.1.1.cmml"><msub id="S2.E2.m1.1.1.1.1.1.1.2" xref="S2.E2.m1.1.1.1.1.1.1.2.cmml"><mi id="S2.E2.m1.1.1.1.1.1.1.2.2" xref="S2.E2.m1.1.1.1.1.1.1.2.2.cmml">a</mi><mi id="S2.E2.m1.1.1.1.1.1.1.2.3" xref="S2.E2.m1.1.1.1.1.1.1.2.3.cmml">t</mi></msub><mo fence="false" id="S2.E2.m1.1.1.1.1.1.1.1" xref="S2.E2.m1.1.1.1.1.1.1.1.cmml">|</mo><msub id="S2.E2.m1.1.1.1.1.1.1.3" xref="S2.E2.m1.1.1.1.1.1.1.3.cmml"><mi id="S2.E2.m1.1.1.1.1.1.1.3.2" xref="S2.E2.m1.1.1.1.1.1.1.3.2.cmml">o</mi><mi id="S2.E2.m1.1.1.1.1.1.1.3.3" xref="S2.E2.m1.1.1.1.1.1.1.3.3.cmml">t</mi></msub></mrow><mo id="S2.E2.m1.1.1.1.1.1.3" xref="S2.E2.m1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></munder><mspace id="S2.E2.m1.2.2.1.2" width="1.167em" xref="S2.E2.m1.2.2.1.1.cmml"></mspace></mrow><annotation-xml encoding="MathML-Content" id="S2.E2.m1.2b"><apply id="S2.E2.m1.2.2.1.1.cmml" xref="S2.E2.m1.2.2.1"><csymbol cd="ambiguous" id="S2.E2.m1.2.2.1.1.1.cmml" xref="S2.E2.m1.2.2.1">subscript</csymbol><apply id="S2.E2.m1.2.2.1.1.2.cmml" xref="S2.E2.m1.2.2.1.1.2"><arg id="S2.E2.m1.2.2.1.1.2.1.cmml" xref="S2.E2.m1.2.2.1.1.2.1"></arg><max id="S2.E2.m1.2.2.1.1.2.2.cmml" xref="S2.E2.m1.2.2.1.1.2.2"></max></apply><apply id="S2.E2.m1.1.1.1.cmml" xref="S2.E2.m1.1.1.1"><times id="S2.E2.m1.1.1.1.2.cmml" xref="S2.E2.m1.1.1.1.2"></times><apply id="S2.E2.m1.1.1.1.3.cmml" xref="S2.E2.m1.1.1.1.3"><csymbol cd="ambiguous" id="S2.E2.m1.1.1.1.3.1.cmml" xref="S2.E2.m1.1.1.1.3">subscript</csymbol><ci id="S2.E2.m1.1.1.1.3.2.cmml" xref="S2.E2.m1.1.1.1.3.2">𝜋</ci><ci id="S2.E2.m1.1.1.1.3.3.cmml" xref="S2.E2.m1.1.1.1.3.3">𝜃</ci></apply><apply id="S2.E2.m1.1.1.1.1.1.1.cmml" xref="S2.E2.m1.1.1.1.1.1"><csymbol cd="latexml" id="S2.E2.m1.1.1.1.1.1.1.1.cmml" xref="S2.E2.m1.1.1.1.1.1.1.1">conditional</csymbol><apply id="S2.E2.m1.1.1.1.1.1.1.2.cmml" xref="S2.E2.m1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.E2.m1.1.1.1.1.1.1.2.1.cmml" xref="S2.E2.m1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S2.E2.m1.1.1.1.1.1.1.2.2.cmml" xref="S2.E2.m1.1.1.1.1.1.1.2.2">𝑎</ci><ci id="S2.E2.m1.1.1.1.1.1.1.2.3.cmml" xref="S2.E2.m1.1.1.1.1.1.1.2.3">𝑡</ci></apply><apply id="S2.E2.m1.1.1.1.1.1.1.3.cmml" xref="S2.E2.m1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.E2.m1.1.1.1.1.1.1.3.1.cmml" xref="S2.E2.m1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S2.E2.m1.1.1.1.1.1.1.3.2.cmml" xref="S2.E2.m1.1.1.1.1.1.1.3.2">𝑜</ci><ci id="S2.E2.m1.1.1.1.1.1.1.3.3.cmml" xref="S2.E2.m1.1.1.1.1.1.1.3.3">𝑡</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E2.m1.2c">\displaystyle\operatorname*{\arg\!\max}_{\pi_{\theta}\left(a_{t}|o_{t}\right)}\quad</annotation><annotation encoding="application/x-llamapun" id="S2.E2.m1.2d">start_OPERATOR roman_arg roman_max end_OPERATOR start_POSTSUBSCRIPT italic_π start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_a start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | italic_o start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle\mathbb{E}\left[\sum_{t=0}^{\infty}\gamma^{t}r_{t}\right]" class="ltx_Math" display="inline" id="S2.E2.m2.1"><semantics id="S2.E2.m2.1a"><mrow id="S2.E2.m2.1.1" xref="S2.E2.m2.1.1.cmml"><mi id="S2.E2.m2.1.1.3" xref="S2.E2.m2.1.1.3.cmml">𝔼</mi><mo id="S2.E2.m2.1.1.2" xref="S2.E2.m2.1.1.2.cmml">⁢</mo><mrow id="S2.E2.m2.1.1.1.1" xref="S2.E2.m2.1.1.1.2.cmml"><mo id="S2.E2.m2.1.1.1.1.2" xref="S2.E2.m2.1.1.1.2.1.cmml">[</mo><mrow id="S2.E2.m2.1.1.1.1.1" xref="S2.E2.m2.1.1.1.1.1.cmml"><mstyle displaystyle="true" id="S2.E2.m2.1.1.1.1.1.1" xref="S2.E2.m2.1.1.1.1.1.1.cmml"><munderover id="S2.E2.m2.1.1.1.1.1.1a" xref="S2.E2.m2.1.1.1.1.1.1.cmml"><mo id="S2.E2.m2.1.1.1.1.1.1.2.2" movablelimits="false" xref="S2.E2.m2.1.1.1.1.1.1.2.2.cmml">∑</mo><mrow id="S2.E2.m2.1.1.1.1.1.1.2.3" xref="S2.E2.m2.1.1.1.1.1.1.2.3.cmml"><mi id="S2.E2.m2.1.1.1.1.1.1.2.3.2" xref="S2.E2.m2.1.1.1.1.1.1.2.3.2.cmml">t</mi><mo id="S2.E2.m2.1.1.1.1.1.1.2.3.1" xref="S2.E2.m2.1.1.1.1.1.1.2.3.1.cmml">=</mo><mn id="S2.E2.m2.1.1.1.1.1.1.2.3.3" xref="S2.E2.m2.1.1.1.1.1.1.2.3.3.cmml">0</mn></mrow><mi id="S2.E2.m2.1.1.1.1.1.1.3" mathvariant="normal" xref="S2.E2.m2.1.1.1.1.1.1.3.cmml">∞</mi></munderover></mstyle><mrow id="S2.E2.m2.1.1.1.1.1.2" xref="S2.E2.m2.1.1.1.1.1.2.cmml"><msup id="S2.E2.m2.1.1.1.1.1.2.2" xref="S2.E2.m2.1.1.1.1.1.2.2.cmml"><mi id="S2.E2.m2.1.1.1.1.1.2.2.2" xref="S2.E2.m2.1.1.1.1.1.2.2.2.cmml">γ</mi><mi id="S2.E2.m2.1.1.1.1.1.2.2.3" xref="S2.E2.m2.1.1.1.1.1.2.2.3.cmml">t</mi></msup><mo id="S2.E2.m2.1.1.1.1.1.2.1" xref="S2.E2.m2.1.1.1.1.1.2.1.cmml">⁢</mo><msub id="S2.E2.m2.1.1.1.1.1.2.3" xref="S2.E2.m2.1.1.1.1.1.2.3.cmml"><mi id="S2.E2.m2.1.1.1.1.1.2.3.2" xref="S2.E2.m2.1.1.1.1.1.2.3.2.cmml">r</mi><mi id="S2.E2.m2.1.1.1.1.1.2.3.3" xref="S2.E2.m2.1.1.1.1.1.2.3.3.cmml">t</mi></msub></mrow></mrow><mo id="S2.E2.m2.1.1.1.1.3" xref="S2.E2.m2.1.1.1.2.1.cmml">]</mo></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E2.m2.1b"><apply id="S2.E2.m2.1.1.cmml" xref="S2.E2.m2.1.1"><times id="S2.E2.m2.1.1.2.cmml" xref="S2.E2.m2.1.1.2"></times><ci id="S2.E2.m2.1.1.3.cmml" xref="S2.E2.m2.1.1.3">𝔼</ci><apply id="S2.E2.m2.1.1.1.2.cmml" xref="S2.E2.m2.1.1.1.1"><csymbol cd="latexml" id="S2.E2.m2.1.1.1.2.1.cmml" xref="S2.E2.m2.1.1.1.1.2">delimited-[]</csymbol><apply id="S2.E2.m2.1.1.1.1.1.cmml" xref="S2.E2.m2.1.1.1.1.1"><apply id="S2.E2.m2.1.1.1.1.1.1.cmml" xref="S2.E2.m2.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E2.m2.1.1.1.1.1.1.1.cmml" xref="S2.E2.m2.1.1.1.1.1.1">superscript</csymbol><apply id="S2.E2.m2.1.1.1.1.1.1.2.cmml" xref="S2.E2.m2.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E2.m2.1.1.1.1.1.1.2.1.cmml" xref="S2.E2.m2.1.1.1.1.1.1">subscript</csymbol><sum id="S2.E2.m2.1.1.1.1.1.1.2.2.cmml" xref="S2.E2.m2.1.1.1.1.1.1.2.2"></sum><apply id="S2.E2.m2.1.1.1.1.1.1.2.3.cmml" xref="S2.E2.m2.1.1.1.1.1.1.2.3"><eq id="S2.E2.m2.1.1.1.1.1.1.2.3.1.cmml" xref="S2.E2.m2.1.1.1.1.1.1.2.3.1"></eq><ci id="S2.E2.m2.1.1.1.1.1.1.2.3.2.cmml" xref="S2.E2.m2.1.1.1.1.1.1.2.3.2">𝑡</ci><cn id="S2.E2.m2.1.1.1.1.1.1.2.3.3.cmml" type="integer" xref="S2.E2.m2.1.1.1.1.1.1.2.3.3">0</cn></apply></apply><infinity id="S2.E2.m2.1.1.1.1.1.1.3.cmml" xref="S2.E2.m2.1.1.1.1.1.1.3"></infinity></apply><apply id="S2.E2.m2.1.1.1.1.1.2.cmml" xref="S2.E2.m2.1.1.1.1.1.2"><times id="S2.E2.m2.1.1.1.1.1.2.1.cmml" xref="S2.E2.m2.1.1.1.1.1.2.1"></times><apply id="S2.E2.m2.1.1.1.1.1.2.2.cmml" xref="S2.E2.m2.1.1.1.1.1.2.2"><csymbol cd="ambiguous" id="S2.E2.m2.1.1.1.1.1.2.2.1.cmml" xref="S2.E2.m2.1.1.1.1.1.2.2">superscript</csymbol><ci id="S2.E2.m2.1.1.1.1.1.2.2.2.cmml" xref="S2.E2.m2.1.1.1.1.1.2.2.2">𝛾</ci><ci id="S2.E2.m2.1.1.1.1.1.2.2.3.cmml" xref="S2.E2.m2.1.1.1.1.1.2.2.3">𝑡</ci></apply><apply id="S2.E2.m2.1.1.1.1.1.2.3.cmml" xref="S2.E2.m2.1.1.1.1.1.2.3"><csymbol cd="ambiguous" id="S2.E2.m2.1.1.1.1.1.2.3.1.cmml" xref="S2.E2.m2.1.1.1.1.1.2.3">subscript</csymbol><ci id="S2.E2.m2.1.1.1.1.1.2.3.2.cmml" xref="S2.E2.m2.1.1.1.1.1.2.3.2">𝑟</ci><ci id="S2.E2.m2.1.1.1.1.1.2.3.3.cmml" xref="S2.E2.m2.1.1.1.1.1.2.3.3">𝑡</ci></apply></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E2.m2.1c">\displaystyle\mathbb{E}\left[\sum_{t=0}^{\infty}\gamma^{t}r_{t}\right]</annotation><annotation encoding="application/x-llamapun" id="S2.E2.m2.1d">blackboard_E [ ∑ start_POSTSUBSCRIPT italic_t = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_γ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ]</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(2)</span></td> </tr></tbody> </table> </div> </section> </section> <section class="ltx_subsection" id="S2.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S2.SS2.5.1.1">II-B</span> </span><span class="ltx_text ltx_font_italic" id="S2.SS2.6.2">Competitive Multi-Agent Scenario</span> </h3> <div class="ltx_para" id="S2.SS2.p1"> <p class="ltx_p" id="S2.SS2.p1.1">We formulated a 2-agent autonomous racing scenario, wherein each agent’s objective was to minimize its lap time without colliding with the track or its opponent. This scenario is representative of a standard F1TENTH (a.k.a. RoboRacer) autonomous racing competition <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib13" title="">13</a>]</cite>. Here, each agent collected its observations; no information was shared among the agents. The exact map of the environment was not known to any agent. Consequently, this problem was also framed as a POMDP. However, contrary to the cooperative MARL, this problem adopted a hybrid imitation-reinforcement learning architecture to guide the agents’ exploration, thereby reducing training time. To this end, we recorded 5 laps worth of single-agent demonstrations for each agent by manually driving it in sub-optimal trajectories.</p> </div> <section class="ltx_subsubsection" id="S2.SS2.SSS1"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S2.SS2.SSS1.5.1.1">II-B</span>1 </span>Observation Space</h4> <div class="ltx_para" id="S2.SS2.SSS1.p1"> <p class="ltx_p" id="S2.SS2.SSS1.p1.5">Each agent collected a vectorized observation: <math alttext="o_{t}^{i}=\left[v_{t}^{i},m_{t}^{i}\right]\in\mathbb{R}^{28}" class="ltx_Math" display="inline" id="S2.SS2.SSS1.p1.1.m1.2"><semantics id="S2.SS2.SSS1.p1.1.m1.2a"><mrow id="S2.SS2.SSS1.p1.1.m1.2.2" xref="S2.SS2.SSS1.p1.1.m1.2.2.cmml"><msubsup id="S2.SS2.SSS1.p1.1.m1.2.2.4" xref="S2.SS2.SSS1.p1.1.m1.2.2.4.cmml"><mi id="S2.SS2.SSS1.p1.1.m1.2.2.4.2.2" xref="S2.SS2.SSS1.p1.1.m1.2.2.4.2.2.cmml">o</mi><mi id="S2.SS2.SSS1.p1.1.m1.2.2.4.2.3" xref="S2.SS2.SSS1.p1.1.m1.2.2.4.2.3.cmml">t</mi><mi id="S2.SS2.SSS1.p1.1.m1.2.2.4.3" xref="S2.SS2.SSS1.p1.1.m1.2.2.4.3.cmml">i</mi></msubsup><mo id="S2.SS2.SSS1.p1.1.m1.2.2.5" xref="S2.SS2.SSS1.p1.1.m1.2.2.5.cmml">=</mo><mrow id="S2.SS2.SSS1.p1.1.m1.2.2.2.2" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.3.cmml"><mo id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.3" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.3.cmml">[</mo><msubsup id="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1" xref="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.cmml"><mi id="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.2.2" xref="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.2.2.cmml">v</mi><mi id="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.2.3" xref="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.2.3.cmml">t</mi><mi id="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.3" xref="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.3.cmml">i</mi></msubsup><mo id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.4" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.3.cmml">,</mo><msubsup id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.cmml"><mi id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.2.2" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.2.2.cmml">m</mi><mi id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.2.3" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.2.3.cmml">t</mi><mi id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.3" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.3.cmml">i</mi></msubsup><mo id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.5" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.3.cmml">]</mo></mrow><mo id="S2.SS2.SSS1.p1.1.m1.2.2.6" xref="S2.SS2.SSS1.p1.1.m1.2.2.6.cmml">∈</mo><msup id="S2.SS2.SSS1.p1.1.m1.2.2.7" xref="S2.SS2.SSS1.p1.1.m1.2.2.7.cmml"><mi id="S2.SS2.SSS1.p1.1.m1.2.2.7.2" xref="S2.SS2.SSS1.p1.1.m1.2.2.7.2.cmml">ℝ</mi><mn id="S2.SS2.SSS1.p1.1.m1.2.2.7.3" xref="S2.SS2.SSS1.p1.1.m1.2.2.7.3.cmml">28</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.SSS1.p1.1.m1.2b"><apply id="S2.SS2.SSS1.p1.1.m1.2.2.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2"><and id="S2.SS2.SSS1.p1.1.m1.2.2a.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2"></and><apply id="S2.SS2.SSS1.p1.1.m1.2.2b.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2"><eq id="S2.SS2.SSS1.p1.1.m1.2.2.5.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.5"></eq><apply id="S2.SS2.SSS1.p1.1.m1.2.2.4.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.4"><csymbol cd="ambiguous" id="S2.SS2.SSS1.p1.1.m1.2.2.4.1.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.4">superscript</csymbol><apply id="S2.SS2.SSS1.p1.1.m1.2.2.4.2.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.4"><csymbol cd="ambiguous" id="S2.SS2.SSS1.p1.1.m1.2.2.4.2.1.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.4">subscript</csymbol><ci id="S2.SS2.SSS1.p1.1.m1.2.2.4.2.2.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.4.2.2">𝑜</ci><ci id="S2.SS2.SSS1.p1.1.m1.2.2.4.2.3.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.4.2.3">𝑡</ci></apply><ci id="S2.SS2.SSS1.p1.1.m1.2.2.4.3.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.4.3">𝑖</ci></apply><interval closure="closed" id="S2.SS2.SSS1.p1.1.m1.2.2.2.3.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2"><apply id="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.cmml" xref="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.1.cmml" xref="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1">superscript</csymbol><apply id="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.2.cmml" xref="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.2.1.cmml" xref="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1">subscript</csymbol><ci id="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.2.2.cmml" xref="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.2.2">𝑣</ci><ci id="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.2.3.cmml" xref="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.2.3">𝑡</ci></apply><ci id="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.3.cmml" xref="S2.SS2.SSS1.p1.1.m1.1.1.1.1.1.3">𝑖</ci></apply><apply id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.1.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2">superscript</csymbol><apply id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.2.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.2.1.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2">subscript</csymbol><ci id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.2.2.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.2.2">𝑚</ci><ci id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.2.3.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.2.3">𝑡</ci></apply><ci id="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.3.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.2.2.2.3">𝑖</ci></apply></interval></apply><apply id="S2.SS2.SSS1.p1.1.m1.2.2c.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2"><in id="S2.SS2.SSS1.p1.1.m1.2.2.6.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.6"></in><share href="https://arxiv.org/html/2403.10996v5#S2.SS2.SSS1.p1.1.m1.2.2.2.cmml" id="S2.SS2.SSS1.p1.1.m1.2.2d.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2"></share><apply id="S2.SS2.SSS1.p1.1.m1.2.2.7.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.7"><csymbol cd="ambiguous" id="S2.SS2.SSS1.p1.1.m1.2.2.7.1.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.7">superscript</csymbol><ci id="S2.SS2.SSS1.p1.1.m1.2.2.7.2.cmml" xref="S2.SS2.SSS1.p1.1.m1.2.2.7.2">ℝ</ci><cn id="S2.SS2.SSS1.p1.1.m1.2.2.7.3.cmml" type="integer" xref="S2.SS2.SSS1.p1.1.m1.2.2.7.3">28</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.SSS1.p1.1.m1.2c">o_{t}^{i}=\left[v_{t}^{i},m_{t}^{i}\right]\in\mathbb{R}^{28}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.SSS1.p1.1.m1.2d">italic_o start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = [ italic_v start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ] ∈ blackboard_R start_POSTSUPERSCRIPT 28 end_POSTSUPERSCRIPT</annotation></semantics></math>. Here, <math alttext="v_{t}^{i}\in\mathbb{R}^{1}" class="ltx_Math" display="inline" id="S2.SS2.SSS1.p1.2.m2.1"><semantics id="S2.SS2.SSS1.p1.2.m2.1a"><mrow id="S2.SS2.SSS1.p1.2.m2.1.1" xref="S2.SS2.SSS1.p1.2.m2.1.1.cmml"><msubsup id="S2.SS2.SSS1.p1.2.m2.1.1.2" xref="S2.SS2.SSS1.p1.2.m2.1.1.2.cmml"><mi id="S2.SS2.SSS1.p1.2.m2.1.1.2.2.2" xref="S2.SS2.SSS1.p1.2.m2.1.1.2.2.2.cmml">v</mi><mi id="S2.SS2.SSS1.p1.2.m2.1.1.2.2.3" xref="S2.SS2.SSS1.p1.2.m2.1.1.2.2.3.cmml">t</mi><mi id="S2.SS2.SSS1.p1.2.m2.1.1.2.3" xref="S2.SS2.SSS1.p1.2.m2.1.1.2.3.cmml">i</mi></msubsup><mo id="S2.SS2.SSS1.p1.2.m2.1.1.1" xref="S2.SS2.SSS1.p1.2.m2.1.1.1.cmml">∈</mo><msup id="S2.SS2.SSS1.p1.2.m2.1.1.3" xref="S2.SS2.SSS1.p1.2.m2.1.1.3.cmml"><mi id="S2.SS2.SSS1.p1.2.m2.1.1.3.2" xref="S2.SS2.SSS1.p1.2.m2.1.1.3.2.cmml">ℝ</mi><mn id="S2.SS2.SSS1.p1.2.m2.1.1.3.3" xref="S2.SS2.SSS1.p1.2.m2.1.1.3.3.cmml">1</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.SSS1.p1.2.m2.1b"><apply id="S2.SS2.SSS1.p1.2.m2.1.1.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1"><in id="S2.SS2.SSS1.p1.2.m2.1.1.1.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1.1"></in><apply id="S2.SS2.SSS1.p1.2.m2.1.1.2.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1.2"><csymbol cd="ambiguous" id="S2.SS2.SSS1.p1.2.m2.1.1.2.1.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1.2">superscript</csymbol><apply id="S2.SS2.SSS1.p1.2.m2.1.1.2.2.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1.2"><csymbol cd="ambiguous" id="S2.SS2.SSS1.p1.2.m2.1.1.2.2.1.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1.2">subscript</csymbol><ci id="S2.SS2.SSS1.p1.2.m2.1.1.2.2.2.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1.2.2.2">𝑣</ci><ci id="S2.SS2.SSS1.p1.2.m2.1.1.2.2.3.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1.2.2.3">𝑡</ci></apply><ci id="S2.SS2.SSS1.p1.2.m2.1.1.2.3.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1.2.3">𝑖</ci></apply><apply id="S2.SS2.SSS1.p1.2.m2.1.1.3.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1.3"><csymbol cd="ambiguous" id="S2.SS2.SSS1.p1.2.m2.1.1.3.1.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1.3">superscript</csymbol><ci id="S2.SS2.SSS1.p1.2.m2.1.1.3.2.cmml" xref="S2.SS2.SSS1.p1.2.m2.1.1.3.2">ℝ</ci><cn id="S2.SS2.SSS1.p1.2.m2.1.1.3.3.cmml" type="integer" xref="S2.SS2.SSS1.p1.2.m2.1.1.3.3">1</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.SSS1.p1.2.m2.1c">v_{t}^{i}\in\mathbb{R}^{1}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.SSS1.p1.2.m2.1d">italic_v start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT</annotation></semantics></math> represents the estimated forward velocity of <math alttext="i" class="ltx_Math" display="inline" id="S2.SS2.SSS1.p1.3.m3.1"><semantics id="S2.SS2.SSS1.p1.3.m3.1a"><mi id="S2.SS2.SSS1.p1.3.m3.1.1" xref="S2.SS2.SSS1.p1.3.m3.1.1.cmml">i</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.SSS1.p1.3.m3.1b"><ci id="S2.SS2.SSS1.p1.3.m3.1.1.cmml" xref="S2.SS2.SSS1.p1.3.m3.1.1">𝑖</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.SSS1.p1.3.m3.1c">i</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.SSS1.p1.3.m3.1d">italic_i</annotation></semantics></math>-th agent, and <math alttext="m_{t}^{i}=\left[{}^{1}m_{t}^{i},^{2}m_{t}^{i},\cdots,^{27}m_{t}^{i}\right]\in% \mathbb{R}^{27}" class="ltx_math_unparsed" display="inline" id="S2.SS2.SSS1.p1.4.m4.1"><semantics id="S2.SS2.SSS1.p1.4.m4.1a"><mrow id="S2.SS2.SSS1.p1.4.m4.1b"><msubsup id="S2.SS2.SSS1.p1.4.m4.1.1"><mi id="S2.SS2.SSS1.p1.4.m4.1.1.2.2">m</mi><mi id="S2.SS2.SSS1.p1.4.m4.1.1.2.3">t</mi><mi id="S2.SS2.SSS1.p1.4.m4.1.1.3">i</mi></msubsup><mo id="S2.SS2.SSS1.p1.4.m4.1.2">=</mo><mrow id="S2.SS2.SSS1.p1.4.m4.1.3"><mo id="S2.SS2.SSS1.p1.4.m4.1.3.1">[</mo><mmultiscripts id="S2.SS2.SSS1.p1.4.m4.1.3.2"><mi id="S2.SS2.SSS1.p1.4.m4.1.3.2.2.2.2">m</mi><mi id="S2.SS2.SSS1.p1.4.m4.1.3.2.2.2.3">t</mi><mi id="S2.SS2.SSS1.p1.4.m4.1.3.2.2.3">i</mi><mprescripts id="S2.SS2.SSS1.p1.4.m4.1.3.2a"></mprescripts><mrow id="S2.SS2.SSS1.p1.4.m4.1.3.2b"></mrow><mn id="S2.SS2.SSS1.p1.4.m4.1.3.2.3">1</mn></mmultiscripts><msup id="S2.SS2.SSS1.p1.4.m4.1.3.3"><mo id="S2.SS2.SSS1.p1.4.m4.1.3.3.2">,</mo><mn id="S2.SS2.SSS1.p1.4.m4.1.3.3.3">2</mn></msup><msubsup id="S2.SS2.SSS1.p1.4.m4.1.3.4"><mi id="S2.SS2.SSS1.p1.4.m4.1.3.4.2.2">m</mi><mi id="S2.SS2.SSS1.p1.4.m4.1.3.4.2.3">t</mi><mi id="S2.SS2.SSS1.p1.4.m4.1.3.4.3">i</mi></msubsup><mo id="S2.SS2.SSS1.p1.4.m4.1.3.5">,</mo><mi id="S2.SS2.SSS1.p1.4.m4.1.3.6" mathvariant="normal">⋯</mi><msup id="S2.SS2.SSS1.p1.4.m4.1.3.7"><mo id="S2.SS2.SSS1.p1.4.m4.1.3.7.2">,</mo><mn id="S2.SS2.SSS1.p1.4.m4.1.3.7.3">27</mn></msup><msubsup id="S2.SS2.SSS1.p1.4.m4.1.3.8"><mi id="S2.SS2.SSS1.p1.4.m4.1.3.8.2.2">m</mi><mi id="S2.SS2.SSS1.p1.4.m4.1.3.8.2.3">t</mi><mi id="S2.SS2.SSS1.p1.4.m4.1.3.8.3">i</mi></msubsup><mo id="S2.SS2.SSS1.p1.4.m4.1.3.9">]</mo></mrow><mo id="S2.SS2.SSS1.p1.4.m4.1.4">∈</mo><msup id="S2.SS2.SSS1.p1.4.m4.1.5"><mi id="S2.SS2.SSS1.p1.4.m4.1.5.2">ℝ</mi><mn id="S2.SS2.SSS1.p1.4.m4.1.5.3">27</mn></msup></mrow><annotation encoding="application/x-tex" id="S2.SS2.SSS1.p1.4.m4.1c">m_{t}^{i}=\left[{}^{1}m_{t}^{i},^{2}m_{t}^{i},\cdots,^{27}m_{t}^{i}\right]\in% \mathbb{R}^{27}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.SSS1.p1.4.m4.1d">italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = [ start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPT italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , ⋯ , start_POSTSUPERSCRIPT 27 end_POSTSUPERSCRIPT italic_m start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ] ∈ blackboard_R start_POSTSUPERSCRIPT 27 end_POSTSUPERSCRIPT</annotation></semantics></math> is the LIDAR range array with 27 ranging measurements uniformly spaced 270<sup class="ltx_sup" id="S2.SS2.SSS1.p1.5.1"><span class="ltx_text ltx_font_italic" id="S2.SS2.SSS1.p1.5.1.1">∘</span></sup> around each side of the heading vector, within a 10 m radius.</p> </div> </section> <section class="ltx_subsubsection" id="S2.SS2.SSS2"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S2.SS2.SSS2.5.1.1">II-B</span>2 </span>Action Space</h4> <div class="ltx_para" id="S2.SS2.SSS2.p1"> <p class="ltx_p" id="S2.SS2.SSS2.p1.1">The action space to control Ackermann-steered vehicles was <math alttext="a_{t}^{i}=\left[\tau_{t}^{i},\delta_{t}^{i}\right]\in\mathbb{R}^{2}" class="ltx_Math" display="inline" id="S2.SS2.SSS2.p1.1.m1.2"><semantics id="S2.SS2.SSS2.p1.1.m1.2a"><mrow id="S2.SS2.SSS2.p1.1.m1.2.2" xref="S2.SS2.SSS2.p1.1.m1.2.2.cmml"><msubsup id="S2.SS2.SSS2.p1.1.m1.2.2.4" xref="S2.SS2.SSS2.p1.1.m1.2.2.4.cmml"><mi id="S2.SS2.SSS2.p1.1.m1.2.2.4.2.2" xref="S2.SS2.SSS2.p1.1.m1.2.2.4.2.2.cmml">a</mi><mi id="S2.SS2.SSS2.p1.1.m1.2.2.4.2.3" xref="S2.SS2.SSS2.p1.1.m1.2.2.4.2.3.cmml">t</mi><mi id="S2.SS2.SSS2.p1.1.m1.2.2.4.3" xref="S2.SS2.SSS2.p1.1.m1.2.2.4.3.cmml">i</mi></msubsup><mo id="S2.SS2.SSS2.p1.1.m1.2.2.5" xref="S2.SS2.SSS2.p1.1.m1.2.2.5.cmml">=</mo><mrow id="S2.SS2.SSS2.p1.1.m1.2.2.2.2" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.3.cmml"><mo id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.3" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.3.cmml">[</mo><msubsup id="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1" xref="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.cmml"><mi id="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.2.2" xref="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.2.2.cmml">τ</mi><mi id="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.2.3" xref="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.2.3.cmml">t</mi><mi id="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.3" xref="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.3.cmml">i</mi></msubsup><mo id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.4" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.3.cmml">,</mo><msubsup id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.cmml"><mi id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.2.2" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.2.2.cmml">δ</mi><mi id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.2.3" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.2.3.cmml">t</mi><mi id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.3" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.3.cmml">i</mi></msubsup><mo id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.5" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.3.cmml">]</mo></mrow><mo id="S2.SS2.SSS2.p1.1.m1.2.2.6" xref="S2.SS2.SSS2.p1.1.m1.2.2.6.cmml">∈</mo><msup id="S2.SS2.SSS2.p1.1.m1.2.2.7" xref="S2.SS2.SSS2.p1.1.m1.2.2.7.cmml"><mi id="S2.SS2.SSS2.p1.1.m1.2.2.7.2" xref="S2.SS2.SSS2.p1.1.m1.2.2.7.2.cmml">ℝ</mi><mn id="S2.SS2.SSS2.p1.1.m1.2.2.7.3" xref="S2.SS2.SSS2.p1.1.m1.2.2.7.3.cmml">2</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.SSS2.p1.1.m1.2b"><apply id="S2.SS2.SSS2.p1.1.m1.2.2.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2"><and id="S2.SS2.SSS2.p1.1.m1.2.2a.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2"></and><apply id="S2.SS2.SSS2.p1.1.m1.2.2b.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2"><eq id="S2.SS2.SSS2.p1.1.m1.2.2.5.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.5"></eq><apply id="S2.SS2.SSS2.p1.1.m1.2.2.4.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.4"><csymbol cd="ambiguous" id="S2.SS2.SSS2.p1.1.m1.2.2.4.1.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.4">superscript</csymbol><apply id="S2.SS2.SSS2.p1.1.m1.2.2.4.2.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.4"><csymbol cd="ambiguous" id="S2.SS2.SSS2.p1.1.m1.2.2.4.2.1.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.4">subscript</csymbol><ci id="S2.SS2.SSS2.p1.1.m1.2.2.4.2.2.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.4.2.2">𝑎</ci><ci id="S2.SS2.SSS2.p1.1.m1.2.2.4.2.3.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.4.2.3">𝑡</ci></apply><ci id="S2.SS2.SSS2.p1.1.m1.2.2.4.3.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.4.3">𝑖</ci></apply><interval closure="closed" id="S2.SS2.SSS2.p1.1.m1.2.2.2.3.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2"><apply id="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.cmml" xref="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.1.cmml" xref="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1">superscript</csymbol><apply id="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.2.cmml" xref="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.2.1.cmml" xref="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1">subscript</csymbol><ci id="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.2.2.cmml" xref="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.2.2">𝜏</ci><ci id="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.2.3.cmml" xref="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.2.3">𝑡</ci></apply><ci id="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.3.cmml" xref="S2.SS2.SSS2.p1.1.m1.1.1.1.1.1.3">𝑖</ci></apply><apply id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.1.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2">superscript</csymbol><apply id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.2.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2"><csymbol cd="ambiguous" id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.2.1.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2">subscript</csymbol><ci id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.2.2.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.2.2">𝛿</ci><ci id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.2.3.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.2.3">𝑡</ci></apply><ci id="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.3.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.2.2.2.3">𝑖</ci></apply></interval></apply><apply id="S2.SS2.SSS2.p1.1.m1.2.2c.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2"><in id="S2.SS2.SSS2.p1.1.m1.2.2.6.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.6"></in><share href="https://arxiv.org/html/2403.10996v5#S2.SS2.SSS2.p1.1.m1.2.2.2.cmml" id="S2.SS2.SSS2.p1.1.m1.2.2d.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2"></share><apply id="S2.SS2.SSS2.p1.1.m1.2.2.7.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.7"><csymbol cd="ambiguous" id="S2.SS2.SSS2.p1.1.m1.2.2.7.1.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.7">superscript</csymbol><ci id="S2.SS2.SSS2.p1.1.m1.2.2.7.2.cmml" xref="S2.SS2.SSS2.p1.1.m1.2.2.7.2">ℝ</ci><cn id="S2.SS2.SSS2.p1.1.m1.2.2.7.3.cmml" type="integer" xref="S2.SS2.SSS2.p1.1.m1.2.2.7.3">2</cn></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.SSS2.p1.1.m1.2c">a_{t}^{i}=\left[\tau_{t}^{i},\delta_{t}^{i}\right]\in\mathbb{R}^{2}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.SSS2.p1.1.m1.2d">italic_a start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = [ italic_τ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT , italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ] ∈ blackboard_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT</annotation></semantics></math>. Here, the throttle command allowed the agents to optimize their speed profile, and the steering command allowed the agents to optimize their race line, overtake their peers, and avoid collisions.</p> </div> </section> <section class="ltx_subsubsection" id="S2.SS2.SSS3"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S2.SS2.SSS3.5.1.1">II-B</span>3 </span>Reward Function</h4> <div class="ltx_para" id="S2.SS2.SSS3.p1"> <p class="ltx_p" id="S2.SS2.SSS3.p1.1">Following signals guided the agents:</p> </div> <div class="ltx_para" id="S2.SS2.SSS3.p2"> <ul class="ltx_itemize" id="S2.I1"> <li class="ltx_item" id="S2.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S2.I1.i1.p1"> <p class="ltx_p" id="S2.I1.i1.p1.1"><span class="ltx_text ltx_font_bold" id="S2.I1.i1.p1.1.1">Behavioral Cloning:</span> The behavioral cloning (BC) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib14" title="">14</a>]</cite> algorithm updated the policy in a supervised fashion with respect to the recorded demonstrations, mutually exclusive of the reinforcement learning update.</p> </div> </li> <li class="ltx_item" id="S2.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S2.I1.i2.p1"> <p class="ltx_p" id="S2.I1.i2.p1.1"><span class="ltx_text ltx_font_bold" id="S2.I1.i2.p1.1.1">GAIL Reward:</span> The generative adversarial imitation learning (GAIL) reward <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib15" title="">15</a>]</cite> <math alttext="{}^{g}r_{t}" class="ltx_Math" display="inline" id="S2.I1.i2.p1.1.m1.1"><semantics id="S2.I1.i2.p1.1.m1.1a"><mmultiscripts id="S2.I1.i2.p1.1.m1.1.1" xref="S2.I1.i2.p1.1.m1.1.1.cmml"><mi id="S2.I1.i2.p1.1.m1.1.1.2.2" xref="S2.I1.i2.p1.1.m1.1.1.2.2.cmml">r</mi><mi id="S2.I1.i2.p1.1.m1.1.1.2.3" xref="S2.I1.i2.p1.1.m1.1.1.2.3.cmml">t</mi><mrow id="S2.I1.i2.p1.1.m1.1.1a" xref="S2.I1.i2.p1.1.m1.1.1.cmml"></mrow><mprescripts id="S2.I1.i2.p1.1.m1.1.1b" xref="S2.I1.i2.p1.1.m1.1.1.cmml"></mprescripts><mrow id="S2.I1.i2.p1.1.m1.1.1c" xref="S2.I1.i2.p1.1.m1.1.1.cmml"></mrow><mi id="S2.I1.i2.p1.1.m1.1.1.3" xref="S2.I1.i2.p1.1.m1.1.1.3.cmml">g</mi></mmultiscripts><annotation-xml encoding="MathML-Content" id="S2.I1.i2.p1.1.m1.1b"><apply id="S2.I1.i2.p1.1.m1.1.1.cmml" xref="S2.I1.i2.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S2.I1.i2.p1.1.m1.1.1.1.cmml" xref="S2.I1.i2.p1.1.m1.1.1">superscript</csymbol><apply id="S2.I1.i2.p1.1.m1.1.1.2.cmml" xref="S2.I1.i2.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S2.I1.i2.p1.1.m1.1.1.2.1.cmml" xref="S2.I1.i2.p1.1.m1.1.1">subscript</csymbol><ci id="S2.I1.i2.p1.1.m1.1.1.2.2.cmml" xref="S2.I1.i2.p1.1.m1.1.1.2.2">𝑟</ci><ci id="S2.I1.i2.p1.1.m1.1.1.2.3.cmml" xref="S2.I1.i2.p1.1.m1.1.1.2.3">𝑡</ci></apply><ci id="S2.I1.i2.p1.1.m1.1.1.3.cmml" xref="S2.I1.i2.p1.1.m1.1.1.3">𝑔</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i2.p1.1.m1.1c">{}^{g}r_{t}</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i2.p1.1.m1.1d">start_FLOATSUPERSCRIPT italic_g end_FLOATSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> ensured that the agent optimized its actions safely and ethically by rewarding proportional to the closeness of new observation-action pairs to those from the recorded demonstrations.</p> </div> </li> <li class="ltx_item" id="S2.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S2.I1.i3.p1"> <p class="ltx_p" id="S2.I1.i3.p1.1"><span class="ltx_text ltx_font_bold" id="S2.I1.i3.p1.1.1">Curiosity Reward:</span> The curiosity reward <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib16" title="">16</a>]</cite> <math alttext="{}^{c}r_{t}" class="ltx_Math" display="inline" id="S2.I1.i3.p1.1.m1.1"><semantics id="S2.I1.i3.p1.1.m1.1a"><mmultiscripts id="S2.I1.i3.p1.1.m1.1.1" xref="S2.I1.i3.p1.1.m1.1.1.cmml"><mi id="S2.I1.i3.p1.1.m1.1.1.2.2" xref="S2.I1.i3.p1.1.m1.1.1.2.2.cmml">r</mi><mi id="S2.I1.i3.p1.1.m1.1.1.2.3" xref="S2.I1.i3.p1.1.m1.1.1.2.3.cmml">t</mi><mrow id="S2.I1.i3.p1.1.m1.1.1a" xref="S2.I1.i3.p1.1.m1.1.1.cmml"></mrow><mprescripts id="S2.I1.i3.p1.1.m1.1.1b" xref="S2.I1.i3.p1.1.m1.1.1.cmml"></mprescripts><mrow id="S2.I1.i3.p1.1.m1.1.1c" xref="S2.I1.i3.p1.1.m1.1.1.cmml"></mrow><mi id="S2.I1.i3.p1.1.m1.1.1.3" xref="S2.I1.i3.p1.1.m1.1.1.3.cmml">c</mi></mmultiscripts><annotation-xml encoding="MathML-Content" id="S2.I1.i3.p1.1.m1.1b"><apply id="S2.I1.i3.p1.1.m1.1.1.cmml" xref="S2.I1.i3.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S2.I1.i3.p1.1.m1.1.1.1.cmml" xref="S2.I1.i3.p1.1.m1.1.1">superscript</csymbol><apply id="S2.I1.i3.p1.1.m1.1.1.2.cmml" xref="S2.I1.i3.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S2.I1.i3.p1.1.m1.1.1.2.1.cmml" xref="S2.I1.i3.p1.1.m1.1.1">subscript</csymbol><ci id="S2.I1.i3.p1.1.m1.1.1.2.2.cmml" xref="S2.I1.i3.p1.1.m1.1.1.2.2">𝑟</ci><ci id="S2.I1.i3.p1.1.m1.1.1.2.3.cmml" xref="S2.I1.i3.p1.1.m1.1.1.2.3">𝑡</ci></apply><ci id="S2.I1.i3.p1.1.m1.1.1.3.cmml" xref="S2.I1.i3.p1.1.m1.1.1.3">𝑐</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i3.p1.1.m1.1c">{}^{c}r_{t}</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i3.p1.1.m1.1d">start_FLOATSUPERSCRIPT italic_c end_FLOATSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math> promoted exploration by rewarding proportional to the difference in predicted and actual encoded observations.</p> </div> </li> <li class="ltx_item" id="S2.I1.i4" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S2.I1.i4.p1"> <p class="ltx_p" id="S2.I1.i4.p1.7"><span class="ltx_text ltx_font_bold" id="S2.I1.i4.p1.7.1">Extrinsic Reward:</span> The objectives of lap time reduction and motion constraints were handled using an extrinsic reward function <math alttext="{}^{e}r_{t}" class="ltx_Math" display="inline" id="S2.I1.i4.p1.1.m1.1"><semantics id="S2.I1.i4.p1.1.m1.1a"><mmultiscripts id="S2.I1.i4.p1.1.m1.1.1" xref="S2.I1.i4.p1.1.m1.1.1.cmml"><mi id="S2.I1.i4.p1.1.m1.1.1.2.2" xref="S2.I1.i4.p1.1.m1.1.1.2.2.cmml">r</mi><mi id="S2.I1.i4.p1.1.m1.1.1.2.3" xref="S2.I1.i4.p1.1.m1.1.1.2.3.cmml">t</mi><mrow id="S2.I1.i4.p1.1.m1.1.1a" xref="S2.I1.i4.p1.1.m1.1.1.cmml"></mrow><mprescripts id="S2.I1.i4.p1.1.m1.1.1b" xref="S2.I1.i4.p1.1.m1.1.1.cmml"></mprescripts><mrow id="S2.I1.i4.p1.1.m1.1.1c" xref="S2.I1.i4.p1.1.m1.1.1.cmml"></mrow><mi id="S2.I1.i4.p1.1.m1.1.1.3" xref="S2.I1.i4.p1.1.m1.1.1.3.cmml">e</mi></mmultiscripts><annotation-xml encoding="MathML-Content" id="S2.I1.i4.p1.1.m1.1b"><apply id="S2.I1.i4.p1.1.m1.1.1.cmml" xref="S2.I1.i4.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S2.I1.i4.p1.1.m1.1.1.1.cmml" xref="S2.I1.i4.p1.1.m1.1.1">superscript</csymbol><apply id="S2.I1.i4.p1.1.m1.1.1.2.cmml" xref="S2.I1.i4.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S2.I1.i4.p1.1.m1.1.1.2.1.cmml" xref="S2.I1.i4.p1.1.m1.1.1">subscript</csymbol><ci id="S2.I1.i4.p1.1.m1.1.1.2.2.cmml" xref="S2.I1.i4.p1.1.m1.1.1.2.2">𝑟</ci><ci id="S2.I1.i4.p1.1.m1.1.1.2.3.cmml" xref="S2.I1.i4.p1.1.m1.1.1.2.3">𝑡</ci></apply><ci id="S2.I1.i4.p1.1.m1.1.1.3.cmml" xref="S2.I1.i4.p1.1.m1.1.1.3">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i4.p1.1.m1.1c">{}^{e}r_{t}</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i4.p1.1.m1.1d">start_FLOATSUPERSCRIPT italic_e end_FLOATSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math>. The agents received a reward of <math alttext="r_{checkpoint}=0.01" class="ltx_Math" display="inline" id="S2.I1.i4.p1.2.m2.1"><semantics id="S2.I1.i4.p1.2.m2.1a"><mrow id="S2.I1.i4.p1.2.m2.1.1" xref="S2.I1.i4.p1.2.m2.1.1.cmml"><msub id="S2.I1.i4.p1.2.m2.1.1.2" xref="S2.I1.i4.p1.2.m2.1.1.2.cmml"><mi id="S2.I1.i4.p1.2.m2.1.1.2.2" xref="S2.I1.i4.p1.2.m2.1.1.2.2.cmml">r</mi><mrow id="S2.I1.i4.p1.2.m2.1.1.2.3" xref="S2.I1.i4.p1.2.m2.1.1.2.3.cmml"><mi id="S2.I1.i4.p1.2.m2.1.1.2.3.2" xref="S2.I1.i4.p1.2.m2.1.1.2.3.2.cmml">c</mi><mo id="S2.I1.i4.p1.2.m2.1.1.2.3.1" xref="S2.I1.i4.p1.2.m2.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.2.m2.1.1.2.3.3" xref="S2.I1.i4.p1.2.m2.1.1.2.3.3.cmml">h</mi><mo id="S2.I1.i4.p1.2.m2.1.1.2.3.1a" xref="S2.I1.i4.p1.2.m2.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.2.m2.1.1.2.3.4" xref="S2.I1.i4.p1.2.m2.1.1.2.3.4.cmml">e</mi><mo id="S2.I1.i4.p1.2.m2.1.1.2.3.1b" xref="S2.I1.i4.p1.2.m2.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.2.m2.1.1.2.3.5" xref="S2.I1.i4.p1.2.m2.1.1.2.3.5.cmml">c</mi><mo id="S2.I1.i4.p1.2.m2.1.1.2.3.1c" xref="S2.I1.i4.p1.2.m2.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.2.m2.1.1.2.3.6" xref="S2.I1.i4.p1.2.m2.1.1.2.3.6.cmml">k</mi><mo id="S2.I1.i4.p1.2.m2.1.1.2.3.1d" xref="S2.I1.i4.p1.2.m2.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.2.m2.1.1.2.3.7" xref="S2.I1.i4.p1.2.m2.1.1.2.3.7.cmml">p</mi><mo id="S2.I1.i4.p1.2.m2.1.1.2.3.1e" xref="S2.I1.i4.p1.2.m2.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.2.m2.1.1.2.3.8" xref="S2.I1.i4.p1.2.m2.1.1.2.3.8.cmml">o</mi><mo id="S2.I1.i4.p1.2.m2.1.1.2.3.1f" xref="S2.I1.i4.p1.2.m2.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.2.m2.1.1.2.3.9" xref="S2.I1.i4.p1.2.m2.1.1.2.3.9.cmml">i</mi><mo id="S2.I1.i4.p1.2.m2.1.1.2.3.1g" xref="S2.I1.i4.p1.2.m2.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.2.m2.1.1.2.3.10" xref="S2.I1.i4.p1.2.m2.1.1.2.3.10.cmml">n</mi><mo id="S2.I1.i4.p1.2.m2.1.1.2.3.1h" xref="S2.I1.i4.p1.2.m2.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.2.m2.1.1.2.3.11" xref="S2.I1.i4.p1.2.m2.1.1.2.3.11.cmml">t</mi></mrow></msub><mo id="S2.I1.i4.p1.2.m2.1.1.1" xref="S2.I1.i4.p1.2.m2.1.1.1.cmml">=</mo><mn id="S2.I1.i4.p1.2.m2.1.1.3" xref="S2.I1.i4.p1.2.m2.1.1.3.cmml">0.01</mn></mrow><annotation-xml encoding="MathML-Content" id="S2.I1.i4.p1.2.m2.1b"><apply id="S2.I1.i4.p1.2.m2.1.1.cmml" xref="S2.I1.i4.p1.2.m2.1.1"><eq id="S2.I1.i4.p1.2.m2.1.1.1.cmml" xref="S2.I1.i4.p1.2.m2.1.1.1"></eq><apply id="S2.I1.i4.p1.2.m2.1.1.2.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2"><csymbol cd="ambiguous" id="S2.I1.i4.p1.2.m2.1.1.2.1.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2">subscript</csymbol><ci id="S2.I1.i4.p1.2.m2.1.1.2.2.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.2">𝑟</ci><apply id="S2.I1.i4.p1.2.m2.1.1.2.3.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3"><times id="S2.I1.i4.p1.2.m2.1.1.2.3.1.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3.1"></times><ci id="S2.I1.i4.p1.2.m2.1.1.2.3.2.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3.2">𝑐</ci><ci id="S2.I1.i4.p1.2.m2.1.1.2.3.3.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3.3">ℎ</ci><ci id="S2.I1.i4.p1.2.m2.1.1.2.3.4.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3.4">𝑒</ci><ci id="S2.I1.i4.p1.2.m2.1.1.2.3.5.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3.5">𝑐</ci><ci id="S2.I1.i4.p1.2.m2.1.1.2.3.6.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3.6">𝑘</ci><ci id="S2.I1.i4.p1.2.m2.1.1.2.3.7.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3.7">𝑝</ci><ci id="S2.I1.i4.p1.2.m2.1.1.2.3.8.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3.8">𝑜</ci><ci id="S2.I1.i4.p1.2.m2.1.1.2.3.9.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3.9">𝑖</ci><ci id="S2.I1.i4.p1.2.m2.1.1.2.3.10.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3.10">𝑛</ci><ci id="S2.I1.i4.p1.2.m2.1.1.2.3.11.cmml" xref="S2.I1.i4.p1.2.m2.1.1.2.3.11">𝑡</ci></apply></apply><cn id="S2.I1.i4.p1.2.m2.1.1.3.cmml" type="float" xref="S2.I1.i4.p1.2.m2.1.1.3">0.01</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i4.p1.2.m2.1c">r_{checkpoint}=0.01</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i4.p1.2.m2.1d">italic_r start_POSTSUBSCRIPT italic_c italic_h italic_e italic_c italic_k italic_p italic_o italic_i italic_n italic_t end_POSTSUBSCRIPT = 0.01</annotation></semantics></math> for passing each of the 19 checkpoints <math alttext="c_{i}" class="ltx_Math" display="inline" id="S2.I1.i4.p1.3.m3.1"><semantics id="S2.I1.i4.p1.3.m3.1a"><msub id="S2.I1.i4.p1.3.m3.1.1" xref="S2.I1.i4.p1.3.m3.1.1.cmml"><mi id="S2.I1.i4.p1.3.m3.1.1.2" xref="S2.I1.i4.p1.3.m3.1.1.2.cmml">c</mi><mi id="S2.I1.i4.p1.3.m3.1.1.3" xref="S2.I1.i4.p1.3.m3.1.1.3.cmml">i</mi></msub><annotation-xml encoding="MathML-Content" id="S2.I1.i4.p1.3.m3.1b"><apply id="S2.I1.i4.p1.3.m3.1.1.cmml" xref="S2.I1.i4.p1.3.m3.1.1"><csymbol cd="ambiguous" id="S2.I1.i4.p1.3.m3.1.1.1.cmml" xref="S2.I1.i4.p1.3.m3.1.1">subscript</csymbol><ci id="S2.I1.i4.p1.3.m3.1.1.2.cmml" xref="S2.I1.i4.p1.3.m3.1.1.2">𝑐</ci><ci id="S2.I1.i4.p1.3.m3.1.1.3.cmml" xref="S2.I1.i4.p1.3.m3.1.1.3">𝑖</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i4.p1.3.m3.1c">c_{i}</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i4.p1.3.m3.1d">italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT</annotation></semantics></math> on the race track, <math alttext="r_{lap}=0.1" class="ltx_Math" display="inline" id="S2.I1.i4.p1.4.m4.1"><semantics id="S2.I1.i4.p1.4.m4.1a"><mrow id="S2.I1.i4.p1.4.m4.1.1" xref="S2.I1.i4.p1.4.m4.1.1.cmml"><msub id="S2.I1.i4.p1.4.m4.1.1.2" xref="S2.I1.i4.p1.4.m4.1.1.2.cmml"><mi id="S2.I1.i4.p1.4.m4.1.1.2.2" xref="S2.I1.i4.p1.4.m4.1.1.2.2.cmml">r</mi><mrow id="S2.I1.i4.p1.4.m4.1.1.2.3" xref="S2.I1.i4.p1.4.m4.1.1.2.3.cmml"><mi id="S2.I1.i4.p1.4.m4.1.1.2.3.2" xref="S2.I1.i4.p1.4.m4.1.1.2.3.2.cmml">l</mi><mo id="S2.I1.i4.p1.4.m4.1.1.2.3.1" xref="S2.I1.i4.p1.4.m4.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.4.m4.1.1.2.3.3" xref="S2.I1.i4.p1.4.m4.1.1.2.3.3.cmml">a</mi><mo id="S2.I1.i4.p1.4.m4.1.1.2.3.1a" xref="S2.I1.i4.p1.4.m4.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.4.m4.1.1.2.3.4" xref="S2.I1.i4.p1.4.m4.1.1.2.3.4.cmml">p</mi></mrow></msub><mo id="S2.I1.i4.p1.4.m4.1.1.1" xref="S2.I1.i4.p1.4.m4.1.1.1.cmml">=</mo><mn id="S2.I1.i4.p1.4.m4.1.1.3" xref="S2.I1.i4.p1.4.m4.1.1.3.cmml">0.1</mn></mrow><annotation-xml encoding="MathML-Content" id="S2.I1.i4.p1.4.m4.1b"><apply id="S2.I1.i4.p1.4.m4.1.1.cmml" xref="S2.I1.i4.p1.4.m4.1.1"><eq id="S2.I1.i4.p1.4.m4.1.1.1.cmml" xref="S2.I1.i4.p1.4.m4.1.1.1"></eq><apply id="S2.I1.i4.p1.4.m4.1.1.2.cmml" xref="S2.I1.i4.p1.4.m4.1.1.2"><csymbol cd="ambiguous" id="S2.I1.i4.p1.4.m4.1.1.2.1.cmml" xref="S2.I1.i4.p1.4.m4.1.1.2">subscript</csymbol><ci id="S2.I1.i4.p1.4.m4.1.1.2.2.cmml" xref="S2.I1.i4.p1.4.m4.1.1.2.2">𝑟</ci><apply id="S2.I1.i4.p1.4.m4.1.1.2.3.cmml" xref="S2.I1.i4.p1.4.m4.1.1.2.3"><times id="S2.I1.i4.p1.4.m4.1.1.2.3.1.cmml" xref="S2.I1.i4.p1.4.m4.1.1.2.3.1"></times><ci id="S2.I1.i4.p1.4.m4.1.1.2.3.2.cmml" xref="S2.I1.i4.p1.4.m4.1.1.2.3.2">𝑙</ci><ci id="S2.I1.i4.p1.4.m4.1.1.2.3.3.cmml" xref="S2.I1.i4.p1.4.m4.1.1.2.3.3">𝑎</ci><ci id="S2.I1.i4.p1.4.m4.1.1.2.3.4.cmml" xref="S2.I1.i4.p1.4.m4.1.1.2.3.4">𝑝</ci></apply></apply><cn id="S2.I1.i4.p1.4.m4.1.1.3.cmml" type="float" xref="S2.I1.i4.p1.4.m4.1.1.3">0.1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i4.p1.4.m4.1c">r_{lap}=0.1</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i4.p1.4.m4.1d">italic_r start_POSTSUBSCRIPT italic_l italic_a italic_p end_POSTSUBSCRIPT = 0.1</annotation></semantics></math> upon completing a lap, <math alttext="r_{best\&gt;lap}=0.7" class="ltx_Math" display="inline" id="S2.I1.i4.p1.5.m5.1"><semantics id="S2.I1.i4.p1.5.m5.1a"><mrow id="S2.I1.i4.p1.5.m5.1.1" xref="S2.I1.i4.p1.5.m5.1.1.cmml"><msub id="S2.I1.i4.p1.5.m5.1.1.2" xref="S2.I1.i4.p1.5.m5.1.1.2.cmml"><mi id="S2.I1.i4.p1.5.m5.1.1.2.2" xref="S2.I1.i4.p1.5.m5.1.1.2.2.cmml">r</mi><mrow id="S2.I1.i4.p1.5.m5.1.1.2.3" xref="S2.I1.i4.p1.5.m5.1.1.2.3.cmml"><mi id="S2.I1.i4.p1.5.m5.1.1.2.3.2" xref="S2.I1.i4.p1.5.m5.1.1.2.3.2.cmml">b</mi><mo id="S2.I1.i4.p1.5.m5.1.1.2.3.1" xref="S2.I1.i4.p1.5.m5.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.5.m5.1.1.2.3.3" xref="S2.I1.i4.p1.5.m5.1.1.2.3.3.cmml">e</mi><mo id="S2.I1.i4.p1.5.m5.1.1.2.3.1a" xref="S2.I1.i4.p1.5.m5.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.5.m5.1.1.2.3.4" xref="S2.I1.i4.p1.5.m5.1.1.2.3.4.cmml">s</mi><mo id="S2.I1.i4.p1.5.m5.1.1.2.3.1b" xref="S2.I1.i4.p1.5.m5.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.5.m5.1.1.2.3.5" xref="S2.I1.i4.p1.5.m5.1.1.2.3.5.cmml">t</mi><mo id="S2.I1.i4.p1.5.m5.1.1.2.3.1c" lspace="0.220em" xref="S2.I1.i4.p1.5.m5.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.5.m5.1.1.2.3.6" xref="S2.I1.i4.p1.5.m5.1.1.2.3.6.cmml">l</mi><mo id="S2.I1.i4.p1.5.m5.1.1.2.3.1d" xref="S2.I1.i4.p1.5.m5.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.5.m5.1.1.2.3.7" xref="S2.I1.i4.p1.5.m5.1.1.2.3.7.cmml">a</mi><mo id="S2.I1.i4.p1.5.m5.1.1.2.3.1e" xref="S2.I1.i4.p1.5.m5.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.5.m5.1.1.2.3.8" xref="S2.I1.i4.p1.5.m5.1.1.2.3.8.cmml">p</mi></mrow></msub><mo id="S2.I1.i4.p1.5.m5.1.1.1" xref="S2.I1.i4.p1.5.m5.1.1.1.cmml">=</mo><mn id="S2.I1.i4.p1.5.m5.1.1.3" xref="S2.I1.i4.p1.5.m5.1.1.3.cmml">0.7</mn></mrow><annotation-xml encoding="MathML-Content" id="S2.I1.i4.p1.5.m5.1b"><apply id="S2.I1.i4.p1.5.m5.1.1.cmml" xref="S2.I1.i4.p1.5.m5.1.1"><eq id="S2.I1.i4.p1.5.m5.1.1.1.cmml" xref="S2.I1.i4.p1.5.m5.1.1.1"></eq><apply id="S2.I1.i4.p1.5.m5.1.1.2.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2"><csymbol cd="ambiguous" id="S2.I1.i4.p1.5.m5.1.1.2.1.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2">subscript</csymbol><ci id="S2.I1.i4.p1.5.m5.1.1.2.2.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2.2">𝑟</ci><apply id="S2.I1.i4.p1.5.m5.1.1.2.3.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2.3"><times id="S2.I1.i4.p1.5.m5.1.1.2.3.1.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2.3.1"></times><ci id="S2.I1.i4.p1.5.m5.1.1.2.3.2.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2.3.2">𝑏</ci><ci id="S2.I1.i4.p1.5.m5.1.1.2.3.3.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2.3.3">𝑒</ci><ci id="S2.I1.i4.p1.5.m5.1.1.2.3.4.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2.3.4">𝑠</ci><ci id="S2.I1.i4.p1.5.m5.1.1.2.3.5.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2.3.5">𝑡</ci><ci id="S2.I1.i4.p1.5.m5.1.1.2.3.6.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2.3.6">𝑙</ci><ci id="S2.I1.i4.p1.5.m5.1.1.2.3.7.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2.3.7">𝑎</ci><ci id="S2.I1.i4.p1.5.m5.1.1.2.3.8.cmml" xref="S2.I1.i4.p1.5.m5.1.1.2.3.8">𝑝</ci></apply></apply><cn id="S2.I1.i4.p1.5.m5.1.1.3.cmml" type="float" xref="S2.I1.i4.p1.5.m5.1.1.3">0.7</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i4.p1.5.m5.1c">r_{best\&gt;lap}=0.7</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i4.p1.5.m5.1d">italic_r start_POSTSUBSCRIPT italic_b italic_e italic_s italic_t italic_l italic_a italic_p end_POSTSUBSCRIPT = 0.7</annotation></semantics></math> upon achieving a new best lap time, and a penalty of <math alttext="r_{collision}=-1" class="ltx_Math" display="inline" id="S2.I1.i4.p1.6.m6.1"><semantics id="S2.I1.i4.p1.6.m6.1a"><mrow id="S2.I1.i4.p1.6.m6.1.1" xref="S2.I1.i4.p1.6.m6.1.1.cmml"><msub id="S2.I1.i4.p1.6.m6.1.1.2" xref="S2.I1.i4.p1.6.m6.1.1.2.cmml"><mi id="S2.I1.i4.p1.6.m6.1.1.2.2" xref="S2.I1.i4.p1.6.m6.1.1.2.2.cmml">r</mi><mrow id="S2.I1.i4.p1.6.m6.1.1.2.3" xref="S2.I1.i4.p1.6.m6.1.1.2.3.cmml"><mi id="S2.I1.i4.p1.6.m6.1.1.2.3.2" xref="S2.I1.i4.p1.6.m6.1.1.2.3.2.cmml">c</mi><mo id="S2.I1.i4.p1.6.m6.1.1.2.3.1" xref="S2.I1.i4.p1.6.m6.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.6.m6.1.1.2.3.3" xref="S2.I1.i4.p1.6.m6.1.1.2.3.3.cmml">o</mi><mo id="S2.I1.i4.p1.6.m6.1.1.2.3.1a" xref="S2.I1.i4.p1.6.m6.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.6.m6.1.1.2.3.4" xref="S2.I1.i4.p1.6.m6.1.1.2.3.4.cmml">l</mi><mo id="S2.I1.i4.p1.6.m6.1.1.2.3.1b" xref="S2.I1.i4.p1.6.m6.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.6.m6.1.1.2.3.5" xref="S2.I1.i4.p1.6.m6.1.1.2.3.5.cmml">l</mi><mo id="S2.I1.i4.p1.6.m6.1.1.2.3.1c" xref="S2.I1.i4.p1.6.m6.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.6.m6.1.1.2.3.6" xref="S2.I1.i4.p1.6.m6.1.1.2.3.6.cmml">i</mi><mo id="S2.I1.i4.p1.6.m6.1.1.2.3.1d" xref="S2.I1.i4.p1.6.m6.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.6.m6.1.1.2.3.7" xref="S2.I1.i4.p1.6.m6.1.1.2.3.7.cmml">s</mi><mo id="S2.I1.i4.p1.6.m6.1.1.2.3.1e" xref="S2.I1.i4.p1.6.m6.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.6.m6.1.1.2.3.8" xref="S2.I1.i4.p1.6.m6.1.1.2.3.8.cmml">i</mi><mo id="S2.I1.i4.p1.6.m6.1.1.2.3.1f" xref="S2.I1.i4.p1.6.m6.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.6.m6.1.1.2.3.9" xref="S2.I1.i4.p1.6.m6.1.1.2.3.9.cmml">o</mi><mo id="S2.I1.i4.p1.6.m6.1.1.2.3.1g" xref="S2.I1.i4.p1.6.m6.1.1.2.3.1.cmml">⁢</mo><mi id="S2.I1.i4.p1.6.m6.1.1.2.3.10" xref="S2.I1.i4.p1.6.m6.1.1.2.3.10.cmml">n</mi></mrow></msub><mo id="S2.I1.i4.p1.6.m6.1.1.1" xref="S2.I1.i4.p1.6.m6.1.1.1.cmml">=</mo><mrow id="S2.I1.i4.p1.6.m6.1.1.3" xref="S2.I1.i4.p1.6.m6.1.1.3.cmml"><mo id="S2.I1.i4.p1.6.m6.1.1.3a" xref="S2.I1.i4.p1.6.m6.1.1.3.cmml">−</mo><mn id="S2.I1.i4.p1.6.m6.1.1.3.2" xref="S2.I1.i4.p1.6.m6.1.1.3.2.cmml">1</mn></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.I1.i4.p1.6.m6.1b"><apply id="S2.I1.i4.p1.6.m6.1.1.cmml" xref="S2.I1.i4.p1.6.m6.1.1"><eq id="S2.I1.i4.p1.6.m6.1.1.1.cmml" xref="S2.I1.i4.p1.6.m6.1.1.1"></eq><apply id="S2.I1.i4.p1.6.m6.1.1.2.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2"><csymbol cd="ambiguous" id="S2.I1.i4.p1.6.m6.1.1.2.1.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2">subscript</csymbol><ci id="S2.I1.i4.p1.6.m6.1.1.2.2.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.2">𝑟</ci><apply id="S2.I1.i4.p1.6.m6.1.1.2.3.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.3"><times id="S2.I1.i4.p1.6.m6.1.1.2.3.1.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.3.1"></times><ci id="S2.I1.i4.p1.6.m6.1.1.2.3.2.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.3.2">𝑐</ci><ci id="S2.I1.i4.p1.6.m6.1.1.2.3.3.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.3.3">𝑜</ci><ci id="S2.I1.i4.p1.6.m6.1.1.2.3.4.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.3.4">𝑙</ci><ci id="S2.I1.i4.p1.6.m6.1.1.2.3.5.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.3.5">𝑙</ci><ci id="S2.I1.i4.p1.6.m6.1.1.2.3.6.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.3.6">𝑖</ci><ci id="S2.I1.i4.p1.6.m6.1.1.2.3.7.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.3.7">𝑠</ci><ci id="S2.I1.i4.p1.6.m6.1.1.2.3.8.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.3.8">𝑖</ci><ci id="S2.I1.i4.p1.6.m6.1.1.2.3.9.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.3.9">𝑜</ci><ci id="S2.I1.i4.p1.6.m6.1.1.2.3.10.cmml" xref="S2.I1.i4.p1.6.m6.1.1.2.3.10">𝑛</ci></apply></apply><apply id="S2.I1.i4.p1.6.m6.1.1.3.cmml" xref="S2.I1.i4.p1.6.m6.1.1.3"><minus id="S2.I1.i4.p1.6.m6.1.1.3.1.cmml" xref="S2.I1.i4.p1.6.m6.1.1.3"></minus><cn id="S2.I1.i4.p1.6.m6.1.1.3.2.cmml" type="integer" xref="S2.I1.i4.p1.6.m6.1.1.3.2">1</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i4.p1.6.m6.1c">r_{collision}=-1</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i4.p1.6.m6.1d">italic_r start_POSTSUBSCRIPT italic_c italic_o italic_l italic_l italic_i italic_s italic_i italic_o italic_n end_POSTSUBSCRIPT = - 1</annotation></semantics></math> for colliding with the track bounds or peer agent (in which case both agents were penalized equally). Additionally, a continuous reward promoted higher velocities <math alttext="v_{t}" class="ltx_Math" display="inline" id="S2.I1.i4.p1.7.m7.1"><semantics id="S2.I1.i4.p1.7.m7.1a"><msub id="S2.I1.i4.p1.7.m7.1.1" xref="S2.I1.i4.p1.7.m7.1.1.cmml"><mi id="S2.I1.i4.p1.7.m7.1.1.2" xref="S2.I1.i4.p1.7.m7.1.1.2.cmml">v</mi><mi id="S2.I1.i4.p1.7.m7.1.1.3" xref="S2.I1.i4.p1.7.m7.1.1.3.cmml">t</mi></msub><annotation-xml encoding="MathML-Content" id="S2.I1.i4.p1.7.m7.1b"><apply id="S2.I1.i4.p1.7.m7.1.1.cmml" xref="S2.I1.i4.p1.7.m7.1.1"><csymbol cd="ambiguous" id="S2.I1.i4.p1.7.m7.1.1.1.cmml" xref="S2.I1.i4.p1.7.m7.1.1">subscript</csymbol><ci id="S2.I1.i4.p1.7.m7.1.1.2.cmml" xref="S2.I1.i4.p1.7.m7.1.1.2">𝑣</ci><ci id="S2.I1.i4.p1.7.m7.1.1.3.cmml" xref="S2.I1.i4.p1.7.m7.1.1.3">𝑡</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i4.p1.7.m7.1c">v_{t}</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i4.p1.7.m7.1d">italic_v start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math>.</p> <table class="ltx_equation ltx_eqn_table" id="S2.E3"> <tbody><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_eqn_cell ltx_align_center"><math alttext="{}^{e}r_{t}^{i}=\begin{cases}r_{collision}&amp;\text{if collision}\\ r_{checkpoint}&amp;\text{if checkpoint passed}\\ r_{lap}&amp;\text{if lap completed}\\ r_{best\&gt;lap}&amp;\text{if best lap time}\\ 0.01*v_{t}^{i}&amp;\text{otherwise}\end{cases}" class="ltx_Math" display="block" id="S2.E3.m1.10"><semantics id="S2.E3.m1.10a"><mrow id="S2.E3.m1.10.11" xref="S2.E3.m1.10.11.cmml"><mmultiscripts id="S2.E3.m1.10.11.2" xref="S2.E3.m1.10.11.2.cmml"><mi id="S2.E3.m1.10.11.2.2.2.2" xref="S2.E3.m1.10.11.2.2.2.2.cmml">r</mi><mi id="S2.E3.m1.10.11.2.2.2.3" xref="S2.E3.m1.10.11.2.2.2.3.cmml">t</mi><mi id="S2.E3.m1.10.11.2.2.3" xref="S2.E3.m1.10.11.2.2.3.cmml">i</mi><mprescripts id="S2.E3.m1.10.11.2a" xref="S2.E3.m1.10.11.2.cmml"></mprescripts><mrow id="S2.E3.m1.10.11.2b" xref="S2.E3.m1.10.11.2.cmml"></mrow><mi id="S2.E3.m1.10.11.2.3" xref="S2.E3.m1.10.11.2.3.cmml">e</mi></mmultiscripts><mo id="S2.E3.m1.10.11.1" xref="S2.E3.m1.10.11.1.cmml">=</mo><mrow id="S2.E3.m1.10.10" xref="S2.E3.m1.10.11.3.1.cmml"><mo id="S2.E3.m1.10.10.11" xref="S2.E3.m1.10.11.3.1.1.cmml">{</mo><mtable columnspacing="5pt" displaystyle="true" id="S2.E3.m1.10.10.10" rowspacing="0pt" xref="S2.E3.m1.10.11.3.1.cmml"><mtr id="S2.E3.m1.10.10.10a" xref="S2.E3.m1.10.11.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E3.m1.10.10.10b" xref="S2.E3.m1.10.11.3.1.cmml"><msub id="S2.E3.m1.1.1.1.1.1.1" xref="S2.E3.m1.1.1.1.1.1.1.cmml"><mi id="S2.E3.m1.1.1.1.1.1.1.2" xref="S2.E3.m1.1.1.1.1.1.1.2.cmml">r</mi><mrow id="S2.E3.m1.1.1.1.1.1.1.3" xref="S2.E3.m1.1.1.1.1.1.1.3.cmml"><mi id="S2.E3.m1.1.1.1.1.1.1.3.2" xref="S2.E3.m1.1.1.1.1.1.1.3.2.cmml">c</mi><mo id="S2.E3.m1.1.1.1.1.1.1.3.1" xref="S2.E3.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.1.1.1.1.1.1.3.3" xref="S2.E3.m1.1.1.1.1.1.1.3.3.cmml">o</mi><mo id="S2.E3.m1.1.1.1.1.1.1.3.1a" xref="S2.E3.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.1.1.1.1.1.1.3.4" xref="S2.E3.m1.1.1.1.1.1.1.3.4.cmml">l</mi><mo id="S2.E3.m1.1.1.1.1.1.1.3.1b" xref="S2.E3.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.1.1.1.1.1.1.3.5" xref="S2.E3.m1.1.1.1.1.1.1.3.5.cmml">l</mi><mo id="S2.E3.m1.1.1.1.1.1.1.3.1c" xref="S2.E3.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.1.1.1.1.1.1.3.6" xref="S2.E3.m1.1.1.1.1.1.1.3.6.cmml">i</mi><mo id="S2.E3.m1.1.1.1.1.1.1.3.1d" xref="S2.E3.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.1.1.1.1.1.1.3.7" xref="S2.E3.m1.1.1.1.1.1.1.3.7.cmml">s</mi><mo id="S2.E3.m1.1.1.1.1.1.1.3.1e" xref="S2.E3.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.1.1.1.1.1.1.3.8" xref="S2.E3.m1.1.1.1.1.1.1.3.8.cmml">i</mi><mo id="S2.E3.m1.1.1.1.1.1.1.3.1f" xref="S2.E3.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.1.1.1.1.1.1.3.9" xref="S2.E3.m1.1.1.1.1.1.1.3.9.cmml">o</mi><mo id="S2.E3.m1.1.1.1.1.1.1.3.1g" xref="S2.E3.m1.1.1.1.1.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.1.1.1.1.1.1.3.10" xref="S2.E3.m1.1.1.1.1.1.1.3.10.cmml">n</mi></mrow></msub></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E3.m1.10.10.10c" xref="S2.E3.m1.10.11.3.1.cmml"><mtext id="S2.E3.m1.2.2.2.2.2.1" xref="S2.E3.m1.2.2.2.2.2.1a.cmml">if collision</mtext></mtd></mtr><mtr id="S2.E3.m1.10.10.10d" xref="S2.E3.m1.10.11.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E3.m1.10.10.10e" xref="S2.E3.m1.10.11.3.1.cmml"><msub id="S2.E3.m1.3.3.3.3.1.1" xref="S2.E3.m1.3.3.3.3.1.1.cmml"><mi id="S2.E3.m1.3.3.3.3.1.1.2" xref="S2.E3.m1.3.3.3.3.1.1.2.cmml">r</mi><mrow id="S2.E3.m1.3.3.3.3.1.1.3" xref="S2.E3.m1.3.3.3.3.1.1.3.cmml"><mi id="S2.E3.m1.3.3.3.3.1.1.3.2" xref="S2.E3.m1.3.3.3.3.1.1.3.2.cmml">c</mi><mo id="S2.E3.m1.3.3.3.3.1.1.3.1" xref="S2.E3.m1.3.3.3.3.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.3.3.3.3.1.1.3.3" xref="S2.E3.m1.3.3.3.3.1.1.3.3.cmml">h</mi><mo id="S2.E3.m1.3.3.3.3.1.1.3.1a" xref="S2.E3.m1.3.3.3.3.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.3.3.3.3.1.1.3.4" xref="S2.E3.m1.3.3.3.3.1.1.3.4.cmml">e</mi><mo id="S2.E3.m1.3.3.3.3.1.1.3.1b" xref="S2.E3.m1.3.3.3.3.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.3.3.3.3.1.1.3.5" xref="S2.E3.m1.3.3.3.3.1.1.3.5.cmml">c</mi><mo id="S2.E3.m1.3.3.3.3.1.1.3.1c" xref="S2.E3.m1.3.3.3.3.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.3.3.3.3.1.1.3.6" xref="S2.E3.m1.3.3.3.3.1.1.3.6.cmml">k</mi><mo id="S2.E3.m1.3.3.3.3.1.1.3.1d" xref="S2.E3.m1.3.3.3.3.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.3.3.3.3.1.1.3.7" xref="S2.E3.m1.3.3.3.3.1.1.3.7.cmml">p</mi><mo id="S2.E3.m1.3.3.3.3.1.1.3.1e" xref="S2.E3.m1.3.3.3.3.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.3.3.3.3.1.1.3.8" xref="S2.E3.m1.3.3.3.3.1.1.3.8.cmml">o</mi><mo id="S2.E3.m1.3.3.3.3.1.1.3.1f" xref="S2.E3.m1.3.3.3.3.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.3.3.3.3.1.1.3.9" xref="S2.E3.m1.3.3.3.3.1.1.3.9.cmml">i</mi><mo id="S2.E3.m1.3.3.3.3.1.1.3.1g" xref="S2.E3.m1.3.3.3.3.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.3.3.3.3.1.1.3.10" xref="S2.E3.m1.3.3.3.3.1.1.3.10.cmml">n</mi><mo id="S2.E3.m1.3.3.3.3.1.1.3.1h" xref="S2.E3.m1.3.3.3.3.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.3.3.3.3.1.1.3.11" xref="S2.E3.m1.3.3.3.3.1.1.3.11.cmml">t</mi></mrow></msub></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E3.m1.10.10.10f" xref="S2.E3.m1.10.11.3.1.cmml"><mtext id="S2.E3.m1.4.4.4.4.2.1" xref="S2.E3.m1.4.4.4.4.2.1a.cmml">if checkpoint passed</mtext></mtd></mtr><mtr id="S2.E3.m1.10.10.10g" xref="S2.E3.m1.10.11.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E3.m1.10.10.10h" xref="S2.E3.m1.10.11.3.1.cmml"><msub id="S2.E3.m1.5.5.5.5.1.1" xref="S2.E3.m1.5.5.5.5.1.1.cmml"><mi id="S2.E3.m1.5.5.5.5.1.1.2" xref="S2.E3.m1.5.5.5.5.1.1.2.cmml">r</mi><mrow id="S2.E3.m1.5.5.5.5.1.1.3" xref="S2.E3.m1.5.5.5.5.1.1.3.cmml"><mi id="S2.E3.m1.5.5.5.5.1.1.3.2" xref="S2.E3.m1.5.5.5.5.1.1.3.2.cmml">l</mi><mo id="S2.E3.m1.5.5.5.5.1.1.3.1" xref="S2.E3.m1.5.5.5.5.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.5.5.5.5.1.1.3.3" xref="S2.E3.m1.5.5.5.5.1.1.3.3.cmml">a</mi><mo id="S2.E3.m1.5.5.5.5.1.1.3.1a" xref="S2.E3.m1.5.5.5.5.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.5.5.5.5.1.1.3.4" xref="S2.E3.m1.5.5.5.5.1.1.3.4.cmml">p</mi></mrow></msub></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E3.m1.10.10.10i" xref="S2.E3.m1.10.11.3.1.cmml"><mtext id="S2.E3.m1.6.6.6.6.2.1" xref="S2.E3.m1.6.6.6.6.2.1a.cmml">if lap completed</mtext></mtd></mtr><mtr id="S2.E3.m1.10.10.10j" xref="S2.E3.m1.10.11.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E3.m1.10.10.10k" xref="S2.E3.m1.10.11.3.1.cmml"><msub id="S2.E3.m1.7.7.7.7.1.1" xref="S2.E3.m1.7.7.7.7.1.1.cmml"><mi id="S2.E3.m1.7.7.7.7.1.1.2" xref="S2.E3.m1.7.7.7.7.1.1.2.cmml">r</mi><mrow id="S2.E3.m1.7.7.7.7.1.1.3" xref="S2.E3.m1.7.7.7.7.1.1.3.cmml"><mi id="S2.E3.m1.7.7.7.7.1.1.3.2" xref="S2.E3.m1.7.7.7.7.1.1.3.2.cmml">b</mi><mo id="S2.E3.m1.7.7.7.7.1.1.3.1" xref="S2.E3.m1.7.7.7.7.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.7.7.7.7.1.1.3.3" xref="S2.E3.m1.7.7.7.7.1.1.3.3.cmml">e</mi><mo id="S2.E3.m1.7.7.7.7.1.1.3.1a" xref="S2.E3.m1.7.7.7.7.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.7.7.7.7.1.1.3.4" xref="S2.E3.m1.7.7.7.7.1.1.3.4.cmml">s</mi><mo id="S2.E3.m1.7.7.7.7.1.1.3.1b" xref="S2.E3.m1.7.7.7.7.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.7.7.7.7.1.1.3.5" xref="S2.E3.m1.7.7.7.7.1.1.3.5.cmml">t</mi><mo id="S2.E3.m1.7.7.7.7.1.1.3.1c" lspace="0.220em" xref="S2.E3.m1.7.7.7.7.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.7.7.7.7.1.1.3.6" xref="S2.E3.m1.7.7.7.7.1.1.3.6.cmml">l</mi><mo id="S2.E3.m1.7.7.7.7.1.1.3.1d" xref="S2.E3.m1.7.7.7.7.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.7.7.7.7.1.1.3.7" xref="S2.E3.m1.7.7.7.7.1.1.3.7.cmml">a</mi><mo id="S2.E3.m1.7.7.7.7.1.1.3.1e" xref="S2.E3.m1.7.7.7.7.1.1.3.1.cmml">⁢</mo><mi id="S2.E3.m1.7.7.7.7.1.1.3.8" xref="S2.E3.m1.7.7.7.7.1.1.3.8.cmml">p</mi></mrow></msub></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E3.m1.10.10.10l" xref="S2.E3.m1.10.11.3.1.cmml"><mtext id="S2.E3.m1.8.8.8.8.2.1" xref="S2.E3.m1.8.8.8.8.2.1a.cmml">if best lap time</mtext></mtd></mtr><mtr id="S2.E3.m1.10.10.10m" xref="S2.E3.m1.10.11.3.1.cmml"><mtd class="ltx_align_left" columnalign="left" id="S2.E3.m1.10.10.10n" xref="S2.E3.m1.10.11.3.1.cmml"><mrow id="S2.E3.m1.9.9.9.9.1.1" xref="S2.E3.m1.9.9.9.9.1.1.cmml"><mn id="S2.E3.m1.9.9.9.9.1.1.2" xref="S2.E3.m1.9.9.9.9.1.1.2.cmml">0.01</mn><mo id="S2.E3.m1.9.9.9.9.1.1.1" lspace="0.222em" rspace="0.222em" xref="S2.E3.m1.9.9.9.9.1.1.1.cmml">∗</mo><msubsup id="S2.E3.m1.9.9.9.9.1.1.3" xref="S2.E3.m1.9.9.9.9.1.1.3.cmml"><mi id="S2.E3.m1.9.9.9.9.1.1.3.2.2" xref="S2.E3.m1.9.9.9.9.1.1.3.2.2.cmml">v</mi><mi id="S2.E3.m1.9.9.9.9.1.1.3.2.3" xref="S2.E3.m1.9.9.9.9.1.1.3.2.3.cmml">t</mi><mi id="S2.E3.m1.9.9.9.9.1.1.3.3" xref="S2.E3.m1.9.9.9.9.1.1.3.3.cmml">i</mi></msubsup></mrow></mtd><mtd class="ltx_align_left" columnalign="left" id="S2.E3.m1.10.10.10o" xref="S2.E3.m1.10.11.3.1.cmml"><mtext id="S2.E3.m1.10.10.10.10.2.1" xref="S2.E3.m1.10.10.10.10.2.1a.cmml">otherwise</mtext></mtd></mtr></mtable></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E3.m1.10b"><apply id="S2.E3.m1.10.11.cmml" xref="S2.E3.m1.10.11"><eq id="S2.E3.m1.10.11.1.cmml" xref="S2.E3.m1.10.11.1"></eq><apply id="S2.E3.m1.10.11.2.cmml" xref="S2.E3.m1.10.11.2"><csymbol cd="ambiguous" id="S2.E3.m1.10.11.2.1.cmml" xref="S2.E3.m1.10.11.2">superscript</csymbol><apply id="S2.E3.m1.10.11.2.2.cmml" xref="S2.E3.m1.10.11.2"><csymbol cd="ambiguous" id="S2.E3.m1.10.11.2.2.1.cmml" xref="S2.E3.m1.10.11.2">superscript</csymbol><apply id="S2.E3.m1.10.11.2.2.2.cmml" xref="S2.E3.m1.10.11.2"><csymbol cd="ambiguous" id="S2.E3.m1.10.11.2.2.2.1.cmml" xref="S2.E3.m1.10.11.2">subscript</csymbol><ci id="S2.E3.m1.10.11.2.2.2.2.cmml" xref="S2.E3.m1.10.11.2.2.2.2">𝑟</ci><ci id="S2.E3.m1.10.11.2.2.2.3.cmml" xref="S2.E3.m1.10.11.2.2.2.3">𝑡</ci></apply><ci id="S2.E3.m1.10.11.2.2.3.cmml" xref="S2.E3.m1.10.11.2.2.3">𝑖</ci></apply><ci id="S2.E3.m1.10.11.2.3.cmml" xref="S2.E3.m1.10.11.2.3">𝑒</ci></apply><apply id="S2.E3.m1.10.11.3.1.cmml" xref="S2.E3.m1.10.10"><csymbol cd="latexml" id="S2.E3.m1.10.11.3.1.1.cmml" xref="S2.E3.m1.10.10.11">cases</csymbol><apply id="S2.E3.m1.1.1.1.1.1.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E3.m1.1.1.1.1.1.1.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1">subscript</csymbol><ci id="S2.E3.m1.1.1.1.1.1.1.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.2">𝑟</ci><apply id="S2.E3.m1.1.1.1.1.1.1.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3"><times id="S2.E3.m1.1.1.1.1.1.1.3.1.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.1"></times><ci id="S2.E3.m1.1.1.1.1.1.1.3.2.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.2">𝑐</ci><ci id="S2.E3.m1.1.1.1.1.1.1.3.3.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.3">𝑜</ci><ci id="S2.E3.m1.1.1.1.1.1.1.3.4.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.4">𝑙</ci><ci id="S2.E3.m1.1.1.1.1.1.1.3.5.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.5">𝑙</ci><ci id="S2.E3.m1.1.1.1.1.1.1.3.6.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.6">𝑖</ci><ci id="S2.E3.m1.1.1.1.1.1.1.3.7.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.7">𝑠</ci><ci id="S2.E3.m1.1.1.1.1.1.1.3.8.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.8">𝑖</ci><ci id="S2.E3.m1.1.1.1.1.1.1.3.9.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.9">𝑜</ci><ci id="S2.E3.m1.1.1.1.1.1.1.3.10.cmml" xref="S2.E3.m1.1.1.1.1.1.1.3.10">𝑛</ci></apply></apply><ci id="S2.E3.m1.2.2.2.2.2.1a.cmml" xref="S2.E3.m1.2.2.2.2.2.1"><mtext id="S2.E3.m1.2.2.2.2.2.1.cmml" xref="S2.E3.m1.2.2.2.2.2.1">if collision</mtext></ci><apply id="S2.E3.m1.3.3.3.3.1.1.cmml" xref="S2.E3.m1.3.3.3.3.1.1"><csymbol cd="ambiguous" id="S2.E3.m1.3.3.3.3.1.1.1.cmml" xref="S2.E3.m1.3.3.3.3.1.1">subscript</csymbol><ci id="S2.E3.m1.3.3.3.3.1.1.2.cmml" xref="S2.E3.m1.3.3.3.3.1.1.2">𝑟</ci><apply id="S2.E3.m1.3.3.3.3.1.1.3.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3"><times id="S2.E3.m1.3.3.3.3.1.1.3.1.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3.1"></times><ci id="S2.E3.m1.3.3.3.3.1.1.3.2.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3.2">𝑐</ci><ci id="S2.E3.m1.3.3.3.3.1.1.3.3.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3.3">ℎ</ci><ci id="S2.E3.m1.3.3.3.3.1.1.3.4.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3.4">𝑒</ci><ci id="S2.E3.m1.3.3.3.3.1.1.3.5.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3.5">𝑐</ci><ci id="S2.E3.m1.3.3.3.3.1.1.3.6.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3.6">𝑘</ci><ci id="S2.E3.m1.3.3.3.3.1.1.3.7.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3.7">𝑝</ci><ci id="S2.E3.m1.3.3.3.3.1.1.3.8.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3.8">𝑜</ci><ci id="S2.E3.m1.3.3.3.3.1.1.3.9.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3.9">𝑖</ci><ci id="S2.E3.m1.3.3.3.3.1.1.3.10.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3.10">𝑛</ci><ci id="S2.E3.m1.3.3.3.3.1.1.3.11.cmml" xref="S2.E3.m1.3.3.3.3.1.1.3.11">𝑡</ci></apply></apply><ci id="S2.E3.m1.4.4.4.4.2.1a.cmml" xref="S2.E3.m1.4.4.4.4.2.1"><mtext id="S2.E3.m1.4.4.4.4.2.1.cmml" xref="S2.E3.m1.4.4.4.4.2.1">if checkpoint passed</mtext></ci><apply id="S2.E3.m1.5.5.5.5.1.1.cmml" xref="S2.E3.m1.5.5.5.5.1.1"><csymbol cd="ambiguous" id="S2.E3.m1.5.5.5.5.1.1.1.cmml" xref="S2.E3.m1.5.5.5.5.1.1">subscript</csymbol><ci id="S2.E3.m1.5.5.5.5.1.1.2.cmml" xref="S2.E3.m1.5.5.5.5.1.1.2">𝑟</ci><apply id="S2.E3.m1.5.5.5.5.1.1.3.cmml" xref="S2.E3.m1.5.5.5.5.1.1.3"><times id="S2.E3.m1.5.5.5.5.1.1.3.1.cmml" xref="S2.E3.m1.5.5.5.5.1.1.3.1"></times><ci id="S2.E3.m1.5.5.5.5.1.1.3.2.cmml" xref="S2.E3.m1.5.5.5.5.1.1.3.2">𝑙</ci><ci id="S2.E3.m1.5.5.5.5.1.1.3.3.cmml" xref="S2.E3.m1.5.5.5.5.1.1.3.3">𝑎</ci><ci id="S2.E3.m1.5.5.5.5.1.1.3.4.cmml" xref="S2.E3.m1.5.5.5.5.1.1.3.4">𝑝</ci></apply></apply><ci id="S2.E3.m1.6.6.6.6.2.1a.cmml" xref="S2.E3.m1.6.6.6.6.2.1"><mtext id="S2.E3.m1.6.6.6.6.2.1.cmml" xref="S2.E3.m1.6.6.6.6.2.1">if lap completed</mtext></ci><apply id="S2.E3.m1.7.7.7.7.1.1.cmml" xref="S2.E3.m1.7.7.7.7.1.1"><csymbol cd="ambiguous" id="S2.E3.m1.7.7.7.7.1.1.1.cmml" xref="S2.E3.m1.7.7.7.7.1.1">subscript</csymbol><ci id="S2.E3.m1.7.7.7.7.1.1.2.cmml" xref="S2.E3.m1.7.7.7.7.1.1.2">𝑟</ci><apply id="S2.E3.m1.7.7.7.7.1.1.3.cmml" xref="S2.E3.m1.7.7.7.7.1.1.3"><times id="S2.E3.m1.7.7.7.7.1.1.3.1.cmml" xref="S2.E3.m1.7.7.7.7.1.1.3.1"></times><ci id="S2.E3.m1.7.7.7.7.1.1.3.2.cmml" xref="S2.E3.m1.7.7.7.7.1.1.3.2">𝑏</ci><ci id="S2.E3.m1.7.7.7.7.1.1.3.3.cmml" xref="S2.E3.m1.7.7.7.7.1.1.3.3">𝑒</ci><ci id="S2.E3.m1.7.7.7.7.1.1.3.4.cmml" xref="S2.E3.m1.7.7.7.7.1.1.3.4">𝑠</ci><ci id="S2.E3.m1.7.7.7.7.1.1.3.5.cmml" xref="S2.E3.m1.7.7.7.7.1.1.3.5">𝑡</ci><ci id="S2.E3.m1.7.7.7.7.1.1.3.6.cmml" xref="S2.E3.m1.7.7.7.7.1.1.3.6">𝑙</ci><ci id="S2.E3.m1.7.7.7.7.1.1.3.7.cmml" xref="S2.E3.m1.7.7.7.7.1.1.3.7">𝑎</ci><ci id="S2.E3.m1.7.7.7.7.1.1.3.8.cmml" xref="S2.E3.m1.7.7.7.7.1.1.3.8">𝑝</ci></apply></apply><ci id="S2.E3.m1.8.8.8.8.2.1a.cmml" xref="S2.E3.m1.8.8.8.8.2.1"><mtext id="S2.E3.m1.8.8.8.8.2.1.cmml" xref="S2.E3.m1.8.8.8.8.2.1">if best lap time</mtext></ci><apply id="S2.E3.m1.9.9.9.9.1.1.cmml" xref="S2.E3.m1.9.9.9.9.1.1"><times id="S2.E3.m1.9.9.9.9.1.1.1.cmml" xref="S2.E3.m1.9.9.9.9.1.1.1"></times><cn id="S2.E3.m1.9.9.9.9.1.1.2.cmml" type="float" xref="S2.E3.m1.9.9.9.9.1.1.2">0.01</cn><apply id="S2.E3.m1.9.9.9.9.1.1.3.cmml" xref="S2.E3.m1.9.9.9.9.1.1.3"><csymbol cd="ambiguous" id="S2.E3.m1.9.9.9.9.1.1.3.1.cmml" xref="S2.E3.m1.9.9.9.9.1.1.3">superscript</csymbol><apply id="S2.E3.m1.9.9.9.9.1.1.3.2.cmml" xref="S2.E3.m1.9.9.9.9.1.1.3"><csymbol cd="ambiguous" id="S2.E3.m1.9.9.9.9.1.1.3.2.1.cmml" xref="S2.E3.m1.9.9.9.9.1.1.3">subscript</csymbol><ci id="S2.E3.m1.9.9.9.9.1.1.3.2.2.cmml" xref="S2.E3.m1.9.9.9.9.1.1.3.2.2">𝑣</ci><ci id="S2.E3.m1.9.9.9.9.1.1.3.2.3.cmml" xref="S2.E3.m1.9.9.9.9.1.1.3.2.3">𝑡</ci></apply><ci id="S2.E3.m1.9.9.9.9.1.1.3.3.cmml" xref="S2.E3.m1.9.9.9.9.1.1.3.3">𝑖</ci></apply></apply><ci id="S2.E3.m1.10.10.10.10.2.1a.cmml" xref="S2.E3.m1.10.10.10.10.2.1"><mtext id="S2.E3.m1.10.10.10.10.2.1.cmml" xref="S2.E3.m1.10.10.10.10.2.1">otherwise</mtext></ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E3.m1.10c">{}^{e}r_{t}^{i}=\begin{cases}r_{collision}&amp;\text{if collision}\\ r_{checkpoint}&amp;\text{if checkpoint passed}\\ r_{lap}&amp;\text{if lap completed}\\ r_{best\&gt;lap}&amp;\text{if best lap time}\\ 0.01*v_{t}^{i}&amp;\text{otherwise}\end{cases}</annotation><annotation encoding="application/x-llamapun" id="S2.E3.m1.10d">start_FLOATSUPERSCRIPT italic_e end_FLOATSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = { start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_c italic_o italic_l italic_l italic_i italic_s italic_i italic_o italic_n end_POSTSUBSCRIPT end_CELL start_CELL if collision end_CELL end_ROW start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_c italic_h italic_e italic_c italic_k italic_p italic_o italic_i italic_n italic_t end_POSTSUBSCRIPT end_CELL start_CELL if checkpoint passed end_CELL end_ROW start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_l italic_a italic_p end_POSTSUBSCRIPT end_CELL start_CELL if lap completed end_CELL end_ROW start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_b italic_e italic_s italic_t italic_l italic_a italic_p end_POSTSUBSCRIPT end_CELL start_CELL if best lap time end_CELL end_ROW start_ROW start_CELL 0.01 ∗ italic_v start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT end_CELL start_CELL otherwise end_CELL end_ROW</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(3)</span></td> </tr></tbody> </table> </div> </li> </ul> </div> </section> <section class="ltx_subsubsection" id="S2.SS2.SSS4"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S2.SS2.SSS4.5.1.1">II-B</span>4 </span>Optimization Problem</h4> <div class="ltx_para" id="S2.SS2.SSS4.p1"> <p class="ltx_p" id="S2.SS2.SSS4.p1.1">The multi-objective problem of maximizing the expected future discounted reward while minimizing the behavioral cloning loss <math alttext="\mathcal{L}_{BC}" class="ltx_Math" display="inline" id="S2.SS2.SSS4.p1.1.m1.1"><semantics id="S2.SS2.SSS4.p1.1.m1.1a"><msub id="S2.SS2.SSS4.p1.1.m1.1.1" xref="S2.SS2.SSS4.p1.1.m1.1.1.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.SS2.SSS4.p1.1.m1.1.1.2" xref="S2.SS2.SSS4.p1.1.m1.1.1.2.cmml">ℒ</mi><mrow id="S2.SS2.SSS4.p1.1.m1.1.1.3" xref="S2.SS2.SSS4.p1.1.m1.1.1.3.cmml"><mi id="S2.SS2.SSS4.p1.1.m1.1.1.3.2" xref="S2.SS2.SSS4.p1.1.m1.1.1.3.2.cmml">B</mi><mo id="S2.SS2.SSS4.p1.1.m1.1.1.3.1" xref="S2.SS2.SSS4.p1.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S2.SS2.SSS4.p1.1.m1.1.1.3.3" xref="S2.SS2.SSS4.p1.1.m1.1.1.3.3.cmml">C</mi></mrow></msub><annotation-xml encoding="MathML-Content" id="S2.SS2.SSS4.p1.1.m1.1b"><apply id="S2.SS2.SSS4.p1.1.m1.1.1.cmml" xref="S2.SS2.SSS4.p1.1.m1.1.1"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.1.m1.1.1.1.cmml" xref="S2.SS2.SSS4.p1.1.m1.1.1">subscript</csymbol><ci id="S2.SS2.SSS4.p1.1.m1.1.1.2.cmml" xref="S2.SS2.SSS4.p1.1.m1.1.1.2">ℒ</ci><apply id="S2.SS2.SSS4.p1.1.m1.1.1.3.cmml" xref="S2.SS2.SSS4.p1.1.m1.1.1.3"><times id="S2.SS2.SSS4.p1.1.m1.1.1.3.1.cmml" xref="S2.SS2.SSS4.p1.1.m1.1.1.3.1"></times><ci id="S2.SS2.SSS4.p1.1.m1.1.1.3.2.cmml" xref="S2.SS2.SSS4.p1.1.m1.1.1.3.2">𝐵</ci><ci id="S2.SS2.SSS4.p1.1.m1.1.1.3.3.cmml" xref="S2.SS2.SSS4.p1.1.m1.1.1.3.3">𝐶</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.SSS4.p1.1.m1.1c">\mathcal{L}_{BC}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.SSS4.p1.1.m1.1d">caligraphic_L start_POSTSUBSCRIPT italic_B italic_C end_POSTSUBSCRIPT</annotation></semantics></math> is defined as:</p> <table class="ltx_equationgroup ltx_eqn_align ltx_eqn_table" id="S5.EGx2"> <tbody id="S2.E4"><tr class="ltx_equation ltx_eqn_row ltx_align_baseline"> <td class="ltx_eqn_cell ltx_eqn_center_padleft"></td> <td class="ltx_td ltx_align_right ltx_eqn_cell"><math alttext="\displaystyle\operatorname*{\arg\!\max}_{\pi^{i}_{\theta}\left(a_{t}|o_{t}% \right)}\quad" class="ltx_Math" display="inline" id="S2.E4.m1.2"><semantics id="S2.E4.m1.2a"><mrow id="S2.E4.m1.2.2.1" xref="S2.E4.m1.2.2.1.1.cmml"><munder id="S2.E4.m1.2.2.1.1" xref="S2.E4.m1.2.2.1.1.cmml"><mrow id="S2.E4.m1.2.2.1.1.2" xref="S2.E4.m1.2.2.1.1.2.cmml"><mi id="S2.E4.m1.2.2.1.1.2.1" xref="S2.E4.m1.2.2.1.1.2.1.cmml">arg</mi><mo id="S2.E4.m1.2.2.1.1.2a" xref="S2.E4.m1.2.2.1.1.2.cmml">⁡</mo><mi id="S2.E4.m1.2.2.1.1.2.2" xref="S2.E4.m1.2.2.1.1.2.2.cmml">max</mi></mrow><mrow id="S2.E4.m1.1.1.1" xref="S2.E4.m1.1.1.1.cmml"><msubsup id="S2.E4.m1.1.1.1.3" xref="S2.E4.m1.1.1.1.3.cmml"><mi id="S2.E4.m1.1.1.1.3.2.2" xref="S2.E4.m1.1.1.1.3.2.2.cmml">π</mi><mi id="S2.E4.m1.1.1.1.3.3" xref="S2.E4.m1.1.1.1.3.3.cmml">θ</mi><mi id="S2.E4.m1.1.1.1.3.2.3" xref="S2.E4.m1.1.1.1.3.2.3.cmml">i</mi></msubsup><mo id="S2.E4.m1.1.1.1.2" xref="S2.E4.m1.1.1.1.2.cmml">⁢</mo><mrow id="S2.E4.m1.1.1.1.1.1" xref="S2.E4.m1.1.1.1.1.1.1.cmml"><mo id="S2.E4.m1.1.1.1.1.1.2" xref="S2.E4.m1.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.E4.m1.1.1.1.1.1.1" xref="S2.E4.m1.1.1.1.1.1.1.cmml"><msub id="S2.E4.m1.1.1.1.1.1.1.2" xref="S2.E4.m1.1.1.1.1.1.1.2.cmml"><mi id="S2.E4.m1.1.1.1.1.1.1.2.2" xref="S2.E4.m1.1.1.1.1.1.1.2.2.cmml">a</mi><mi id="S2.E4.m1.1.1.1.1.1.1.2.3" xref="S2.E4.m1.1.1.1.1.1.1.2.3.cmml">t</mi></msub><mo fence="false" id="S2.E4.m1.1.1.1.1.1.1.1" xref="S2.E4.m1.1.1.1.1.1.1.1.cmml">|</mo><msub id="S2.E4.m1.1.1.1.1.1.1.3" xref="S2.E4.m1.1.1.1.1.1.1.3.cmml"><mi id="S2.E4.m1.1.1.1.1.1.1.3.2" xref="S2.E4.m1.1.1.1.1.1.1.3.2.cmml">o</mi><mi id="S2.E4.m1.1.1.1.1.1.1.3.3" xref="S2.E4.m1.1.1.1.1.1.1.3.3.cmml">t</mi></msub></mrow><mo id="S2.E4.m1.1.1.1.1.1.3" xref="S2.E4.m1.1.1.1.1.1.1.cmml">)</mo></mrow></mrow></munder><mspace id="S2.E4.m1.2.2.1.2" width="1.167em" xref="S2.E4.m1.2.2.1.1.cmml"></mspace></mrow><annotation-xml encoding="MathML-Content" id="S2.E4.m1.2b"><apply id="S2.E4.m1.2.2.1.1.cmml" xref="S2.E4.m1.2.2.1"><csymbol cd="ambiguous" id="S2.E4.m1.2.2.1.1.1.cmml" xref="S2.E4.m1.2.2.1">subscript</csymbol><apply id="S2.E4.m1.2.2.1.1.2.cmml" xref="S2.E4.m1.2.2.1.1.2"><arg id="S2.E4.m1.2.2.1.1.2.1.cmml" xref="S2.E4.m1.2.2.1.1.2.1"></arg><max id="S2.E4.m1.2.2.1.1.2.2.cmml" xref="S2.E4.m1.2.2.1.1.2.2"></max></apply><apply id="S2.E4.m1.1.1.1.cmml" xref="S2.E4.m1.1.1.1"><times id="S2.E4.m1.1.1.1.2.cmml" xref="S2.E4.m1.1.1.1.2"></times><apply id="S2.E4.m1.1.1.1.3.cmml" xref="S2.E4.m1.1.1.1.3"><csymbol cd="ambiguous" id="S2.E4.m1.1.1.1.3.1.cmml" xref="S2.E4.m1.1.1.1.3">subscript</csymbol><apply id="S2.E4.m1.1.1.1.3.2.cmml" xref="S2.E4.m1.1.1.1.3"><csymbol cd="ambiguous" id="S2.E4.m1.1.1.1.3.2.1.cmml" xref="S2.E4.m1.1.1.1.3">superscript</csymbol><ci id="S2.E4.m1.1.1.1.3.2.2.cmml" xref="S2.E4.m1.1.1.1.3.2.2">𝜋</ci><ci id="S2.E4.m1.1.1.1.3.2.3.cmml" xref="S2.E4.m1.1.1.1.3.2.3">𝑖</ci></apply><ci id="S2.E4.m1.1.1.1.3.3.cmml" xref="S2.E4.m1.1.1.1.3.3">𝜃</ci></apply><apply id="S2.E4.m1.1.1.1.1.1.1.cmml" xref="S2.E4.m1.1.1.1.1.1"><csymbol cd="latexml" id="S2.E4.m1.1.1.1.1.1.1.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.1">conditional</csymbol><apply id="S2.E4.m1.1.1.1.1.1.1.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2"><csymbol cd="ambiguous" id="S2.E4.m1.1.1.1.1.1.1.2.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2">subscript</csymbol><ci id="S2.E4.m1.1.1.1.1.1.1.2.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.2">𝑎</ci><ci id="S2.E4.m1.1.1.1.1.1.1.2.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.2.3">𝑡</ci></apply><apply id="S2.E4.m1.1.1.1.1.1.1.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3"><csymbol cd="ambiguous" id="S2.E4.m1.1.1.1.1.1.1.3.1.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3">subscript</csymbol><ci id="S2.E4.m1.1.1.1.1.1.1.3.2.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.2">𝑜</ci><ci id="S2.E4.m1.1.1.1.1.1.1.3.3.cmml" xref="S2.E4.m1.1.1.1.1.1.1.3.3">𝑡</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E4.m1.2c">\displaystyle\operatorname*{\arg\!\max}_{\pi^{i}_{\theta}\left(a_{t}|o_{t}% \right)}\quad</annotation><annotation encoding="application/x-llamapun" id="S2.E4.m1.2d">start_OPERATOR roman_arg roman_max end_OPERATOR start_POSTSUBSCRIPT italic_π start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_a start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | italic_o start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_td ltx_align_left ltx_eqn_cell"><math alttext="\displaystyle\eta\left(\mathbb{E}\left[\sum_{t=0}^{\infty}\gamma^{t}r^{i}_{t}% \right]\right)-(1-\eta)\mathcal{L}_{BC}" class="ltx_Math" display="inline" id="S2.E4.m2.2"><semantics id="S2.E4.m2.2a"><mrow id="S2.E4.m2.2.2" xref="S2.E4.m2.2.2.cmml"><mrow id="S2.E4.m2.1.1.1" xref="S2.E4.m2.1.1.1.cmml"><mi id="S2.E4.m2.1.1.1.3" xref="S2.E4.m2.1.1.1.3.cmml">η</mi><mo id="S2.E4.m2.1.1.1.2" xref="S2.E4.m2.1.1.1.2.cmml">⁢</mo><mrow id="S2.E4.m2.1.1.1.1.1" xref="S2.E4.m2.1.1.1.1.1.1.cmml"><mo id="S2.E4.m2.1.1.1.1.1.2" xref="S2.E4.m2.1.1.1.1.1.1.cmml">(</mo><mrow id="S2.E4.m2.1.1.1.1.1.1" xref="S2.E4.m2.1.1.1.1.1.1.cmml"><mi id="S2.E4.m2.1.1.1.1.1.1.3" xref="S2.E4.m2.1.1.1.1.1.1.3.cmml">𝔼</mi><mo id="S2.E4.m2.1.1.1.1.1.1.2" xref="S2.E4.m2.1.1.1.1.1.1.2.cmml">⁢</mo><mrow id="S2.E4.m2.1.1.1.1.1.1.1.1" xref="S2.E4.m2.1.1.1.1.1.1.1.2.cmml"><mo id="S2.E4.m2.1.1.1.1.1.1.1.1.2" xref="S2.E4.m2.1.1.1.1.1.1.1.2.1.cmml">[</mo><mrow id="S2.E4.m2.1.1.1.1.1.1.1.1.1" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.cmml"><mstyle displaystyle="true" id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.cmml"><munderover id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1a" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.cmml"><mo id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.2" movablelimits="false" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.2.cmml">∑</mo><mrow id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.cmml"><mi id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.2" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml">t</mi><mo id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.1" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml">=</mo><mn id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.3" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml">0</mn></mrow><mi id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.3" mathvariant="normal" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.3.cmml">∞</mi></munderover></mstyle><mrow id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.cmml"><msup id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2.cmml"><mi id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2.2" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2.2.cmml">γ</mi><mi id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2.3" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2.3.cmml">t</mi></msup><mo id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.1" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.1.cmml">⁢</mo><msubsup id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.cmml"><mi id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.2.2" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.2.2.cmml">r</mi><mi id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.3" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.3.cmml">t</mi><mi id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.2.3" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.2.3.cmml">i</mi></msubsup></mrow></mrow><mo id="S2.E4.m2.1.1.1.1.1.1.1.1.3" xref="S2.E4.m2.1.1.1.1.1.1.1.2.1.cmml">]</mo></mrow></mrow><mo id="S2.E4.m2.1.1.1.1.1.3" xref="S2.E4.m2.1.1.1.1.1.1.cmml">)</mo></mrow></mrow><mo id="S2.E4.m2.2.2.3" xref="S2.E4.m2.2.2.3.cmml">−</mo><mrow id="S2.E4.m2.2.2.2" xref="S2.E4.m2.2.2.2.cmml"><mrow id="S2.E4.m2.2.2.2.1.1" xref="S2.E4.m2.2.2.2.1.1.1.cmml"><mo id="S2.E4.m2.2.2.2.1.1.2" stretchy="false" xref="S2.E4.m2.2.2.2.1.1.1.cmml">(</mo><mrow id="S2.E4.m2.2.2.2.1.1.1" xref="S2.E4.m2.2.2.2.1.1.1.cmml"><mn id="S2.E4.m2.2.2.2.1.1.1.2" xref="S2.E4.m2.2.2.2.1.1.1.2.cmml">1</mn><mo id="S2.E4.m2.2.2.2.1.1.1.1" xref="S2.E4.m2.2.2.2.1.1.1.1.cmml">−</mo><mi id="S2.E4.m2.2.2.2.1.1.1.3" xref="S2.E4.m2.2.2.2.1.1.1.3.cmml">η</mi></mrow><mo id="S2.E4.m2.2.2.2.1.1.3" stretchy="false" xref="S2.E4.m2.2.2.2.1.1.1.cmml">)</mo></mrow><mo id="S2.E4.m2.2.2.2.2" xref="S2.E4.m2.2.2.2.2.cmml">⁢</mo><msub id="S2.E4.m2.2.2.2.3" xref="S2.E4.m2.2.2.2.3.cmml"><mi class="ltx_font_mathcaligraphic" id="S2.E4.m2.2.2.2.3.2" xref="S2.E4.m2.2.2.2.3.2.cmml">ℒ</mi><mrow id="S2.E4.m2.2.2.2.3.3" xref="S2.E4.m2.2.2.2.3.3.cmml"><mi id="S2.E4.m2.2.2.2.3.3.2" xref="S2.E4.m2.2.2.2.3.3.2.cmml">B</mi><mo id="S2.E4.m2.2.2.2.3.3.1" xref="S2.E4.m2.2.2.2.3.3.1.cmml">⁢</mo><mi id="S2.E4.m2.2.2.2.3.3.3" xref="S2.E4.m2.2.2.2.3.3.3.cmml">C</mi></mrow></msub></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.E4.m2.2b"><apply id="S2.E4.m2.2.2.cmml" xref="S2.E4.m2.2.2"><minus id="S2.E4.m2.2.2.3.cmml" xref="S2.E4.m2.2.2.3"></minus><apply id="S2.E4.m2.1.1.1.cmml" xref="S2.E4.m2.1.1.1"><times id="S2.E4.m2.1.1.1.2.cmml" xref="S2.E4.m2.1.1.1.2"></times><ci id="S2.E4.m2.1.1.1.3.cmml" xref="S2.E4.m2.1.1.1.3">𝜂</ci><apply id="S2.E4.m2.1.1.1.1.1.1.cmml" xref="S2.E4.m2.1.1.1.1.1"><times id="S2.E4.m2.1.1.1.1.1.1.2.cmml" xref="S2.E4.m2.1.1.1.1.1.1.2"></times><ci id="S2.E4.m2.1.1.1.1.1.1.3.cmml" xref="S2.E4.m2.1.1.1.1.1.1.3">𝔼</ci><apply id="S2.E4.m2.1.1.1.1.1.1.1.2.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1"><csymbol cd="latexml" id="S2.E4.m2.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.2">delimited-[]</csymbol><apply id="S2.E4.m2.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1"><apply id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.1.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1">superscript</csymbol><apply id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1"><csymbol cd="ambiguous" id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1">subscript</csymbol><sum id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.2"></sum><apply id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3"><eq id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.1.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.1"></eq><ci id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.2.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.2">𝑡</ci><cn id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.3.cmml" type="integer" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.2.3.3">0</cn></apply></apply><infinity id="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.3.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.1.3"></infinity></apply><apply id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2"><times id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.1.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.1"></times><apply id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2"><csymbol cd="ambiguous" id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2.1.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2">superscript</csymbol><ci id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2.2.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2.2">𝛾</ci><ci id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2.3.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.2.3">𝑡</ci></apply><apply id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3"><csymbol cd="ambiguous" id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.1.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3">subscript</csymbol><apply id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.2.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3"><csymbol cd="ambiguous" id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.2.1.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3">superscript</csymbol><ci id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.2.2.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.2.2">𝑟</ci><ci id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.2.3.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.2.3">𝑖</ci></apply><ci id="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.3.cmml" xref="S2.E4.m2.1.1.1.1.1.1.1.1.1.2.3.3">𝑡</ci></apply></apply></apply></apply></apply></apply><apply id="S2.E4.m2.2.2.2.cmml" xref="S2.E4.m2.2.2.2"><times id="S2.E4.m2.2.2.2.2.cmml" xref="S2.E4.m2.2.2.2.2"></times><apply id="S2.E4.m2.2.2.2.1.1.1.cmml" xref="S2.E4.m2.2.2.2.1.1"><minus id="S2.E4.m2.2.2.2.1.1.1.1.cmml" xref="S2.E4.m2.2.2.2.1.1.1.1"></minus><cn id="S2.E4.m2.2.2.2.1.1.1.2.cmml" type="integer" xref="S2.E4.m2.2.2.2.1.1.1.2">1</cn><ci id="S2.E4.m2.2.2.2.1.1.1.3.cmml" xref="S2.E4.m2.2.2.2.1.1.1.3">𝜂</ci></apply><apply id="S2.E4.m2.2.2.2.3.cmml" xref="S2.E4.m2.2.2.2.3"><csymbol cd="ambiguous" id="S2.E4.m2.2.2.2.3.1.cmml" xref="S2.E4.m2.2.2.2.3">subscript</csymbol><ci id="S2.E4.m2.2.2.2.3.2.cmml" xref="S2.E4.m2.2.2.2.3.2">ℒ</ci><apply id="S2.E4.m2.2.2.2.3.3.cmml" xref="S2.E4.m2.2.2.2.3.3"><times id="S2.E4.m2.2.2.2.3.3.1.cmml" xref="S2.E4.m2.2.2.2.3.3.1"></times><ci id="S2.E4.m2.2.2.2.3.3.2.cmml" xref="S2.E4.m2.2.2.2.3.3.2">𝐵</ci><ci id="S2.E4.m2.2.2.2.3.3.3.cmml" xref="S2.E4.m2.2.2.2.3.3.3">𝐶</ci></apply></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.E4.m2.2c">\displaystyle\eta\left(\mathbb{E}\left[\sum_{t=0}^{\infty}\gamma^{t}r^{i}_{t}% \right]\right)-(1-\eta)\mathcal{L}_{BC}</annotation><annotation encoding="application/x-llamapun" id="S2.E4.m2.2d">italic_η ( blackboard_E [ ∑ start_POSTSUBSCRIPT italic_t = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT italic_γ start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_r start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ] ) - ( 1 - italic_η ) caligraphic_L start_POSTSUBSCRIPT italic_B italic_C end_POSTSUBSCRIPT</annotation></semantics></math></td> <td class="ltx_eqn_cell ltx_eqn_center_padright"></td> <td class="ltx_eqn_cell ltx_eqn_eqno ltx_align_middle ltx_align_right" rowspan="1"><span class="ltx_tag ltx_tag_equation ltx_align_right">(4)</span></td> </tr></tbody> </table> <p class="ltx_p" id="S2.SS2.SSS4.p1.3">where <math alttext="\eta" class="ltx_Math" display="inline" id="S2.SS2.SSS4.p1.2.m1.1"><semantics id="S2.SS2.SSS4.p1.2.m1.1a"><mi id="S2.SS2.SSS4.p1.2.m1.1.1" xref="S2.SS2.SSS4.p1.2.m1.1.1.cmml">η</mi><annotation-xml encoding="MathML-Content" id="S2.SS2.SSS4.p1.2.m1.1b"><ci id="S2.SS2.SSS4.p1.2.m1.1.1.cmml" xref="S2.SS2.SSS4.p1.2.m1.1.1">𝜂</ci></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.SSS4.p1.2.m1.1c">\eta</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.SSS4.p1.2.m1.1d">italic_η</annotation></semantics></math> weighs the degree of imitation and reinforcement learning updates, and <math alttext="r^{i}_{t}=\,^{g}r^{i}_{t}+^{c}r^{i}_{t}+^{e}r^{i}_{t}" class="ltx_Math" display="inline" id="S2.SS2.SSS4.p1.3.m2.1"><semantics id="S2.SS2.SSS4.p1.3.m2.1a"><mrow id="S2.SS2.SSS4.p1.3.m2.1.1" xref="S2.SS2.SSS4.p1.3.m2.1.1.cmml"><msubsup id="S2.SS2.SSS4.p1.3.m2.1.1.2" xref="S2.SS2.SSS4.p1.3.m2.1.1.2.cmml"><mi id="S2.SS2.SSS4.p1.3.m2.1.1.2.2.2" xref="S2.SS2.SSS4.p1.3.m2.1.1.2.2.2.cmml">r</mi><mi id="S2.SS2.SSS4.p1.3.m2.1.1.2.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.2.3.cmml">t</mi><mi id="S2.SS2.SSS4.p1.3.m2.1.1.2.2.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.2.2.3.cmml">i</mi></msubsup><msup id="S2.SS2.SSS4.p1.3.m2.1.1.1" xref="S2.SS2.SSS4.p1.3.m2.1.1.1.cmml"><mo id="S2.SS2.SSS4.p1.3.m2.1.1.1.2" rspace="0.448em" xref="S2.SS2.SSS4.p1.3.m2.1.1.1.2.cmml">=</mo><mi id="S2.SS2.SSS4.p1.3.m2.1.1.1.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.1.3.cmml">g</mi></msup><mrow id="S2.SS2.SSS4.p1.3.m2.1.1.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.cmml"><mrow id="S2.SS2.SSS4.p1.3.m2.1.1.3.2" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.cmml"><msubsup id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.cmml"><mi id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.2.2" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.2.2.cmml">r</mi><mi id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.3.cmml">t</mi><mi id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.2.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.2.3.cmml">i</mi></msubsup><msup id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1.cmml"><mo id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1.2" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1.2.cmml">+</mo><mi id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1.3.cmml">c</mi></msup><msubsup id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.cmml"><mi id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.2.2" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.2.2.cmml">r</mi><mi id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.3.cmml">t</mi><mi id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.2.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.2.3.cmml">i</mi></msubsup></mrow><msup id="S2.SS2.SSS4.p1.3.m2.1.1.3.1" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.1.cmml"><mo id="S2.SS2.SSS4.p1.3.m2.1.1.3.1.2" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.1.2.cmml">+</mo><mi id="S2.SS2.SSS4.p1.3.m2.1.1.3.1.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.1.3.cmml">e</mi></msup><msubsup id="S2.SS2.SSS4.p1.3.m2.1.1.3.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.3.cmml"><mi id="S2.SS2.SSS4.p1.3.m2.1.1.3.3.2.2" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.3.2.2.cmml">r</mi><mi id="S2.SS2.SSS4.p1.3.m2.1.1.3.3.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.3.3.cmml">t</mi><mi id="S2.SS2.SSS4.p1.3.m2.1.1.3.3.2.3" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.3.2.3.cmml">i</mi></msubsup></mrow></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.SSS4.p1.3.m2.1b"><apply id="S2.SS2.SSS4.p1.3.m2.1.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1"><apply id="S2.SS2.SSS4.p1.3.m2.1.1.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.1"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.3.m2.1.1.1.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.1">superscript</csymbol><eq id="S2.SS2.SSS4.p1.3.m2.1.1.1.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.1.2"></eq><ci id="S2.SS2.SSS4.p1.3.m2.1.1.1.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.1.3">𝑔</ci></apply><apply id="S2.SS2.SSS4.p1.3.m2.1.1.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.2"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.3.m2.1.1.2.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.2">subscript</csymbol><apply id="S2.SS2.SSS4.p1.3.m2.1.1.2.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.2"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.3.m2.1.1.2.2.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.2">superscript</csymbol><ci id="S2.SS2.SSS4.p1.3.m2.1.1.2.2.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.2.2.2">𝑟</ci><ci id="S2.SS2.SSS4.p1.3.m2.1.1.2.2.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.2.2.3">𝑖</ci></apply><ci id="S2.SS2.SSS4.p1.3.m2.1.1.2.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.2.3">𝑡</ci></apply><apply id="S2.SS2.SSS4.p1.3.m2.1.1.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3"><apply id="S2.SS2.SSS4.p1.3.m2.1.1.3.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.1"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.3.m2.1.1.3.1.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.1">superscript</csymbol><plus id="S2.SS2.SSS4.p1.3.m2.1.1.3.1.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.1.2"></plus><ci id="S2.SS2.SSS4.p1.3.m2.1.1.3.1.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.1.3">𝑒</ci></apply><apply id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2"><apply id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1">superscript</csymbol><plus id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1.2"></plus><ci id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.1.3">𝑐</ci></apply><apply id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2">subscript</csymbol><apply id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.2.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2">superscript</csymbol><ci id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.2.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.2.2">𝑟</ci><ci id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.2.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.2.3">𝑖</ci></apply><ci id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.2.3">𝑡</ci></apply><apply id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3">subscript</csymbol><apply id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.2.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3">superscript</csymbol><ci id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.2.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.2.2">𝑟</ci><ci id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.2.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.2.3">𝑖</ci></apply><ci id="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.2.3.3">𝑡</ci></apply></apply><apply id="S2.SS2.SSS4.p1.3.m2.1.1.3.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.3"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.3.m2.1.1.3.3.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.3">subscript</csymbol><apply id="S2.SS2.SSS4.p1.3.m2.1.1.3.3.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.3"><csymbol cd="ambiguous" id="S2.SS2.SSS4.p1.3.m2.1.1.3.3.2.1.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.3">superscript</csymbol><ci id="S2.SS2.SSS4.p1.3.m2.1.1.3.3.2.2.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.3.2.2">𝑟</ci><ci id="S2.SS2.SSS4.p1.3.m2.1.1.3.3.2.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.3.2.3">𝑖</ci></apply><ci id="S2.SS2.SSS4.p1.3.m2.1.1.3.3.3.cmml" xref="S2.SS2.SSS4.p1.3.m2.1.1.3.3.3">𝑡</ci></apply></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.SSS4.p1.3.m2.1c">r^{i}_{t}=\,^{g}r^{i}_{t}+^{c}r^{i}_{t}+^{e}r^{i}_{t}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.SSS4.p1.3.m2.1d">italic_r start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = start_POSTSUPERSCRIPT italic_g end_POSTSUPERSCRIPT italic_r start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT italic_r start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT italic_r start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT</annotation></semantics></math>.</p> </div> </section> </section> </section> <section class="ltx_section" id="S3"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">III </span><span class="ltx_text ltx_font_smallcaps" id="S3.1.1">Methodology</span> </h2> <div class="ltx_para" id="S3.p1"> <p class="ltx_p" id="S3.p1.1">In this work, we adopted and adapted the AutoDRIVE Ecosystem <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib17" title="">17</a>]</cite> to model, simulate, train, and deploy two MARL case studies. This choice was driven based on the comparative analysis presented in <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib17" title="">17</a>]</cite>, which satisfied all the requirements of this study. From a digital twinning perspective, data-driven system identification and calibration were used to customize models of Nigel <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib18" title="">18</a>]</cite> and F1TENTH <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib19" title="">19</a>]</cite> vehicles from real-world data to ensure reliable simulation.</p> </div> <section class="ltx_subsection" id="S3.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S3.SS1.5.1.1">III-A</span> </span><span class="ltx_text ltx_font_italic" id="S3.SS1.6.2">Simulation Parallelization</span> </h3> <div class="ltx_para" id="S3.SS1.p1"> <p class="ltx_p" id="S3.SS1.p1.1">We leveraged the open-source nature of AutoDRIVE Simulator to implement a selectively scalable agent/environment parallelization framework. The simulator was configured to take advantage of CPU multi-threading as well as GPU instancing (only if available) to efficiently parallelize various simulation objects and processes while maintaining cross-platform support. Following is an overview of the simulation parallelization schemes:</p> </div> <div class="ltx_para" id="S3.SS1.p2"> <ul class="ltx_itemize" id="S3.I1"> <li class="ltx_item" id="S3.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I1.i1.p1"> <p class="ltx_p" id="S3.I1.i1.p1.1"><span class="ltx_text ltx_font_bold" id="S3.I1.i1.p1.1.1">Parallel Instances:</span> Multiple simulation instances can be spun up to train families of multi-agent systems, each isolated within its own simulation instance. This is a brute-force parallelization technique, which can cause unnecessary computational overhead.</p> </div> </li> <li class="ltx_item" id="S3.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I1.i2.p1"> <p class="ltx_p" id="S3.I1.i2.p1.1"><span class="ltx_text ltx_font_bold" id="S3.I1.i2.p1.1.1">Parallel Environments:</span> Isolated agents can learn the same task in parallel environments, within the same simulation instance. This method can help train single/multiple agents in different environmental conditions, with slight variations in each environment.</p> </div> </li> <li class="ltx_item" id="S3.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I1.i3.p1"> <p class="ltx_p" id="S3.I1.i3.p1.1"><span class="ltx_text ltx_font_bold" id="S3.I1.i3.p1.1.1">Parallel Agents:</span> Parallel agents can learn the same task in the same environment, within the same simulation instance. The parallel agents may collide/perceive/interact with selective peers/opponents. Additionally, the parallel agents may or may not be exactly identical, thereby robustifying them against minor parametric variations.</p> </div> </li> </ul> </div> </section> <section class="ltx_subsection" id="S3.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S3.SS2.5.1.1">III-B</span> </span><span class="ltx_text ltx_font_italic" id="S3.SS2.6.2">Learning Architecture</span> </h3> <div class="ltx_para" id="S3.SS2.p1"> <p class="ltx_p" id="S3.SS2.p1.1">We leverage the proximal policy optimization (PPO) algorithm <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib20" title="">20</a>]</cite> for MARL training; justification follows. PPO is an on-policy method, which is empirically equally effective as its off-policy counterparts <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib21" title="">21</a>]</cite>. Moreover, PPO promotes stable and efficient learning by imposing 2 complementary constraints: (a) a clipped surrogate objective to control each action probability update, and (b) a KL divergence early stopping criteria to limit overall policy change.</p> </div> <div class="ltx_para" id="S3.SS2.p2"> <p class="ltx_p" id="S3.SS2.p2.1">In terms of policy updates, cooperative MARL uses the collective experience of all agents to update a common policy. Contrarily, competitive MARL uses the independent experience of each agent to update its individual policy. Nevertheless, in both cases, the parallelized agents contribute their experiences to update their respective herd’s policy. This results in distributed sampling, which improves data collection speed and diversity, thereby increasing its correlation with the true state-action distribution and stabilizing training.</p> </div> <div class="ltx_para" id="S3.SS2.p3"> <p class="ltx_p" id="S3.SS2.p3.1">Table <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S3.T1" title="TABLE I ‣ III-B Learning Architecture ‣ III Methodology ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">I</span></a> hosts the detailed training configurations adopted for the cooperative as well as competitive MARL scenarios. The noted parameter values were arrived at by analyzing the agent(s)’ behaviors to satisfy the intended objectives qualitatively, while also ensuring a stable learning process. MARL training was carried out on a single laptop PC with 12th Gen Intel Core i9-12900H 2.50 GHz CPU, NVIDIA GeForce RTX 3080 Ti GPU, and 32.0 GB RAM.</p> </div> <figure class="ltx_table" id="S3.T1"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S3.T1.14.1.1" style="font-size:90%;">TABLE I</span>: </span><span class="ltx_text" id="S3.T1.15.2" style="font-size:90%;">Training Configurations</span></figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S3.T1.12.12" style="width:433.6pt;height:664.8pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(50.7pt,-77.7pt) scale(1.30509946762641,1.30509946762641) ;"> <table class="ltx_tabular ltx_guessed_headers ltx_align_middle" id="S3.T1.12.12.12"> <tbody class="ltx_tbody"> <tr class="ltx_tr" id="S3.T1.12.12.12.13.1"> <th class="ltx_td ltx_align_center ltx_th ltx_th_row ltx_border_r ltx_border_t" id="S3.T1.12.12.12.13.1.1"> <table class="ltx_tabular ltx_align_middle" id="S3.T1.12.12.12.13.1.1.1"> <tr class="ltx_tr" id="S3.T1.12.12.12.13.1.1.1.1"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T1.12.12.12.13.1.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S3.T1.12.12.12.13.1.1.1.1.1.1">PARAMETER</span></td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.13.1.1.1.2"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T1.12.12.12.13.1.1.1.2.1"><span class="ltx_text ltx_font_bold" id="S3.T1.12.12.12.13.1.1.1.2.1.1">DESCRIPTION</span></td> </tr> </table> </th> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T1.12.12.12.13.1.2"> <table class="ltx_tabular ltx_align_middle" id="S3.T1.12.12.12.13.1.2.1"> <tr class="ltx_tr" id="S3.T1.12.12.12.13.1.2.1.1"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T1.12.12.12.13.1.2.1.1.1"><span class="ltx_text ltx_font_bold" id="S3.T1.12.12.12.13.1.2.1.1.1.1">COOPERATIVE</span></td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.13.1.2.1.2"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T1.12.12.12.13.1.2.1.2.1"><span class="ltx_text ltx_font_bold" id="S3.T1.12.12.12.13.1.2.1.2.1.1">MARL</span></td> </tr> </table> </td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.12.12.12.13.1.3"> <table class="ltx_tabular ltx_align_middle" id="S3.T1.12.12.12.13.1.3.1"> <tr class="ltx_tr" id="S3.T1.12.12.12.13.1.3.1.1"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T1.12.12.12.13.1.3.1.1.1"><span class="ltx_text ltx_font_bold" id="S3.T1.12.12.12.13.1.3.1.1.1.1">COMPETITIVE</span></td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.13.1.3.1.2"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T1.12.12.12.13.1.3.1.2.1"><span class="ltx_text ltx_font_bold" id="S3.T1.12.12.12.13.1.3.1.2.1.1">MARL</span></td> </tr> </table> </td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.14.2"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_t" colspan="2" id="S3.T1.12.12.12.14.2.1"><span class="ltx_text ltx_font_bold" id="S3.T1.12.12.12.14.2.1.1">Hyperparameters</span></th> <td class="ltx_td ltx_border_t" id="S3.T1.12.12.12.14.2.2"></td> </tr> <tr class="ltx_tr" id="S3.T1.1.1.1.1"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r ltx_border_t" id="S3.T1.1.1.1.1.2">Neural network architecture</th> <td class="ltx_td ltx_align_left ltx_border_t" colspan="2" id="S3.T1.1.1.1.1.1">3-layer FCNN <math alttext="\times" class="ltx_Math" display="inline" id="S3.T1.1.1.1.1.1.m1.1"><semantics id="S3.T1.1.1.1.1.1.m1.1a"><mo id="S3.T1.1.1.1.1.1.m1.1.1" xref="S3.T1.1.1.1.1.1.m1.1.1.cmml">×</mo><annotation-xml encoding="MathML-Content" id="S3.T1.1.1.1.1.1.m1.1b"><times id="S3.T1.1.1.1.1.1.m1.1.1.cmml" xref="S3.T1.1.1.1.1.1.m1.1.1"></times></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.1.1.1.1.1.m1.1c">\times</annotation><annotation encoding="application/x-llamapun" id="S3.T1.1.1.1.1.1.m1.1d">×</annotation></semantics></math> {128, Swish <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib22" title="">22</a>]</cite>}</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.15.3"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.12.12.12.15.3.1">Batch size</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.12.12.12.15.3.2">64</td> <td class="ltx_td ltx_align_left" id="S3.T1.12.12.12.15.3.3">64</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.16.4"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.12.12.12.16.4.1">Buffer size</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.12.12.12.16.4.2">1024</td> <td class="ltx_td ltx_align_left" id="S3.T1.12.12.12.16.4.3">1024</td> </tr> <tr class="ltx_tr" id="S3.T1.2.2.2.2"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.2.2.2.2.1">Learning rate (<math alttext="\alpha" class="ltx_Math" display="inline" id="S3.T1.2.2.2.2.1.m1.1"><semantics id="S3.T1.2.2.2.2.1.m1.1a"><mi id="S3.T1.2.2.2.2.1.m1.1.1" xref="S3.T1.2.2.2.2.1.m1.1.1.cmml">α</mi><annotation-xml encoding="MathML-Content" id="S3.T1.2.2.2.2.1.m1.1b"><ci id="S3.T1.2.2.2.2.1.m1.1.1.cmml" xref="S3.T1.2.2.2.2.1.m1.1.1">𝛼</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.2.2.2.2.1.m1.1c">\alpha</annotation><annotation encoding="application/x-llamapun" id="S3.T1.2.2.2.2.1.m1.1d">italic_α</annotation></semantics></math>)</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.2.2.2.2.2">3e-4</td> <td class="ltx_td ltx_align_left" id="S3.T1.2.2.2.2.3">3e-4</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.17.5"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.12.12.12.17.5.1">Learning rate schedule</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.12.12.12.17.5.2">Linear</td> <td class="ltx_td ltx_align_left" id="S3.T1.12.12.12.17.5.3">Linear</td> </tr> <tr class="ltx_tr" id="S3.T1.3.3.3.3"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.3.3.3.3.1">Entropy regularization (<math alttext="\beta" class="ltx_Math" display="inline" id="S3.T1.3.3.3.3.1.m1.1"><semantics id="S3.T1.3.3.3.3.1.m1.1a"><mi id="S3.T1.3.3.3.3.1.m1.1.1" xref="S3.T1.3.3.3.3.1.m1.1.1.cmml">β</mi><annotation-xml encoding="MathML-Content" id="S3.T1.3.3.3.3.1.m1.1b"><ci id="S3.T1.3.3.3.3.1.m1.1.1.cmml" xref="S3.T1.3.3.3.3.1.m1.1.1">𝛽</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.3.3.3.3.1.m1.1c">\beta</annotation><annotation encoding="application/x-llamapun" id="S3.T1.3.3.3.3.1.m1.1d">italic_β</annotation></semantics></math>)</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.3.3.3.3.2">1e-3</td> <td class="ltx_td ltx_align_left" id="S3.T1.3.3.3.3.3">1e-3</td> </tr> <tr class="ltx_tr" id="S3.T1.4.4.4.4"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.4.4.4.4.1">Policy update (<math alttext="\epsilon" class="ltx_Math" display="inline" id="S3.T1.4.4.4.4.1.m1.1"><semantics id="S3.T1.4.4.4.4.1.m1.1a"><mi id="S3.T1.4.4.4.4.1.m1.1.1" xref="S3.T1.4.4.4.4.1.m1.1.1.cmml">ϵ</mi><annotation-xml encoding="MathML-Content" id="S3.T1.4.4.4.4.1.m1.1b"><ci id="S3.T1.4.4.4.4.1.m1.1.1.cmml" xref="S3.T1.4.4.4.4.1.m1.1.1">italic-ϵ</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.4.4.4.4.1.m1.1c">\epsilon</annotation><annotation encoding="application/x-llamapun" id="S3.T1.4.4.4.4.1.m1.1d">italic_ϵ</annotation></semantics></math>)</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.4.4.4.4.2">2e-1</td> <td class="ltx_td ltx_align_left" id="S3.T1.4.4.4.4.3">2e-1</td> </tr> <tr class="ltx_tr" id="S3.T1.5.5.5.5"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.5.5.5.5.1">Regularization parameter (<math alttext="\lambda" class="ltx_Math" display="inline" id="S3.T1.5.5.5.5.1.m1.1"><semantics id="S3.T1.5.5.5.5.1.m1.1a"><mi id="S3.T1.5.5.5.5.1.m1.1.1" xref="S3.T1.5.5.5.5.1.m1.1.1.cmml">λ</mi><annotation-xml encoding="MathML-Content" id="S3.T1.5.5.5.5.1.m1.1b"><ci id="S3.T1.5.5.5.5.1.m1.1.1.cmml" xref="S3.T1.5.5.5.5.1.m1.1.1">𝜆</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.5.5.5.5.1.m1.1c">\lambda</annotation><annotation encoding="application/x-llamapun" id="S3.T1.5.5.5.5.1.m1.1d">italic_λ</annotation></semantics></math>)</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.5.5.5.5.2">9.8e-1</td> <td class="ltx_td ltx_align_left" id="S3.T1.5.5.5.5.3">9.8e-1</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.18.6"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.12.12.12.18.6.1">Epochs</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.12.12.12.18.6.2">3</td> <td class="ltx_td ltx_align_left" id="S3.T1.12.12.12.18.6.3">3</td> </tr> <tr class="ltx_tr" id="S3.T1.6.6.6.6"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.6.6.6.6.1">Maximum steps (<math alttext="n_{max}" class="ltx_Math" display="inline" id="S3.T1.6.6.6.6.1.m1.1"><semantics id="S3.T1.6.6.6.6.1.m1.1a"><msub id="S3.T1.6.6.6.6.1.m1.1.1" xref="S3.T1.6.6.6.6.1.m1.1.1.cmml"><mi id="S3.T1.6.6.6.6.1.m1.1.1.2" xref="S3.T1.6.6.6.6.1.m1.1.1.2.cmml">n</mi><mrow id="S3.T1.6.6.6.6.1.m1.1.1.3" xref="S3.T1.6.6.6.6.1.m1.1.1.3.cmml"><mi id="S3.T1.6.6.6.6.1.m1.1.1.3.2" xref="S3.T1.6.6.6.6.1.m1.1.1.3.2.cmml">m</mi><mo id="S3.T1.6.6.6.6.1.m1.1.1.3.1" xref="S3.T1.6.6.6.6.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S3.T1.6.6.6.6.1.m1.1.1.3.3" xref="S3.T1.6.6.6.6.1.m1.1.1.3.3.cmml">a</mi><mo id="S3.T1.6.6.6.6.1.m1.1.1.3.1a" xref="S3.T1.6.6.6.6.1.m1.1.1.3.1.cmml">⁢</mo><mi id="S3.T1.6.6.6.6.1.m1.1.1.3.4" xref="S3.T1.6.6.6.6.1.m1.1.1.3.4.cmml">x</mi></mrow></msub><annotation-xml encoding="MathML-Content" id="S3.T1.6.6.6.6.1.m1.1b"><apply id="S3.T1.6.6.6.6.1.m1.1.1.cmml" xref="S3.T1.6.6.6.6.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T1.6.6.6.6.1.m1.1.1.1.cmml" xref="S3.T1.6.6.6.6.1.m1.1.1">subscript</csymbol><ci id="S3.T1.6.6.6.6.1.m1.1.1.2.cmml" xref="S3.T1.6.6.6.6.1.m1.1.1.2">𝑛</ci><apply id="S3.T1.6.6.6.6.1.m1.1.1.3.cmml" xref="S3.T1.6.6.6.6.1.m1.1.1.3"><times id="S3.T1.6.6.6.6.1.m1.1.1.3.1.cmml" xref="S3.T1.6.6.6.6.1.m1.1.1.3.1"></times><ci id="S3.T1.6.6.6.6.1.m1.1.1.3.2.cmml" xref="S3.T1.6.6.6.6.1.m1.1.1.3.2">𝑚</ci><ci id="S3.T1.6.6.6.6.1.m1.1.1.3.3.cmml" xref="S3.T1.6.6.6.6.1.m1.1.1.3.3">𝑎</ci><ci id="S3.T1.6.6.6.6.1.m1.1.1.3.4.cmml" xref="S3.T1.6.6.6.6.1.m1.1.1.3.4">𝑥</ci></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.6.6.6.6.1.m1.1c">n_{max}</annotation><annotation encoding="application/x-llamapun" id="S3.T1.6.6.6.6.1.m1.1d">italic_n start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT</annotation></semantics></math>)</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.6.6.6.6.2">1e6</td> <td class="ltx_td ltx_align_left" id="S3.T1.6.6.6.6.3">1e6</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.19.7"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_t" colspan="2" id="S3.T1.12.12.12.19.7.1"><span class="ltx_text ltx_font_bold" id="S3.T1.12.12.12.19.7.1.1">Behavioral Cloning</span></th> <td class="ltx_td ltx_border_t" id="S3.T1.12.12.12.19.7.2"></td> </tr> <tr class="ltx_tr" id="S3.T1.7.7.7.7"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r ltx_border_t" id="S3.T1.7.7.7.7.1">Strength (<math alttext="\eta" class="ltx_Math" display="inline" id="S3.T1.7.7.7.7.1.m1.1"><semantics id="S3.T1.7.7.7.7.1.m1.1a"><mi id="S3.T1.7.7.7.7.1.m1.1.1" xref="S3.T1.7.7.7.7.1.m1.1.1.cmml">η</mi><annotation-xml encoding="MathML-Content" id="S3.T1.7.7.7.7.1.m1.1b"><ci id="S3.T1.7.7.7.7.1.m1.1.1.cmml" xref="S3.T1.7.7.7.7.1.m1.1.1">𝜂</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.7.7.7.7.1.m1.1c">\eta</annotation><annotation encoding="application/x-llamapun" id="S3.T1.7.7.7.7.1.m1.1d">italic_η</annotation></semantics></math>)</th> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.7.7.7.7.2">–</td> <td class="ltx_td ltx_align_left ltx_border_t" id="S3.T1.7.7.7.7.3">5e-1</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.20.8"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_t" colspan="2" id="S3.T1.12.12.12.20.8.1"><span class="ltx_text ltx_font_bold" id="S3.T1.12.12.12.20.8.1.1">GAIL Reward</span></th> <td class="ltx_td ltx_border_t" id="S3.T1.12.12.12.20.8.2"></td> </tr> <tr class="ltx_tr" id="S3.T1.8.8.8.8"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r ltx_border_t" id="S3.T1.8.8.8.8.1">Discount factor (<math alttext="{}^{g}\gamma" class="ltx_Math" display="inline" id="S3.T1.8.8.8.8.1.m1.1"><semantics id="S3.T1.8.8.8.8.1.m1.1a"><mmultiscripts id="S3.T1.8.8.8.8.1.m1.1.1" xref="S3.T1.8.8.8.8.1.m1.1.1.cmml"><mi id="S3.T1.8.8.8.8.1.m1.1.1.2" xref="S3.T1.8.8.8.8.1.m1.1.1.2.cmml">γ</mi><mprescripts id="S3.T1.8.8.8.8.1.m1.1.1a" xref="S3.T1.8.8.8.8.1.m1.1.1.cmml"></mprescripts><mrow id="S3.T1.8.8.8.8.1.m1.1.1b" xref="S3.T1.8.8.8.8.1.m1.1.1.cmml"></mrow><mi id="S3.T1.8.8.8.8.1.m1.1.1.3" xref="S3.T1.8.8.8.8.1.m1.1.1.3.cmml">g</mi></mmultiscripts><annotation-xml encoding="MathML-Content" id="S3.T1.8.8.8.8.1.m1.1b"><apply id="S3.T1.8.8.8.8.1.m1.1.1.cmml" xref="S3.T1.8.8.8.8.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T1.8.8.8.8.1.m1.1.1.1.cmml" xref="S3.T1.8.8.8.8.1.m1.1.1">superscript</csymbol><ci id="S3.T1.8.8.8.8.1.m1.1.1.2.cmml" xref="S3.T1.8.8.8.8.1.m1.1.1.2">𝛾</ci><ci id="S3.T1.8.8.8.8.1.m1.1.1.3.cmml" xref="S3.T1.8.8.8.8.1.m1.1.1.3">𝑔</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.8.8.8.8.1.m1.1c">{}^{g}\gamma</annotation><annotation encoding="application/x-llamapun" id="S3.T1.8.8.8.8.1.m1.1d">start_FLOATSUPERSCRIPT italic_g end_FLOATSUPERSCRIPT italic_γ</annotation></semantics></math>)</th> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.8.8.8.8.2">–</td> <td class="ltx_td ltx_align_left ltx_border_t" id="S3.T1.8.8.8.8.3">9.9e-1</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.21.9"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.12.12.12.21.9.1">Strength</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.12.12.12.21.9.2">–</td> <td class="ltx_td ltx_align_left" id="S3.T1.12.12.12.21.9.3">1e-2</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.22.10"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.12.12.12.22.10.1">Encoding size</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.12.12.12.22.10.2">–</td> <td class="ltx_td ltx_align_left" id="S3.T1.12.12.12.22.10.3">128</td> </tr> <tr class="ltx_tr" id="S3.T1.9.9.9.9"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.9.9.9.9.1">Learning rate (<math alttext="{}^{g}\alpha" class="ltx_Math" display="inline" id="S3.T1.9.9.9.9.1.m1.1"><semantics id="S3.T1.9.9.9.9.1.m1.1a"><mmultiscripts id="S3.T1.9.9.9.9.1.m1.1.1" xref="S3.T1.9.9.9.9.1.m1.1.1.cmml"><mi id="S3.T1.9.9.9.9.1.m1.1.1.2" xref="S3.T1.9.9.9.9.1.m1.1.1.2.cmml">α</mi><mprescripts id="S3.T1.9.9.9.9.1.m1.1.1a" xref="S3.T1.9.9.9.9.1.m1.1.1.cmml"></mprescripts><mrow id="S3.T1.9.9.9.9.1.m1.1.1b" xref="S3.T1.9.9.9.9.1.m1.1.1.cmml"></mrow><mi id="S3.T1.9.9.9.9.1.m1.1.1.3" xref="S3.T1.9.9.9.9.1.m1.1.1.3.cmml">g</mi></mmultiscripts><annotation-xml encoding="MathML-Content" id="S3.T1.9.9.9.9.1.m1.1b"><apply id="S3.T1.9.9.9.9.1.m1.1.1.cmml" xref="S3.T1.9.9.9.9.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T1.9.9.9.9.1.m1.1.1.1.cmml" xref="S3.T1.9.9.9.9.1.m1.1.1">superscript</csymbol><ci id="S3.T1.9.9.9.9.1.m1.1.1.2.cmml" xref="S3.T1.9.9.9.9.1.m1.1.1.2">𝛼</ci><ci id="S3.T1.9.9.9.9.1.m1.1.1.3.cmml" xref="S3.T1.9.9.9.9.1.m1.1.1.3">𝑔</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.9.9.9.9.1.m1.1c">{}^{g}\alpha</annotation><annotation encoding="application/x-llamapun" id="S3.T1.9.9.9.9.1.m1.1d">start_FLOATSUPERSCRIPT italic_g end_FLOATSUPERSCRIPT italic_α</annotation></semantics></math>)</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.9.9.9.9.2">–</td> <td class="ltx_td ltx_align_left" id="S3.T1.9.9.9.9.3">3e-4</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.23.11"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_t" colspan="2" id="S3.T1.12.12.12.23.11.1"><span class="ltx_text ltx_font_bold" id="S3.T1.12.12.12.23.11.1.1">Curiosity Reward</span></th> <td class="ltx_td ltx_border_t" id="S3.T1.12.12.12.23.11.2"></td> </tr> <tr class="ltx_tr" id="S3.T1.10.10.10.10"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r ltx_border_t" id="S3.T1.10.10.10.10.1">Discount factor (<math alttext="{}^{c}\gamma" class="ltx_Math" display="inline" id="S3.T1.10.10.10.10.1.m1.1"><semantics id="S3.T1.10.10.10.10.1.m1.1a"><mmultiscripts id="S3.T1.10.10.10.10.1.m1.1.1" xref="S3.T1.10.10.10.10.1.m1.1.1.cmml"><mi id="S3.T1.10.10.10.10.1.m1.1.1.2" xref="S3.T1.10.10.10.10.1.m1.1.1.2.cmml">γ</mi><mprescripts id="S3.T1.10.10.10.10.1.m1.1.1a" xref="S3.T1.10.10.10.10.1.m1.1.1.cmml"></mprescripts><mrow id="S3.T1.10.10.10.10.1.m1.1.1b" xref="S3.T1.10.10.10.10.1.m1.1.1.cmml"></mrow><mi id="S3.T1.10.10.10.10.1.m1.1.1.3" xref="S3.T1.10.10.10.10.1.m1.1.1.3.cmml">c</mi></mmultiscripts><annotation-xml encoding="MathML-Content" id="S3.T1.10.10.10.10.1.m1.1b"><apply id="S3.T1.10.10.10.10.1.m1.1.1.cmml" xref="S3.T1.10.10.10.10.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T1.10.10.10.10.1.m1.1.1.1.cmml" xref="S3.T1.10.10.10.10.1.m1.1.1">superscript</csymbol><ci id="S3.T1.10.10.10.10.1.m1.1.1.2.cmml" xref="S3.T1.10.10.10.10.1.m1.1.1.2">𝛾</ci><ci id="S3.T1.10.10.10.10.1.m1.1.1.3.cmml" xref="S3.T1.10.10.10.10.1.m1.1.1.3">𝑐</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.10.10.10.10.1.m1.1c">{}^{c}\gamma</annotation><annotation encoding="application/x-llamapun" id="S3.T1.10.10.10.10.1.m1.1d">start_FLOATSUPERSCRIPT italic_c end_FLOATSUPERSCRIPT italic_γ</annotation></semantics></math>)</th> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.10.10.10.10.2">–</td> <td class="ltx_td ltx_align_left ltx_border_t" id="S3.T1.10.10.10.10.3">9.9e-1</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.24.12"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.12.12.12.24.12.1">Strength</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.12.12.12.24.12.2">–</td> <td class="ltx_td ltx_align_left" id="S3.T1.12.12.12.24.12.3">2e-2</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.25.13"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.12.12.12.25.13.1">Encoding size</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.12.12.12.25.13.2">–</td> <td class="ltx_td ltx_align_left" id="S3.T1.12.12.12.25.13.3">256</td> </tr> <tr class="ltx_tr" id="S3.T1.11.11.11.11"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r" id="S3.T1.11.11.11.11.1">Learning rate (<math alttext="{}^{c}\alpha" class="ltx_Math" display="inline" id="S3.T1.11.11.11.11.1.m1.1"><semantics id="S3.T1.11.11.11.11.1.m1.1a"><mmultiscripts id="S3.T1.11.11.11.11.1.m1.1.1" xref="S3.T1.11.11.11.11.1.m1.1.1.cmml"><mi id="S3.T1.11.11.11.11.1.m1.1.1.2" xref="S3.T1.11.11.11.11.1.m1.1.1.2.cmml">α</mi><mprescripts id="S3.T1.11.11.11.11.1.m1.1.1a" xref="S3.T1.11.11.11.11.1.m1.1.1.cmml"></mprescripts><mrow id="S3.T1.11.11.11.11.1.m1.1.1b" xref="S3.T1.11.11.11.11.1.m1.1.1.cmml"></mrow><mi id="S3.T1.11.11.11.11.1.m1.1.1.3" xref="S3.T1.11.11.11.11.1.m1.1.1.3.cmml">c</mi></mmultiscripts><annotation-xml encoding="MathML-Content" id="S3.T1.11.11.11.11.1.m1.1b"><apply id="S3.T1.11.11.11.11.1.m1.1.1.cmml" xref="S3.T1.11.11.11.11.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T1.11.11.11.11.1.m1.1.1.1.cmml" xref="S3.T1.11.11.11.11.1.m1.1.1">superscript</csymbol><ci id="S3.T1.11.11.11.11.1.m1.1.1.2.cmml" xref="S3.T1.11.11.11.11.1.m1.1.1.2">𝛼</ci><ci id="S3.T1.11.11.11.11.1.m1.1.1.3.cmml" xref="S3.T1.11.11.11.11.1.m1.1.1.3">𝑐</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.11.11.11.11.1.m1.1c">{}^{c}\alpha</annotation><annotation encoding="application/x-llamapun" id="S3.T1.11.11.11.11.1.m1.1d">start_FLOATSUPERSCRIPT italic_c end_FLOATSUPERSCRIPT italic_α</annotation></semantics></math>)</th> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T1.11.11.11.11.2">–</td> <td class="ltx_td ltx_align_left" id="S3.T1.11.11.11.11.3">3e-4</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.26.14"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_t" colspan="2" id="S3.T1.12.12.12.26.14.1"><span class="ltx_text ltx_font_bold" id="S3.T1.12.12.12.26.14.1.1">Extrinsic Reward</span></th> <td class="ltx_td ltx_border_t" id="S3.T1.12.12.12.26.14.2"></td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.12"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_r ltx_border_t" id="S3.T1.12.12.12.12.1">Discount factor (<math alttext="{}^{e}\gamma" class="ltx_Math" display="inline" id="S3.T1.12.12.12.12.1.m1.1"><semantics id="S3.T1.12.12.12.12.1.m1.1a"><mmultiscripts id="S3.T1.12.12.12.12.1.m1.1.1" xref="S3.T1.12.12.12.12.1.m1.1.1.cmml"><mi id="S3.T1.12.12.12.12.1.m1.1.1.2" xref="S3.T1.12.12.12.12.1.m1.1.1.2.cmml">γ</mi><mprescripts id="S3.T1.12.12.12.12.1.m1.1.1a" xref="S3.T1.12.12.12.12.1.m1.1.1.cmml"></mprescripts><mrow id="S3.T1.12.12.12.12.1.m1.1.1b" xref="S3.T1.12.12.12.12.1.m1.1.1.cmml"></mrow><mi id="S3.T1.12.12.12.12.1.m1.1.1.3" xref="S3.T1.12.12.12.12.1.m1.1.1.3.cmml">e</mi></mmultiscripts><annotation-xml encoding="MathML-Content" id="S3.T1.12.12.12.12.1.m1.1b"><apply id="S3.T1.12.12.12.12.1.m1.1.1.cmml" xref="S3.T1.12.12.12.12.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T1.12.12.12.12.1.m1.1.1.1.cmml" xref="S3.T1.12.12.12.12.1.m1.1.1">superscript</csymbol><ci id="S3.T1.12.12.12.12.1.m1.1.1.2.cmml" xref="S3.T1.12.12.12.12.1.m1.1.1.2">𝛾</ci><ci id="S3.T1.12.12.12.12.1.m1.1.1.3.cmml" xref="S3.T1.12.12.12.12.1.m1.1.1.3">𝑒</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T1.12.12.12.12.1.m1.1c">{}^{e}\gamma</annotation><annotation encoding="application/x-llamapun" id="S3.T1.12.12.12.12.1.m1.1d">start_FLOATSUPERSCRIPT italic_e end_FLOATSUPERSCRIPT italic_γ</annotation></semantics></math>)</th> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T1.12.12.12.12.2">–</td> <td class="ltx_td ltx_align_left ltx_border_t" id="S3.T1.12.12.12.12.3">9.9e-1</td> </tr> <tr class="ltx_tr" id="S3.T1.12.12.12.27.15"> <th class="ltx_td ltx_align_left ltx_th ltx_th_row ltx_border_b ltx_border_r" id="S3.T1.12.12.12.27.15.1">Strength</th> <td class="ltx_td ltx_align_left ltx_border_b ltx_border_r" id="S3.T1.12.12.12.27.15.2">–</td> <td class="ltx_td ltx_align_left ltx_border_b" id="S3.T1.12.12.12.27.15.3">1.0</td> </tr> </tbody> </table> </span></div> </figure> </section> <section class="ltx_subsection" id="S3.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S3.SS3.5.1.1">III-C</span> </span><span class="ltx_text ltx_font_italic" id="S3.SS3.6.2">Domain Randomization</span> </h3> <div class="ltx_para" id="S3.SS3.p1"> <p class="ltx_p" id="S3.SS3.p1.5">We leveraged the simulation parallelization architecture to introduce systematic domain randomization <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib12" title="">12</a>]</cite> across <math alttext="k" class="ltx_Math" display="inline" id="S3.SS3.p1.1.m1.1"><semantics id="S3.SS3.p1.1.m1.1a"><mi id="S3.SS3.p1.1.m1.1.1" xref="S3.SS3.p1.1.m1.1.1.cmml">k</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.1.m1.1b"><ci id="S3.SS3.p1.1.m1.1.1.cmml" xref="S3.SS3.p1.1.m1.1.1">𝑘</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.1.m1.1c">k</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.1.m1.1d">italic_k</annotation></semantics></math> agent/environment replicas. This allowed us to maintain the solver consistency across simulation time steps while introducing dynamical perturbations, which is not explored in the literature. Table <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S3.T2" title="TABLE II ‣ III-C Domain Randomization ‣ III Methodology ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">II</span></a> hosts the detailed domain randomization parameters for the cooperative as well as competitive MARL scenarios. Particularly, since the cooperative MARL scenario was environment-parallelized, we vary the dynamics of each environment replica in this case. Contrarily, since the competitive MARL scenario was agent-parallelized, we vary the dynamics of each agent replica in this case. Additionally, in both cases, we also introduce noise in the agents’ observations and actions at each time step. Here, the parameter <math alttext="\xi" class="ltx_Math" display="inline" id="S3.SS3.p1.2.m2.1"><semantics id="S3.SS3.p1.2.m2.1a"><mi id="S3.SS3.p1.2.m2.1.1" xref="S3.SS3.p1.2.m2.1.1.cmml">ξ</mi><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.2.m2.1b"><ci id="S3.SS3.p1.2.m2.1.1.cmml" xref="S3.SS3.p1.2.m2.1.1">𝜉</ci></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.2.m2.1c">\xi</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.2.m2.1d">italic_ξ</annotation></semantics></math> denotes the degree of domain randomization. In this work, we analyze the effect of no domain randomization (NDR), i.e. <math alttext="\xi=0" class="ltx_Math" display="inline" id="S3.SS3.p1.3.m3.1"><semantics id="S3.SS3.p1.3.m3.1a"><mrow id="S3.SS3.p1.3.m3.1.1" xref="S3.SS3.p1.3.m3.1.1.cmml"><mi id="S3.SS3.p1.3.m3.1.1.2" xref="S3.SS3.p1.3.m3.1.1.2.cmml">ξ</mi><mo id="S3.SS3.p1.3.m3.1.1.1" xref="S3.SS3.p1.3.m3.1.1.1.cmml">=</mo><mn id="S3.SS3.p1.3.m3.1.1.3" xref="S3.SS3.p1.3.m3.1.1.3.cmml">0</mn></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.3.m3.1b"><apply id="S3.SS3.p1.3.m3.1.1.cmml" xref="S3.SS3.p1.3.m3.1.1"><eq id="S3.SS3.p1.3.m3.1.1.1.cmml" xref="S3.SS3.p1.3.m3.1.1.1"></eq><ci id="S3.SS3.p1.3.m3.1.1.2.cmml" xref="S3.SS3.p1.3.m3.1.1.2">𝜉</ci><cn id="S3.SS3.p1.3.m3.1.1.3.cmml" type="integer" xref="S3.SS3.p1.3.m3.1.1.3">0</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.3.m3.1c">\xi=0</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.3.m3.1d">italic_ξ = 0</annotation></semantics></math>, low domain randomization (LDR), i.e. <math alttext="\xi=1" class="ltx_Math" display="inline" id="S3.SS3.p1.4.m4.1"><semantics id="S3.SS3.p1.4.m4.1a"><mrow id="S3.SS3.p1.4.m4.1.1" xref="S3.SS3.p1.4.m4.1.1.cmml"><mi id="S3.SS3.p1.4.m4.1.1.2" xref="S3.SS3.p1.4.m4.1.1.2.cmml">ξ</mi><mo id="S3.SS3.p1.4.m4.1.1.1" xref="S3.SS3.p1.4.m4.1.1.1.cmml">=</mo><mn id="S3.SS3.p1.4.m4.1.1.3" xref="S3.SS3.p1.4.m4.1.1.3.cmml">1</mn></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.4.m4.1b"><apply id="S3.SS3.p1.4.m4.1.1.cmml" xref="S3.SS3.p1.4.m4.1.1"><eq id="S3.SS3.p1.4.m4.1.1.1.cmml" xref="S3.SS3.p1.4.m4.1.1.1"></eq><ci id="S3.SS3.p1.4.m4.1.1.2.cmml" xref="S3.SS3.p1.4.m4.1.1.2">𝜉</ci><cn id="S3.SS3.p1.4.m4.1.1.3.cmml" type="integer" xref="S3.SS3.p1.4.m4.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.4.m4.1c">\xi=1</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.4.m4.1d">italic_ξ = 1</annotation></semantics></math>, and high domain randomization (HDR), i.e. <math alttext="\xi=2" class="ltx_Math" display="inline" id="S3.SS3.p1.5.m5.1"><semantics id="S3.SS3.p1.5.m5.1a"><mrow id="S3.SS3.p1.5.m5.1.1" xref="S3.SS3.p1.5.m5.1.1.cmml"><mi id="S3.SS3.p1.5.m5.1.1.2" xref="S3.SS3.p1.5.m5.1.1.2.cmml">ξ</mi><mo id="S3.SS3.p1.5.m5.1.1.1" xref="S3.SS3.p1.5.m5.1.1.1.cmml">=</mo><mn id="S3.SS3.p1.5.m5.1.1.3" xref="S3.SS3.p1.5.m5.1.1.3.cmml">2</mn></mrow><annotation-xml encoding="MathML-Content" id="S3.SS3.p1.5.m5.1b"><apply id="S3.SS3.p1.5.m5.1.1.cmml" xref="S3.SS3.p1.5.m5.1.1"><eq id="S3.SS3.p1.5.m5.1.1.1.cmml" xref="S3.SS3.p1.5.m5.1.1.1"></eq><ci id="S3.SS3.p1.5.m5.1.1.2.cmml" xref="S3.SS3.p1.5.m5.1.1.2">𝜉</ci><cn id="S3.SS3.p1.5.m5.1.1.3.cmml" type="integer" xref="S3.SS3.p1.5.m5.1.1.3">2</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.SS3.p1.5.m5.1c">\xi=2</annotation><annotation encoding="application/x-llamapun" id="S3.SS3.p1.5.m5.1d">italic_ξ = 2</annotation></semantics></math>.</p> </div> <figure class="ltx_table" id="S3.T2"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S3.T2.30.1.1" style="font-size:90%;">TABLE II</span>: </span><span class="ltx_text" id="S3.T2.31.2" style="font-size:90%;">Domain Randomization</span></figcaption> <div class="ltx_inline-block ltx_align_center ltx_transformed_outer" id="S3.T2.28.28" style="width:433.6pt;height:361.2pt;vertical-align:-0.0pt;"><span class="ltx_transformed_inner" style="transform:translate(29.9pt,-24.9pt) scale(1.15980142051566,1.15980142051566) ;"> <table class="ltx_tabular ltx_align_middle" id="S3.T2.28.28.28"> <tbody class="ltx_tbody"> <tr class="ltx_tr" id="S3.T2.28.28.28.29.1"> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T2.28.28.28.29.1.1"> <table class="ltx_tabular ltx_align_middle" id="S3.T2.28.28.28.29.1.1.1"> <tr class="ltx_tr" id="S3.T2.28.28.28.29.1.1.1.1"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T2.28.28.28.29.1.1.1.1.1"><span class="ltx_text ltx_font_bold" id="S3.T2.28.28.28.29.1.1.1.1.1.1">PARAMETER</span></td> </tr> <tr class="ltx_tr" id="S3.T2.28.28.28.29.1.1.1.2"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T2.28.28.28.29.1.1.1.2.1"><span class="ltx_text ltx_font_bold" id="S3.T2.28.28.28.29.1.1.1.2.1.1">DESCRIPTION</span></td> </tr> </table> </td> <td class="ltx_td ltx_align_center ltx_border_r ltx_border_t" id="S3.T2.28.28.28.29.1.2"> <table class="ltx_tabular ltx_align_middle" id="S3.T2.28.28.28.29.1.2.1"> <tr class="ltx_tr" id="S3.T2.28.28.28.29.1.2.1.1"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T2.28.28.28.29.1.2.1.1.1"><span class="ltx_text ltx_font_bold" id="S3.T2.28.28.28.29.1.2.1.1.1.1">COOPERATIVE</span></td> </tr> <tr class="ltx_tr" id="S3.T2.28.28.28.29.1.2.1.2"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T2.28.28.28.29.1.2.1.2.1"><span class="ltx_text ltx_font_bold" id="S3.T2.28.28.28.29.1.2.1.2.1.1">MARL</span></td> </tr> </table> </td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T2.28.28.28.29.1.3"> <table class="ltx_tabular ltx_align_middle" id="S3.T2.28.28.28.29.1.3.1"> <tr class="ltx_tr" id="S3.T2.28.28.28.29.1.3.1.1"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T2.28.28.28.29.1.3.1.1.1"><span class="ltx_text ltx_font_bold" id="S3.T2.28.28.28.29.1.3.1.1.1.1">COMPETITIVE</span></td> </tr> <tr class="ltx_tr" id="S3.T2.28.28.28.29.1.3.1.2"> <td class="ltx_td ltx_nopad_r ltx_align_center" id="S3.T2.28.28.28.29.1.3.1.2.1"><span class="ltx_text ltx_font_bold" id="S3.T2.28.28.28.29.1.3.1.2.1.1">MARL</span></td> </tr> </table> </td> </tr> <tr class="ltx_tr" id="S3.T2.28.28.28.30.2"> <td class="ltx_td ltx_align_left ltx_border_t" colspan="2" id="S3.T2.28.28.28.30.2.1"><span class="ltx_text ltx_font_bold" id="S3.T2.28.28.28.30.2.1.1">Observation Noise</span></td> <td class="ltx_td ltx_border_t" id="S3.T2.28.28.28.30.2.2"></td> </tr> <tr class="ltx_tr" id="S3.T2.3.3.3.3"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T2.2.2.2.2.2">Position (<math alttext="w_{x}^{k}" class="ltx_Math" display="inline" id="S3.T2.1.1.1.1.1.m1.1"><semantics id="S3.T2.1.1.1.1.1.m1.1a"><msubsup id="S3.T2.1.1.1.1.1.m1.1.1" xref="S3.T2.1.1.1.1.1.m1.1.1.cmml"><mi id="S3.T2.1.1.1.1.1.m1.1.1.2.2" xref="S3.T2.1.1.1.1.1.m1.1.1.2.2.cmml">w</mi><mi id="S3.T2.1.1.1.1.1.m1.1.1.2.3" xref="S3.T2.1.1.1.1.1.m1.1.1.2.3.cmml">x</mi><mi id="S3.T2.1.1.1.1.1.m1.1.1.3" xref="S3.T2.1.1.1.1.1.m1.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.1.1.1.1.1.m1.1b"><apply id="S3.T2.1.1.1.1.1.m1.1.1.cmml" xref="S3.T2.1.1.1.1.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.1.1.1.1.1.m1.1.1.1.cmml" xref="S3.T2.1.1.1.1.1.m1.1.1">superscript</csymbol><apply id="S3.T2.1.1.1.1.1.m1.1.1.2.cmml" xref="S3.T2.1.1.1.1.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.1.1.1.1.1.m1.1.1.2.1.cmml" xref="S3.T2.1.1.1.1.1.m1.1.1">subscript</csymbol><ci id="S3.T2.1.1.1.1.1.m1.1.1.2.2.cmml" xref="S3.T2.1.1.1.1.1.m1.1.1.2.2">𝑤</ci><ci id="S3.T2.1.1.1.1.1.m1.1.1.2.3.cmml" xref="S3.T2.1.1.1.1.1.m1.1.1.2.3">𝑥</ci></apply><ci id="S3.T2.1.1.1.1.1.m1.1.1.3.cmml" xref="S3.T2.1.1.1.1.1.m1.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.1.1.1.1.1.m1.1c">w_{x}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.1.1.1.1.1.m1.1d">italic_w start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>, <math alttext="w_{y}^{k}" class="ltx_Math" display="inline" id="S3.T2.2.2.2.2.2.m2.1"><semantics id="S3.T2.2.2.2.2.2.m2.1a"><msubsup id="S3.T2.2.2.2.2.2.m2.1.1" xref="S3.T2.2.2.2.2.2.m2.1.1.cmml"><mi id="S3.T2.2.2.2.2.2.m2.1.1.2.2" xref="S3.T2.2.2.2.2.2.m2.1.1.2.2.cmml">w</mi><mi id="S3.T2.2.2.2.2.2.m2.1.1.2.3" xref="S3.T2.2.2.2.2.2.m2.1.1.2.3.cmml">y</mi><mi id="S3.T2.2.2.2.2.2.m2.1.1.3" xref="S3.T2.2.2.2.2.2.m2.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.2.2.2.2.2.m2.1b"><apply id="S3.T2.2.2.2.2.2.m2.1.1.cmml" xref="S3.T2.2.2.2.2.2.m2.1.1"><csymbol cd="ambiguous" id="S3.T2.2.2.2.2.2.m2.1.1.1.cmml" xref="S3.T2.2.2.2.2.2.m2.1.1">superscript</csymbol><apply id="S3.T2.2.2.2.2.2.m2.1.1.2.cmml" xref="S3.T2.2.2.2.2.2.m2.1.1"><csymbol cd="ambiguous" id="S3.T2.2.2.2.2.2.m2.1.1.2.1.cmml" xref="S3.T2.2.2.2.2.2.m2.1.1">subscript</csymbol><ci id="S3.T2.2.2.2.2.2.m2.1.1.2.2.cmml" xref="S3.T2.2.2.2.2.2.m2.1.1.2.2">𝑤</ci><ci id="S3.T2.2.2.2.2.2.m2.1.1.2.3.cmml" xref="S3.T2.2.2.2.2.2.m2.1.1.2.3">𝑦</ci></apply><ci id="S3.T2.2.2.2.2.2.m2.1.1.3.cmml" xref="S3.T2.2.2.2.2.2.m2.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.2.2.2.2.2.m2.1c">w_{y}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.2.2.2.2.2.m2.1d">italic_w start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>)</td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T2.3.3.3.3.3"> <math alttext="\xi\cdot N" class="ltx_Math" display="inline" id="S3.T2.3.3.3.3.3.m1.1"><semantics id="S3.T2.3.3.3.3.3.m1.1a"><mrow id="S3.T2.3.3.3.3.3.m1.1.1" xref="S3.T2.3.3.3.3.3.m1.1.1.cmml"><mi id="S3.T2.3.3.3.3.3.m1.1.1.2" xref="S3.T2.3.3.3.3.3.m1.1.1.2.cmml">ξ</mi><mo id="S3.T2.3.3.3.3.3.m1.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.T2.3.3.3.3.3.m1.1.1.1.cmml">⋅</mo><mi id="S3.T2.3.3.3.3.3.m1.1.1.3" xref="S3.T2.3.3.3.3.3.m1.1.1.3.cmml">N</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.T2.3.3.3.3.3.m1.1b"><apply id="S3.T2.3.3.3.3.3.m1.1.1.cmml" xref="S3.T2.3.3.3.3.3.m1.1.1"><ci id="S3.T2.3.3.3.3.3.m1.1.1.1.cmml" xref="S3.T2.3.3.3.3.3.m1.1.1.1">⋅</ci><ci id="S3.T2.3.3.3.3.3.m1.1.1.2.cmml" xref="S3.T2.3.3.3.3.3.m1.1.1.2">𝜉</ci><ci id="S3.T2.3.3.3.3.3.m1.1.1.3.cmml" xref="S3.T2.3.3.3.3.3.m1.1.1.3">𝑁</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.3.3.3.3.3.m1.1c">\xi\cdot N</annotation><annotation encoding="application/x-llamapun" id="S3.T2.3.3.3.3.3.m1.1d">italic_ξ ⋅ italic_N</annotation></semantics></math>(0,1e-4) m</td> <td class="ltx_td ltx_align_left ltx_border_t" id="S3.T2.3.3.3.3.4">–</td> </tr> <tr class="ltx_tr" id="S3.T2.5.5.5.5"> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.4.4.4.4.1">Orientation (<math alttext="w_{\psi}^{k}" class="ltx_Math" display="inline" id="S3.T2.4.4.4.4.1.m1.1"><semantics id="S3.T2.4.4.4.4.1.m1.1a"><msubsup id="S3.T2.4.4.4.4.1.m1.1.1" xref="S3.T2.4.4.4.4.1.m1.1.1.cmml"><mi id="S3.T2.4.4.4.4.1.m1.1.1.2.2" xref="S3.T2.4.4.4.4.1.m1.1.1.2.2.cmml">w</mi><mi id="S3.T2.4.4.4.4.1.m1.1.1.2.3" xref="S3.T2.4.4.4.4.1.m1.1.1.2.3.cmml">ψ</mi><mi id="S3.T2.4.4.4.4.1.m1.1.1.3" xref="S3.T2.4.4.4.4.1.m1.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.4.4.4.4.1.m1.1b"><apply id="S3.T2.4.4.4.4.1.m1.1.1.cmml" xref="S3.T2.4.4.4.4.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.4.4.4.4.1.m1.1.1.1.cmml" xref="S3.T2.4.4.4.4.1.m1.1.1">superscript</csymbol><apply id="S3.T2.4.4.4.4.1.m1.1.1.2.cmml" xref="S3.T2.4.4.4.4.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.4.4.4.4.1.m1.1.1.2.1.cmml" xref="S3.T2.4.4.4.4.1.m1.1.1">subscript</csymbol><ci id="S3.T2.4.4.4.4.1.m1.1.1.2.2.cmml" xref="S3.T2.4.4.4.4.1.m1.1.1.2.2">𝑤</ci><ci id="S3.T2.4.4.4.4.1.m1.1.1.2.3.cmml" xref="S3.T2.4.4.4.4.1.m1.1.1.2.3">𝜓</ci></apply><ci id="S3.T2.4.4.4.4.1.m1.1.1.3.cmml" xref="S3.T2.4.4.4.4.1.m1.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.4.4.4.4.1.m1.1c">w_{\psi}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.4.4.4.4.1.m1.1d">italic_w start_POSTSUBSCRIPT italic_ψ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>)</td> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.5.5.5.5.2"> <math alttext="\xi\cdot N" class="ltx_Math" display="inline" id="S3.T2.5.5.5.5.2.m1.1"><semantics id="S3.T2.5.5.5.5.2.m1.1a"><mrow id="S3.T2.5.5.5.5.2.m1.1.1" xref="S3.T2.5.5.5.5.2.m1.1.1.cmml"><mi id="S3.T2.5.5.5.5.2.m1.1.1.2" xref="S3.T2.5.5.5.5.2.m1.1.1.2.cmml">ξ</mi><mo id="S3.T2.5.5.5.5.2.m1.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.T2.5.5.5.5.2.m1.1.1.1.cmml">⋅</mo><mi id="S3.T2.5.5.5.5.2.m1.1.1.3" xref="S3.T2.5.5.5.5.2.m1.1.1.3.cmml">N</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.T2.5.5.5.5.2.m1.1b"><apply id="S3.T2.5.5.5.5.2.m1.1.1.cmml" xref="S3.T2.5.5.5.5.2.m1.1.1"><ci id="S3.T2.5.5.5.5.2.m1.1.1.1.cmml" xref="S3.T2.5.5.5.5.2.m1.1.1.1">⋅</ci><ci id="S3.T2.5.5.5.5.2.m1.1.1.2.cmml" xref="S3.T2.5.5.5.5.2.m1.1.1.2">𝜉</ci><ci id="S3.T2.5.5.5.5.2.m1.1.1.3.cmml" xref="S3.T2.5.5.5.5.2.m1.1.1.3">𝑁</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.5.5.5.5.2.m1.1c">\xi\cdot N</annotation><annotation encoding="application/x-llamapun" id="S3.T2.5.5.5.5.2.m1.1d">italic_ξ ⋅ italic_N</annotation></semantics></math>(0,3.0625e-4) rad</td> <td class="ltx_td ltx_align_left" id="S3.T2.5.5.5.5.3">–</td> </tr> <tr class="ltx_tr" id="S3.T2.8.8.8.8"> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.6.6.6.6.1">Velocity (<math alttext="w_{v}^{k}" class="ltx_Math" display="inline" id="S3.T2.6.6.6.6.1.m1.1"><semantics id="S3.T2.6.6.6.6.1.m1.1a"><msubsup id="S3.T2.6.6.6.6.1.m1.1.1" xref="S3.T2.6.6.6.6.1.m1.1.1.cmml"><mi id="S3.T2.6.6.6.6.1.m1.1.1.2.2" xref="S3.T2.6.6.6.6.1.m1.1.1.2.2.cmml">w</mi><mi id="S3.T2.6.6.6.6.1.m1.1.1.2.3" xref="S3.T2.6.6.6.6.1.m1.1.1.2.3.cmml">v</mi><mi id="S3.T2.6.6.6.6.1.m1.1.1.3" xref="S3.T2.6.6.6.6.1.m1.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.6.6.6.6.1.m1.1b"><apply id="S3.T2.6.6.6.6.1.m1.1.1.cmml" xref="S3.T2.6.6.6.6.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.6.6.6.6.1.m1.1.1.1.cmml" xref="S3.T2.6.6.6.6.1.m1.1.1">superscript</csymbol><apply id="S3.T2.6.6.6.6.1.m1.1.1.2.cmml" xref="S3.T2.6.6.6.6.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.6.6.6.6.1.m1.1.1.2.1.cmml" xref="S3.T2.6.6.6.6.1.m1.1.1">subscript</csymbol><ci id="S3.T2.6.6.6.6.1.m1.1.1.2.2.cmml" xref="S3.T2.6.6.6.6.1.m1.1.1.2.2">𝑤</ci><ci id="S3.T2.6.6.6.6.1.m1.1.1.2.3.cmml" xref="S3.T2.6.6.6.6.1.m1.1.1.2.3">𝑣</ci></apply><ci id="S3.T2.6.6.6.6.1.m1.1.1.3.cmml" xref="S3.T2.6.6.6.6.1.m1.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.6.6.6.6.1.m1.1c">w_{v}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.6.6.6.6.1.m1.1d">italic_w start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>)</td> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.7.7.7.7.2"> <math alttext="\xi\cdot N" class="ltx_Math" display="inline" id="S3.T2.7.7.7.7.2.m1.1"><semantics id="S3.T2.7.7.7.7.2.m1.1a"><mrow id="S3.T2.7.7.7.7.2.m1.1.1" xref="S3.T2.7.7.7.7.2.m1.1.1.cmml"><mi id="S3.T2.7.7.7.7.2.m1.1.1.2" xref="S3.T2.7.7.7.7.2.m1.1.1.2.cmml">ξ</mi><mo id="S3.T2.7.7.7.7.2.m1.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.T2.7.7.7.7.2.m1.1.1.1.cmml">⋅</mo><mi id="S3.T2.7.7.7.7.2.m1.1.1.3" xref="S3.T2.7.7.7.7.2.m1.1.1.3.cmml">N</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.T2.7.7.7.7.2.m1.1b"><apply id="S3.T2.7.7.7.7.2.m1.1.1.cmml" xref="S3.T2.7.7.7.7.2.m1.1.1"><ci id="S3.T2.7.7.7.7.2.m1.1.1.1.cmml" xref="S3.T2.7.7.7.7.2.m1.1.1.1">⋅</ci><ci id="S3.T2.7.7.7.7.2.m1.1.1.2.cmml" xref="S3.T2.7.7.7.7.2.m1.1.1.2">𝜉</ci><ci id="S3.T2.7.7.7.7.2.m1.1.1.3.cmml" xref="S3.T2.7.7.7.7.2.m1.1.1.3">𝑁</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.7.7.7.7.2.m1.1c">\xi\cdot N</annotation><annotation encoding="application/x-llamapun" id="S3.T2.7.7.7.7.2.m1.1d">italic_ξ ⋅ italic_N</annotation></semantics></math>(0,1e-4) m/s</td> <td class="ltx_td ltx_align_left" id="S3.T2.8.8.8.8.3"> <math alttext="\xi\cdot N" class="ltx_Math" display="inline" id="S3.T2.8.8.8.8.3.m1.1"><semantics id="S3.T2.8.8.8.8.3.m1.1a"><mrow id="S3.T2.8.8.8.8.3.m1.1.1" xref="S3.T2.8.8.8.8.3.m1.1.1.cmml"><mi id="S3.T2.8.8.8.8.3.m1.1.1.2" xref="S3.T2.8.8.8.8.3.m1.1.1.2.cmml">ξ</mi><mo id="S3.T2.8.8.8.8.3.m1.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.T2.8.8.8.8.3.m1.1.1.1.cmml">⋅</mo><mi id="S3.T2.8.8.8.8.3.m1.1.1.3" xref="S3.T2.8.8.8.8.3.m1.1.1.3.cmml">N</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.T2.8.8.8.8.3.m1.1b"><apply id="S3.T2.8.8.8.8.3.m1.1.1.cmml" xref="S3.T2.8.8.8.8.3.m1.1.1"><ci id="S3.T2.8.8.8.8.3.m1.1.1.1.cmml" xref="S3.T2.8.8.8.8.3.m1.1.1.1">⋅</ci><ci id="S3.T2.8.8.8.8.3.m1.1.1.2.cmml" xref="S3.T2.8.8.8.8.3.m1.1.1.2">𝜉</ci><ci id="S3.T2.8.8.8.8.3.m1.1.1.3.cmml" xref="S3.T2.8.8.8.8.3.m1.1.1.3">𝑁</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.8.8.8.8.3.m1.1c">\xi\cdot N</annotation><annotation encoding="application/x-llamapun" id="S3.T2.8.8.8.8.3.m1.1d">italic_ξ ⋅ italic_N</annotation></semantics></math>(0,1e-4) m/s</td> </tr> <tr class="ltx_tr" id="S3.T2.10.10.10.10"> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.9.9.9.9.1">LIDAR Scan (<math alttext="w_{m}^{k}" class="ltx_Math" display="inline" id="S3.T2.9.9.9.9.1.m1.1"><semantics id="S3.T2.9.9.9.9.1.m1.1a"><msubsup id="S3.T2.9.9.9.9.1.m1.1.1" xref="S3.T2.9.9.9.9.1.m1.1.1.cmml"><mi id="S3.T2.9.9.9.9.1.m1.1.1.2.2" xref="S3.T2.9.9.9.9.1.m1.1.1.2.2.cmml">w</mi><mi id="S3.T2.9.9.9.9.1.m1.1.1.2.3" xref="S3.T2.9.9.9.9.1.m1.1.1.2.3.cmml">m</mi><mi id="S3.T2.9.9.9.9.1.m1.1.1.3" xref="S3.T2.9.9.9.9.1.m1.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.9.9.9.9.1.m1.1b"><apply id="S3.T2.9.9.9.9.1.m1.1.1.cmml" xref="S3.T2.9.9.9.9.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.9.9.9.9.1.m1.1.1.1.cmml" xref="S3.T2.9.9.9.9.1.m1.1.1">superscript</csymbol><apply id="S3.T2.9.9.9.9.1.m1.1.1.2.cmml" xref="S3.T2.9.9.9.9.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.9.9.9.9.1.m1.1.1.2.1.cmml" xref="S3.T2.9.9.9.9.1.m1.1.1">subscript</csymbol><ci id="S3.T2.9.9.9.9.1.m1.1.1.2.2.cmml" xref="S3.T2.9.9.9.9.1.m1.1.1.2.2">𝑤</ci><ci id="S3.T2.9.9.9.9.1.m1.1.1.2.3.cmml" xref="S3.T2.9.9.9.9.1.m1.1.1.2.3">𝑚</ci></apply><ci id="S3.T2.9.9.9.9.1.m1.1.1.3.cmml" xref="S3.T2.9.9.9.9.1.m1.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.9.9.9.9.1.m1.1c">w_{m}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.9.9.9.9.1.m1.1d">italic_w start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>)</td> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.10.10.10.10.3">–</td> <td class="ltx_td ltx_align_left" id="S3.T2.10.10.10.10.2"> <math alttext="\xi\cdot N" class="ltx_Math" display="inline" id="S3.T2.10.10.10.10.2.m1.1"><semantics id="S3.T2.10.10.10.10.2.m1.1a"><mrow id="S3.T2.10.10.10.10.2.m1.1.1" xref="S3.T2.10.10.10.10.2.m1.1.1.cmml"><mi id="S3.T2.10.10.10.10.2.m1.1.1.2" xref="S3.T2.10.10.10.10.2.m1.1.1.2.cmml">ξ</mi><mo id="S3.T2.10.10.10.10.2.m1.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.T2.10.10.10.10.2.m1.1.1.1.cmml">⋅</mo><mi id="S3.T2.10.10.10.10.2.m1.1.1.3" xref="S3.T2.10.10.10.10.2.m1.1.1.3.cmml">N</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.T2.10.10.10.10.2.m1.1b"><apply id="S3.T2.10.10.10.10.2.m1.1.1.cmml" xref="S3.T2.10.10.10.10.2.m1.1.1"><ci id="S3.T2.10.10.10.10.2.m1.1.1.1.cmml" xref="S3.T2.10.10.10.10.2.m1.1.1.1">⋅</ci><ci id="S3.T2.10.10.10.10.2.m1.1.1.2.cmml" xref="S3.T2.10.10.10.10.2.m1.1.1.2">𝜉</ci><ci id="S3.T2.10.10.10.10.2.m1.1.1.3.cmml" xref="S3.T2.10.10.10.10.2.m1.1.1.3">𝑁</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.10.10.10.10.2.m1.1c">\xi\cdot N</annotation><annotation encoding="application/x-llamapun" id="S3.T2.10.10.10.10.2.m1.1d">italic_ξ ⋅ italic_N</annotation></semantics></math>(0,1e-6) m</td> </tr> <tr class="ltx_tr" id="S3.T2.28.28.28.31.3"> <td class="ltx_td ltx_align_left ltx_border_t" colspan="2" id="S3.T2.28.28.28.31.3.1"><span class="ltx_text ltx_font_bold" id="S3.T2.28.28.28.31.3.1.1">Action Noise</span></td> <td class="ltx_td ltx_border_t" id="S3.T2.28.28.28.31.3.2"></td> </tr> <tr class="ltx_tr" id="S3.T2.13.13.13.13"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T2.11.11.11.11.1">Throttle (<math alttext="w_{\tau}^{k}" class="ltx_Math" display="inline" id="S3.T2.11.11.11.11.1.m1.1"><semantics id="S3.T2.11.11.11.11.1.m1.1a"><msubsup id="S3.T2.11.11.11.11.1.m1.1.1" xref="S3.T2.11.11.11.11.1.m1.1.1.cmml"><mi id="S3.T2.11.11.11.11.1.m1.1.1.2.2" xref="S3.T2.11.11.11.11.1.m1.1.1.2.2.cmml">w</mi><mi id="S3.T2.11.11.11.11.1.m1.1.1.2.3" xref="S3.T2.11.11.11.11.1.m1.1.1.2.3.cmml">τ</mi><mi id="S3.T2.11.11.11.11.1.m1.1.1.3" xref="S3.T2.11.11.11.11.1.m1.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.11.11.11.11.1.m1.1b"><apply id="S3.T2.11.11.11.11.1.m1.1.1.cmml" xref="S3.T2.11.11.11.11.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.11.11.11.11.1.m1.1.1.1.cmml" xref="S3.T2.11.11.11.11.1.m1.1.1">superscript</csymbol><apply id="S3.T2.11.11.11.11.1.m1.1.1.2.cmml" xref="S3.T2.11.11.11.11.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.11.11.11.11.1.m1.1.1.2.1.cmml" xref="S3.T2.11.11.11.11.1.m1.1.1">subscript</csymbol><ci id="S3.T2.11.11.11.11.1.m1.1.1.2.2.cmml" xref="S3.T2.11.11.11.11.1.m1.1.1.2.2">𝑤</ci><ci id="S3.T2.11.11.11.11.1.m1.1.1.2.3.cmml" xref="S3.T2.11.11.11.11.1.m1.1.1.2.3">𝜏</ci></apply><ci id="S3.T2.11.11.11.11.1.m1.1.1.3.cmml" xref="S3.T2.11.11.11.11.1.m1.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.11.11.11.11.1.m1.1c">w_{\tau}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.11.11.11.11.1.m1.1d">italic_w start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>)</td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T2.12.12.12.12.2"> <math alttext="\xi\cdot N" class="ltx_Math" display="inline" id="S3.T2.12.12.12.12.2.m1.1"><semantics id="S3.T2.12.12.12.12.2.m1.1a"><mrow id="S3.T2.12.12.12.12.2.m1.1.1" xref="S3.T2.12.12.12.12.2.m1.1.1.cmml"><mi id="S3.T2.12.12.12.12.2.m1.1.1.2" xref="S3.T2.12.12.12.12.2.m1.1.1.2.cmml">ξ</mi><mo id="S3.T2.12.12.12.12.2.m1.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.T2.12.12.12.12.2.m1.1.1.1.cmml">⋅</mo><mi id="S3.T2.12.12.12.12.2.m1.1.1.3" xref="S3.T2.12.12.12.12.2.m1.1.1.3.cmml">N</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.T2.12.12.12.12.2.m1.1b"><apply id="S3.T2.12.12.12.12.2.m1.1.1.cmml" xref="S3.T2.12.12.12.12.2.m1.1.1"><ci id="S3.T2.12.12.12.12.2.m1.1.1.1.cmml" xref="S3.T2.12.12.12.12.2.m1.1.1.1">⋅</ci><ci id="S3.T2.12.12.12.12.2.m1.1.1.2.cmml" xref="S3.T2.12.12.12.12.2.m1.1.1.2">𝜉</ci><ci id="S3.T2.12.12.12.12.2.m1.1.1.3.cmml" xref="S3.T2.12.12.12.12.2.m1.1.1.3">𝑁</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.12.12.12.12.2.m1.1c">\xi\cdot N</annotation><annotation encoding="application/x-llamapun" id="S3.T2.12.12.12.12.2.m1.1d">italic_ξ ⋅ italic_N</annotation></semantics></math>(0,2.5e-3) norm%</td> <td class="ltx_td ltx_align_left ltx_border_t" id="S3.T2.13.13.13.13.3"> <math alttext="\xi\cdot N" class="ltx_Math" display="inline" id="S3.T2.13.13.13.13.3.m1.1"><semantics id="S3.T2.13.13.13.13.3.m1.1a"><mrow id="S3.T2.13.13.13.13.3.m1.1.1" xref="S3.T2.13.13.13.13.3.m1.1.1.cmml"><mi id="S3.T2.13.13.13.13.3.m1.1.1.2" xref="S3.T2.13.13.13.13.3.m1.1.1.2.cmml">ξ</mi><mo id="S3.T2.13.13.13.13.3.m1.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.T2.13.13.13.13.3.m1.1.1.1.cmml">⋅</mo><mi id="S3.T2.13.13.13.13.3.m1.1.1.3" xref="S3.T2.13.13.13.13.3.m1.1.1.3.cmml">N</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.T2.13.13.13.13.3.m1.1b"><apply id="S3.T2.13.13.13.13.3.m1.1.1.cmml" xref="S3.T2.13.13.13.13.3.m1.1.1"><ci id="S3.T2.13.13.13.13.3.m1.1.1.1.cmml" xref="S3.T2.13.13.13.13.3.m1.1.1.1">⋅</ci><ci id="S3.T2.13.13.13.13.3.m1.1.1.2.cmml" xref="S3.T2.13.13.13.13.3.m1.1.1.2">𝜉</ci><ci id="S3.T2.13.13.13.13.3.m1.1.1.3.cmml" xref="S3.T2.13.13.13.13.3.m1.1.1.3">𝑁</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.13.13.13.13.3.m1.1c">\xi\cdot N</annotation><annotation encoding="application/x-llamapun" id="S3.T2.13.13.13.13.3.m1.1d">italic_ξ ⋅ italic_N</annotation></semantics></math>(0,2.5e-3) norm%</td> </tr> <tr class="ltx_tr" id="S3.T2.16.16.16.16"> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.14.14.14.14.1">Steering (<math alttext="w_{\delta}^{k}" class="ltx_Math" display="inline" id="S3.T2.14.14.14.14.1.m1.1"><semantics id="S3.T2.14.14.14.14.1.m1.1a"><msubsup id="S3.T2.14.14.14.14.1.m1.1.1" xref="S3.T2.14.14.14.14.1.m1.1.1.cmml"><mi id="S3.T2.14.14.14.14.1.m1.1.1.2.2" xref="S3.T2.14.14.14.14.1.m1.1.1.2.2.cmml">w</mi><mi id="S3.T2.14.14.14.14.1.m1.1.1.2.3" xref="S3.T2.14.14.14.14.1.m1.1.1.2.3.cmml">δ</mi><mi id="S3.T2.14.14.14.14.1.m1.1.1.3" xref="S3.T2.14.14.14.14.1.m1.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.14.14.14.14.1.m1.1b"><apply id="S3.T2.14.14.14.14.1.m1.1.1.cmml" xref="S3.T2.14.14.14.14.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.14.14.14.14.1.m1.1.1.1.cmml" xref="S3.T2.14.14.14.14.1.m1.1.1">superscript</csymbol><apply id="S3.T2.14.14.14.14.1.m1.1.1.2.cmml" xref="S3.T2.14.14.14.14.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.14.14.14.14.1.m1.1.1.2.1.cmml" xref="S3.T2.14.14.14.14.1.m1.1.1">subscript</csymbol><ci id="S3.T2.14.14.14.14.1.m1.1.1.2.2.cmml" xref="S3.T2.14.14.14.14.1.m1.1.1.2.2">𝑤</ci><ci id="S3.T2.14.14.14.14.1.m1.1.1.2.3.cmml" xref="S3.T2.14.14.14.14.1.m1.1.1.2.3">𝛿</ci></apply><ci id="S3.T2.14.14.14.14.1.m1.1.1.3.cmml" xref="S3.T2.14.14.14.14.1.m1.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.14.14.14.14.1.m1.1c">w_{\delta}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.14.14.14.14.1.m1.1d">italic_w start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>)</td> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.15.15.15.15.2"> <math alttext="\xi\cdot N" class="ltx_Math" display="inline" id="S3.T2.15.15.15.15.2.m1.1"><semantics id="S3.T2.15.15.15.15.2.m1.1a"><mrow id="S3.T2.15.15.15.15.2.m1.1.1" xref="S3.T2.15.15.15.15.2.m1.1.1.cmml"><mi id="S3.T2.15.15.15.15.2.m1.1.1.2" xref="S3.T2.15.15.15.15.2.m1.1.1.2.cmml">ξ</mi><mo id="S3.T2.15.15.15.15.2.m1.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.T2.15.15.15.15.2.m1.1.1.1.cmml">⋅</mo><mi id="S3.T2.15.15.15.15.2.m1.1.1.3" xref="S3.T2.15.15.15.15.2.m1.1.1.3.cmml">N</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.T2.15.15.15.15.2.m1.1b"><apply id="S3.T2.15.15.15.15.2.m1.1.1.cmml" xref="S3.T2.15.15.15.15.2.m1.1.1"><ci id="S3.T2.15.15.15.15.2.m1.1.1.1.cmml" xref="S3.T2.15.15.15.15.2.m1.1.1.1">⋅</ci><ci id="S3.T2.15.15.15.15.2.m1.1.1.2.cmml" xref="S3.T2.15.15.15.15.2.m1.1.1.2">𝜉</ci><ci id="S3.T2.15.15.15.15.2.m1.1.1.3.cmml" xref="S3.T2.15.15.15.15.2.m1.1.1.3">𝑁</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.15.15.15.15.2.m1.1c">\xi\cdot N</annotation><annotation encoding="application/x-llamapun" id="S3.T2.15.15.15.15.2.m1.1d">italic_ξ ⋅ italic_N</annotation></semantics></math>(0,2.5e-3) norm%</td> <td class="ltx_td ltx_align_left" id="S3.T2.16.16.16.16.3"> <math alttext="\xi\cdot N" class="ltx_Math" display="inline" id="S3.T2.16.16.16.16.3.m1.1"><semantics id="S3.T2.16.16.16.16.3.m1.1a"><mrow id="S3.T2.16.16.16.16.3.m1.1.1" xref="S3.T2.16.16.16.16.3.m1.1.1.cmml"><mi id="S3.T2.16.16.16.16.3.m1.1.1.2" xref="S3.T2.16.16.16.16.3.m1.1.1.2.cmml">ξ</mi><mo id="S3.T2.16.16.16.16.3.m1.1.1.1" lspace="0.222em" rspace="0.222em" xref="S3.T2.16.16.16.16.3.m1.1.1.1.cmml">⋅</mo><mi id="S3.T2.16.16.16.16.3.m1.1.1.3" xref="S3.T2.16.16.16.16.3.m1.1.1.3.cmml">N</mi></mrow><annotation-xml encoding="MathML-Content" id="S3.T2.16.16.16.16.3.m1.1b"><apply id="S3.T2.16.16.16.16.3.m1.1.1.cmml" xref="S3.T2.16.16.16.16.3.m1.1.1"><ci id="S3.T2.16.16.16.16.3.m1.1.1.1.cmml" xref="S3.T2.16.16.16.16.3.m1.1.1.1">⋅</ci><ci id="S3.T2.16.16.16.16.3.m1.1.1.2.cmml" xref="S3.T2.16.16.16.16.3.m1.1.1.2">𝜉</ci><ci id="S3.T2.16.16.16.16.3.m1.1.1.3.cmml" xref="S3.T2.16.16.16.16.3.m1.1.1.3">𝑁</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.16.16.16.16.3.m1.1c">\xi\cdot N</annotation><annotation encoding="application/x-llamapun" id="S3.T2.16.16.16.16.3.m1.1d">italic_ξ ⋅ italic_N</annotation></semantics></math>(0,2.5e-3) norm%</td> </tr> <tr class="ltx_tr" id="S3.T2.28.28.28.32.4"> <td class="ltx_td ltx_align_left ltx_border_t" colspan="2" id="S3.T2.28.28.28.32.4.1"><span class="ltx_text ltx_font_bold" id="S3.T2.28.28.28.32.4.1.1">Agent Dynamics</span></td> <td class="ltx_td ltx_border_t" id="S3.T2.28.28.28.32.4.2"></td> </tr> <tr class="ltx_tr" id="S3.T2.20.20.20.20"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T2.19.19.19.19.3">Center of Mass (<math alttext="w_{x_{cg}}^{k}" class="ltx_Math" display="inline" id="S3.T2.17.17.17.17.1.m1.1"><semantics id="S3.T2.17.17.17.17.1.m1.1a"><msubsup id="S3.T2.17.17.17.17.1.m1.1.1" xref="S3.T2.17.17.17.17.1.m1.1.1.cmml"><mi id="S3.T2.17.17.17.17.1.m1.1.1.2.2" xref="S3.T2.17.17.17.17.1.m1.1.1.2.2.cmml">w</mi><msub id="S3.T2.17.17.17.17.1.m1.1.1.2.3" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3.cmml"><mi id="S3.T2.17.17.17.17.1.m1.1.1.2.3.2" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3.2.cmml">x</mi><mrow id="S3.T2.17.17.17.17.1.m1.1.1.2.3.3" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.cmml"><mi id="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.2" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.2.cmml">c</mi><mo id="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.1" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.1.cmml">⁢</mo><mi id="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.3" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.3.cmml">g</mi></mrow></msub><mi id="S3.T2.17.17.17.17.1.m1.1.1.3" xref="S3.T2.17.17.17.17.1.m1.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.17.17.17.17.1.m1.1b"><apply id="S3.T2.17.17.17.17.1.m1.1.1.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.17.17.17.17.1.m1.1.1.1.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1">superscript</csymbol><apply id="S3.T2.17.17.17.17.1.m1.1.1.2.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.17.17.17.17.1.m1.1.1.2.1.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1">subscript</csymbol><ci id="S3.T2.17.17.17.17.1.m1.1.1.2.2.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1.2.2">𝑤</ci><apply id="S3.T2.17.17.17.17.1.m1.1.1.2.3.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3"><csymbol cd="ambiguous" id="S3.T2.17.17.17.17.1.m1.1.1.2.3.1.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3">subscript</csymbol><ci id="S3.T2.17.17.17.17.1.m1.1.1.2.3.2.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3.2">𝑥</ci><apply id="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3.3"><times id="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.1.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.1"></times><ci id="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.2.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.2">𝑐</ci><ci id="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.3.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1.2.3.3.3">𝑔</ci></apply></apply></apply><ci id="S3.T2.17.17.17.17.1.m1.1.1.3.cmml" xref="S3.T2.17.17.17.17.1.m1.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.17.17.17.17.1.m1.1c">w_{x_{cg}}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.17.17.17.17.1.m1.1d">italic_w start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_c italic_g end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>, <math alttext="w_{y_{cg}}^{k}" class="ltx_Math" display="inline" id="S3.T2.18.18.18.18.2.m2.1"><semantics id="S3.T2.18.18.18.18.2.m2.1a"><msubsup id="S3.T2.18.18.18.18.2.m2.1.1" xref="S3.T2.18.18.18.18.2.m2.1.1.cmml"><mi id="S3.T2.18.18.18.18.2.m2.1.1.2.2" xref="S3.T2.18.18.18.18.2.m2.1.1.2.2.cmml">w</mi><msub id="S3.T2.18.18.18.18.2.m2.1.1.2.3" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3.cmml"><mi id="S3.T2.18.18.18.18.2.m2.1.1.2.3.2" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3.2.cmml">y</mi><mrow id="S3.T2.18.18.18.18.2.m2.1.1.2.3.3" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.cmml"><mi id="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.2" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.2.cmml">c</mi><mo id="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.1" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.1.cmml">⁢</mo><mi id="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.3" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.3.cmml">g</mi></mrow></msub><mi id="S3.T2.18.18.18.18.2.m2.1.1.3" xref="S3.T2.18.18.18.18.2.m2.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.18.18.18.18.2.m2.1b"><apply id="S3.T2.18.18.18.18.2.m2.1.1.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1"><csymbol cd="ambiguous" id="S3.T2.18.18.18.18.2.m2.1.1.1.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1">superscript</csymbol><apply id="S3.T2.18.18.18.18.2.m2.1.1.2.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1"><csymbol cd="ambiguous" id="S3.T2.18.18.18.18.2.m2.1.1.2.1.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1">subscript</csymbol><ci id="S3.T2.18.18.18.18.2.m2.1.1.2.2.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1.2.2">𝑤</ci><apply id="S3.T2.18.18.18.18.2.m2.1.1.2.3.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3"><csymbol cd="ambiguous" id="S3.T2.18.18.18.18.2.m2.1.1.2.3.1.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3">subscript</csymbol><ci id="S3.T2.18.18.18.18.2.m2.1.1.2.3.2.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3.2">𝑦</ci><apply id="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3.3"><times id="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.1.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.1"></times><ci id="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.2.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.2">𝑐</ci><ci id="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.3.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1.2.3.3.3">𝑔</ci></apply></apply></apply><ci id="S3.T2.18.18.18.18.2.m2.1.1.3.cmml" xref="S3.T2.18.18.18.18.2.m2.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.18.18.18.18.2.m2.1c">w_{y_{cg}}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.18.18.18.18.2.m2.1d">italic_w start_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_c italic_g end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>, <math alttext="w_{z_{cg}}^{k}" class="ltx_Math" display="inline" id="S3.T2.19.19.19.19.3.m3.1"><semantics id="S3.T2.19.19.19.19.3.m3.1a"><msubsup id="S3.T2.19.19.19.19.3.m3.1.1" xref="S3.T2.19.19.19.19.3.m3.1.1.cmml"><mi id="S3.T2.19.19.19.19.3.m3.1.1.2.2" xref="S3.T2.19.19.19.19.3.m3.1.1.2.2.cmml">w</mi><msub id="S3.T2.19.19.19.19.3.m3.1.1.2.3" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3.cmml"><mi id="S3.T2.19.19.19.19.3.m3.1.1.2.3.2" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3.2.cmml">z</mi><mrow id="S3.T2.19.19.19.19.3.m3.1.1.2.3.3" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.cmml"><mi id="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.2" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.2.cmml">c</mi><mo id="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.1" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.1.cmml">⁢</mo><mi id="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.3" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.3.cmml">g</mi></mrow></msub><mi id="S3.T2.19.19.19.19.3.m3.1.1.3" xref="S3.T2.19.19.19.19.3.m3.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.19.19.19.19.3.m3.1b"><apply id="S3.T2.19.19.19.19.3.m3.1.1.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1"><csymbol cd="ambiguous" id="S3.T2.19.19.19.19.3.m3.1.1.1.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1">superscript</csymbol><apply id="S3.T2.19.19.19.19.3.m3.1.1.2.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1"><csymbol cd="ambiguous" id="S3.T2.19.19.19.19.3.m3.1.1.2.1.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1">subscript</csymbol><ci id="S3.T2.19.19.19.19.3.m3.1.1.2.2.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1.2.2">𝑤</ci><apply id="S3.T2.19.19.19.19.3.m3.1.1.2.3.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3"><csymbol cd="ambiguous" id="S3.T2.19.19.19.19.3.m3.1.1.2.3.1.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3">subscript</csymbol><ci id="S3.T2.19.19.19.19.3.m3.1.1.2.3.2.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3.2">𝑧</ci><apply id="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3.3"><times id="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.1.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.1"></times><ci id="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.2.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.2">𝑐</ci><ci id="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.3.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1.2.3.3.3">𝑔</ci></apply></apply></apply><ci id="S3.T2.19.19.19.19.3.m3.1.1.3.cmml" xref="S3.T2.19.19.19.19.3.m3.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.19.19.19.19.3.m3.1c">w_{z_{cg}}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.19.19.19.19.3.m3.1d">italic_w start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_c italic_g end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>)</td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T2.20.20.20.20.5">–</td> <td class="ltx_td ltx_align_left ltx_border_t" id="S3.T2.20.20.20.20.4"> <math alttext="\xi\cdot" class="ltx_math_unparsed" display="inline" id="S3.T2.20.20.20.20.4.m1.1"><semantics id="S3.T2.20.20.20.20.4.m1.1a"><mrow id="S3.T2.20.20.20.20.4.m1.1b"><mi id="S3.T2.20.20.20.20.4.m1.1.1">ξ</mi><mo id="S3.T2.20.20.20.20.4.m1.1.2" lspace="0.222em">⋅</mo></mrow><annotation encoding="application/x-tex" id="S3.T2.20.20.20.20.4.m1.1c">\xi\cdot</annotation><annotation encoding="application/x-llamapun" id="S3.T2.20.20.20.20.4.m1.1d">italic_ξ ⋅</annotation></semantics></math>[-5e-2:1.11e-2:5e-2] m</td> </tr> <tr class="ltx_tr" id="S3.T2.22.22.22.22"> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.21.21.21.21.1">Suspension Stiffness (<math alttext="w_{K}^{k}" class="ltx_Math" display="inline" id="S3.T2.21.21.21.21.1.m1.1"><semantics id="S3.T2.21.21.21.21.1.m1.1a"><msubsup id="S3.T2.21.21.21.21.1.m1.1.1" xref="S3.T2.21.21.21.21.1.m1.1.1.cmml"><mi id="S3.T2.21.21.21.21.1.m1.1.1.2.2" xref="S3.T2.21.21.21.21.1.m1.1.1.2.2.cmml">w</mi><mi id="S3.T2.21.21.21.21.1.m1.1.1.2.3" xref="S3.T2.21.21.21.21.1.m1.1.1.2.3.cmml">K</mi><mi id="S3.T2.21.21.21.21.1.m1.1.1.3" xref="S3.T2.21.21.21.21.1.m1.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.21.21.21.21.1.m1.1b"><apply id="S3.T2.21.21.21.21.1.m1.1.1.cmml" xref="S3.T2.21.21.21.21.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.21.21.21.21.1.m1.1.1.1.cmml" xref="S3.T2.21.21.21.21.1.m1.1.1">superscript</csymbol><apply id="S3.T2.21.21.21.21.1.m1.1.1.2.cmml" xref="S3.T2.21.21.21.21.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.21.21.21.21.1.m1.1.1.2.1.cmml" xref="S3.T2.21.21.21.21.1.m1.1.1">subscript</csymbol><ci id="S3.T2.21.21.21.21.1.m1.1.1.2.2.cmml" xref="S3.T2.21.21.21.21.1.m1.1.1.2.2">𝑤</ci><ci id="S3.T2.21.21.21.21.1.m1.1.1.2.3.cmml" xref="S3.T2.21.21.21.21.1.m1.1.1.2.3">𝐾</ci></apply><ci id="S3.T2.21.21.21.21.1.m1.1.1.3.cmml" xref="S3.T2.21.21.21.21.1.m1.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.21.21.21.21.1.m1.1c">w_{K}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.21.21.21.21.1.m1.1d">italic_w start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>)</td> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.22.22.22.22.3">–</td> <td class="ltx_td ltx_align_left" id="S3.T2.22.22.22.22.2"> <math alttext="\xi\cdot" class="ltx_math_unparsed" display="inline" id="S3.T2.22.22.22.22.2.m1.1"><semantics id="S3.T2.22.22.22.22.2.m1.1a"><mrow id="S3.T2.22.22.22.22.2.m1.1b"><mi id="S3.T2.22.22.22.22.2.m1.1.1">ξ</mi><mo id="S3.T2.22.22.22.22.2.m1.1.2" lspace="0.222em">⋅</mo></mrow><annotation encoding="application/x-tex" id="S3.T2.22.22.22.22.2.m1.1c">\xi\cdot</annotation><annotation encoding="application/x-llamapun" id="S3.T2.22.22.22.22.2.m1.1d">italic_ξ ⋅</annotation></semantics></math>[-100:22.22:100] N/m</td> </tr> <tr class="ltx_tr" id="S3.T2.24.24.24.24"> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.23.23.23.23.1">Tire Stiffness (<math alttext="w_{c_{\alpha}}^{k}" class="ltx_Math" display="inline" id="S3.T2.23.23.23.23.1.m1.1"><semantics id="S3.T2.23.23.23.23.1.m1.1a"><msubsup id="S3.T2.23.23.23.23.1.m1.1.1" xref="S3.T2.23.23.23.23.1.m1.1.1.cmml"><mi id="S3.T2.23.23.23.23.1.m1.1.1.2.2" xref="S3.T2.23.23.23.23.1.m1.1.1.2.2.cmml">w</mi><msub id="S3.T2.23.23.23.23.1.m1.1.1.2.3" xref="S3.T2.23.23.23.23.1.m1.1.1.2.3.cmml"><mi id="S3.T2.23.23.23.23.1.m1.1.1.2.3.2" xref="S3.T2.23.23.23.23.1.m1.1.1.2.3.2.cmml">c</mi><mi id="S3.T2.23.23.23.23.1.m1.1.1.2.3.3" xref="S3.T2.23.23.23.23.1.m1.1.1.2.3.3.cmml">α</mi></msub><mi id="S3.T2.23.23.23.23.1.m1.1.1.3" xref="S3.T2.23.23.23.23.1.m1.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.23.23.23.23.1.m1.1b"><apply id="S3.T2.23.23.23.23.1.m1.1.1.cmml" xref="S3.T2.23.23.23.23.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.23.23.23.23.1.m1.1.1.1.cmml" xref="S3.T2.23.23.23.23.1.m1.1.1">superscript</csymbol><apply id="S3.T2.23.23.23.23.1.m1.1.1.2.cmml" xref="S3.T2.23.23.23.23.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.23.23.23.23.1.m1.1.1.2.1.cmml" xref="S3.T2.23.23.23.23.1.m1.1.1">subscript</csymbol><ci id="S3.T2.23.23.23.23.1.m1.1.1.2.2.cmml" xref="S3.T2.23.23.23.23.1.m1.1.1.2.2">𝑤</ci><apply id="S3.T2.23.23.23.23.1.m1.1.1.2.3.cmml" xref="S3.T2.23.23.23.23.1.m1.1.1.2.3"><csymbol cd="ambiguous" id="S3.T2.23.23.23.23.1.m1.1.1.2.3.1.cmml" xref="S3.T2.23.23.23.23.1.m1.1.1.2.3">subscript</csymbol><ci id="S3.T2.23.23.23.23.1.m1.1.1.2.3.2.cmml" xref="S3.T2.23.23.23.23.1.m1.1.1.2.3.2">𝑐</ci><ci id="S3.T2.23.23.23.23.1.m1.1.1.2.3.3.cmml" xref="S3.T2.23.23.23.23.1.m1.1.1.2.3.3">𝛼</ci></apply></apply><ci id="S3.T2.23.23.23.23.1.m1.1.1.3.cmml" xref="S3.T2.23.23.23.23.1.m1.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.23.23.23.23.1.m1.1c">w_{c_{\alpha}}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.23.23.23.23.1.m1.1d">italic_w start_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>)</td> <td class="ltx_td ltx_align_left ltx_border_r" id="S3.T2.24.24.24.24.3">–</td> <td class="ltx_td ltx_align_left" id="S3.T2.24.24.24.24.2"> <math alttext="\xi\cdot" class="ltx_math_unparsed" display="inline" id="S3.T2.24.24.24.24.2.m1.1"><semantics id="S3.T2.24.24.24.24.2.m1.1a"><mrow id="S3.T2.24.24.24.24.2.m1.1b"><mi id="S3.T2.24.24.24.24.2.m1.1.1">ξ</mi><mo id="S3.T2.24.24.24.24.2.m1.1.2" lspace="0.222em">⋅</mo></mrow><annotation encoding="application/x-tex" id="S3.T2.24.24.24.24.2.m1.1c">\xi\cdot</annotation><annotation encoding="application/x-llamapun" id="S3.T2.24.24.24.24.2.m1.1d">italic_ξ ⋅</annotation></semantics></math>[-2.5:5.6e-1:2.5] N/rad</td> </tr> <tr class="ltx_tr" id="S3.T2.28.28.28.33.5"> <td class="ltx_td ltx_align_left ltx_border_t" colspan="2" id="S3.T2.28.28.28.33.5.1"><span class="ltx_text ltx_font_bold" id="S3.T2.28.28.28.33.5.1.1">Environment Dynamics</span></td> <td class="ltx_td ltx_border_t" id="S3.T2.28.28.28.33.5.2"></td> </tr> <tr class="ltx_tr" id="S3.T2.26.26.26.26"> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T2.25.25.25.25.1">Surface Friction (<math alttext="w_{\mu}^{k}" class="ltx_Math" display="inline" id="S3.T2.25.25.25.25.1.m1.1"><semantics id="S3.T2.25.25.25.25.1.m1.1a"><msubsup id="S3.T2.25.25.25.25.1.m1.1.1" xref="S3.T2.25.25.25.25.1.m1.1.1.cmml"><mi id="S3.T2.25.25.25.25.1.m1.1.1.2.2" xref="S3.T2.25.25.25.25.1.m1.1.1.2.2.cmml">w</mi><mi id="S3.T2.25.25.25.25.1.m1.1.1.2.3" xref="S3.T2.25.25.25.25.1.m1.1.1.2.3.cmml">μ</mi><mi id="S3.T2.25.25.25.25.1.m1.1.1.3" xref="S3.T2.25.25.25.25.1.m1.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.25.25.25.25.1.m1.1b"><apply id="S3.T2.25.25.25.25.1.m1.1.1.cmml" xref="S3.T2.25.25.25.25.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.25.25.25.25.1.m1.1.1.1.cmml" xref="S3.T2.25.25.25.25.1.m1.1.1">superscript</csymbol><apply id="S3.T2.25.25.25.25.1.m1.1.1.2.cmml" xref="S3.T2.25.25.25.25.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.25.25.25.25.1.m1.1.1.2.1.cmml" xref="S3.T2.25.25.25.25.1.m1.1.1">subscript</csymbol><ci id="S3.T2.25.25.25.25.1.m1.1.1.2.2.cmml" xref="S3.T2.25.25.25.25.1.m1.1.1.2.2">𝑤</ci><ci id="S3.T2.25.25.25.25.1.m1.1.1.2.3.cmml" xref="S3.T2.25.25.25.25.1.m1.1.1.2.3">𝜇</ci></apply><ci id="S3.T2.25.25.25.25.1.m1.1.1.3.cmml" xref="S3.T2.25.25.25.25.1.m1.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.25.25.25.25.1.m1.1c">w_{\mu}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.25.25.25.25.1.m1.1d">italic_w start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>)</td> <td class="ltx_td ltx_align_left ltx_border_r ltx_border_t" id="S3.T2.26.26.26.26.2"> <math alttext="\xi\cdot" class="ltx_math_unparsed" display="inline" id="S3.T2.26.26.26.26.2.m1.1"><semantics id="S3.T2.26.26.26.26.2.m1.1a"><mrow id="S3.T2.26.26.26.26.2.m1.1b"><mi id="S3.T2.26.26.26.26.2.m1.1.1">ξ</mi><mo id="S3.T2.26.26.26.26.2.m1.1.2" lspace="0.222em">⋅</mo></mrow><annotation encoding="application/x-tex" id="S3.T2.26.26.26.26.2.m1.1c">\xi\cdot</annotation><annotation encoding="application/x-llamapun" id="S3.T2.26.26.26.26.2.m1.1d">italic_ξ ⋅</annotation></semantics></math>[-1e-1:8.33e-3:1e-1]</td> <td class="ltx_td ltx_align_left ltx_border_t" id="S3.T2.26.26.26.26.3">–</td> </tr> <tr class="ltx_tr" id="S3.T2.28.28.28.28"> <td class="ltx_td ltx_align_left ltx_border_b ltx_border_r" id="S3.T2.27.27.27.27.1">Communication Delay (<math alttext="w_{d}^{k}" class="ltx_Math" display="inline" id="S3.T2.27.27.27.27.1.m1.1"><semantics id="S3.T2.27.27.27.27.1.m1.1a"><msubsup id="S3.T2.27.27.27.27.1.m1.1.1" xref="S3.T2.27.27.27.27.1.m1.1.1.cmml"><mi id="S3.T2.27.27.27.27.1.m1.1.1.2.2" xref="S3.T2.27.27.27.27.1.m1.1.1.2.2.cmml">w</mi><mi id="S3.T2.27.27.27.27.1.m1.1.1.2.3" xref="S3.T2.27.27.27.27.1.m1.1.1.2.3.cmml">d</mi><mi id="S3.T2.27.27.27.27.1.m1.1.1.3" xref="S3.T2.27.27.27.27.1.m1.1.1.3.cmml">k</mi></msubsup><annotation-xml encoding="MathML-Content" id="S3.T2.27.27.27.27.1.m1.1b"><apply id="S3.T2.27.27.27.27.1.m1.1.1.cmml" xref="S3.T2.27.27.27.27.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.27.27.27.27.1.m1.1.1.1.cmml" xref="S3.T2.27.27.27.27.1.m1.1.1">superscript</csymbol><apply id="S3.T2.27.27.27.27.1.m1.1.1.2.cmml" xref="S3.T2.27.27.27.27.1.m1.1.1"><csymbol cd="ambiguous" id="S3.T2.27.27.27.27.1.m1.1.1.2.1.cmml" xref="S3.T2.27.27.27.27.1.m1.1.1">subscript</csymbol><ci id="S3.T2.27.27.27.27.1.m1.1.1.2.2.cmml" xref="S3.T2.27.27.27.27.1.m1.1.1.2.2">𝑤</ci><ci id="S3.T2.27.27.27.27.1.m1.1.1.2.3.cmml" xref="S3.T2.27.27.27.27.1.m1.1.1.2.3">𝑑</ci></apply><ci id="S3.T2.27.27.27.27.1.m1.1.1.3.cmml" xref="S3.T2.27.27.27.27.1.m1.1.1.3">𝑘</ci></apply></annotation-xml><annotation encoding="application/x-tex" id="S3.T2.27.27.27.27.1.m1.1c">w_{d}^{k}</annotation><annotation encoding="application/x-llamapun" id="S3.T2.27.27.27.27.1.m1.1d">italic_w start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT</annotation></semantics></math>)</td> <td class="ltx_td ltx_align_left ltx_border_b ltx_border_r" id="S3.T2.28.28.28.28.2"> <math alttext="\xi\cdot" class="ltx_math_unparsed" display="inline" id="S3.T2.28.28.28.28.2.m1.1"><semantics id="S3.T2.28.28.28.28.2.m1.1a"><mrow id="S3.T2.28.28.28.28.2.m1.1b"><mi id="S3.T2.28.28.28.28.2.m1.1.1">ξ</mi><mo id="S3.T2.28.28.28.28.2.m1.1.2" lspace="0.222em">⋅</mo></mrow><annotation encoding="application/x-tex" id="S3.T2.28.28.28.28.2.m1.1c">\xi\cdot</annotation><annotation encoding="application/x-llamapun" id="S3.T2.28.28.28.28.2.m1.1d">italic_ξ ⋅</annotation></semantics></math>[0:4.17e-4:1e-2] s</td> <td class="ltx_td ltx_align_left ltx_border_b" id="S3.T2.28.28.28.28.3">–</td> </tr> </tbody> </table> </span></div> </figure> </section> <section class="ltx_subsection" id="S3.SS4"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S3.SS4.5.1.1">III-D</span> </span><span class="ltx_text ltx_font_italic" id="S3.SS4.6.2">Hybrid Sim2Real Transfer</span> </h3> <div class="ltx_para" id="S3.SS4.p1"> <p class="ltx_p" id="S3.SS4.p1.1">We propose a hybrid method for transferring the trained MARL policies from simulation to reality. The term <span class="ltx_text ltx_font_italic" id="S3.SS4.p1.1.1">“hybrid”</span> specifically alludes to a mixed-reality digital twin framework, which establishes a real-time bi-directional synchronization between the physical and virtual worlds. The intention is to minimize the number of physical agent(s) and environmental element(s) while deploying and validating MARL systems in the real world. Fig. <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S1.F1" title="Figure 1 ‣ I Introduction ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">1</span></a> (captured at 1 Hz) depicts the sim2real transfer of MARL policies discussed in this work using the proposed framework. Here, we deploy a single physical agent in an open space and connect it with its digital twin. The “ego” digital twin operates in a virtual environment with virtual peers, collects observations, and uses the trained policy to plan actions in the digital space. The planned action sequences are relayed back to the physical twin to be executed in the real world, which updates its state in reality. Finally, the ego digital twin is updated based on real-time state estimates of its physical twin to close the loop. This process is repeated recursively until the experiment is completed. This way, we can exploit the real-world characteristics of vehicle dynamics and tire-road interactions while being resource-altruistic by augmenting environmental element(s) and peer agent(s) in the digital space. This also alleviates the safety concern of the experimental vehicles colliding with each other or the environmental element(s).</p> </div> </section> </section> <section class="ltx_section" id="S4"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">IV </span><span class="ltx_text ltx_font_smallcaps" id="S4.1.1">Results</span> </h2> <div class="ltx_para" id="S4.p1"> <p class="ltx_p" id="S4.p1.1">We use the follow-the-gap method (FGM) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#bib.bib23" title="">23</a>]</cite> as a benchmark for the MARL policies discussed in this work. The choice of FGM was justified since (a) it is capable of negotiating static and dynamic obstacles, which is essential for multi-agent systems, (b) it can work with non-holonomic Ackermann-steered vehicles adopted in this work, and (c) it is a decentralized reactive algorithm making a solid case for <span class="ltx_text ltx_font_italic" id="S4.p1.1.1">“apples-to-apples”</span> comparison with the MARL policies.</p> </div> <section class="ltx_subsection" id="S4.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S4.SS1.5.1.1">IV-A</span> </span><span class="ltx_text ltx_font_italic" id="S4.SS1.6.2">Cooperative Multi-Agent Scenario</span> </h3> <section class="ltx_subsubsection" id="S4.SS1.SSS1"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S4.SS1.SSS1.5.1.1">IV-A</span>1 </span>Training and Simulation Parallelization</h4> <div class="ltx_para" id="S4.SS1.SSS1.p1"> <p class="ltx_p" id="S4.SS1.SSS1.p1.9">Fig. <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.F3" title="Figure 3 ‣ IV-A1 Training and Simulation Parallelization ‣ IV-A Cooperative Multi-Agent Scenario ‣ IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">3</span></a> depicts the key performance indicators (KPIs) used to analyze the cooperative MARL training without any domain randomization (i.e., NDR). It was observed that the agents took over 600k steps to understand the collective objective of safe intersection traversal. This is marked by a sustainable increase in the cumulative reward (from <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS1.SSS1.p1.1.m1.1"><semantics id="S4.SS1.SSS1.p1.1.m1.1a"><mo id="S4.SS1.SSS1.p1.1.m1.1.1" xref="S4.SS1.SSS1.p1.1.m1.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS1.p1.1.m1.1b"><csymbol cd="latexml" id="S4.SS1.SSS1.p1.1.m1.1.1.cmml" xref="S4.SS1.SSS1.p1.1.m1.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS1.p1.1.m1.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS1.p1.1.m1.1d">∼</annotation></semantics></math>3 to <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS1.SSS1.p1.2.m2.1"><semantics id="S4.SS1.SSS1.p1.2.m2.1a"><mo id="S4.SS1.SSS1.p1.2.m2.1.1" xref="S4.SS1.SSS1.p1.2.m2.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS1.p1.2.m2.1b"><csymbol cd="latexml" id="S4.SS1.SSS1.p1.2.m2.1.1.cmml" xref="S4.SS1.SSS1.p1.2.m2.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS1.p1.2.m2.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS1.p1.2.m2.1d">∼</annotation></semantics></math>8) as well as episode length (from <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS1.SSS1.p1.3.m3.1"><semantics id="S4.SS1.SSS1.p1.3.m3.1a"><mo id="S4.SS1.SSS1.p1.3.m3.1.1" xref="S4.SS1.SSS1.p1.3.m3.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS1.p1.3.m3.1b"><csymbol cd="latexml" id="S4.SS1.SSS1.p1.3.m3.1.1.cmml" xref="S4.SS1.SSS1.p1.3.m3.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS1.p1.3.m3.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS1.p1.3.m3.1d">∼</annotation></semantics></math>470 to <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS1.SSS1.p1.4.m4.1"><semantics id="S4.SS1.SSS1.p1.4.m4.1a"><mo id="S4.SS1.SSS1.p1.4.m4.1.1" xref="S4.SS1.SSS1.p1.4.m4.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS1.p1.4.m4.1b"><csymbol cd="latexml" id="S4.SS1.SSS1.p1.4.m4.1.1.cmml" xref="S4.SS1.SSS1.p1.4.m4.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS1.p1.4.m4.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS1.p1.4.m4.1d">∼</annotation></semantics></math>600 steps). This is also when the policy entropy (i.e., randomness) fluctuated significantly, signifying that the agents were still learning. After this initial exploration, the agents tried reward hacking by choosing to take a longer time to traverse the intersection. This is marked by an increase in the episode length (from <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS1.SSS1.p1.5.m5.1"><semantics id="S4.SS1.SSS1.p1.5.m5.1a"><mo id="S4.SS1.SSS1.p1.5.m5.1.1" xref="S4.SS1.SSS1.p1.5.m5.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS1.p1.5.m5.1b"><csymbol cd="latexml" id="S4.SS1.SSS1.p1.5.m5.1.1.cmml" xref="S4.SS1.SSS1.p1.5.m5.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS1.p1.5.m5.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS1.p1.5.m5.1d">∼</annotation></semantics></math>600 to <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS1.SSS1.p1.6.m6.1"><semantics id="S4.SS1.SSS1.p1.6.m6.1a"><mo id="S4.SS1.SSS1.p1.6.m6.1.1" xref="S4.SS1.SSS1.p1.6.m6.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS1.p1.6.m6.1b"><csymbol cd="latexml" id="S4.SS1.SSS1.p1.6.m6.1.1.cmml" xref="S4.SS1.SSS1.p1.6.m6.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS1.p1.6.m6.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS1.p1.6.m6.1d">∼</annotation></semantics></math>700 steps) between 700k and 750k steps. We anticipate this to be an effect of the last reward term, which continuously rewarded the agents inversely proportional to their distance from the goal. However, this phase was quickly overcome, since the probability of collision or lane boundary violations increased and the resulting reward was comparatively insignificant. By now, the policy entropy was starting to settle but was still fluctuating a little. Towards the end of 1M steps, the policy converged at a stable cumulative reward (<math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS1.SSS1.p1.7.m7.1"><semantics id="S4.SS1.SSS1.p1.7.m7.1a"><mo id="S4.SS1.SSS1.p1.7.m7.1.1" xref="S4.SS1.SSS1.p1.7.m7.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS1.p1.7.m7.1b"><csymbol cd="latexml" id="S4.SS1.SSS1.p1.7.m7.1.1.cmml" xref="S4.SS1.SSS1.p1.7.m7.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS1.p1.7.m7.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS1.p1.7.m7.1d">∼</annotation></semantics></math>8) and episode length (<math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS1.SSS1.p1.8.m8.1"><semantics id="S4.SS1.SSS1.p1.8.m8.1a"><mo id="S4.SS1.SSS1.p1.8.m8.1.1" xref="S4.SS1.SSS1.p1.8.m8.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS1.p1.8.m8.1b"><csymbol cd="latexml" id="S4.SS1.SSS1.p1.8.m8.1.1.cmml" xref="S4.SS1.SSS1.p1.8.m8.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS1.p1.8.m8.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS1.p1.8.m8.1d">∼</annotation></semantics></math>600 steps), while settling at a policy entropy of <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS1.SSS1.p1.9.m9.1"><semantics id="S4.SS1.SSS1.p1.9.m9.1a"><mo id="S4.SS1.SSS1.p1.9.m9.1.1" xref="S4.SS1.SSS1.p1.9.m9.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS1.p1.9.m9.1b"><csymbol cd="latexml" id="S4.SS1.SSS1.p1.9.m9.1.1.cmml" xref="S4.SS1.SSS1.p1.9.m9.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS1.p1.9.m9.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS1.p1.9.m9.1d">∼</annotation></semantics></math>1.2. For low- and high-domain randomization (i.e., LDR and HDR), the KPIs followed a similar trend but with increasing fluctuations (especially in policy entropy), owing to the randomized parameters.</p> </div> <div class="ltx_para" id="S4.SS1.SSS1.p2"> <p class="ltx_p" id="S4.SS1.SSS1.p2.1">From a computing perspective, we analyzed the effect of parallelizing the intersection-traversal environment from a single instance (4 agents) up to 25 instances (100 agents). As depicted in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S1.F2" title="Figure 2 ‣ I Introduction ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_ref ltx_nolink"><span class="ltx_text ltx_ref_tag">2</span></span>(a)-(b)</a> the reduction in training time (up to 76.3%) was quite non-linear, with a saturating point approaching after 15-20 parallel environments.</p> </div> <figure class="ltx_figure" id="S4.F3"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_3"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F3.sf1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="464" id="S4.F3.sf1.g1" src="extracted/6294916/fig3a.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F3.sf1.2.1.1" style="font-size:90%;">(a)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_3"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F3.sf2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="464" id="S4.F3.sf2.g1" src="extracted/6294916/fig3b.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F3.sf2.2.1.1" style="font-size:90%;">(b)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_3"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F3.sf3"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="464" id="S4.F3.sf3.g1" src="extracted/6294916/fig3c.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F3.sf3.2.1.1" style="font-size:90%;">(c)</span> </span></figcaption> </figure> </div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F3.2.1.1" style="font-size:90%;">Figure 3</span>: </span><span class="ltx_text" id="S4.F3.3.2" style="font-size:90%;">Training results for cooperative MARL: (a) cumulative reward, (b) episode length, and (c) policy entropy w.r.t. training steps.</span></figcaption> </figure> <figure class="ltx_figure" id="S4.F4"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_3"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F4.sf1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S4.F4.sf1.g1" src="extracted/6294916/fig4a.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F4.sf1.2.1.1" style="font-size:90%;">(a)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_3"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F4.sf2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S4.F4.sf2.g1" src="extracted/6294916/fig4b.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F4.sf2.2.1.1" style="font-size:90%;">(b)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_3"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F4.sf3"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S4.F4.sf3.g1" src="extracted/6294916/fig4c.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F4.sf3.2.1.1" style="font-size:90%;">(c)</span> </span></figcaption> </figure> </div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F4.2.1.1" style="font-size:90%;">Figure 4</span>: </span><span class="ltx_text" id="S4.F4.3.2" style="font-size:90%;">Deployment results for cooperative MARL: (a) A1 and A4 avoid collision, (b) A1 finds a gap between A2 and A3 to reach its goal, and (c) A2 and A3 avoid collision, A4 approaches its goal, and A1 is re-spawned.</span></figcaption> </figure> <figure class="ltx_figure" id="S4.F5"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="365" id="S4.F5.g1" src="extracted/6294916/fig5.jpg" width="598"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F5.2.1.1" style="font-size:90%;">Figure 5</span>: </span><span class="ltx_text" id="S4.F5.3.2" style="font-size:90%;">Deployment and benchmarking of intersection traversal policies with 4 cooperative agents (A1-A4).</span></figcaption> </figure> <figure class="ltx_figure" id="S4.F6"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F6.sf1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="464" id="S4.F6.sf1.g1" src="extracted/6294916/fig6a.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F6.sf1.2.1.1" style="font-size:90%;">(a)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F6.sf2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="464" id="S4.F6.sf2.g1" src="extracted/6294916/fig6b.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F6.sf2.2.1.1" style="font-size:90%;">(b)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F6.sf3"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="464" id="S4.F6.sf3.g1" src="extracted/6294916/fig6c.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F6.sf3.2.1.1" style="font-size:90%;">(c)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F6.sf4"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="464" id="S4.F6.sf4.g1" src="extracted/6294916/fig6d.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F6.sf4.2.1.1" style="font-size:90%;">(d)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F6.sf5"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="464" id="S4.F6.sf5.g1" src="extracted/6294916/fig6e.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F6.sf5.2.1.1" style="font-size:90%;">(e)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F6.sf6"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="464" id="S4.F6.sf6.g1" src="extracted/6294916/fig6f.jpg" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F6.sf6.2.1.1" style="font-size:90%;">(f)</span> </span></figcaption> </figure> </div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F6.2.1.1" style="font-size:90%;">Figure 6</span>: </span><span class="ltx_text" id="S4.F6.3.2" style="font-size:90%;">Training results for competitive MARL: (a) BC loss, (b) GAIL reward, (c) curiosity reward, (d) extrinsic reward, (e) episode length, and (f) policy entropy w.r.t. training steps.</span></figcaption> </figure> <figure class="ltx_figure" id="S4.F7"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F7.sf1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S4.F7.sf1.g1" src="extracted/6294916/fig7a.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F7.sf1.2.1.1" style="font-size:90%;">(a)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F7.sf2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S4.F7.sf2.g1" src="extracted/6294916/fig7b.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F7.sf2.2.1.1" style="font-size:90%;">(b)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F7.sf3"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S4.F7.sf3.g1" src="extracted/6294916/fig7c.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F7.sf3.2.1.1" style="font-size:90%;">(c)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F7.sf4"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S4.F7.sf4.g1" src="extracted/6294916/fig7d.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F7.sf4.2.1.1" style="font-size:90%;">(d)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F7.sf5"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S4.F7.sf5.g1" src="extracted/6294916/fig7e.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F7.sf5.2.1.1" style="font-size:90%;">(e)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_many"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S4.F7.sf6"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="337" id="S4.F7.sf6.g1" src="extracted/6294916/fig7f.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F7.sf6.2.1.1" style="font-size:90%;">(f)</span> </span></figcaption> </figure> </div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F7.2.1.1" style="font-size:90%;">Figure 7</span>: </span><span class="ltx_text" id="S4.F7.3.2" style="font-size:90%;">Deployment results for competitive MARL: (a)-(c) denote three frozen snapshots of a block-block-overtake sequence, and (d)-(f) denote three frozen snapshots of a let-pass-and-overtake sequence.</span></figcaption> </figure> <figure class="ltx_figure" id="S4.F8"><img alt="Refer to caption" class="ltx_graphics ltx_img_landscape" height="365" id="S4.F8.g1" src="extracted/6294916/fig8.jpg" width="598"/> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F8.2.1.1" style="font-size:90%;">Figure 8</span>: </span><span class="ltx_text" id="S4.F8.3.2" style="font-size:90%;">Deployment and benchmarking of autonomous racing policies with 2 adversarial agents (A1 and A2).</span></figcaption> </figure> </section> <section class="ltx_subsubsection" id="S4.SS1.SSS2"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S4.SS1.SSS2.5.1.1">IV-A</span>2 </span>Deployment and Sim2Real Transfer</h4> <div class="ltx_para" id="S4.SS1.SSS2.p1"> <p class="ltx_p" id="S4.SS1.SSS2.p1.1">The trained policies were first deployed and verified in simulation, where we observed interesting emergent behaviors among the agents. The agents strategically slowed down or steered away from each other to avoid collision (refer Fig. <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.F4" title="Figure 4 ‣ IV-A1 Training and Simulation Parallelization ‣ IV-A Cooperative Multi-Agent Scenario ‣ IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">4</span></a>).</p> </div> <div class="ltx_para" id="S4.SS1.SSS2.p2"> <p class="ltx_p" id="S4.SS1.SSS2.p2.1">Next, we quantitatively analyzed the policies trained with different grades of domain randomization (i.e., NDR, LDR, and HDR) and benchmarked them against FGM. The design of experiments followed 16 simulation runs and 16 real-world deployments, where the performance was assessed across 3 KPIs, viz. success rate, cumulative reward, and episode duration aggregated across all the agents (since this was a cooperative scenario) as depicted in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.F5" title="Figure 5 ‣ IV-A1 Training and Simulation Parallelization ‣ IV-A Cooperative Multi-Agent Scenario ‣ IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">5</span></a>. For real-world deployments, our experiments cycled across all the agents, such that each agent was physically deployed in the loop with the simulated environment comprising its virtual peers (refer Section <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S3.SS4" title="III-D Hybrid Sim2Real Transfer ‣ III Methodology ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag"><span class="ltx_text">III-D</span></span></a> for implementation details).</p> </div> <div class="ltx_para" id="S4.SS1.SSS2.p3"> <p class="ltx_p" id="S4.SS1.SSS2.p3.3">It was observed that FGM was least successful (<math alttext="&lt;" class="ltx_Math" display="inline" id="S4.SS1.SSS2.p3.1.m1.1"><semantics id="S4.SS1.SSS2.p3.1.m1.1a"><mo id="S4.SS1.SSS2.p3.1.m1.1.1" xref="S4.SS1.SSS2.p3.1.m1.1.1.cmml">&lt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS2.p3.1.m1.1b"><lt id="S4.SS1.SSS2.p3.1.m1.1.1.cmml" xref="S4.SS1.SSS2.p3.1.m1.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS2.p3.1.m1.1c">&lt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS2.p3.1.m1.1d">&lt;</annotation></semantics></math>30%) across simulation as well as real-world experiments. The same fact was reflected across the reward (<math alttext="&lt;" class="ltx_Math" display="inline" id="S4.SS1.SSS2.p3.2.m2.1"><semantics id="S4.SS1.SSS2.p3.2.m2.1a"><mo id="S4.SS1.SSS2.p3.2.m2.1.1" xref="S4.SS1.SSS2.p3.2.m2.1.1.cmml">&lt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS2.p3.2.m2.1b"><lt id="S4.SS1.SSS2.p3.2.m2.1.1.cmml" xref="S4.SS1.SSS2.p3.2.m2.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS2.p3.2.m2.1c">&lt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS2.p3.2.m2.1d">&lt;</annotation></semantics></math>5 points) as well as duration (<math alttext="&lt;" class="ltx_Math" display="inline" id="S4.SS1.SSS2.p3.3.m3.1"><semantics id="S4.SS1.SSS2.p3.3.m3.1a"><mo id="S4.SS1.SSS2.p3.3.m3.1.1" xref="S4.SS1.SSS2.p3.3.m3.1.1.cmml">&lt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS1.SSS2.p3.3.m3.1b"><lt id="S4.SS1.SSS2.p3.3.m3.1.1.cmml" xref="S4.SS1.SSS2.p3.3.m3.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS1.SSS2.p3.3.m3.1c">&lt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS1.SSS2.p3.3.m3.1d">&lt;</annotation></semantics></math>500 steps) metrics. NDR-MARL performed better with mean success rates above 45%, which allowed it to cash in over 6 reward points on average. Here, the episode duration was between 400 and 650 steps in most cases, with outliers depicting early collisions. LDR-MARL was the most consistent with success rates reaching as high as 80%, rewards reaching as high as 9 points, and episode durations ranging between 500 and 600 steps in most of the cases. Lastly, it was observed that HDR-MARL performed poorer than other MARL configurations, where it could only achieve up to 40% success rate.</p> </div> <div class="ltx_para" id="S4.SS1.SSS2.p4"> <p class="ltx_p" id="S4.SS1.SSS2.p4.1">Finally, it is worth mentioning that the closeness between sim and real metrics estimates the sim2real gap, which was least for LDR-MARL (4.12%) followed by NDR-MARL (9.38%), FGM (16.01%), and HDR-MARL (33.85%).</p> </div> </section> </section> <section class="ltx_subsection" id="S4.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S4.SS2.5.1.1">IV-B</span> </span><span class="ltx_text ltx_font_italic" id="S4.SS2.6.2">Competitive Multi-Agent Scenario</span> </h3> <section class="ltx_subsubsection" id="S4.SS2.SSS1"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S4.SS2.SSS1.5.1.1">IV-B</span>1 </span>Training and Simulation Parallelization</h4> <div class="ltx_para" id="S4.SS2.SSS1.p1"> <p class="ltx_p" id="S4.SS2.SSS1.p1.20">Fig. <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.F6" title="Figure 6 ‣ IV-A1 Training and Simulation Parallelization ‣ IV-A Cooperative Multi-Agent Scenario ‣ IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">6</span></a> depicts the KPIs used to analyze the competitive MARL training without any domain randomization (i.e., NDR). It was observed that the agents initially (until <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.1.m1.1"><semantics id="S4.SS2.SSS1.p1.1.m1.1a"><mo id="S4.SS2.SSS1.p1.1.m1.1.1" xref="S4.SS2.SSS1.p1.1.m1.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.1.m1.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.1.m1.1.1.cmml" xref="S4.SS2.SSS1.p1.1.m1.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.1.m1.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.1.m1.1d">∼</annotation></semantics></math>200k steps) tried aggressive maneuvers, which mostly resulted in collisions. This is marked by the low extrinsic rewards (<math alttext="&lt;" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.2.m2.1"><semantics id="S4.SS2.SSS1.p1.2.m2.1a"><mo id="S4.SS2.SSS1.p1.2.m2.1.1" xref="S4.SS2.SSS1.p1.2.m2.1.1.cmml">&lt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.2.m2.1b"><lt id="S4.SS2.SSS1.p1.2.m2.1.1.cmml" xref="S4.SS2.SSS1.p1.2.m2.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.2.m2.1c">&lt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.2.m2.1d">&lt;</annotation></semantics></math>50) and episode lengths (<math alttext="&lt;" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.3.m3.1"><semantics id="S4.SS2.SSS1.p1.3.m3.1a"><mo id="S4.SS2.SSS1.p1.3.m3.1.1" xref="S4.SS2.SSS1.p1.3.m3.1.1.cmml">&lt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.3.m3.1b"><lt id="S4.SS2.SSS1.p1.3.m3.1.1.cmml" xref="S4.SS2.SSS1.p1.3.m3.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.3.m3.1c">&lt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.3.m3.1d">&lt;</annotation></semantics></math>1400 steps) in this phase. This can also be attributed to the higher BC loss (<math alttext="&gt;" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.4.m4.1"><semantics id="S4.SS2.SSS1.p1.4.m4.1a"><mo id="S4.SS2.SSS1.p1.4.m4.1.1" xref="S4.SS2.SSS1.p1.4.m4.1.1.cmml">&gt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.4.m4.1b"><gt id="S4.SS2.SSS1.p1.4.m4.1.1.cmml" xref="S4.SS2.SSS1.p1.4.m4.1.1"></gt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.4.m4.1c">&gt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.4.m4.1d">&gt;</annotation></semantics></math>0.1) as well as lower curiosity (<math alttext="&lt;" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.5.m5.1"><semantics id="S4.SS2.SSS1.p1.5.m5.1a"><mo id="S4.SS2.SSS1.p1.5.m5.1.1" xref="S4.SS2.SSS1.p1.5.m5.1.1.cmml">&lt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.5.m5.1b"><lt id="S4.SS2.SSS1.p1.5.m5.1.1.cmml" xref="S4.SS2.SSS1.p1.5.m5.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.5.m5.1c">&lt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.5.m5.1d">&lt;</annotation></semantics></math>0.8) and GAIL (<math alttext="&lt;" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.6.m6.1"><semantics id="S4.SS2.SSS1.p1.6.m6.1a"><mo id="S4.SS2.SSS1.p1.6.m6.1.1" xref="S4.SS2.SSS1.p1.6.m6.1.1.cmml">&lt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.6.m6.1b"><lt id="S4.SS2.SSS1.p1.6.m6.1.1.cmml" xref="S4.SS2.SSS1.p1.6.m6.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.6.m6.1c">&lt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.6.m6.1d">&lt;</annotation></semantics></math>9) rewards, indicating that the agents had not even started imitating the demonstrations correctly. However, the pre-recorded demonstrations soon (between <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.7.m7.1"><semantics id="S4.SS2.SSS1.p1.7.m7.1a"><mo id="S4.SS2.SSS1.p1.7.m7.1.1" xref="S4.SS2.SSS1.p1.7.m7.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.7.m7.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.7.m7.1.1.cmml" xref="S4.SS2.SSS1.p1.7.m7.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.7.m7.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.7.m7.1d">∼</annotation></semantics></math>200k and <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.8.m8.1"><semantics id="S4.SS2.SSS1.p1.8.m8.1a"><mo id="S4.SS2.SSS1.p1.8.m8.1.1" xref="S4.SS2.SSS1.p1.8.m8.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.8.m8.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.8.m8.1.1.cmml" xref="S4.SS2.SSS1.p1.8.m8.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.8.m8.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.8.m8.1d">∼</annotation></semantics></math>800k steps) guided the agents toward completing multiple laps around the race track. This is marked by an exponential reduction in the BC loss (from <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.9.m9.1"><semantics id="S4.SS2.SSS1.p1.9.m9.1a"><mo id="S4.SS2.SSS1.p1.9.m9.1.1" xref="S4.SS2.SSS1.p1.9.m9.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.9.m9.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.9.m9.1.1.cmml" xref="S4.SS2.SSS1.p1.9.m9.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.9.m9.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.9.m9.1d">∼</annotation></semantics></math>0.1 to <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.10.m10.1"><semantics id="S4.SS2.SSS1.p1.10.m10.1a"><mo id="S4.SS2.SSS1.p1.10.m10.1.1" xref="S4.SS2.SSS1.p1.10.m10.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.10.m10.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.10.m10.1.1.cmml" xref="S4.SS2.SSS1.p1.10.m10.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.10.m10.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.10.m10.1d">∼</annotation></semantics></math>0.025) and a progressive increase in the extrinsic (from <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.11.m11.1"><semantics id="S4.SS2.SSS1.p1.11.m11.1a"><mo id="S4.SS2.SSS1.p1.11.m11.1.1" xref="S4.SS2.SSS1.p1.11.m11.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.11.m11.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.11.m11.1.1.cmml" xref="S4.SS2.SSS1.p1.11.m11.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.11.m11.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.11.m11.1d">∼</annotation></semantics></math>50 to <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.12.m12.1"><semantics id="S4.SS2.SSS1.p1.12.m12.1a"><mo id="S4.SS2.SSS1.p1.12.m12.1.1" xref="S4.SS2.SSS1.p1.12.m12.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.12.m12.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.12.m12.1.1.cmml" xref="S4.SS2.SSS1.p1.12.m12.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.12.m12.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.12.m12.1d">∼</annotation></semantics></math>57), curiosity (from <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.13.m13.1"><semantics id="S4.SS2.SSS1.p1.13.m13.1a"><mo id="S4.SS2.SSS1.p1.13.m13.1.1" xref="S4.SS2.SSS1.p1.13.m13.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.13.m13.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.13.m13.1.1.cmml" xref="S4.SS2.SSS1.p1.13.m13.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.13.m13.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.13.m13.1d">∼</annotation></semantics></math>0.8 to <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.14.m14.1"><semantics id="S4.SS2.SSS1.p1.14.m14.1a"><mo id="S4.SS2.SSS1.p1.14.m14.1.1" xref="S4.SS2.SSS1.p1.14.m14.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.14.m14.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.14.m14.1.1.cmml" xref="S4.SS2.SSS1.p1.14.m14.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.14.m14.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.14.m14.1d">∼</annotation></semantics></math>1.1), and GAIL (from <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.15.m15.1"><semantics id="S4.SS2.SSS1.p1.15.m15.1a"><mo id="S4.SS2.SSS1.p1.15.m15.1.1" xref="S4.SS2.SSS1.p1.15.m15.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.15.m15.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.15.m15.1.1.cmml" xref="S4.SS2.SSS1.p1.15.m15.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.15.m15.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.15.m15.1d">∼</annotation></semantics></math>9 to <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.16.m16.1"><semantics id="S4.SS2.SSS1.p1.16.m16.1a"><mo id="S4.SS2.SSS1.p1.16.m16.1.1" xref="S4.SS2.SSS1.p1.16.m16.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.16.m16.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.16.m16.1.1.cmml" xref="S4.SS2.SSS1.p1.16.m16.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.16.m16.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.16.m16.1d">∼</annotation></semantics></math>11) rewards as well as the episode length (from <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.17.m17.1"><semantics id="S4.SS2.SSS1.p1.17.m17.1a"><mo id="S4.SS2.SSS1.p1.17.m17.1.1" xref="S4.SS2.SSS1.p1.17.m17.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.17.m17.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.17.m17.1.1.cmml" xref="S4.SS2.SSS1.p1.17.m17.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.17.m17.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.17.m17.1d">∼</annotation></semantics></math>1400 to <math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.18.m18.1"><semantics id="S4.SS2.SSS1.p1.18.m18.1a"><mo id="S4.SS2.SSS1.p1.18.m18.1.1" xref="S4.SS2.SSS1.p1.18.m18.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.18.m18.1b"><csymbol cd="latexml" id="S4.SS2.SSS1.p1.18.m18.1.1.cmml" xref="S4.SS2.SSS1.p1.18.m18.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.18.m18.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.18.m18.1d">∼</annotation></semantics></math>1600 steps). It was interestingly observed that the red agent (Agent 1) dominated the blue one (Agent 2) till about 500 steps, after which the latter learned the <span class="ltx_text ltx_font_italic" id="S4.SS2.SSS1.p1.20.1">“competitive spirit”</span> and bridged the performance gap. Towards the end of 1M steps, both the policies converged at stable reward values and episode length, while gradually reducing the policy entropy from <math alttext="&gt;" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.19.m19.1"><semantics id="S4.SS2.SSS1.p1.19.m19.1a"><mo id="S4.SS2.SSS1.p1.19.m19.1.1" xref="S4.SS2.SSS1.p1.19.m19.1.1.cmml">&gt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.19.m19.1b"><gt id="S4.SS2.SSS1.p1.19.m19.1.1.cmml" xref="S4.SS2.SSS1.p1.19.m19.1.1"></gt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.19.m19.1c">&gt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.19.m19.1d">&gt;</annotation></semantics></math>0.3 to <math alttext="&lt;" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p1.20.m20.1"><semantics id="S4.SS2.SSS1.p1.20.m20.1a"><mo id="S4.SS2.SSS1.p1.20.m20.1.1" xref="S4.SS2.SSS1.p1.20.m20.1.1.cmml">&lt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p1.20.m20.1b"><lt id="S4.SS2.SSS1.p1.20.m20.1.1.cmml" xref="S4.SS2.SSS1.p1.20.m20.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p1.20.m20.1c">&lt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p1.20.m20.1d">&lt;</annotation></semantics></math>0.05. Here, the non-zero offset in BC loss indicates that the agents did not over-fit the demonstrations; rather, they explored the state space quite well to maximize the extrinsic reward by adopting aggressive <span class="ltx_text ltx_font_italic" id="S4.SS2.SSS1.p1.20.2">“racing”</span> behaviors. The KPIs followed a similar trend for LDR and HDR variations but with higher fluctuations (especially in policy entropy), owing to the randomized parameters.</p> </div> <div class="ltx_para" id="S4.SS2.SSS1.p2"> <p class="ltx_p" id="S4.SS2.SSS1.p2.1">From a computing perspective, we analyzed the effect of parallelizing the 2-agent adversarial racing family from a single instance (2 agents) up to 10 such families (20 agents) training in parallel, within the same environment. As observed from Fig. <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S1.F2" title="Figure 2 ‣ I Introduction ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_ref ltx_nolink"><span class="ltx_text ltx_ref_tag">2</span></span>(c)-(d)</a> the reduction in training time (up to 49%) was less dramatic in this case, with a saturating point approaching after 10<math alttext="\times" class="ltx_Math" display="inline" id="S4.SS2.SSS1.p2.1.m1.1"><semantics id="S4.SS2.SSS1.p2.1.m1.1a"><mo id="S4.SS2.SSS1.p2.1.m1.1.1" xref="S4.SS2.SSS1.p2.1.m1.1.1.cmml">×</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS1.p2.1.m1.1b"><times id="S4.SS2.SSS1.p2.1.m1.1.1.cmml" xref="S4.SS2.SSS1.p2.1.m1.1.1"></times></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS1.p2.1.m1.1c">\times</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS1.p2.1.m1.1d">×</annotation></semantics></math>2 parallel agents.</p> </div> </section> <section class="ltx_subsubsection" id="S4.SS2.SSS2"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S4.SS2.SSS2.5.1.1">IV-B</span>2 </span>Deployment and Sim2Real Transfer</h4> <div class="ltx_para" id="S4.SS2.SSS2.p1"> <p class="ltx_p" id="S4.SS2.SSS2.p1.1">The trained policies were first deployed and verified in simulation, where we observed interesting adversarial behaviors (e.g., blocking, baiting, overtaking, etc.). These behaviors conveyed that the agents were explicitly competing, while implicitly coordinating to avoid collisions (refer Fig. <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.F7" title="Figure 7 ‣ IV-A1 Training and Simulation Parallelization ‣ IV-A Cooperative Multi-Agent Scenario ‣ IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">7</span></a>).</p> </div> <div class="ltx_para" id="S4.SS2.SSS2.p2"> <p class="ltx_p" id="S4.SS2.SSS2.p2.1">Next, we quantitatively analyzed the policies trained with different grades of domain randomization (i.e., NDR, LDR, and HDR) and benchmarked them against FGM. The design of experiments followed 16 simulation runs and 16 real-world deployments, where the performance was assessed across 3 KPIs, viz. win rate, cumulative reward, and episode duration separately for each agent (since this was a competitive scenario) as depicted in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S4.F8" title="Figure 8 ‣ IV-A1 Training and Simulation Parallelization ‣ IV-A Cooperative Multi-Agent Scenario ‣ IV Results ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag">8</span></a>. For real-world deployments, our experiments cycled across all the agents, such that each agent was physically deployed in the loop with the simulated environment comprising its virtual peers (refer Section <a class="ltx_ref" href="https://arxiv.org/html/2403.10996v5#S3.SS4" title="III-D Hybrid Sim2Real Transfer ‣ III Methodology ‣ Mixed-Reality Digital Twins: Leveraging the Physical and Virtual Worlds for Hybrid Sim2Real Transition of Multi-Agent Reinforcement Learning Policies"><span class="ltx_text ltx_ref_tag"><span class="ltx_text">III-D</span></span></a> for implementation details).</p> </div> <div class="ltx_para" id="S4.SS2.SSS2.p3"> <p class="ltx_p" id="S4.SS2.SSS2.p3.7">It was observed that both agents had the lowest average winning rate (<math alttext="&lt;" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p3.1.m1.1"><semantics id="S4.SS2.SSS2.p3.1.m1.1a"><mo id="S4.SS2.SSS2.p3.1.m1.1.1" xref="S4.SS2.SSS2.p3.1.m1.1.1.cmml">&lt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p3.1.m1.1b"><lt id="S4.SS2.SSS2.p3.1.m1.1.1.cmml" xref="S4.SS2.SSS2.p3.1.m1.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p3.1.m1.1c">&lt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p3.1.m1.1d">&lt;</annotation></semantics></math>25%) with HDR-MARL, although they secured the least mean reward (<math alttext="&lt;" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p3.2.m2.1"><semantics id="S4.SS2.SSS2.p3.2.m2.1a"><mo id="S4.SS2.SSS2.p3.2.m2.1.1" xref="S4.SS2.SSS2.p3.2.m2.1.1.cmml">&lt;</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p3.2.m2.1b"><lt id="S4.SS2.SSS2.p3.2.m2.1.1.cmml" xref="S4.SS2.SSS2.p3.2.m2.1.1"></lt></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p3.2.m2.1c">&lt;</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p3.2.m2.1d">&lt;</annotation></semantics></math>25 points) with FGM. This highlighted the difference between losing fairly and losing due to collision, which was also corroborated by the outliers in the duration metric. NDR-MARL provided higher (<math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p3.3.m3.1"><semantics id="S4.SS2.SSS2.p3.3.m3.1a"><mo id="S4.SS2.SSS2.p3.3.m3.1.1" xref="S4.SS2.SSS2.p3.3.m3.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p3.3.m3.1b"><csymbol cd="latexml" id="S4.SS2.SSS2.p3.3.m3.1.1.cmml" xref="S4.SS2.SSS2.p3.3.m3.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p3.3.m3.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p3.3.m3.1d">∼</annotation></semantics></math>30-45%) winning consistency for either agent, but could not surpass that of LDR-MARL (<math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p3.4.m4.1"><semantics id="S4.SS2.SSS2.p3.4.m4.1a"><mo id="S4.SS2.SSS2.p3.4.m4.1.1" xref="S4.SS2.SSS2.p3.4.m4.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p3.4.m4.1b"><csymbol cd="latexml" id="S4.SS2.SSS2.p3.4.m4.1.1.cmml" xref="S4.SS2.SSS2.p3.4.m4.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p3.4.m4.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p3.4.m4.1d">∼</annotation></semantics></math>35-47%). The same was reflected by the reward metric, wherein LDR-MARL cashed in slightly more reward (<math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p3.5.m5.1"><semantics id="S4.SS2.SSS2.p3.5.m5.1a"><mo id="S4.SS2.SSS2.p3.5.m5.1.1" xref="S4.SS2.SSS2.p3.5.m5.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p3.5.m5.1b"><csymbol cd="latexml" id="S4.SS2.SSS2.p3.5.m5.1.1.cmml" xref="S4.SS2.SSS2.p3.5.m5.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p3.5.m5.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p3.5.m5.1d">∼</annotation></semantics></math>60-75 points) than NDR-MARL (<math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p3.6.m6.1"><semantics id="S4.SS2.SSS2.p3.6.m6.1a"><mo id="S4.SS2.SSS2.p3.6.m6.1.1" xref="S4.SS2.SSS2.p3.6.m6.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p3.6.m6.1b"><csymbol cd="latexml" id="S4.SS2.SSS2.p3.6.m6.1.1.cmml" xref="S4.SS2.SSS2.p3.6.m6.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p3.6.m6.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p3.6.m6.1d">∼</annotation></semantics></math>55-70 points). Both performed equally well on the duration metric (<math alttext="\sim" class="ltx_Math" display="inline" id="S4.SS2.SSS2.p3.7.m7.1"><semantics id="S4.SS2.SSS2.p3.7.m7.1a"><mo id="S4.SS2.SSS2.p3.7.m7.1.1" xref="S4.SS2.SSS2.p3.7.m7.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S4.SS2.SSS2.p3.7.m7.1b"><csymbol cd="latexml" id="S4.SS2.SSS2.p3.7.m7.1.1.cmml" xref="S4.SS2.SSS2.p3.7.m7.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S4.SS2.SSS2.p3.7.m7.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S4.SS2.SSS2.p3.7.m7.1d">∼</annotation></semantics></math>1400-1700 steps), however, LDR-MARL was slightly more consistent with lower variance.</p> </div> <div class="ltx_para" id="S4.SS2.SSS2.p4"> <p class="ltx_p" id="S4.SS2.SSS2.p4.1">Finally, the sim2real gap was least for LDR-MARL (2.88%) followed by NDR-MARL (6.88%), HDR-MARL (8.98%), and FGM (13.48%).</p> </div> </section> </section> </section> <section class="ltx_section" id="S5"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">V </span><span class="ltx_text ltx_font_smallcaps" id="S5.1.1">Conclusion</span> </h2> <div class="ltx_para" id="S5.p1"> <p class="ltx_p" id="S5.p1.1">This work identified two pain points in training and deploying MARL systems, and attempted to address them by proposing a scalable and parallelizable digital twin framework. Two representative case studies were formulated to support the claims: a 4-agent collaborative intersection traversal problem and a 2-agent adversarial head-to-head racing problem. The two problems were deliberately formulated with distinct observation spaces and reward functions, but more importantly, also the learning architecture (vanilla MARL vs. demonstration-guided MARL). We analyzed the training metrics in each case and also noted the non-linear effect of agent/environment parallelization on the training time, with a hardware/software-specific point of diminishing return. Finally, we presented a mixed-reality sim2real transfer of the trained policies using a single physical vehicle, which was immersed within the proposed digital twin framework to interact with its virtual peers in a virtual environment.</p> </div> <div class="ltx_para" id="S5.p2"> <p class="ltx_p" id="S5.p2.1">Future avenues of research include analyzing the effect of different communication frameworks and protocols on digital twinning, formulation of physics-guided MARL problems, and scaling the deployments in terms of the number and size of the agents.</p> </div> </section> <section class="ltx_bibliography" id="bib"> <h2 class="ltx_title ltx_title_bibliography">References</h2> <ul class="ltx_biblist"> <li class="ltx_bibitem" id="bib.bib1"> <span class="ltx_tag ltx_tag_bibitem">[1]</span> <span class="ltx_bibblock"> S. H. Semnani, H. Liu, M. Everett, A. de Ruiter, and J. P. How, “Multi-agent Motion Planning for Dense and Dynamic Environments via Deep Reinforcement Learning,” <em class="ltx_emph ltx_font_italic" id="bib.bib1.1.1">IEEE Robotics and Automation Letters</em>, vol. 5, no. 2, pp. 3221–3226, 2020. </span> </li> <li class="ltx_bibitem" id="bib.bib2"> <span class="ltx_tag ltx_tag_bibitem">[2]</span> <span class="ltx_bibblock"> P. Long, T. Fan, X. Liao, W. Liu, H. Zhang, and J. Pan, “Towards Optimally Decentralized Multi-Robot Collision Avoidance via Deep Reinforcement Learning,” in <em class="ltx_emph ltx_font_italic" id="bib.bib2.1.1">2018 IEEE International Conference on Robotics and Automation (ICRA)</em>, 2018, pp. 6252–6259. </span> </li> <li class="ltx_bibitem" id="bib.bib3"> <span class="ltx_tag ltx_tag_bibitem">[3]</span> <span class="ltx_bibblock"> K. Sivanathan, B. K. Vinayagam, T. Samak, and C. Samak, “Decentralized Motion Planning for Multi-Robot Navigation using Deep Reinforcement Learning,” in <em class="ltx_emph ltx_font_italic" id="bib.bib3.1.1">2020 3rd International Conference on Intelligent Sustainable Systems (ICISS)</em>, 2020, pp. 709–716. </span> </li> <li class="ltx_bibitem" id="bib.bib4"> <span class="ltx_tag ltx_tag_bibitem">[4]</span> <span class="ltx_bibblock"> E. Candela, L. Parada, L. Marques, T.-A. Georgescu, Y. Demiris, and P. Angeloudis, “Transferring Multi-Agent Reinforcement Learning Policies for Autonomous Driving using Sim-to-Real,” in <em class="ltx_emph ltx_font_italic" id="bib.bib4.1.1">2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</em>, 2022, pp. 8814–8820. </span> </li> <li class="ltx_bibitem" id="bib.bib5"> <span class="ltx_tag ltx_tag_bibitem">[5]</span> <span class="ltx_bibblock"> J. Betz, H. Zheng, A. Liniger, U. Rosolia, P. Karle, M. Behl, V. Krovi, and R. Mangharam, “Autonomous Vehicles on the Edge: A Survey on Autonomous Vehicle Racing,” <em class="ltx_emph ltx_font_italic" id="bib.bib5.1.1">IEEE Open Journal of Intelligent Transportation Systems</em>, vol. 3, pp. 458–488, 2022. </span> </li> <li class="ltx_bibitem" id="bib.bib6"> <span class="ltx_tag ltx_tag_bibitem">[6]</span> <span class="ltx_bibblock"> P. Werner, T. Seyde, P. Drews, T. M. Balch, I. Gilitschenski, W. Schwarting, G. Rosman, S. Karaman, and D. Rus, “Dynamic Multi-Team Racing: Competitive Driving on 1/10-th Scale Vehicles via Learning in Simulation,” in <em class="ltx_emph ltx_font_italic" id="bib.bib6.1.1">Proceedings of The 7th Conference on Robot Learning</em>, ser. Proceedings of Machine Learning Research, J. Tan, M. Toussaint, and K. Darvish, Eds., vol. 229.   PMLR, 06–09 Nov 2023, pp. 1667–1685. </span> </li> <li class="ltx_bibitem" id="bib.bib7"> <span class="ltx_tag ltx_tag_bibitem">[7]</span> <span class="ltx_bibblock"> F. Fuchs, Y. Song, E. Kaufmann, D. Scaramuzza, and P. Dürr, “Super-Human Performance in Gran Turismo Sport Using Deep Reinforcement Learning,” <em class="ltx_emph ltx_font_italic" id="bib.bib7.1.1">IEEE Robotics and Automation Letters</em>, vol. 6, no. 3, pp. 4257–4264, 2021. </span> </li> <li class="ltx_bibitem" id="bib.bib8"> <span class="ltx_tag ltx_tag_bibitem">[8]</span> <span class="ltx_bibblock"> Y. Song, H. Lin, E. Kaufmann, P. Dürr, and D. Scaramuzza, “Autonomous Overtaking in Gran Turismo Sport Using Curriculum Reinforcement Learning,” in <em class="ltx_emph ltx_font_italic" id="bib.bib8.1.1">2021 IEEE International Conference on Robotics and Automation (ICRA)</em>, 2021, pp. 9403–9409. </span> </li> <li class="ltx_bibitem" id="bib.bib9"> <span class="ltx_tag ltx_tag_bibitem">[9]</span> <span class="ltx_bibblock"> N. Rudin, D. Hoeller, P. Reist, and M. Hutter, “Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning,” in <em class="ltx_emph ltx_font_italic" id="bib.bib9.1.1">Proceedings of the 5th Conference on Robot Learning</em>, ser. Proceedings of Machine Learning Research, A. Faust, D. Hsu, and G. Neumann, Eds., vol. 164.   PMLR, 08–11 Nov 2022, pp. 91–100. </span> </li> <li class="ltx_bibitem" id="bib.bib10"> <span class="ltx_tag ltx_tag_bibitem">[10]</span> <span class="ltx_bibblock"> J. Truong, S. Chernova, and D. Batra, “Bi-Directional Domain Adaptation for Sim2Real Transfer of Embodied Navigation Agents,” <em class="ltx_emph ltx_font_italic" id="bib.bib10.1.1">IEEE Robotics and Automation Letters</em>, vol. 6, no. 2, pp. 2634–2641, 2021. </span> </li> <li class="ltx_bibitem" id="bib.bib11"> <span class="ltx_tag ltx_tag_bibitem">[11]</span> <span class="ltx_bibblock"> J. Hwangbo, J. Lee, A. Dosovitskiy, D. Bellicoso, V. Tsounis, V. Koltun, and M. Hutter, “Learning agile and dynamic motor skills for legged robots,” <em class="ltx_emph ltx_font_italic" id="bib.bib11.1.1">Science Robotics</em>, vol. 4, no. 26, p. eaau5872, 2019. </span> </li> <li class="ltx_bibitem" id="bib.bib12"> <span class="ltx_tag ltx_tag_bibitem">[12]</span> <span class="ltx_bibblock"> J. Tobin, R. Fong, A. Ray, J. Schneider, W. Zaremba, and P. Abbeel, “Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World,” in <em class="ltx_emph ltx_font_italic" id="bib.bib12.1.1">2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</em>, 2017, pp. 23–30. </span> </li> <li class="ltx_bibitem" id="bib.bib13"> <span class="ltx_tag ltx_tag_bibitem">[13]</span> <span class="ltx_bibblock"> R. Mangharam, V. Krovi, J. Betz, A. Amine, C. Samak, and T. Samak, “24th RoboRacer Autonomous Racing Competition,” 2025. [Online]. Available: <a class="ltx_ref ltx_url" href="https://icra2025-race.roboracer.ai" title="">https://icra2025-race.roboracer.ai</a> </span> </li> <li class="ltx_bibitem" id="bib.bib14"> <span class="ltx_tag ltx_tag_bibitem">[14]</span> <span class="ltx_bibblock"> M. Bain and C. Sammut, “A Framework for Behavioural Cloning,” in <em class="ltx_emph ltx_font_italic" id="bib.bib14.1.1">Machine Intelligence 15</em>, 1995. </span> </li> <li class="ltx_bibitem" id="bib.bib15"> <span class="ltx_tag ltx_tag_bibitem">[15]</span> <span class="ltx_bibblock"> J. Ho and S. Ermon, “Generative Adversarial Imitation Learning,” in <em class="ltx_emph ltx_font_italic" id="bib.bib15.1.1">Proceedings of the 30th International Conference on Neural Information Processing Systems</em>, ser. NIPS’16.   Red Hook, NY, USA: Curran Associates Inc., 2016, p. 4572–4580. </span> </li> <li class="ltx_bibitem" id="bib.bib16"> <span class="ltx_tag ltx_tag_bibitem">[16]</span> <span class="ltx_bibblock"> D. Pathak, P. Agrawal, A. A. Efros, and T. Darrell, “Curiosity-Driven Exploration by Self-Supervised Prediction,” in <em class="ltx_emph ltx_font_italic" id="bib.bib16.1.1">Proceedings of the 34th International Conference on Machine Learning - Volume 70</em>, ser. ICML’17.   JMLR.org, 2017, p. 2778–2787. </span> </li> <li class="ltx_bibitem" id="bib.bib17"> <span class="ltx_tag ltx_tag_bibitem">[17]</span> <span class="ltx_bibblock"> T. Samak, C. Samak, S. Kandhasamy, V. Krovi, and M. Xie, “AutoDRIVE: A Comprehensive, Flexible and Integrated Digital Twin Ecosystem for Autonomous Driving Research &amp; Education,” <em class="ltx_emph ltx_font_italic" id="bib.bib17.1.1">Robotics</em>, vol. 12, no. 3, p. 77, May 2023. </span> </li> <li class="ltx_bibitem" id="bib.bib18"> <span class="ltx_tag ltx_tag_bibitem">[18]</span> <span class="ltx_bibblock"> C. V. Samak, T. V. Samak, J. M. Velni, and V. N. Krovi, “Nigel—Mechatronic Design and Robust Sim2Real Control of an Overactuated Autonomous Vehicle,” <em class="ltx_emph ltx_font_italic" id="bib.bib18.1.1">IEEE/ASME Transactions on Mechatronics</em>, vol. 29, no. 4, pp. 2785–2793, 2024. </span> </li> <li class="ltx_bibitem" id="bib.bib19"> <span class="ltx_tag ltx_tag_bibitem">[19]</span> <span class="ltx_bibblock"> M. O’Kelly, V. Sukhil, H. Abbas, J. Harkins, C. Kao, Y. V. Pant, R. Mangharam, D. Agarwal, M. Behl, P. Burgio, and M. Bertogna. (2019) F1/10: An Open-Source Autonomous Cyber-Physical Platform. </span> </li> <li class="ltx_bibitem" id="bib.bib20"> <span class="ltx_tag ltx_tag_bibitem">[20]</span> <span class="ltx_bibblock"> J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal Policy Optimization Algorithms,” 2017. </span> </li> <li class="ltx_bibitem" id="bib.bib21"> <span class="ltx_tag ltx_tag_bibitem">[21]</span> <span class="ltx_bibblock"> C. Yu, A. Velu, E. Vinitsky, J. Gao, Y. Wang, A. Bayen, and Y. WU, “The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games,” in <em class="ltx_emph ltx_font_italic" id="bib.bib21.1.1">Advances in Neural Information Processing Systems</em>, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35.   Curran Associates, Inc., 2022, pp. 24 611–24 624. </span> </li> <li class="ltx_bibitem" id="bib.bib22"> <span class="ltx_tag ltx_tag_bibitem">[22]</span> <span class="ltx_bibblock"> P. Ramachandran, B. Zoph, and Q. V. Le, “Searching for Activation Functions,” 2017. </span> </li> <li class="ltx_bibitem" id="bib.bib23"> <span class="ltx_tag ltx_tag_bibitem">[23]</span> <span class="ltx_bibblock"> V. Sezer and M. Gokasan, “A Novel Obstacle Avoidance Algorithm: “Follow the Gap Method”,” <em class="ltx_emph ltx_font_italic" id="bib.bib23.1.1">Robotics and Autonomous Systems</em>, vol. 60, no. 9, pp. 1123–1134, 2012. </span> </li> </ul> </section> <div class="ltx_pagination ltx_role_newpage"></div> </article> </div> <footer class="ltx_page_footer"> <div class="ltx_page_logo">Generated on Thu Mar 20 01:09:40 2025 by <a class="ltx_LaTeXML_logo" href="http://dlmf.nist.gov/LaTeXML/"><span style="letter-spacing:-0.2em; margin-right:0.1em;">L<span class="ltx_font_smallcaps" style="position:relative; bottom:2.2pt;">a</span>T<span class="ltx_font_smallcaps" style="font-size:120%;position:relative; bottom:-0.2ex;">e</span></span><span style="font-size:90%; position:relative; bottom:-0.2ex;">XML</span><img alt="Mascot Sammy" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAsAAAAOCAYAAAD5YeaVAAAAAXNSR0IArs4c6QAAAAZiS0dEAP8A/wD/oL2nkwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9wKExQZLWTEaOUAAAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAdpJREFUKM9tkL+L2nAARz9fPZNCKFapUn8kyI0e4iRHSR1Kb8ng0lJw6FYHFwv2LwhOpcWxTjeUunYqOmqd6hEoRDhtDWdA8ApRYsSUCDHNt5ul13vz4w0vWCgUnnEc975arX6ORqN3VqtVZbfbTQC4uEHANM3jSqXymFI6yWazP2KxWAXAL9zCUa1Wy2tXVxheKA9YNoR8Pt+aTqe4FVVVvz05O6MBhqUIBGk8Hn8HAOVy+T+XLJfLS4ZhTiRJgqIoVBRFIoric47jPnmeB1mW/9rr9ZpSSn3Lsmir1fJZlqWlUonKsvwWwD8ymc/nXwVBeLjf7xEKhdBut9Hr9WgmkyGEkJwsy5eHG5vN5g0AKIoCAEgkEkin0wQAfN9/cXPdheu6P33fBwB4ngcAcByHJpPJl+fn54mD3Gg0NrquXxeLRQAAwzAYj8cwTZPwPH9/sVg8PXweDAauqqr2cDjEer1GJBLBZDJBs9mE4zjwfZ85lAGg2+06hmGgXq+j3+/DsixYlgVN03a9Xu8jgCNCyIegIAgx13Vfd7vdu+FweG8YRkjXdWy329+dTgeSJD3ieZ7RNO0VAXAPwDEAO5VKndi2fWrb9jWl9Esul6PZbDY9Go1OZ7PZ9z/lyuD3OozU2wAAAABJRU5ErkJggg=="/></a> </div></footer> </div> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10