CINXE.COM
LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions
<!DOCTYPE html> <html lang="en"> <head> <meta content="text/html; charset=utf-8" http-equiv="content-type"/> <title>LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions</title> <!--Generated on Tue Mar 18 01:49:46 2025 by LaTeXML (version 0.8.8) http://dlmf.nist.gov/LaTeXML/.--> <meta content="width=device-width, initial-scale=1, shrink-to-fit=no" name="viewport"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv-fonts.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/latexml_styles.css" rel="stylesheet" type="text/css"/> <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.3.3/html2canvas.min.js"></script> <script src="/static/browse/0.3.4/js/addons_new.js"></script> <script src="/static/browse/0.3.4/js/feedbackOverlay.js"></script> <meta content=" Internet of Things, large language model, split federated learning. " lang="en" name="keywords"/> <base href="/html/2503.13819v1/"/></head> <body> <nav class="ltx_page_navbar"> <nav class="ltx_TOC"> <ol class="ltx_toclist"> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S1" title="In LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">I </span><span class="ltx_text ltx_font_smallcaps">Introduction</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S2" title="In LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">II </span><span class="ltx_text ltx_font_smallcaps">LLM-Empowered IoT for 6G Networks</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S2.SS1" title="In II LLM-Empowered IoT for 6G Networks ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-A</span> </span><span class="ltx_text ltx_font_italic">6G IoT</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S2.SS2" title="In II LLM-Empowered IoT for 6G Networks ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-B</span> </span><span class="ltx_text ltx_font_italic">Large Language Model</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S2.SS3" title="In II LLM-Empowered IoT for 6G Networks ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">II-C</span> </span><span class="ltx_text ltx_font_italic">LLM-Empowered IoT Architecture</span></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S3" title="In LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">III </span><span class="ltx_text ltx_font_smallcaps">LLM for 6G IoT</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S3.SS1" title="In III LLM for 6G IoT ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">III-A</span> </span><span class="ltx_text ltx_font_italic">LLMs Empower IoT Applications</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S3.SS2" title="In III LLM for 6G IoT ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">III-B</span> </span><span class="ltx_text ltx_font_italic">LLMs Enhance IoT Management</span></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S4" title="In LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">IV </span><span class="ltx_text ltx_font_smallcaps">LLM on 6G IoT</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S4.SS1" title="In IV LLM on 6G IoT ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-A</span> </span><span class="ltx_text ltx_font_italic">Edge Fine-Tuning</span></span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S4.SS1.SSS1" title="In IV-A Edge Fine-Tuning ‣ IV LLM on 6G IoT ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-A</span>1 </span>Parameter-Efficient Fine-Tuning</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S4.SS1.SSS2" title="In IV-A Edge Fine-Tuning ‣ IV LLM on 6G IoT ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-A</span>2 </span>Distributed Learning Framework</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_subsection"> <a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S4.SS2" title="In IV LLM on 6G IoT ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-B</span> </span><span class="ltx_text ltx_font_italic">Edge Inference</span></span></a> <ol class="ltx_toclist ltx_toclist_subsection"> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S4.SS2.SSS1" title="In IV-B Edge Inference ‣ IV LLM on 6G IoT ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-B</span>1 </span>On-Device Inference</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsubsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S4.SS2.SSS2" title="In IV-B Edge Inference ‣ IV LLM on 6G IoT ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">IV-B</span>2 </span>Co-Inference</span></a></li> </ol> </li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S5" title="In LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">V </span><span class="ltx_text ltx_font_smallcaps">Memory-Efficient Split Federated Learning for LLM Fine-Tuning</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S6" title="In LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">VI </span><span class="ltx_text ltx_font_smallcaps">Case Study</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S6.SS1" title="In VI Case Study ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">VI-A</span> </span><span class="ltx_text ltx_font_italic">Considered Scenario</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S6.SS2" title="In VI Case Study ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">VI-B</span> </span><span class="ltx_text ltx_font_italic">Simulation Results</span></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S7" title="In LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">VII </span><span class="ltx_text ltx_font_smallcaps">Open Issues</span></span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S7.SS1" title="In VII Open Issues ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">VII-A</span> </span><span class="ltx_text ltx_font_italic">Limited and Heterogeneous Resources in IoT Devices</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S7.SS2" title="In VII Open Issues ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">VII-B</span> </span><span class="ltx_text ltx_font_italic">On-Demand Deployment of LLMs</span></span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S7.SS3" title="In VII Open Issues ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref"><span class="ltx_text">VII-C</span> </span><span class="ltx_text ltx_font_italic">Privacy and Data Security Risks</span></span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S8" title="In LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">VIII </span><span class="ltx_text ltx_font_smallcaps">Conclusion</span></span></a></li> </ol></nav> </nav> <div class="ltx_page_main"> <div class="ltx_page_content"> <article class="ltx_document ltx_authors_1line"> <h1 class="ltx_title ltx_title_document"> LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions</h1> <div class="ltx_authors"> <span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Xiaopei Chen, Wen Wu, , Zuguang Li, , Liang Li, , Fei Ji </span><span class="ltx_author_notes">Xiaopei Chen is with the School of Future Technology, South China University of Technology, Guangzhou 511442, China, and also with the Frontier Research Center, Peng Cheng Laboratory, Shenzhen 518000, China (e-mail: ftchenxp@mail.scut.edu.cn). Wen Wu, Zuguang Li, and Liang Li are with the Frontier Research Center, Peng Cheng Laboratory, Shenzhen 518000, China (e-mail: {wuw02, lizg01, lil03}@pcl.ac.cn). Fei Ji is with the School of Electronic and Information Engineering, South China University of Technology, Guangzhou 510640, China (e-mail: eefeiji@scut.edu.cn). </span></span> </div> <div class="ltx_abstract"> <h6 class="ltx_title ltx_title_abstract">Abstract</h6> <p class="ltx_p" id="id1.id1">The Internet of Things (IoT) in the sixth generation (6G) era is envisioned to evolve towards intelligence, ubiquity, and self-optimization. Large language models (LLMs) have demonstrated remarkable generalization capabilities across diverse domains, including natural language processing (NLP), computer vision (CV), and beyond. In this article, we propose an LLM-empowered IoT architecture for 6G networks to achieve intelligent autonomy while supporting advanced IoT applications. LLMs are pushed to the edge of the 6G network to support the synergy of LLMs and IoT. LLM solutions are tailored to both IoT application requirements and IoT management needs, i.e., LLM for IoT. On the other hand, edge inference and edge fine-tuning are discussed to support the deployment of LLMs, i.e., LLM on IoT. Furthermore, we propose a memory-efficient split federated learning (SFL) framework for LLM fine-tuning on heterogeneous IoT devices that alleviates memory pressures on both IoT devices and the edge server while achieving comparable performance and convergence time. Finally, a case study is presented, followed by a discussion about open issues of LLM-empowered IoT for 6G networks.</p> </div> <div class="ltx_keywords"> <h6 class="ltx_title ltx_title_keywords">Index Terms: </h6> Internet of Things, large language model, split federated learning. </div> <section class="ltx_section" id="S1"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">I </span><span class="ltx_text ltx_font_smallcaps" id="S1.1.1">Introduction</span> </h2> <div class="ltx_para" id="S1.p1"> <p class="ltx_p" id="S1.p1.1">Compared to fifth-generation (5G), sixth-generation (6G) networks are expected to bring revolutionary breakthroughs in ultra-low latency, ultra-high bandwidth, intelligence, and enhanced autonomy <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib1" title=""><span class="ltx_text" style="font-size:80%;">1</span></a>]</cite>. 6G networks will rely on cutting-edge technologies such as terahertz communication, intelligent reflective surfaces, space-air-ground integrated networks, and edge intelligence to realize the new era of the interconnected intelligence of all things <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib2" title=""><span class="ltx_text" style="font-size:80%;">2</span></a>, <a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib3" title=""><span class="ltx_text" style="font-size:80%;">3</span></a>]</cite>. These features will not only promote the upgrading of human-computer interaction and data-driven applications but also profoundly change the computing mode and intelligence level of the Internet of Things (IoT). As a core application of the 6G network, IoT will usher in a profound transformation in three aspects: massive device access, intelligent data processing, and automated decision-making.</p> </div> <div class="ltx_para" id="S1.p2"> <p class="ltx_p" id="S1.p2.1">Riding the wave of advancements in Artificial Intelligence (AI), we are witnessing the eruption of Transformer-based large language models (LLMs) such as GPT and LLaMA, which have demonstrated remarkable generalization capabilities across diverse domains, including natural language processing (NLP), computer vision (CV), and beyond. With the advent of the 6G era, LLMs are expected to enhance the intelligence of IoT.</p> </div> <div class="ltx_para" id="S1.p3"> <p class="ltx_p" id="S1.p3.1">The success of LLMs is fundamentally driven by their scalability. Unlike earlier approaches, LLMs exhibit continuous improvements in accuracy and generalization as data volume or parameter count increases, all while maintaining the same underlying simple algorithms and architectures. The scaling law <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib4" title=""><span class="ltx_text" style="font-size:80%;">4</span></a>]</cite> formalizes this phenomenon, demonstrating how the performance of transformer-based models improves in a predictable manner as model size and dataset scale expand. The IoT can provide massive data to motivate the development of modern LLMs. Unfortunately, most existing LLM products are heavily dependent on cloud computing, which comes with significant drawbacks, including excessive latency, high bandwidth costs, and serious privacy concerns. On the other hand, the enormous memory requirements and computational demands of LLMs make direct deployment on resource-constrained IoT devices infeasible.</p> </div> <div class="ltx_para" id="S1.p4"> <p class="ltx_p" id="S1.p4.1">In this article, we propose an LLM-empowered IoT architecture for 6G networks to achieve intelligent autonomy while supporting advanced IoT applications. LLMs are pushed to the edge of the 6G network to achieve the synergy between LLMs and IoT. The synergy of LLM and IoT in the proposed architecture is two-fold: On one hand, LLMs are applied to empower IoT applications and enhance IoT network management, namely LLM for 6G IoT. On the other hand, LLMs are expected to be deployed to support IoT applications in IoT environments through edge fine-tuning methods and edge inference techniques, namely LLM on 6G IoT. Specifically, parameter-efficient fine-tuning reduces the computational resource requirements for LLM fine-tuning, and distributed learning and collaborative inference can be adapted to the limited memory and computational power of IoT devices. Furthermore, we propose an SFL framework for memory-efficient fine-tuning over heterogeneous IoT devices, where IoT devices perform low-rank adaptation (LoRA) fine-tuning on only a subset of lower layers of the pre-trained LLM tailored to their individual capacities. On the server, a full LLM is maintained, and the corresponding LoRA modules are selectively fine-tuned in a sequential manner for each IoT device. The case study demonstrates that the proposed scheme can reduce memory footprint and training time while achieving comparable performance.</p> </div> <figure class="ltx_figure" id="S1.F1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="269" id="S1.F1.g1" src="extracted/6288571/fig/1.jpg" width="479"/> <br class="ltx_break ltx_centering"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S1.F1.2.1.1" style="font-size:90%;">Figure 1</span>: </span><span class="ltx_text" id="S1.F1.3.2" style="font-size:90%;">LLM-empowered IoT architecture.</span></figcaption> </figure> <div class="ltx_para" id="S1.p5"> <p class="ltx_p" id="S1.p5.1">The remainder of this paper is organized as follows. In Section II, some expected features of 6G IoT are discussed, and then the LLM-empowered IoT architecture is proposed. The basic idea of LLM for 6G IoT and LLM on 6G IoT are presented in Section III and Section IV, respectively. Section V presents the memory-efficient SFL framework. Section VI provides a case study. Open issues are discussed in Section VII, and the conclusions are summarized in Section VIII.</p> </div> </section> <section class="ltx_section" id="S2"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">II </span><span class="ltx_text ltx_font_smallcaps" id="S2.1.1">LLM-Empowered IoT for 6G Networks</span> </h2> <section class="ltx_subsection" id="S2.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S2.SS1.4.1.1">II-A</span> </span><span class="ltx_text ltx_font_italic" id="S2.SS1.5.2">6G IoT</span> </h3> <div class="ltx_para" id="S2.SS1.p1"> <p class="ltx_p" id="S2.SS1.p1.1">The IoT has become a core driver of digital connectivity, playing an important role in health care, smart cities, industrial automation, and many other areas. In the vision of 6G <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib1" title=""><span class="ltx_text" style="font-size:80%;">1</span></a>]</cite>, the IoT will have the following core features:</p> <ul class="ltx_itemize" id="S2.I1"> <li class="ltx_item" id="S2.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S2.I1.i1.p1"> <p class="ltx_p" id="S2.I1.i1.p1.1"><span class="ltx_text ltx_font_bold" id="S2.I1.i1.p1.1.1">Immersive Communication</span>: 6G provides terabit-per-second (Tbps) data rates, enabling efficient transmission of ultra-high-definition 8K/16K videos and holographic data streams.</p> </div> </li> <li class="ltx_item" id="S2.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S2.I1.i2.p1"> <p class="ltx_p" id="S2.I1.i2.p1.1"><span class="ltx_text ltx_font_bold" id="S2.I1.i2.p1.1.1">Massive communication</span>: 6G IoT will support the connections of <math alttext="10^{6}-10^{8}" class="ltx_Math" display="inline" id="S2.I1.i2.p1.1.m1.1"><semantics id="S2.I1.i2.p1.1.m1.1a"><mrow id="S2.I1.i2.p1.1.m1.1.1" xref="S2.I1.i2.p1.1.m1.1.1.cmml"><msup id="S2.I1.i2.p1.1.m1.1.1.2" xref="S2.I1.i2.p1.1.m1.1.1.2.cmml"><mn id="S2.I1.i2.p1.1.m1.1.1.2.2" xref="S2.I1.i2.p1.1.m1.1.1.2.2.cmml">10</mn><mn id="S2.I1.i2.p1.1.m1.1.1.2.3" xref="S2.I1.i2.p1.1.m1.1.1.2.3.cmml">6</mn></msup><mo id="S2.I1.i2.p1.1.m1.1.1.1" xref="S2.I1.i2.p1.1.m1.1.1.1.cmml">−</mo><msup id="S2.I1.i2.p1.1.m1.1.1.3" xref="S2.I1.i2.p1.1.m1.1.1.3.cmml"><mn id="S2.I1.i2.p1.1.m1.1.1.3.2" xref="S2.I1.i2.p1.1.m1.1.1.3.2.cmml">10</mn><mn id="S2.I1.i2.p1.1.m1.1.1.3.3" xref="S2.I1.i2.p1.1.m1.1.1.3.3.cmml">8</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.I1.i2.p1.1.m1.1b"><apply id="S2.I1.i2.p1.1.m1.1.1.cmml" xref="S2.I1.i2.p1.1.m1.1.1"><minus id="S2.I1.i2.p1.1.m1.1.1.1.cmml" xref="S2.I1.i2.p1.1.m1.1.1.1"></minus><apply id="S2.I1.i2.p1.1.m1.1.1.2.cmml" xref="S2.I1.i2.p1.1.m1.1.1.2"><csymbol cd="ambiguous" id="S2.I1.i2.p1.1.m1.1.1.2.1.cmml" xref="S2.I1.i2.p1.1.m1.1.1.2">superscript</csymbol><cn id="S2.I1.i2.p1.1.m1.1.1.2.2.cmml" type="integer" xref="S2.I1.i2.p1.1.m1.1.1.2.2">10</cn><cn id="S2.I1.i2.p1.1.m1.1.1.2.3.cmml" type="integer" xref="S2.I1.i2.p1.1.m1.1.1.2.3">6</cn></apply><apply id="S2.I1.i2.p1.1.m1.1.1.3.cmml" xref="S2.I1.i2.p1.1.m1.1.1.3"><csymbol cd="ambiguous" id="S2.I1.i2.p1.1.m1.1.1.3.1.cmml" xref="S2.I1.i2.p1.1.m1.1.1.3">superscript</csymbol><cn id="S2.I1.i2.p1.1.m1.1.1.3.2.cmml" type="integer" xref="S2.I1.i2.p1.1.m1.1.1.3.2">10</cn><cn id="S2.I1.i2.p1.1.m1.1.1.3.3.cmml" type="integer" xref="S2.I1.i2.p1.1.m1.1.1.3.3">8</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i2.p1.1.m1.1c">10^{6}-10^{8}</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i2.p1.1.m1.1d">10 start_POSTSUPERSCRIPT 6 end_POSTSUPERSCRIPT - 10 start_POSTSUPERSCRIPT 8 end_POSTSUPERSCRIPT</annotation></semantics></math> devices per square kilometer, far exceeding the capacity of 5G.</p> </div> </li> <li class="ltx_item" id="S2.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S2.I1.i3.p1"> <p class="ltx_p" id="S2.I1.i3.p1.1"><span class="ltx_text ltx_font_bold" id="S2.I1.i3.p1.1.1">Ubiquitous connectivity</span>: Leveraging the space-air-ground integrated networks, massive MIMO, intelligent reflective surfaces, etc., the 6G network will go beyond the capabilities of 5G to realize ubiquitous connectivity to support the IoT on a global scale.</p> </div> </li> <li class="ltx_item" id="S2.I1.i4" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S2.I1.i4.p1"> <p class="ltx_p" id="S2.I1.i4.p1.2"><span class="ltx_text ltx_font_bold" id="S2.I1.i4.p1.2.1">Hyper reliable and low-latency communication</span>: 6G can achieve <math alttext="0.1-1" class="ltx_Math" display="inline" id="S2.I1.i4.p1.1.m1.1"><semantics id="S2.I1.i4.p1.1.m1.1a"><mrow id="S2.I1.i4.p1.1.m1.1.1" xref="S2.I1.i4.p1.1.m1.1.1.cmml"><mn id="S2.I1.i4.p1.1.m1.1.1.2" xref="S2.I1.i4.p1.1.m1.1.1.2.cmml">0.1</mn><mo id="S2.I1.i4.p1.1.m1.1.1.1" xref="S2.I1.i4.p1.1.m1.1.1.1.cmml">−</mo><mn id="S2.I1.i4.p1.1.m1.1.1.3" xref="S2.I1.i4.p1.1.m1.1.1.3.cmml">1</mn></mrow><annotation-xml encoding="MathML-Content" id="S2.I1.i4.p1.1.m1.1b"><apply id="S2.I1.i4.p1.1.m1.1.1.cmml" xref="S2.I1.i4.p1.1.m1.1.1"><minus id="S2.I1.i4.p1.1.m1.1.1.1.cmml" xref="S2.I1.i4.p1.1.m1.1.1.1"></minus><cn id="S2.I1.i4.p1.1.m1.1.1.2.cmml" type="float" xref="S2.I1.i4.p1.1.m1.1.1.2">0.1</cn><cn id="S2.I1.i4.p1.1.m1.1.1.3.cmml" type="integer" xref="S2.I1.i4.p1.1.m1.1.1.3">1</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i4.p1.1.m1.1c">0.1-1</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i4.p1.1.m1.1d">0.1 - 1</annotation></semantics></math> ms latency and <math alttext="99.9999\%" class="ltx_Math" display="inline" id="S2.I1.i4.p1.2.m2.1"><semantics id="S2.I1.i4.p1.2.m2.1a"><mrow id="S2.I1.i4.p1.2.m2.1.1" xref="S2.I1.i4.p1.2.m2.1.1.cmml"><mn id="S2.I1.i4.p1.2.m2.1.1.2" xref="S2.I1.i4.p1.2.m2.1.1.2.cmml">99.9999</mn><mo id="S2.I1.i4.p1.2.m2.1.1.1" xref="S2.I1.i4.p1.2.m2.1.1.1.cmml">%</mo></mrow><annotation-xml encoding="MathML-Content" id="S2.I1.i4.p1.2.m2.1b"><apply id="S2.I1.i4.p1.2.m2.1.1.cmml" xref="S2.I1.i4.p1.2.m2.1.1"><csymbol cd="latexml" id="S2.I1.i4.p1.2.m2.1.1.1.cmml" xref="S2.I1.i4.p1.2.m2.1.1.1">percent</csymbol><cn id="S2.I1.i4.p1.2.m2.1.1.2.cmml" type="float" xref="S2.I1.i4.p1.2.m2.1.1.2">99.9999</cn></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.I1.i4.p1.2.m2.1c">99.9999\%</annotation><annotation encoding="application/x-llamapun" id="S2.I1.i4.p1.2.m2.1d">99.9999 %</annotation></semantics></math> reliability, ensuring real-time response and improving communication stability of IoT devices in scenarios such as telemedicine and industrial automation.</p> </div> </li> <li class="ltx_item" id="S2.I1.i5" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S2.I1.i5.p1"> <p class="ltx_p" id="S2.I1.i5.p1.1"><span class="ltx_text ltx_font_bold" id="S2.I1.i5.p1.1.1">Integrated sensing and communication</span>: 6G IoT devices will be equipped with wireless positioning, radar sensing, target detection and other functions. It will provide centimeter-level positioning to support high-precision navigation applications.</p> </div> </li> <li class="ltx_item" id="S2.I1.i6" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S2.I1.i6.p1"> <p class="ltx_p" id="S2.I1.i6.p1.1"><span class="ltx_text ltx_font_bold" id="S2.I1.i6.p1.1.1">AI and communication</span>: 6G IoT will integrate deeply AI to automatically optimize resource allocation, signal transmission and device management to improve network efficiency. On the other hand, AI services will sink to edge devices, realizing edge intelligent services and reducing dependence on the cloud.</p> </div> </li> </ul> </div> <figure class="ltx_figure" id="S2.F2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="338" id="S2.F2.g1" src="extracted/6288571/fig/2.jpg" width="538"/> <br class="ltx_break ltx_centering"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S2.F2.2.1.1" style="font-size:90%;">Figure 2</span>: </span><span class="ltx_text" id="S2.F2.3.2" style="font-size:90%;">LLM for 6G IoT.</span></figcaption> </figure> </section> <section class="ltx_subsection" id="S2.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S2.SS2.4.1.1">II-B</span> </span><span class="ltx_text ltx_font_italic" id="S2.SS2.5.2">Large Language Model</span> </h3> <div class="ltx_para" id="S2.SS2.p1"> <p class="ltx_p" id="S2.SS2.p1.1">As the brightest achievement of contemporary AI, LLMs have demonstrated astounding capabilities in a variety of vertical domains. An LLM usually consists of an embedding layer, multiple transformer layers, and an output layer <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib5" title=""><span class="ltx_text" style="font-size:80%;">5</span></a>]</cite>. The impressive adaptability of LLMs stems from the pre-training and fine-tuning paradigm. The pre-training phase uses massive unsupervised data (publicly available data) to learn generalized linguistic knowledge through self-supervision. The pre-training phase requires a significant amount of resources, which include not only the computing processors like GPUs and TPUs but also the memory, energy, and network bandwidth. For example, the pre-training of LLaMa-2-70B takes 2 trillions of data tokens, 1.7 million GPU hours, and consumes <math alttext="2.5\times 10^{12}" class="ltx_Math" display="inline" id="S2.SS2.p1.1.m1.1"><semantics id="S2.SS2.p1.1.m1.1a"><mrow id="S2.SS2.p1.1.m1.1.1" xref="S2.SS2.p1.1.m1.1.1.cmml"><mn id="S2.SS2.p1.1.m1.1.1.2" xref="S2.SS2.p1.1.m1.1.1.2.cmml">2.5</mn><mo id="S2.SS2.p1.1.m1.1.1.1" lspace="0.222em" rspace="0.222em" xref="S2.SS2.p1.1.m1.1.1.1.cmml">×</mo><msup id="S2.SS2.p1.1.m1.1.1.3" xref="S2.SS2.p1.1.m1.1.1.3.cmml"><mn id="S2.SS2.p1.1.m1.1.1.3.2" xref="S2.SS2.p1.1.m1.1.1.3.2.cmml">10</mn><mn id="S2.SS2.p1.1.m1.1.1.3.3" xref="S2.SS2.p1.1.m1.1.1.3.3.cmml">12</mn></msup></mrow><annotation-xml encoding="MathML-Content" id="S2.SS2.p1.1.m1.1b"><apply id="S2.SS2.p1.1.m1.1.1.cmml" xref="S2.SS2.p1.1.m1.1.1"><times id="S2.SS2.p1.1.m1.1.1.1.cmml" xref="S2.SS2.p1.1.m1.1.1.1"></times><cn id="S2.SS2.p1.1.m1.1.1.2.cmml" type="float" xref="S2.SS2.p1.1.m1.1.1.2">2.5</cn><apply id="S2.SS2.p1.1.m1.1.1.3.cmml" xref="S2.SS2.p1.1.m1.1.1.3"><csymbol cd="ambiguous" id="S2.SS2.p1.1.m1.1.1.3.1.cmml" xref="S2.SS2.p1.1.m1.1.1.3">superscript</csymbol><cn id="S2.SS2.p1.1.m1.1.1.3.2.cmml" type="integer" xref="S2.SS2.p1.1.m1.1.1.3.2">10</cn><cn id="S2.SS2.p1.1.m1.1.1.3.3.cmml" type="integer" xref="S2.SS2.p1.1.m1.1.1.3.3">12</cn></apply></apply></annotation-xml><annotation encoding="application/x-tex" id="S2.SS2.p1.1.m1.1c">2.5\times 10^{12}</annotation><annotation encoding="application/x-llamapun" id="S2.SS2.p1.1.m1.1d">2.5 × 10 start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT</annotation></semantics></math> J of energy. Subsequently, the pre-trained models are fine-tuned to acquire domain-specific knowledge, thus enhancing their generalization capabilities and applicability to specific tasks. Compared to pre-training, the data used for fine-tuning is a small amount of task-specific data (license required), and the resources consumed are less.</p> </div> <div class="ltx_para" id="S2.SS2.p2"> <p class="ltx_p" id="S2.SS2.p2.1">A pressing challenge has arisen alongside the rapid advancement of LLMs: projections indicate that high-quality public datasets may be exhausted by 2026 <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib6" title=""><span class="ltx_text" style="font-size:80%;">6</span></a>]</cite>. The growing reliance on combining existing datasets or leveraging model-generated data, rather than curating new datasets, underscores the increasing scarcity of publicly available data. Given that established scaling laws suggest larger datasets generally yield superior performance, this shortage could soon pose a significant bottleneck to the continued development of LLMs.</p> </div> </section> <section class="ltx_subsection" id="S2.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S2.SS3.4.1.1">II-C</span> </span><span class="ltx_text ltx_font_italic" id="S2.SS3.5.2">LLM-Empowered IoT Architecture</span> </h3> <div class="ltx_para" id="S2.SS3.p1"> <p class="ltx_p" id="S2.SS3.p1.1">IoT devices continuously generate real-time, private, multimodal data (e.g., images, speech, and sensor data), yet they require powerful AI capabilities to manage resources efficiently and deliver intelligent services to users. On the other hand, as mentioned previously, LLMs possess exceptional generalization and reasoning capabilities but face limitations due to data scarcity. By bridging this gap, a synergistic integration between IoT and LLMs can unlock new possibilities for intelligent applications.</p> </div> <div class="ltx_para" id="S2.SS3.p2"> <p class="ltx_p" id="S2.SS3.p2.1">In alignment with the IoT vision for the 6G era, we propose an LLM-empowered IoT architecture to overcome the aforementioned challenges. As shown in Fig. 1, the architecture integrates LLMs within IoT ecosystems across multiple network layers, including space, air, and ground networks. The architecture enables efficient AI processing by distributing LLMs across different computing infrastructures, from cloud servers to edge servers and IoT devices. The cloud server serves as a centralized intelligence hub, performing deep learning tasks and managing LLM updates. The edge servers reduce reliance on cloud computing by handling real-time inference and adaptation. The IoT devices interact with LLMs for real-time decision-making, automation, and data analytics.</p> </div> <div class="ltx_para" id="S2.SS3.p3"> <p class="ltx_p" id="S2.SS3.p3.1">The synergy of LLM and IoT in the proposed architecture is two-fold: LLM for 6G IoT and LLM on 6G IoT. In the following, we will illustrate the basic ideas of LLM for 6G IoT in Section III and LLM on 6G IoT in Section IV, respectively.</p> </div> </section> </section> <section class="ltx_section" id="S3"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">III </span><span class="ltx_text ltx_font_smallcaps" id="S3.1.1">LLM for 6G IoT</span> </h2> <div class="ltx_para" id="S3.p1"> <p class="ltx_p" id="S3.p1.1">As shown in Fig. 2, LLMs for 6G IoT can be categorized into two key aspects: LLMs empower IoT applications and enhance IoT management.</p> </div> <section class="ltx_subsection" id="S3.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S3.SS1.4.1.1">III-A</span> </span><span class="ltx_text ltx_font_italic" id="S3.SS1.5.2">LLMs Empower IoT Applications</span> </h3> <div class="ltx_para" id="S3.SS1.p1"> <p class="ltx_p" id="S3.SS1.p1.1">With the advancement of 6G networks, edge computing, and AI algorithms, the application of LLMs in the IoT field will be more extensive, providing strong support for the future smart society. As shown in Fig. 2, three representative applications are introduced as follows:</p> <ul class="ltx_itemize" id="S3.I1"> <li class="ltx_item" id="S3.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I1.i1.p1"> <p class="ltx_p" id="S3.I1.i1.p1.1">Smart healthcare: IoT is enhancing healthcare by improving real-time monitoring. LLMs can parse medical records, medical papers, and clinical data to assist doctors in making diagnoses and recommending the best treatment options to improve the accuracy of medical decisions. Combined with data from wearable devices, LLM can analyze a patient’s health status in real time, predict disease risk, and provide personalized health advice for early disease warning.</p> </div> </li> <li class="ltx_item" id="S3.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I1.i2.p1"> <p class="ltx_p" id="S3.I1.i2.p1.1">Smart home: LLMs can enhance the interactivity and autonomy of smart homes, making home devices smarter and more personalized. With the powerful NLP capabilities, LLM can be used in smart speakers and voice assistants (e.g., Alexa, Google Assistant) to achieve smoother, context-aware natural language understanding, enabling users to control home appliances, lights, security systems, etc., through voice or text commands. By learning from long-term user data, LLM can predict user habits to enhance the user experience, such as automatically adjusting the temperature and lighting. Combining computer vision and sensor data, LLM can analyze data from home cameras and door lock sensors to identify abnormal behavior and improve home security.</p> </div> </li> <li class="ltx_item" id="S3.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S3.I1.i3.p1"> <p class="ltx_p" id="S3.I1.i3.p1.1">Smart city: IoT plays a crucial role in the development of smart cities by enabling intelligent infrastructure, real-time monitoring, and efficient resource management. LLMs can improve the efficiency of urban management and promote the development of smart cities by understanding and analyzing urban big data. For example, based on traffic sensors, cameras, and GPS data from IoT, LLM can analyze urban traffic flow in time, optimize traffic light scheduling, reduce congestion, and improve commuting efficiency.</p> </div> </li> </ul> </div> </section> <section class="ltx_subsection" id="S3.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S3.SS2.4.1.1">III-B</span> </span><span class="ltx_text ltx_font_italic" id="S3.SS2.5.2">LLMs Enhance IoT Management</span> </h3> <div class="ltx_para" id="S3.SS2.p1"> <p class="ltx_p" id="S3.SS2.p1.1">IoT empowered by LLMs facilitates self-adaptive optimization through dynamic power adjustment, transmission strategy optimization, and intelligent task scheduling, thereby enhancing system robustness and energy efficiency. The main processes are shown in Fig. 2. Firstly, IoT devices (e.g., smart sensors, cameras, communication terminals) continuously generate multimodal information such as logs, location information, signal data, sensor data, etc., and send them to edge computing nodes for processing. Edge servers pre-process data from IoT devices and extract key information to reduce the overhead and latency of data transmission. For example, the edge server estimates the channel from the IoT device to the edge server based on the information provided by a local LLM, such as location, logs, and signal strength. Due to the dynamic and changing nature of the network, the resource status information received at the cloud server is not real-time and, therefore, needs to be predicted at the edge based on the available data. After the cloud server receives all the global information and the provided prompts (e.g., optimization goals), it uses the powerful reasoning capability of the LLMs to reason and generate resource optimization strategies. The strategies are cascaded down, and each edge server and IoT device adjusts its resource occupancy according to the strategies.</p> </div> <div class="ltx_para" id="S3.SS2.p2"> <p class="ltx_p" id="S3.SS2.p2.1">In contrast to most traditional approaches based on optimization theory techniques, which only work with appropriate mathematical models, LLMs can be adapted to different scenarios by learning directly from data without explicit mathematical modeling.</p> </div> </section> </section> <section class="ltx_section" id="S4"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">IV </span><span class="ltx_text ltx_font_smallcaps" id="S4.1.1">LLM on 6G IoT</span> </h2> <div class="ltx_para" id="S4.p1"> <p class="ltx_p" id="S4.p1.1">This section explores the key technologies that enable LLMs to deploy on IoT.</p> </div> <section class="ltx_subsection" id="S4.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S4.SS1.4.1.1">IV-A</span> </span><span class="ltx_text ltx_font_italic" id="S4.SS1.5.2">Edge Fine-Tuning</span> </h3> <div class="ltx_para" id="S4.SS1.p1"> <p class="ltx_p" id="S4.SS1.p1.1">LLMs pre-trained in the cloud are deployed within IoT environments for fine-tuning, enabling them to learn domain-specific knowledge from IoT-generated data and adapt to real-world applications more effectively.</p> </div> <section class="ltx_subsubsection" id="S4.SS1.SSS1"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S4.SS1.SSS1.4.1.1">IV-A</span>1 </span>Parameter-Efficient Fine-Tuning</h4> <div class="ltx_para" id="S4.SS1.SSS1.p1"> <p class="ltx_p" id="S4.SS1.SSS1.p1.1">Considering the limited resources in IoT devices and edge servers, the full fine-tuning (i.e., training all parameters) of LLMs raises the concern of insufficient computing resources. Moreover, full fine-tuning is accompanied by high communication costs for transmitting the full model parameters in federated learning. To this end, the parameter-efficient fine-tuning (PEFT) can be implemented. PEFT reduces computational requirements by updating only a portion of the parameters and freezing most of the parameters of the pre-trained model. As shown in Fig. 3, the PEFT can be categorized into three types: Additive PEFT, selective PEFT, and reparameterization PEFT. Specifically, additive PEFT inserts a minimal number of trainable parameters that are strategically positioned within the model architecture. Selective PEFT selects the subset of the original parameters for training. Reparameterization PEFT reformulates a model’s architecture by transforming its parameters. For example, low-rank adaptation (LoRA) <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib7" title=""><span class="ltx_text" style="font-size:80%;">7</span></a>]</cite> decomposes pre-trained weights into low-rank matrices for updating.</p> </div> <figure class="ltx_figure" id="S4.F3"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_square" height="245" id="S4.F3.g1" src="extracted/6288571/fig/FT.jpg" width="240"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F3.2.1.1" style="font-size:90%;">Figure 3</span>: </span><span class="ltx_text" id="S4.F3.3.2" style="font-size:90%;">Parameter-efficient fine-tuning</span></figcaption> </figure> <div class="ltx_para" id="S4.SS1.SSS1.p2"> <p class="ltx_p" id="S4.SS1.SSS1.p2.1">Although PEFT methods can reduce computational overhead, they require storing intermediate activations to compute the gradients of the trainable parameters, which need a significant memory requirement (over 70% full fine-tuning). For most IoT devices, on-device LLM fine-tuning is still resource-intensive. On the other hand, collecting the raw data from IoT devices to an edge server raises privacy concerns.</p> </div> </section> <section class="ltx_subsubsection" id="S4.SS1.SSS2"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S4.SS1.SSS2.4.1.1">IV-A</span>2 </span>Distributed Learning Framework</h4> <div class="ltx_para" id="S4.SS1.SSS2.p1"> <p class="ltx_p" id="S4.SS1.SSS2.p1.1">Therefore, distributed learning with multi-device collaboration or server-device collaboration is a promising solution. Fig. 4 demonstrates the distributed learning frameworks for edge fine-tuning.</p> <ul class="ltx_itemize" id="S4.I1"> <li class="ltx_item" id="S4.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S4.I1.i1.p1"> <p class="ltx_p" id="S4.I1.i1.p1.1">Collaborative devices split learning: The pre-trained model is partitioned into multiple submodels deployed in a set of devices within the trusted domain. Data from all devices is sent to the device that has the input layer of the model, which is trained through a well-designed pipeline mechanism.</p> </div> </li> <li class="ltx_item" id="S4.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S4.I1.i2.p1"> <p class="ltx_p" id="S4.I1.i2.p1.1">Server-device split learning: The pre-trained model is split into a client-side submodel and a server-side submodel. After completing the training of a client, the client uploads the client’s model to the server to update the overall model. After that, the server splits the model and sends it to the next client to continue training.</p> </div> </li> <li class="ltx_item" id="S4.I1.i3" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S4.I1.i3.p1"> <p class="ltx_p" id="S4.I1.i3.p1.1">Split federated learning: Split federated learning (SFL) combines the principles of federated learning (FL) and split learning (SL) by splitting the model between the client and server, allowing clients to train part of the model locally while offloading the remaining computation to the server, thus enhancing privacy and reducing client-side memory usage.</p> </div> </li> <li class="ltx_item" id="S4.I1.i4" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S4.I1.i4.p1"> <p class="ltx_p" id="S4.I1.i4.p1.1">Transfer learning: IoT devices are deployed with lightweight LLMs or other small models, and edge servers are deployed with LLMs. Knowledge transfer between LLMs and small models is accomplished through knowledge distillation.</p> </div> </li> </ul> </div> <figure class="ltx_figure" id="S4.F4"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_square" height="292" id="S4.F4.g1" src="extracted/6288571/fig/DL.jpg" width="293"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F4.2.1.1" style="font-size:90%;">Figure 4</span>: </span><span class="ltx_text" id="S4.F4.3.2" style="font-size:90%;">Distributed learning frameworks for edge fine-tuning.</span></figcaption> </figure> <div class="ltx_para" id="S4.SS1.SSS2.p2"> <p class="ltx_p" id="S4.SS1.SSS2.p2.1">A shared objective of the aforementioned distributed learning approaches is to alleviate the memory burden and computational demands on devices by partitioning the model. Distributed learning methods combined with PEFT can effectively reduce the memory footprint, computational requirements, and communication overhead of model aggregation.</p> </div> </section> </section> <section class="ltx_subsection" id="S4.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S4.SS2.4.1.1">IV-B</span> </span><span class="ltx_text ltx_font_italic" id="S4.SS2.5.2">Edge Inference</span> </h3> <div class="ltx_para" id="S4.SS2.p1"> <p class="ltx_p" id="S4.SS2.p1.1">Existing cloud-based LLM inference will allow raw data leakage. Moreover, cloud-based inference will introduce significant latency that will defy the need for LLMs to serve IoT.</p> </div> <section class="ltx_subsubsection" id="S4.SS2.SSS1"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S4.SS2.SSS1.4.1.1">IV-B</span>1 </span>On-Device Inference</h4> <div class="ltx_para" id="S4.SS2.SSS1.p1"> <p class="ltx_p" id="S4.SS2.SSS1.p1.1">A feasible solution for fast LLM inference is on-device inference. The model is lightened by quantizing and pruning the large model, and then the knowledge of the original large model is obtained through knowledge distillation. With lightweight LLM, IoT infers at the device to get results. The model compression is accompanied by a decrease in accuracy. This is intolerable for some applications that require high precision.</p> </div> </section> <section class="ltx_subsubsection" id="S4.SS2.SSS2"> <h4 class="ltx_title ltx_title_subsubsection"> <span class="ltx_tag ltx_tag_subsubsection"><span class="ltx_text" id="S4.SS2.SSS2.4.1.1">IV-B</span>2 </span>Co-Inference</h4> <div class="ltx_para" id="S4.SS2.SSS2.p1"> <p class="ltx_p" id="S4.SS2.SSS2.p1.1">Collaborative inference is incorporated into our proposed architecture to ensure accuracy and the resource constraints of IoT devices. Collaborative inference alleviates the computational burden on IoT devices by offloading workloads to a server through layer-wise model partitioning. Beyond ensuring strong privacy preservation, it can also reduce communication overhead when the size of the intermediate features at the partitioned layer is smaller than that of the raw data, making it a more efficient approach for resource-constrained IoT environments. Furthermore, similar to multi-device split learning, a multi-hop collaborative inference architecture can be designed based on the specific network topology and device resources.</p> </div> </section> </section> </section> <section class="ltx_section" id="S5"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">V </span><span class="ltx_text ltx_font_smallcaps" id="S5.1.1">Memory-Efficient Split Federated Learning for LLM Fine-Tuning</span> </h2> <div class="ltx_para" id="S5.p1"> <p class="ltx_p" id="S5.p1.1">In this section, a novel SFL framework is proposed for LLM fine-tuning, aiming to reduce memory usage.</p> </div> <div class="ltx_para" id="S5.p2"> <p class="ltx_p" id="S5.p2.1">We consider a typical two-tier IoT network, which consists of an edge server and a set of IoT devices (clients). The edge server, with more powerful computational and memory capability, is primarily responsible for the model fine-tuning training task and manages model aggregation and split. These mobile devices are heterogeneous, with different computing capabilities and memory. Each client has a local dataset for fine-tuning and the local datasets of the clients are non-independent and identically distributed (Non-IID). The edge server and the IoT devices collaboratively fine-tune a transformer-based LLM by the LoRA method <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib7" title=""><span class="ltx_text" style="font-size:80%;">7</span></a>]</cite> for a specific downstream task.</p> </div> <figure class="ltx_figure" id="S5.F5"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="269" id="S5.F5.g1" src="extracted/6288571/fig/MESFL.jpg" width="479"/> <br class="ltx_break ltx_centering"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S5.F5.2.1.1" style="font-size:90%;">Figure 5</span>: </span><span class="ltx_text" id="S5.F5.3.2" style="font-size:90%;">An illustration of the memory-efficient SFL framework.</span></figcaption> </figure> <div class="ltx_para" id="S5.p3"> <p class="ltx_p" id="S5.p3.1">To fit the limited computing and memory resources, each client is allocated a reasonable client-side pre-trained model, as well as the corresponding client-side LoRA adapters. The server is responsible for the forward propagation and backward propagation computation of the server-side pre-trained models. Besides, the server takes charge of synchronizing the LoRA adapters, periodically aggregating the client-side LoRA adapters and the server-side LoRA adapters. The goal is to learn an optimal LoRA adapter model that minimizes the global loss function across all the clients.</p> </div> <div class="ltx_para" id="S5.p4"> <p class="ltx_p" id="S5.p4.1">As shown in Fig. <a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#S5.F5" title="Figure 5 ‣ V Memory-Efficient Split Federated Learning for LLM Fine-Tuning ‣ LLM-Empowered IoT for 6G Networks: Architecture, Challenges, and Solutions"><span class="ltx_text ltx_ref_tag">5</span></a>, the server maintains an entire pre-trained model and multiple server-side LoRA adapters corresponding to the client-side LoRA adapters. The training procedure can be divided into parallel-sequential fine-tuning and LoRA adapter aggregation two phases:</p> <ul class="ltx_itemize" id="S5.I1"> <li class="ltx_item" id="S5.I1.i1" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S5.I1.i1.p1"> <p class="ltx_p" id="S5.I1.i1.p1.1">Parallel-sequential fine-tuning: Each client performs forward propagation computation of the client-side pre-trained model locally. After the client completes the forward propagation, the activation, the corresponding label, and the corresponding index of the split layer are uploaded to the server via wireless communication. Based on the received client information, the corresponding server-side LoRA adapters are loaded, and the received activations are passed into the corresponding cut layer for forward propagation computation in the remaining model. After completing the update of the server-side LoRA adapters of the current client, the server transmits the activations’ gradients to its corresponding client. Meanwhile, the server switches the current LoRA adapters and loads the next training LoRA adapters. The client updates its LoRA adapters based on the gradients of the activations that were received.</p> </div> </li> <li class="ltx_item" id="S5.I1.i2" style="list-style-type:none;"> <span class="ltx_tag ltx_tag_item">•</span> <div class="ltx_para" id="S5.I1.i2.p1"> <p class="ltx_p" id="S5.I1.i2.p1.1">LoRA adapter aggregation: LoRA adapter aggregation is executed per <math alttext="I" class="ltx_Math" display="inline" id="S5.I1.i2.p1.1.m1.1"><semantics id="S5.I1.i2.p1.1.m1.1a"><mi id="S5.I1.i2.p1.1.m1.1.1" xref="S5.I1.i2.p1.1.m1.1.1.cmml">I</mi><annotation-xml encoding="MathML-Content" id="S5.I1.i2.p1.1.m1.1b"><ci id="S5.I1.i2.p1.1.m1.1.1.cmml" xref="S5.I1.i2.p1.1.m1.1.1">𝐼</ci></annotation-xml><annotation encoding="application/x-tex" id="S5.I1.i2.p1.1.m1.1c">I</annotation><annotation encoding="application/x-llamapun" id="S5.I1.i2.p1.1.m1.1d">italic_I</annotation></semantics></math> training round. Each client sends its client-side LoRA adapters to the server via wireless communications. The client-side LoRA adapters and the corresponding server-side LoRA adapters combine to form a full LoRA adapters for aggregating. The server aggregates all clients’ full LoRA adapters into aggregated LoRA adapters by FedAVG method <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib8" title=""><span class="ltx_text" style="font-size:80%;">8</span></a>]</cite>. The server splits the aggregated LoRA adapters to fit the client-side pre-trained models and then sends the aggregated client-side LoRA adapters to each client.</p> </div> </li> </ul> </div> <div class="ltx_para" id="S5.p5"> <p class="ltx_p" id="S5.p5.1">In the proposed framework, client-side submodels are trained in parallel, while server-side submodels are trained sequentially. The order of sequential computation impacts the backward propagation of the clients. The decision of client order for server-side training influences the overall training time. Therefore, to minimize the overall completion time, the key is to hide the communication time and the client computation time under the server computation time as much as possible <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib9" title=""><span class="ltx_text" style="font-size:80%;">9</span></a>]</cite>. Since the gradient size of each layer is the same and the gradient transmission time is much smaller than the backward propagation time. Therefore, we use a greedy algorithm where the server prioritizes tasks with a longer backward propagation time of the client.</p> </div> </section> <section class="ltx_section" id="S6"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">VI </span><span class="ltx_text ltx_font_smallcaps" id="S6.1.1">Case Study</span> </h2> <section class="ltx_subsection" id="S6.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S6.SS1.4.1.1">VI-A</span> </span><span class="ltx_text ltx_font_italic" id="S6.SS1.5.2">Considered Scenario</span> </h3> <div class="ltx_para" id="S6.SS1.p1"> <p class="ltx_p" id="S6.SS1.p1.1">We use an RTX 4080s server with a computational capability of 52.2 TFLOPS and consider six heterogeneous clients: a Jetson Nano (0.472 TFLOPS) with first one transformer layer, a Jetson TX2 (1.33 TFLOPS) with first one transformer layer, a Snapdragon 8s Gen 3 (1.689 TFLOPS) with first two transformer layers, a Snapdragon 8 Gen 3 (2.774 TFLOPS) with first two transformer layers, an A17 Pro (2.147 TFLOPS) with first three transformer layers, and an M3 (3.533 TFLOPS) with first three transformer layers. The data rate between the server and each client is set to 100 Mbps. We leverage BERT-base <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib10" title=""><span class="ltx_text" style="font-size:80%;">10</span></a>]</cite> as the pre-trained model for text analysis tasks using CARER dataset <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib11" title=""><span class="ltx_text" style="font-size:80%;">11</span></a>]</cite>. We set the LoRA rank, batch size, learning rate, maximum sequence length, and target accuracy to 16, 16, 0.00001, 128, and 0.89. We compare the proposed scheme with the following baselines: 1) SL <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib12" title=""><span class="ltx_text" style="font-size:80%;">12</span></a>]</cite>; 2)SFL <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib13" title=""><span class="ltx_text" style="font-size:80%;">13</span></a>]</cite>.</p> </div> <figure class="ltx_figure" id="S6.F6"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="280" id="S6.F6.g1" src="x1.png" width="373"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S6.F6.2.1.1" style="font-size:90%;">Figure 6</span>: </span><span class="ltx_text" id="S6.F6.3.2" style="font-size:90%;">Comparison of memory usage and convergence time across different schemes.</span></figcaption> </figure> </section> <section class="ltx_subsection" id="S6.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S6.SS2.4.1.1">VI-B</span> </span><span class="ltx_text ltx_font_italic" id="S6.SS2.5.2">Simulation Results</span> </h3> <div class="ltx_para" id="S6.SS2.p1"> <p class="ltx_p" id="S6.SS2.p1.1">Figure 6 shows the memory usage and the convergence time of the three frameworks. Although SL has the smallest memory usage, it results in the longest convergence time. SFL takes up a very large amount of memory due to the need to maintain multiple large models. The proposed scheme is able to achieve the fastest convergence. Compared with SL, the proposed scheme reduces the training time by 40% at the 10% memory cost. Compared to SFL, the proposed scheme reduces 79% of memory and 6% of training time. The proposed scheme reduces the memory footprint by reusing a full LLM for sequential training and reduces the overall training time by designing a reasonable training scheduling scheme.</p> </div> </section> </section> <section class="ltx_section" id="S7"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">VII </span><span class="ltx_text ltx_font_smallcaps" id="S7.1.1">Open Issues</span> </h2> <section class="ltx_subsection" id="S7.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S7.SS1.4.1.1">VII-A</span> </span><span class="ltx_text ltx_font_italic" id="S7.SS1.5.2">Limited and Heterogeneous Resources in IoT Devices</span> </h3> <div class="ltx_para" id="S7.SS1.p1"> <p class="ltx_p" id="S7.SS1.p1.1">IoT devices with limited memory, computational power, and processing capabilities, typically operate in energy-constrained environments, which significantly impacts the feasibility of running LLMs. High memory consumption and computational resources in LLMs exceed the storage capabilities of many IoT devices, preventing them from loading full models locally. Energy limitations can also lead to frequent interruptions or degraded performance, disrupting real-time processing and reducing system reliability. Given the sheer number of connected IoT devices, bandwidth constraints further restrict efficient communication and data exchange. Moreover, IoT networks consist of devices with vastly different levels of computational power, memory availability, and processing capabilities, leading to imbalanced workloads during model inference or training. This mismatch in device capabilities makes synchronizing and optimizing significantly more complex.</p> </div> <div class="ltx_para" id="S7.SS1.p2"> <p class="ltx_p" id="S7.SS1.p2.1">To address the limited and heterogeneous resources of IoT devices, a combination of model optimization techniques, distributed computing strategies, and efficient workload management is essential. Model compression methods such as quantization, pruning, and knowledge distillation can significantly reduce the size and computational requirements of LLMs. Additionally, split computing allows IoT devices to offload heavy computations to edge servers, enabling a balance between local processing and remote inference. Furthermore, adaptive workload distribution ensures that high-performance edge nodes handle complex tasks while lightweight devices process simpler workloads, optimizing overall system efficiency.</p> </div> </section> <section class="ltx_subsection" id="S7.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S7.SS2.4.1.1">VII-B</span> </span><span class="ltx_text ltx_font_italic" id="S7.SS2.5.2">On-Demand Deployment of LLMs</span> </h3> <div class="ltx_para" id="S7.SS2.p1"> <p class="ltx_p" id="S7.SS2.p1.1">Deploying LLMs in IoT environments presents significant challenges due to limited resources and dynamic workload requirements. Traditional cloud-based LLM deployment relies on centralized processing, which often results in high latency and excessive bandwidth consumption. Many IoT applications, such as real-time monitoring and smart healthcare, require low-latency responses, making static cloud-based deployment impractical. Moreover, in IoT applications, service requests of LLMs are diverse and random, and the computational demands depend on the complexity of the task, making it inefficient to continuously allocate fixed resources for model execution. Furthermore, the heterogeneous nature of IoT environments adds complexity to efficiently deploying and managing LLMs. Some devices may only require lightweight inference, while others might need full-scale LLM execution for advanced decision-making.</p> </div> <div class="ltx_para" id="S7.SS2.p2"> <p class="ltx_p" id="S7.SS2.p2.1">To enable the on-demand deployment of LLMs, a combination of edge caching and model compression is essential. Edge caching enables frequently used model components or inference results to be stored closer to IoT devices, reducing redundant computations and minimizing communication overhead. Meanwhile, a trade-off between model compression and performance must be carefully managed <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib14" title=""><span class="ltx_text" style="font-size:80%;">14</span></a>]</cite>, as highly compressed models may introduce accuracy degradation, while full-scale models demand significant computational resources.</p> </div> </section> <section class="ltx_subsection" id="S7.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection"><span class="ltx_text" id="S7.SS3.4.1.1">VII-C</span> </span><span class="ltx_text ltx_font_italic" id="S7.SS3.5.2">Privacy and Data Security Risks</span> </h3> <div class="ltx_para" id="S7.SS3.p1"> <p class="ltx_p" id="S7.SS3.p1.1">IoT devices handle sensitive data, including personal, healthcare, and location information, raising critical privacy and security concerns. Many lack secure storage and processing, making them vulnerable to breaches and cyberattacks. IoT applications require continuous data collection, increasing the risk of data leakage, especially when relying on cloud-based processing. Additionally, IoT ecosystems with inconsistent security standards are highly heterogeneous to expose devices to encryption weaknesses and data poisoning risks. Decentralized AI models face threats like model inversion attacks, complicating data integrity and regulatory compliance in large-scale deployments <cite class="ltx_cite ltx_citemacro_cite">[<a class="ltx_ref" href="https://arxiv.org/html/2503.13819v1#bib.bib15" title=""><span class="ltx_text" style="font-size:80%;">15</span></a>]</cite>.</p> </div> <div class="ltx_para" id="S7.SS3.p2"> <p class="ltx_p" id="S7.SS3.p2.1">To mitigate privacy and data security risks, a combination of secure AI frameworks, encryption techniques, and decentralized learning approaches is crucial. End-to-end encryption and secure multiparty computation can protect sensitive data during transmission and processing, ensuring that only authorized entities can access it. Federated learning enables on-device training without transferring raw data to centralized servers, reducing privacy risks while benefiting from collaborative model improvements. Homomorphic encryption and differential privacy techniques can further safeguard user data by allowing computations on encrypted data and preventing the extraction of sensitive information from AI models.</p> </div> </section> </section> <section class="ltx_section" id="S8"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">VIII </span><span class="ltx_text ltx_font_smallcaps" id="S8.1.1">Conclusion</span> </h2> <div class="ltx_para" id="S8.p1"> <p class="ltx_p" id="S8.p1.1">In this article, we have proposed an LLM-empowered IoT architecture to support IoT applications and facilitate intelligent network management. The architecture aims to seamlessly integrate LLMs and IoT. LLM for 6G IoT aims to enhance device intelligence and efficient resource management in dynamic IoT environments. LLM on 6G IoT focuses on optimizing infrastructure to support LLM deployment, leveraging edge computing and efficient networking to accommodate the high computational and communication demands of LLMs. We have proposed a memory-efficient SFL framework for LLM fine-tuning that integrates FL and SL to efficiently reduce memory and time. A case study has validated the framework’s feasibility and effectiveness. To accelerate the development of LLM-empowered IoT architecture, we have discussed the open issues.</p> </div> </section> <section class="ltx_bibliography" id="bib"> <h2 class="ltx_title ltx_title_bibliography" style="font-size:80%;">References</h2> <ul class="ltx_biblist"> <li class="ltx_bibitem" id="bib.bib1"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib1.2.2.1" style="font-size:80%;">[1]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib1.4.1" style="font-size:80%;"> ITU, “Framework and overall objectives of the future development of imt for 2030 and beyond,” ITU, Tech. Rep. ITU-R M.2160-0, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib2"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib2.2.2.1" style="font-size:80%;">[2]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib2.4.1" style="font-size:80%;"> X. Shen, J. Gao, W. Wu, M. Li, C. Zhou, and W. Zhuang, “Holistic network virtualization and pervasive network intelligence for 6G,” </span><em class="ltx_emph ltx_font_italic" id="bib.bib2.5.2" style="font-size:80%;">IEEE Commun. Surveys Tuts.</em><span class="ltx_text" id="bib.bib2.6.3" style="font-size:80%;">, vol. 24, no. 1, pp. 1–30, 2022. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib3"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib3.2.2.1" style="font-size:80%;">[3]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib3.4.1" style="font-size:80%;"> W. Wu, C. Zhou, M. Li, H. Wu, H. Zhou, N. Zhang, X. S. Shen, and W. Zhuang, “AI-native network slicing for 6G networks,” </span><em class="ltx_emph ltx_font_italic" id="bib.bib3.5.2" style="font-size:80%;">IEEE Wireless Commun.</em><span class="ltx_text" id="bib.bib3.6.3" style="font-size:80%;">, vol. 29, no. 1, pp. 96–103, 2022. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib4"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib4.2.2.1" style="font-size:80%;">[4]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib4.4.1" style="font-size:80%;"> J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, and D. Amodei, “Scaling laws for neural language models,” </span><em class="ltx_emph ltx_font_italic" id="bib.bib4.5.2" style="font-size:80%;">arXiv:2001.08361</em><span class="ltx_text" id="bib.bib4.6.3" style="font-size:80%;">, 2020. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib5"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib5.2.2.1" style="font-size:80%;">[5]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib5.4.1" style="font-size:80%;"> Y. Chang, X. Wang, J. Wang, Y. Wu, L. Yang, K. Zhu, H. Chen, X. Yi, C. Wang, Y. Wang </span><em class="ltx_emph ltx_font_italic" id="bib.bib5.5.2" style="font-size:80%;">et al.</em><span class="ltx_text" id="bib.bib5.6.3" style="font-size:80%;">, “A survey on evaluation of large language models,” </span><em class="ltx_emph ltx_font_italic" id="bib.bib5.7.4" style="font-size:80%;">ACM Trans. Intell. Syst. Technol.</em><span class="ltx_text" id="bib.bib5.8.5" style="font-size:80%;">, vol. 15, no. 3, pp. 1–45, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib6"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib6.2.2.1" style="font-size:80%;">[6]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib6.4.1" style="font-size:80%;"> P. Villalobos, J. Sevilla, L. Heim, T. Besiroglu, M. Hobbhahn, and A. Ho, “Will we run out of data? an analysis of the limits of scaling datasets in machine learning,” </span><em class="ltx_emph ltx_font_italic" id="bib.bib6.5.2" style="font-size:80%;">arXiv:2211.04325</em><span class="ltx_text" id="bib.bib6.6.3" style="font-size:80%;">, vol. 1, 2022. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib7"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib7.2.2.1" style="font-size:80%;">[7]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib7.4.1" style="font-size:80%;"> E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “LoRA: Low-rank adaptation of large language models,” </span><em class="ltx_emph ltx_font_italic" id="bib.bib7.5.2" style="font-size:80%;">arXiv:2106.09685</em><span class="ltx_text" id="bib.bib7.6.3" style="font-size:80%;">, 2021. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib8"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib8.2.2.1" style="font-size:80%;">[8]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib8.4.1" style="font-size:80%;"> B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” in </span><em class="ltx_emph ltx_font_italic" id="bib.bib8.5.2" style="font-size:80%;">Proc. Int. Conf. Artif. Intell. Statist.,</em><span class="ltx_text" id="bib.bib8.6.3" style="font-size:80%;">, 2017, pp. 1273–1282. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib9"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib9.2.2.1" style="font-size:80%;">[9]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib9.4.1" style="font-size:80%;"> X. Gong, “Delay-optimal distributed edge computing in wireless edge networks,” in </span><em class="ltx_emph ltx_font_italic" id="bib.bib9.5.2" style="font-size:80%;">Proc. IEEE INFOCOM</em><span class="ltx_text" id="bib.bib9.6.3" style="font-size:80%;">, 2020, pp. 2629–2638. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib10"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib10.2.2.1" style="font-size:80%;">[10]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib10.4.1" style="font-size:80%;"> J. Devlin, “BERT: Pre-training of deep bidirectional transformers for language understanding,” </span><em class="ltx_emph ltx_font_italic" id="bib.bib10.5.2" style="font-size:80%;">arXiv:1810.04805</em><span class="ltx_text" id="bib.bib10.6.3" style="font-size:80%;">, 2018. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib11"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib11.2.2.1" style="font-size:80%;">[11]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib11.4.1" style="font-size:80%;"> E. Saravia, H.-C. T. Liu, Y.-H. Huang, J. Wu, and Y.-S. Chen, “CARER: Contextualized affect representations for emotion recognition,” in </span><em class="ltx_emph ltx_font_italic" id="bib.bib11.5.2" style="font-size:80%;">Proc. EMNLP</em><span class="ltx_text" id="bib.bib11.6.3" style="font-size:80%;">, 2018, pp. 3687–3697. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib12"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib12.2.2.1" style="font-size:80%;">[12]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib12.4.1" style="font-size:80%;"> W. Wu, M. Li, K. Qu, C. Zhou, X. Shen, W. Zhuang, X. Li, and W. Shi, “Split learning over wireless networks: Parallel design and resource management,” </span><em class="ltx_emph ltx_font_italic" id="bib.bib12.5.2" style="font-size:80%;">IEEE J. Sel. Areas Commun.</em><span class="ltx_text" id="bib.bib12.6.3" style="font-size:80%;">, vol. 41, no. 4, pp. 1051–1066, 2023. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib13"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib13.2.2.1" style="font-size:80%;">[13]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib13.4.1" style="font-size:80%;"> Y. Tian, Y. Wan, L. Lyu, D. Yao, H. Jin, and L. Sun, “FedBERT: When federated learning meets pre-training,” </span><em class="ltx_emph ltx_font_italic" id="bib.bib13.5.2" style="font-size:80%;">ACM Trans. Intell. Syst. Technol.</em><span class="ltx_text" id="bib.bib13.6.3" style="font-size:80%;">, vol. 13, no. 4, 2022. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib14"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib14.2.2.1" style="font-size:80%;">[14]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib14.4.1" style="font-size:80%;"> Y. Yang, S. Dang, and Z. Zhang, “An adaptive compression and communication framework for wireless federated learning,” </span><em class="ltx_emph ltx_font_italic" id="bib.bib14.5.2" style="font-size:80%;">IEEE Trans. Mobile Comput.</em><span class="ltx_text" id="bib.bib14.6.3" style="font-size:80%;">, 2024. </span> </span> </li> <li class="ltx_bibitem" id="bib.bib15"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem"><span class="ltx_text" id="bib.bib15.2.2.1" style="font-size:80%;">[15]</span></span> <span class="ltx_bibblock"><span class="ltx_text" id="bib.bib15.4.1" style="font-size:80%;"> G. Chen, Z. Qin, M. Yang, Y. Zhou, T. Fan, T. Du, and Z. Xu, “Unveiling the vulnerability of private fine-tuning in split-based frameworks for large language models: A bidirectionally enhanced attack,” in </span><em class="ltx_emph ltx_font_italic" id="bib.bib15.5.2" style="font-size:80%;">Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security</em><span class="ltx_text" id="bib.bib15.6.3" style="font-size:80%;">, 2024, pp. 2904–2918. </span> </span> </li> </ul> </section> </article> </div> <footer class="ltx_page_footer"> <div class="ltx_page_logo">Generated on Tue Mar 18 01:49:46 2025 by <a class="ltx_LaTeXML_logo" href="http://dlmf.nist.gov/LaTeXML/"><span style="letter-spacing:-0.2em; margin-right:0.1em;">L<span class="ltx_font_smallcaps" style="position:relative; bottom:2.2pt;">a</span>T<span class="ltx_font_smallcaps" style="font-size:120%;position:relative; bottom:-0.2ex;">e</span></span><span style="font-size:90%; position:relative; bottom:-0.2ex;">XML</span><img alt="Mascot Sammy" src=""/></a> </div></footer> </div> </body> </html>