CINXE.COM
A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology
<!DOCTYPE html> <html lang="en"> <head> <meta content="text/html; charset=utf-8" http-equiv="content-type"/> <title>A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology</title> <!--Generated on Wed Feb 12 06:29:08 2025 by LaTeXML (version 0.8.8) http://dlmf.nist.gov/LaTeXML/.--> <meta content="width=device-width, initial-scale=1, shrink-to-fit=no" name="viewport"/> <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/ar5iv-fonts.0.7.9.min.css" rel="stylesheet" type="text/css"/> <link href="/static/browse/0.3.4/css/latexml_styles.css" rel="stylesheet" type="text/css"/> <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/1.3.3/html2canvas.min.js"></script> <script src="/static/browse/0.3.4/js/addons_new.js"></script> <script src="/static/browse/0.3.4/js/feedbackOverlay.js"></script> <meta content="Non-volatile memory, Near-Memory Compute, Microcontroller" lang="en" name="keywords"/> <base href="/html/2503.11660v1/"/></head> <body> <nav class="ltx_page_navbar"> <nav class="ltx_TOC"> <ol class="ltx_toclist"> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#S1" title="In A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">1 </span>Introduction</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"> <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#S2" title="In A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2 </span>The Proposed Architecture</span></a> <ol class="ltx_toclist ltx_toclist_section"> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#S2.SS1" title="In 2. The Proposed Architecture ‣ A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2.1 </span>Overall Structure</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#S2.SS2" title="In 2. The Proposed Architecture ‣ A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2.2 </span>Near-Memory Computing Unit(NMCU)</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#S2.SS3" title="In 2. The Proposed Architecture ‣ A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2.3 </span>High Voltage Generator</span></a></li> <li class="ltx_tocentry ltx_tocentry_subsection"><a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#S2.SS4" title="In 2. The Proposed Architecture ‣ A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">2.4 </span>Overstress-free WL driver</span></a></li> </ol> </li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#S3" title="In A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">3 </span>Experiment Results</span></a></li> <li class="ltx_tocentry ltx_tocentry_section"><a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#S4" title="In A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology"><span class="ltx_text ltx_ref_title"><span class="ltx_tag ltx_tag_ref">4 </span>Conclusion</span></a></li> </ol></nav> </nav> <div class="ltx_page_main"> <div class="ltx_page_content"> <article class="ltx_document ltx_authors_1line ltx_leqno"> <h1 class="ltx_title ltx_title_document">A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology</h1> <div class="ltx_authors"> <span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Daewung Kim </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id2.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id3.2.id2">169 Yeoksam-ro, Gangnam-gu</span><span class="ltx_text ltx_affiliation_city" id="id4.3.id3">Seoul</span><span class="ltx_text ltx_affiliation_country" id="id5.4.id4">Republic of Korea</span><span class="ltx_text ltx_affiliation_postcode" id="id6.5.id5">06247</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:david@anaflash.com">david@anaflash.com</a> </span></span></span> <span class="ltx_author_before">, </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Seong Hwan Jeon </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id7.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id8.2.id2">169 Yeoksam-ro, Gangnam-gu</span><span class="ltx_text ltx_affiliation_city" id="id9.3.id3">Seoul</span><span class="ltx_text ltx_affiliation_country" id="id10.4.id4">Republic of Korea</span><span class="ltx_text ltx_affiliation_postcode" id="id11.5.id5">06247</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:john@anaflash.com">john@anaflash.com</a> </span></span></span> <span class="ltx_author_before">, </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Young Hee Jeon </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id12.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id13.2.id2">169 Yeoksam-ro, Gangnam-gu</span><span class="ltx_text ltx_affiliation_city" id="id14.3.id3">Seoul</span><span class="ltx_text ltx_affiliation_country" id="id15.4.id4">Republic of Korea</span><span class="ltx_text ltx_affiliation_postcode" id="id16.5.id5">06247</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:nina@anaflash.com">nina@anaflash.com</a> </span></span></span> <span class="ltx_author_before">, </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Kyung-Bae Kwon </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id17.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id18.2.id2">169 Yeoksam-ro, Gangnam-gu</span><span class="ltx_text ltx_affiliation_city" id="id19.3.id3">Seoul</span><span class="ltx_text ltx_affiliation_country" id="id20.4.id4">Republic of Korea</span><span class="ltx_text ltx_affiliation_postcode" id="id21.5.id5">06247</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:luke@anaflash.com">luke@anaflash.com</a> </span></span></span> <span class="ltx_author_before">, </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Jigon Kim </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id22.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id23.2.id2">169 Yeoksam-ro, Gangnam-gu</span><span class="ltx_text ltx_affiliation_city" id="id24.3.id3">Seoul</span><span class="ltx_text ltx_affiliation_country" id="id25.4.id4">Republic of Korea</span><span class="ltx_text ltx_affiliation_postcode" id="id26.5.id5">06247</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:jerry@anaflash.com">jerry@anaflash.com</a> </span></span></span> <span class="ltx_author_before">, </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Yeounghun Choi </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id27.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id28.2.id2">169 Yeoksam-ro, Gangnam-gu</span><span class="ltx_text ltx_affiliation_city" id="id29.3.id3">Seoul</span><span class="ltx_text ltx_affiliation_country" id="id30.4.id4">Republic of Korea</span><span class="ltx_text ltx_affiliation_postcode" id="id31.5.id5">06247</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:eun@anaflash.com">eun@anaflash.com</a> </span></span></span> <span class="ltx_author_before">, </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Hyunseung Cha </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id32.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id33.2.id2">Bundangnaegok-ro, Bundang-gu</span><span class="ltx_text ltx_affiliation_city" id="id34.3.id3">Seongnam-si</span><span class="ltx_text ltx_affiliation_country" id="id35.4.id4">Republic of Korea</span><span class="ltx_text ltx_affiliation_postcode" id="id36.5.id5">13529</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:tony@anaflash.com">tony@anaflash.com</a> </span></span></span> <span class="ltx_author_before">, </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Kitae Kwon </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id37.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id38.2.id2">440 N WOLFE RD</span><span class="ltx_text ltx_affiliation_city" id="id39.3.id3">Sunnyvale</span><span class="ltx_text ltx_affiliation_state" id="id40.4.id4">CA</span><span class="ltx_text ltx_affiliation_country" id="id41.5.id5">USA</span><span class="ltx_text ltx_affiliation_postcode" id="id42.6.id6">94085-3869</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:kkwon@anaflash.com">kkwon@anaflash.com</a> </span></span></span> <span class="ltx_author_before">, </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Daesik Park </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id43.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id44.2.id2">440 N WOLFE RD</span><span class="ltx_text ltx_affiliation_city" id="id45.3.id3">Sunnyvale</span><span class="ltx_text ltx_affiliation_state" id="id46.4.id4">CA</span><span class="ltx_text ltx_affiliation_country" id="id47.5.id5">USA</span><span class="ltx_text ltx_affiliation_postcode" id="id48.6.id6">94085-3869</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:daniel@anaflash.com">daniel@anaflash.com</a> </span></span></span> <span class="ltx_author_before">, </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Jongseuk Lee </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id49.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id50.2.id2">440 N WOLFE RD</span><span class="ltx_text ltx_affiliation_city" id="id51.3.id3">Sunnyvale</span><span class="ltx_text ltx_affiliation_state" id="id52.4.id4">CA</span><span class="ltx_text ltx_affiliation_country" id="id53.5.id5">USA</span><span class="ltx_text ltx_affiliation_postcode" id="id54.6.id6">94085-3869</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:jimmy@anaflash.com">jimmy@anaflash.com</a> </span></span></span> <span class="ltx_author_before">, </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Sihwan Kim </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id55.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id56.2.id2">440 N WOLFE RD</span><span class="ltx_text ltx_affiliation_city" id="id57.3.id3">Sunnyvale</span><span class="ltx_text ltx_affiliation_state" id="id58.4.id4">CA</span><span class="ltx_text ltx_affiliation_country" id="id59.5.id5">USA</span><span class="ltx_text ltx_affiliation_postcode" id="id60.6.id6">94085-3869</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:skim@anaflash.com">skim@anaflash.com</a> </span></span></span> <span class="ltx_author_before"> and </span><span class="ltx_creator ltx_role_author"> <span class="ltx_personname">Seung-Hwan Song </span><span class="ltx_author_notes"> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_institution" id="id61.1.id1">ANAFLASH Inc.</span><span class="ltx_text ltx_affiliation_streetaddress" id="id62.2.id2">Bundangnaegok-ro, Bundang-gu</span><span class="ltx_text ltx_affiliation_city" id="id63.3.id3">Seongnam-si</span><span class="ltx_text ltx_affiliation_country" id="id64.4.id4">Republic of Korea</span><span class="ltx_text ltx_affiliation_postcode" id="id65.5.id5">13529</span> </span> <span class="ltx_contact ltx_role_affiliation"><span class="ltx_text ltx_affiliation_streetaddress" id="id66.6.id1">440 N WOLFE RD</span><span class="ltx_text ltx_affiliation_city" id="id67.7.id2">Sunnyvale</span><span class="ltx_text ltx_affiliation_state" id="id68.8.id3">CA</span><span class="ltx_text ltx_affiliation_country" id="id69.9.id4">USA</span><span class="ltx_text ltx_affiliation_postcode" id="id70.10.id5">94085-3869</span> </span> <span class="ltx_contact ltx_role_email"><a href="mailto:peter@anaflash.com">peter@anaflash.com</a> </span></span></span> </div> <div class="ltx_dates">(2025)</div> <div class="ltx_abstract"> <h6 class="ltx_title ltx_title_abstract">Abstract.</h6> <p class="ltx_p" id="id71.id1">This study introduces a novel AI microcontroller optimized for cost-effective, battery-powered edge AI applications. Unlike traditional single bit/cell memory configurations, the proposed microcontroller integrates zero-standby power weight memory featuring standard logic compatible 4-bits/cell embedded flash technology tightly coupled to a Near-Memory Computing Unit. This architecture enables efficient and low-power AI acceleration. Advanced state mapping and an overstress-free word line (WL) driver circuit extend verify levels, ensuring robust 16 state cell margin. A ping-pong buffer reduces internal data movement while supporting simultaneous multi-bit processing. The fabricated microcontroller demonstrated high reliability, maintaining accuracy after 160 hours of unpowered baking at 125℃.</p> </div> <div class="ltx_keywords">Non-volatile memory, Near-Memory Compute, Microcontroller </div> <span class="ltx_note ltx_note_frontmatter ltx_role_copyright" id="id1"><sup class="ltx_note_mark">†</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">†</sup><span class="ltx_note_type">copyright: </span>rightsretained</span></span></span><span class="ltx_note ltx_note_frontmatter ltx_role_journalyear" id="id2"><sup class="ltx_note_mark">†</sup><span class="ltx_note_outer"><span class="ltx_note_content"><sup class="ltx_note_mark">†</sup><span class="ltx_note_type">journalyear: </span>2025</span></span></span> <section class="ltx_section" id="S1"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">1. </span>Introduction</h2> <div class="ltx_para" id="S1.p1"> <p class="ltx_p" id="S1.p1.1">Microcontrollers designed for battery-powered smart edge devices are often required to run inferencing tasks at a place where sensor data are generated for real-time response. They use a locally stored AI model which is trained in the cloud. Power-gating technique is often deployed to reduce idle mode power consumption in the low power applications. The AI model can be stored and updated in an embedded Non-Volatile Memory (eNVM) during the device’s lifetime without consuming standby power during the idle mode. Typically, multiple-time programmable eNVM technology requires additional fabrication steps beyond a standard logic process and are configured to store only single bit information per unit memory cell, which limits the efficiency of AI computation <cite class="ltx_cite ltx_citemacro_citep">(Deaville et al<span class="ltx_text">.</span>, <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#bib.bib2" title="">2022</a>)</cite>. In this work, we introduce an AI microcontroller with zero-standby power weight memory featuring standard logic compatible 4-bits/cell Embedded FLASH (EFLASH) technology, tightly coupled with a Near-Memory Computing Unit (NMCU) for cost-effective and low power edge AI computing applications.</p> </div> </section> <section class="ltx_section" id="S2"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">2. </span>The Proposed Architecture</h2> <section class="ltx_subsection" id="S2.SS1"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">2.1. </span>Overall Structure</h3> <div class="ltx_para" id="S2.SS1.p1"> <p class="ltx_p" id="S2.SS1.p1.1">Fig. 1 shows a block diagram of the proposed AI microcontroller, which consists of i) 32-bit RISC-V CPU core, ii) SRAM for instruction and data memory, iii) DMA controller, iv) peripheral subsystems including GPIO, SPI, and UART, v) 128 Kb EFLASH for initial setting parameters and code storage, vi) 4 Mb 4-bits/cell EFLASH tightly coupled with the NMCU, v) on-chip standard logic compatible High Voltage (HV) and reference voltage generator circuits. The EFLASH macro is based on a 5T cell based single-poly EFLASH cell array<cite class="ltx_cite ltx_citemacro_citep">(Song et al<span class="ltx_text">.</span>, <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#bib.bib8" title="">2013</a>)</cite> and is integrated with other peripheral circuits such as in the WL driver, Sense Amplifier (SA) circuits.</p> </div> <figure class="ltx_figure" id="S2.F1"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_figure_panel ltx_img_landscape" height="410" id="S2.F1.g1" src="extracted/6179635/figure_1.png" width="598"/></div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S2.F1.2.1.1" style="font-size:90%;">Figure 1</span>. </span><span class="ltx_text" id="S2.F1.3.2" style="font-size:90%;">AI microcontroller featuring 4-bits/cell EFLASH technology tightly coupled to a near-memory computing unit</span></figcaption><div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_1"><span class="ltx_ERROR ltx_centering ltx_figure_panel undefined" id="S2.F1.4">\Description</span></div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <p class="ltx_p ltx_figure_panel ltx_align_center" id="S2.F1.5">AI microcontroller featuring 4-bits/cell EFLASH technology tightly coupled to a near-memory computing unit</p> </div> </div> </figure> </section> <section class="ltx_subsection" id="S2.SS2"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">2.2. </span>Near-Memory Computing Unit(NMCU)</h3> <figure class="ltx_figure" id="S2.F2"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_figure_panel ltx_img_landscape" height="207" id="S2.F2.g1" src="extracted/6179635/figure_2.png" width="479"/></div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S2.F2.2.1.1" style="font-size:90%;">Figure 2</span>. </span><span class="ltx_text" id="S2.F2.3.2" style="font-size:90%;">Near-Memory Computing Unit for efficient AI acceleration</span></figcaption><div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_1"><span class="ltx_ERROR ltx_centering ltx_figure_panel undefined" id="S2.F2.4">\Description</span></div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <p class="ltx_p ltx_figure_panel ltx_align_center" id="S2.F2.5">Near-Memory Computing Unit for efficient AI acceleration</p> </div> </div> </figure> <div class="ltx_para" id="S2.SS2.p1"> <p class="ltx_p" id="S2.SS2.p1.1">Fig. 2 shows a Near-Memory Computing Unit with 4-bits/cell EFLASH based weight memory. The flash memory is tightly coupled to the computation unit with a large bandwidth for efficient AI acceleration. Each 4-bits/cell EFLASH bank can load 256 4-bit weights in a single read operation. To maximize throughput, two processing elements (PEs) are allocated per 4-bits/cell EFLASH macro. Therefore, one PE can process MAC operations of up to 128 elements per EFLASH read. Larger matrix-vector multiplication (MVM) operations are possible by performing multiple EFLASH reads in succession. The NMCU’s flow control logic automatically adjusts the address of the weight parameters as required for the MVM operation with a single RISC-V instruction, which reduces communication overhead between host CPU and NMCU. NMCU includes a ping-pong buffer that can use the calculation results of the previous layer as an input for the next layer calculation. The input fetcher logic supplies the PE with an input vector of 128 8-bit elements by selecting either the input buffer or the ping-pong buffer. After the MVM operation is completed, the operation result is quantized to 8 bits and written-back to the ping-pong buffer. Notably, no additional data movement is required beyond the first input vector for TinyML models like FC-Autoencoder <cite class="ltx_cite ltx_citemacro_citep">(et al., <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#bib.bib4" title="">2021</a>)</cite>. The NMCU also employs element-wise int8 quantization schemes from TFLite-micro <cite class="ltx_cite ltx_citemacro_citep">(et al., <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#bib.bib3" title="">2018</a>)</cite>.</p> </div> </section> <section class="ltx_subsection" id="S2.SS3"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">2.3. </span>High Voltage Generator</h3> <div class="ltx_para" id="S2.SS3.p1"> <p class="ltx_p" id="S2.SS3.p1.1">Fig. 3 shows the schematic diagram of the designed HV generator circuit to pump the I/O supply voltage (i.e. VDDH = 2.5V) to program and erase voltage level (i.e. VPP4 = <math alttext="\sim" class="ltx_Math" display="inline" id="S2.SS3.p1.1.m1.1"><semantics id="S2.SS3.p1.1.m1.1a"><mo id="S2.SS3.p1.1.m1.1.1" xref="S2.SS3.p1.1.m1.1.1.cmml">∼</mo><annotation-xml encoding="MathML-Content" id="S2.SS3.p1.1.m1.1b"><csymbol cd="latexml" id="S2.SS3.p1.1.m1.1.1.cmml" xref="S2.SS3.p1.1.m1.1.1">similar-to</csymbol></annotation-xml><annotation encoding="application/x-tex" id="S2.SS3.p1.1.m1.1c">\sim</annotation><annotation encoding="application/x-llamapun" id="S2.SS3.p1.1.m1.1d">∼</annotation></semantics></math>10 V) during the program and erase operations. The HV generator is designed using standard I/O logic devices without any additional HV process steps and is composed of six-stage voltage doubler to operate the individual I/O devices within the nominal operating voltage level while providing sufficiently high regulated VPP4 level for given program and erase time. Here, we deploy the adaptive body biasing scheme for NMOS as well as PMOS transistors to avoid forward bias current in the voltage doubler circuit. When the VPP1 level is boosted higher than a reference level of SREF, the cascaded PMOS switches connect the boosted nodes VPP1-4 to the program/erase voltage supply nodes (i.e. VPS1-4) without introducing stress voltage of the PMOS switches during the program/erase operation of the logic compatible EFLASH macro. On the other hand, when the VPP1 level is discharged lower than a reference level of SREF by disabling the clock generator to save power consumption from the HV generator circuit, the cascaded PMOS switches connect the VDDH level to the program/erase voltage supply nodes VPS1-4.</p> </div> <figure class="ltx_figure" id="S2.F3"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_figure_panel ltx_img_landscape" height="376" id="S2.F3.g1" src="extracted/6179635/figure_3.png" width="598"/></div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S2.F3.2.1.1" style="font-size:90%;">Figure 3</span>. </span><span class="ltx_text" id="S2.F3.3.2" style="font-size:90%;">Standard logic compatible high voltage generator for embedded flash program/erase operations</span></figcaption><div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_1"><span class="ltx_ERROR ltx_centering ltx_figure_panel undefined" id="S2.F3.4">\Description</span></div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <p class="ltx_p ltx_figure_panel ltx_align_center" id="S2.F3.5">Standard logic compatible high voltage generator for embedded flash program/erase operations</p> </div> </div> </figure> </section> <section class="ltx_subsection" id="S2.SS4"> <h3 class="ltx_title ltx_title_subsection"> <span class="ltx_tag ltx_tag_subsection">2.4. </span>Overstress-free WL driver</h3> <div class="ltx_para" id="S2.SS4.p1"> <p class="ltx_p" id="S2.SS4.p1.1">The conventional WL driver circuit in <cite class="ltx_cite ltx_citemacro_citep">(Song et al<span class="ltx_text">.</span>, <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#bib.bib8" title="">2013</a>)</cite> supplies a read reference level (i.e. VRD) through the source of the NMOS device string to the selected WL. Due to the threshold voltage drop of the NMOS exacerbated by the elevated source voltage of the NMOS, the available VRD for the EFLASH read operation was much lower than VPPH level. In this work, we propose an overstress-free WL driver circuit with a VRD PMOS charging path from the PMOS charging circuit as shown in Fig. 4. This driver is used to extend the VRD level up to VDDH (nominal operating voltage of the individual device) for a wider range program-verify read operation, which is critical for 4-bits/cell program verify operations. For program operation, SWR1 and SWR2 signals are toggled. Then, WL can be driven to the program voltage (i.e. VPGM=10 V) through the VPGM charging PMOS path shown in Fig 4a. Since the stacked devices in the VPGM discharging path split the voltage stresses, the driver circuit operates without introducing voltage overstress of the individual device. For program-verify operation, the read selection signal (SRD) is switched from low to high. Then, as illustrated in Fig 4b, the WL starts charging from GND to the VRD level through the VRD NMOS path for the case when the VRD is low enough and through the VRD PMOS path for the case when the VRD is high enough. When the SRD is switched from high to low, the WL is connected to the ground level through the NMOS discharging path. Thus, with the proposed circuit, the program-verify read voltage of VRD can be extended to VDDH without a VTH drop. For read operation, the high voltage generator circuit is turned off. Then, VPS1-4 nodes are switched to VDDH, whereas VPP1-4 nodes are switched to GND from the circuits shown in Fig. 3. Then, as illustrated in Fig 4c, the WL begins charging from GND to the VRD level through the VRD NMOS and/or PMOS path depending on the VRD level. Consequently, the proposed circuit extends the read voltage of VRD to VDDH without a VTH drop, enabling reliable 4-bit/cell read operation.</p> </div> <figure class="ltx_figure" id="S2.F4"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_2"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S2.F4.sf1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="365" id="S2.F4.sf1.g1" src="extracted/6179635/figure_4a.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S2.F4.sf1.2.1.1" style="font-size:90%;">(a)</span> </span><span class="ltx_text" id="S2.F4.sf1.3.2" style="font-size:90%;">Program operation</span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_2"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S2.F4.sf2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="379" id="S2.F4.sf2.g1" src="extracted/6179635/figure_4b.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S2.F4.sf2.2.1.1" style="font-size:90%;">(b)</span> </span><span class="ltx_text" id="S2.F4.sf2.3.2" style="font-size:90%;">Program-verify operation</span></figcaption> </figure> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_2"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S2.F4.sf3"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="381" id="S2.F4.sf3.g1" src="extracted/6179635/figure_4c.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S2.F4.sf3.2.1.1" style="font-size:90%;">(c)</span> </span><span class="ltx_text" id="S2.F4.sf3.3.2" style="font-size:90%;">Read operation</span></figcaption> </figure> </div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S2.F4.2.1.1" style="font-size:90%;">Figure 4</span>. </span><span class="ltx_text" id="S2.F4.3.2" style="font-size:90%;">Overstress-free WL driver circuit of 4-bits/cell EFLASH with PMOS charging path: (a) for program operation, (b) for a program-verify read operation, and (c) for read operation.</span></figcaption><div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_2"><span class="ltx_ERROR ltx_centering ltx_figure_panel undefined" id="S2.F4.4">\Description</span></div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <p class="ltx_p ltx_figure_panel ltx_align_center" id="S2.F4.5">Overstress-free WL driver circuit of 4-bits/cell EFLASH with PMOS charging path: (a) for program operation, (b) for a program-verify read operation, and (c) for read operation.</p> </div> </div> </figure> </section> </section> <section class="ltx_section" id="S3"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">3. </span>Experiment Results</h2> <div class="ltx_para" id="S3.p1"> <p class="ltx_p" id="S3.p1.1">The proposed standard logic compatible non-volatile AI microcontroller featuring 4-bits/cell EFLASH tightly coupled to the NMCU has been fabricated using a 1 V core supply 28 nm low power standard logic technology. Since the 4-bits/cell EFLASH cells have a higher probability of transitioning to the adjacent states compared to the long distance states during a cell lifetime, we mapped the 4-bits/cell EFLASH memory states to the 4-bit quantized weight value such that the adjacent states can differ by one decimal value as shown in Fig. 5 (a). This resulted in a non-uniform distribution of the programmed 4-bits/cell EFLASH memory states, since the distribution of the trained weights is the most common near zero in general <cite class="ltx_cite ltx_citemacro_citep">(Zhong et al<span class="ltx_text">.</span>, <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#bib.bib9" title="">2022</a>)</cite>. Considering such a non-uniform distribution, we carefully determined 15 verify read reference levels for 15 programmed states. By sequentially verifying each programmed state as shown in Fig.5 (b), 16 distinct states can be programmed with a margin between states. The designed logic compatible HV generator circuits were measured to boost the program voltage level (i.e. VPP4 level) of approximately 10V as shown in Fig. 5 (c). The designed WL driver circuits were measured to supply verify-reference levels from 0V to 2.5V (=VDDH), which is used to verify 15 programmed states with a full range of 2.5V.</p> </div> <figure class="ltx_figure" id="S3.F5"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_2"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S3.F5.sf1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="151" id="S3.F5.sf1.g1" src="extracted/6179635/figure_5a.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S3.F5.sf1.2.1.1" style="font-size:90%;">(a)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_2"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S3.F5.sf2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_square" height="597" id="S3.F5.sf2.g1" src="extracted/6179635/figure_5b.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S3.F5.sf2.2.1.1" style="font-size:90%;">(b)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_3"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S3.F5.sf3"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="340" id="S3.F5.sf3.g1" src="extracted/6179635/figure_5c.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S3.F5.sf3.2.1.1" style="font-size:90%;">(c)</span> </span></figcaption> </figure> </div> <div class="ltx_flex_cell ltx_flex_size_3"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S3.F5.sf4"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_landscape" height="238" id="S3.F5.sf4.g1" src="extracted/6179635/figure_5d.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S3.F5.sf4.2.1.1" style="font-size:90%;">(d)</span> </span></figcaption> </figure> </div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S3.F5.2.1.1" style="font-size:90%;">Figure 5</span>. </span><span class="ltx_text" id="S3.F5.3.2" style="font-size:90%;">(a) 4-bits/cell EFLASH state mapping table, (b) 16 states program-verify sequences, (c) measured VPP1-4 levels from the logic compatible charge pump, and (d) WL driver output signals (PWL/WWL) for verify operations of 4-bits/cell EFLASH cells</span></figcaption><div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_3"><span class="ltx_ERROR ltx_centering ltx_figure_panel undefined" id="S3.F5.4">\Description</span></div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <p class="ltx_p ltx_figure_panel ltx_align_center" id="S3.F5.5">(a) 4-bits per cell EFLASH state mapping table, (b) 16 states program-verify sequences, (c) measured VPP1-4 levels from the logic compatible charge pump, and (d) WL driver output signals (PWL and WWL) for verify operations of 4-bits per cell EFLASH cells</p> </div> </div> </figure> <div class="ltx_para" id="S3.p2"> <p class="ltx_p" id="S3.p2.1">To demonstrate actual neural networks in our chip, we evaluated an MLP model trained with MNIST dataset <cite class="ltx_cite ltx_citemacro_citep">(LeCun et al<span class="ltx_text">.</span>, <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#bib.bib6" title="">1998</a>)</cite> and standard benchmark FC-Autoencoder from MLPerf-Tiny <cite class="ltx_cite ltx_citemacro_citep">(et al., <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#bib.bib4" title="">2021</a>)</cite> before and after baking the fabricated microcontroller chip at 125℃ for 340 and 160 hours, respectively. To fit the precision of the weights to 4 bits/cell EFLASH, we performed 4 bit integer quantization aware training with MNIST dataset and ToyADMOS dataset. Fig. 6 shows the measured weight distribution of 4-bits/cell EFLASH cells and Table <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#S3.T1" title="Table 1 ‣ 3. Experiment Results ‣ A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology"><span class="ltx_text ltx_ref_tag">1</span></a> shows AI inference test results. Although some overlap was observed between adjacent cell states after baking, AI inference accuracy remained robust to have 95.58% for MNIST and 0.878 AUC for FC-Autoencoder, respectively. As a result, the inference accuracy degradation was limited to 0.04% compared to the software baseline for the MNIST dataset or not observed for FC-Autoencoder dataset for which the 9th layer of the model was implemented on-chip while other layers were processed off-chip as described in Fig. 7.</p> </div> <figure class="ltx_figure" id="S3.F6"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_1"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S3.F6.sf1"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_portrait" height="742" id="S3.F6.sf1.g1" src="extracted/6179635/figure_6a.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S3.F6.sf1.2.1.1" style="font-size:90%;">(a)</span> </span><span class="ltx_text" id="S3.F6.sf1.3.2" style="font-size:90%;">Weight distribution for MNIST(34K cells)</span></figcaption> </figure> </div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_2"> <figure class="ltx_figure ltx_figure_panel ltx_align_center" id="S3.F6.sf2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_img_portrait" height="747" id="S3.F6.sf2.g1" src="extracted/6179635/figure_6b.png" width="598"/> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S3.F6.sf2.2.1.1" style="font-size:90%;">(b)</span> </span><span class="ltx_text" id="S3.F6.sf2.3.2" style="font-size:90%;">Weight distribution for Autoencoder(16K cells)</span></figcaption> </figure> </div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S3.F6.2.1.1" style="font-size:90%;">Figure 6</span>. </span><span class="ltx_text" id="S3.F6.3.2" style="font-size:90%;">Measured weight distribution of 4-bits/cell EFLASH cells</span></figcaption><div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_2"><span class="ltx_ERROR ltx_centering ltx_figure_panel undefined" id="S3.F6.4">\Description</span></div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <p class="ltx_p ltx_figure_panel ltx_align_center" id="S3.F6.5">Measured weight distribution of 4-bits per cell EFLASH cells</p> </div> </div> </figure> <figure class="ltx_table" id="S3.T1"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S3.T1.2.1.1" style="font-size:90%;">Table 1</span>. </span><span class="ltx_text" id="S3.T1.3.2" style="font-size:90%;">Measured results of AI inference tasks</span></figcaption> <table class="ltx_tabular ltx_align_middle" id="S3.T1.4"> <tr class="ltx_tr" id="S3.T1.4.1"> <td class="ltx_td ltx_align_center ltx_border_tt" id="S3.T1.4.1.1">Inference Accuracy</td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S3.T1.4.1.2">MNIST</td> <td class="ltx_td ltx_align_left ltx_border_tt" id="S3.T1.4.1.3">AutoEncoder</td> </tr> <tr class="ltx_tr" id="S3.T1.4.2"> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.2.1">Before Bake</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S3.T1.4.2.2">95.67%</td> <td class="ltx_td ltx_align_left ltx_border_t" id="S3.T1.4.2.3">0.878 AUC</td> </tr> <tr class="ltx_tr" id="S3.T1.4.3"> <td class="ltx_td ltx_align_center" id="S3.T1.4.3.1">After Bake</td> <td class="ltx_td ltx_align_center" id="S3.T1.4.3.2">95.58%</td> <td class="ltx_td ltx_align_left" id="S3.T1.4.3.3">0.878 AUC</td> </tr> <tr class="ltx_tr" id="S3.T1.4.4"> <td class="ltx_td ltx_align_center ltx_border_bb" id="S3.T1.4.4.1">SW. Baseline</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S3.T1.4.4.2">95.62%</td> <td class="ltx_td ltx_align_left ltx_border_bb" id="S3.T1.4.4.3">0.878 AUC</td> </tr> </table> </figure> <figure class="ltx_figure" id="S3.F7"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_figure_panel ltx_img_landscape" height="324" id="S3.F7.g1" src="extracted/6179635/figure_8.png" width="598"/></div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S3.F7.2.1.1" style="font-size:90%;">Figure 7</span>. </span><span class="ltx_text" id="S3.F7.3.2" style="font-size:90%;">AI inference model</span></figcaption><div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_1"><span class="ltx_ERROR ltx_centering ltx_figure_panel undefined" id="S3.F7.4">\Description</span></div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <p class="ltx_p ltx_figure_panel ltx_align_center" id="S3.F7.5">AI inference model</p> </div> </div> </figure> </section> <section class="ltx_section" id="S4"> <h2 class="ltx_title ltx_title_section"> <span class="ltx_tag ltx_tag_section">4. </span>Conclusion</h2> <div class="ltx_para" id="S4.p1"> <p class="ltx_p" id="S4.p1.1">As summarized in the Table <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#S4.T2" title="Table 2 ‣ 4. Conclusion ‣ A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology"><span class="ltx_text ltx_ref_tag">2</span></a>, this work presents a unique standard logic compatible non-volatile microcontroller designed for cost-effective battery-powered edge AI device applications. A die photograph of the fabricated AI microcontroller is shown in Fig. 8. While alternative AI acceleration solutions employing tightly coupled memory have largely been restricted to a single bit/cell configurations, the proposed AI microcontroller, with its tightly coupled zero-standby-power weight memory, incorporates standard logic compatible 4-bit/cell embedded flash technology for efficient low power edge AI acceleration. Carefully designed state mapping and overstress-free WL driver circuit provide wider-range of verify levels, enabling a sufficient cell margin for 16 distinct cell states. The tightly coupled NMCU processes multi-bit information simultaneously and minimizes an internal data movement by a carefully designed ping-pong buffer. The fabricated non-volatile AI microcontroller maintained a good accuracy after being baked at 125℃ for more than 160 hours while unpowered.</p> </div> <figure class="ltx_figure" id="S4.F8"> <div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_2"><img alt="Refer to caption" class="ltx_graphics ltx_centering ltx_figure_panel ltx_img_square" height="412" id="S4.F8.g1" src="extracted/6179635/figure_7.jpg" width="419"/></div> </div> <figcaption class="ltx_caption ltx_centering"><span class="ltx_tag ltx_tag_figure"><span class="ltx_text" id="S4.F8.2.1.1" style="font-size:90%;">Figure 8</span>. </span><span class="ltx_text" id="S4.F8.3.2" style="font-size:90%;">Die photograph of the fabricated AI microcontroller</span></figcaption><div class="ltx_flex_figure"> <div class="ltx_flex_cell ltx_flex_size_1"><span class="ltx_ERROR ltx_centering ltx_figure_panel undefined" id="S4.F8.4">\Description</span></div> <div class="ltx_flex_break"></div> <div class="ltx_flex_cell ltx_flex_size_1"> <p class="ltx_p ltx_figure_panel ltx_align_center" id="S4.F8.5">Die photograph of the fabricated AI microcontroller</p> </div> </div> </figure> <figure class="ltx_table" id="S4.T2"> <figcaption class="ltx_caption"><span class="ltx_tag ltx_tag_table"><span class="ltx_text" id="S4.T2.2.1.1" style="font-size:90%;">Table 2</span>. </span><span class="ltx_text" id="S4.T2.3.2" style="font-size:90%;">Comparison table</span></figcaption> <table class="ltx_tabular ltx_align_middle" id="S4.T2.4"> <tr class="ltx_tr" id="S4.T2.4.1"> <td class="ltx_td ltx_border_tt" id="S4.T2.4.1.1"></td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T2.4.1.2"><cite class="ltx_cite ltx_citemacro_citep">(Deaville et al<span class="ltx_text">.</span>, <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#bib.bib2" title="">2022</a>)</cite></td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T2.4.1.3"><cite class="ltx_cite ltx_citemacro_citep">(et al., <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#bib.bib5" title="">2023</a>)</cite></td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T2.4.1.4"><cite class="ltx_cite ltx_citemacro_citep">(Lin et al<span class="ltx_text">.</span>, <a class="ltx_ref" href="https://arxiv.org/html/2503.11660v1#bib.bib7" title="">2023</a>)</cite></td> <td class="ltx_td ltx_align_center ltx_border_tt" id="S4.T2.4.1.5">This Work</td> </tr> <tr class="ltx_tr" id="S4.T2.4.2"> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.4.2.1">Process</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.4.2.2">22 nm</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.4.2.3">18 nm</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.4.2.4">28 nm</td> <td class="ltx_td ltx_align_center ltx_border_t" id="S4.T2.4.2.5">28 nm</td> </tr> <tr class="ltx_tr" id="S4.T2.4.3"> <td class="ltx_td ltx_align_center" id="S4.T2.4.3.1">Process Overhead</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.3.2">Yes</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.3.3">No</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.3.4">No</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.3.5">No</td> </tr> <tr class="ltx_tr" id="S4.T2.4.4"> <td class="ltx_td ltx_align_center" id="S4.T2.4.4.1">Memory Config</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.4.2"> <span class="ltx_text" id="S4.T2.4.4.2.1"></span> <span class="ltx_text" id="S4.T2.4.4.2.2"> <span class="ltx_tabular ltx_align_middle" id="S4.T2.4.4.2.2.1"> <span class="ltx_tr" id="S4.T2.4.4.2.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S4.T2.4.4.2.2.1.1.1">1 bit/cell</span></span> <span class="ltx_tr" id="S4.T2.4.4.2.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S4.T2.4.4.2.2.1.2.1">MRAM</span></span> </span></span><span class="ltx_text" id="S4.T2.4.4.2.3"></span></td> <td class="ltx_td ltx_align_center" id="S4.T2.4.4.3"> <span class="ltx_text" id="S4.T2.4.4.3.1"></span> <span class="ltx_text" id="S4.T2.4.4.3.2"> <span class="ltx_tabular ltx_align_middle" id="S4.T2.4.4.3.2.1"> <span class="ltx_tr" id="S4.T2.4.4.3.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S4.T2.4.4.3.2.1.1.1">1 bit/cell</span></span> <span class="ltx_tr" id="S4.T2.4.4.3.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S4.T2.4.4.3.2.1.2.1">SRAM</span></span> </span></span><span class="ltx_text" id="S4.T2.4.4.3.3"></span></td> <td class="ltx_td ltx_align_center" id="S4.T2.4.4.4"> <span class="ltx_text" id="S4.T2.4.4.4.1"></span> <span class="ltx_text" id="S4.T2.4.4.4.2"> <span class="ltx_tabular ltx_align_middle" id="S4.T2.4.4.4.2.1"> <span class="ltx_tr" id="S4.T2.4.4.4.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S4.T2.4.4.4.2.1.1.1">1 bit/cell</span></span> <span class="ltx_tr" id="S4.T2.4.4.4.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S4.T2.4.4.4.2.1.2.1">SRAM</span></span> </span></span><span class="ltx_text" id="S4.T2.4.4.4.3"></span></td> <td class="ltx_td ltx_align_center" id="S4.T2.4.4.5"> <span class="ltx_text" id="S4.T2.4.4.5.1"></span> <span class="ltx_text" id="S4.T2.4.4.5.2"> <span class="ltx_tabular ltx_align_middle" id="S4.T2.4.4.5.2.1"> <span class="ltx_tr" id="S4.T2.4.4.5.2.1.1"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S4.T2.4.4.5.2.1.1.1">4 bits/cell</span></span> <span class="ltx_tr" id="S4.T2.4.4.5.2.1.2"> <span class="ltx_td ltx_nopad_r ltx_align_center" id="S4.T2.4.4.5.2.1.2.1">EFLASH</span></span> </span></span><span class="ltx_text" id="S4.T2.4.4.5.3"></span></td> </tr> <tr class="ltx_tr" id="S4.T2.4.5"> <td class="ltx_td ltx_align_center" id="S4.T2.4.5.1">Non-Volatile</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.5.2">Yes</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.5.3">No</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.5.4">No</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.5.5">Yes</td> </tr> <tr class="ltx_tr" id="S4.T2.4.6"> <td class="ltx_td ltx_align_center" id="S4.T2.4.6.1">Activation Precision</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.6.2">1b</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.6.3">1-4b</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.6.4">8b</td> <td class="ltx_td ltx_align_center" id="S4.T2.4.6.5">8b</td> </tr> <tr class="ltx_tr" id="S4.T2.4.7"> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.4.7.1">Weight Precision</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.4.7.2">4b</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.4.7.3">1-4b</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.4.7.4">8b</td> <td class="ltx_td ltx_align_center ltx_border_bb" id="S4.T2.4.7.5">4b</td> </tr> </table> </figure> <div class="ltx_acknowledgements"> <h6 class="ltx_title ltx_title_acknowledgements">Acknowledgements.</h6> This work was partly supported by the DIPS 1000+ Fabless Challenge award and TIPS grants funded by the Ministry of SMEs and Startups (S3318962) and the Institute of Information & Communications Technology Planning & Evaluation (IITP) grants funded by the Korea government (MSIT) (RS-2023-00216370 and RS-2023-00229849). </div> </section> <section class="ltx_bibliography" id="bib"> <h2 class="ltx_title ltx_title_bibliography">References</h2> <ul class="ltx_biblist"> <li class="ltx_bibitem" id="bib.bib1"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem">(1)</span> <span class="ltx_bibblock"> </span> </li> <li class="ltx_bibitem" id="bib.bib2"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem">Deaville et al<span class="ltx_text" id="bib.bib2.2.2.1">.</span> (2022)</span> <span class="ltx_bibblock"> Peter Deaville, Bonan Zhang, and Naveen Verma. 2022. </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib2.3.1">A 22nm 128-kb MRAM Row/Column-Parallel In-Memory Computing Macro with Memory-Resistance Boosting and Multi-Column ADC Readout</em>. </span> <span class="ltx_bibblock">Symposium on VLSI Technology & Circuits. </span> <span class="ltx_bibblock"> </span> </li> <li class="ltx_bibitem" id="bib.bib3"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem">et al. (2018)</span> <span class="ltx_bibblock"> B. Jacob et al. 2018. </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib3.1.1">Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference</em>. </span> <span class="ltx_bibblock">IEEE CVPR. </span> <span class="ltx_bibblock"> </span> </li> <li class="ltx_bibitem" id="bib.bib4"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem">et al. (2021)</span> <span class="ltx_bibblock"> C. Banbury et al. 2021. </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib4.1.1">MLPerf Tiny Benchmark</em>. </span> <span class="ltx_bibblock">Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1). </span> <span class="ltx_bibblock"> </span> <span class="ltx_bibblock"><a class="ltx_ref ltx_url ltx_font_typewriter" href="https://openreview.net/forum?id=8RxxwAut1BI" title="">https://openreview.net/forum?id=8RxxwAut1BI</a>. </span> </li> <li class="ltx_bibitem" id="bib.bib5"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem">et al. (2023)</span> <span class="ltx_bibblock"> Desoli et al. 2023. </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib5.1.1">16.7 A 40-310TOPS/W SRAM-Based All-Digital Up to 4b In-Memory Computing Multi-Tiled NN Accelerator in FD-SOI 18nm for Deep-Learning Edge Applications</em>. </span> <span class="ltx_bibblock">ISSCC. </span> <span class="ltx_bibblock"> </span> </li> <li class="ltx_bibitem" id="bib.bib6"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem">LeCun et al<span class="ltx_text" id="bib.bib6.2.2.1">.</span> (1998)</span> <span class="ltx_bibblock"> Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. 1998. </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib6.3.1">Gradient-based learning applied to document recognition</em>. </span> <span class="ltx_bibblock">Proceedings of the IEEE. </span> <span class="ltx_bibblock"> </span> </li> <li class="ltx_bibitem" id="bib.bib7"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem">Lin et al<span class="ltx_text" id="bib.bib7.3.2.1">.</span> (2023)</span> <span class="ltx_bibblock"> Chuan-Tung Lin, Paul Xuanyuanliang Huang, Jonghyun Oh, Dewei Wang, and Mingoo Seok. 2023. </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib7.1.1">iMCU: A 102-<math alttext="\mu" class="ltx_Math" display="inline" id="bib.bib7.1.1.m1.1"><semantics id="bib.bib7.1.1.m1.1a"><mi id="bib.bib7.1.1.m1.1.1" xref="bib.bib7.1.1.m1.1.1.cmml">μ</mi><annotation-xml encoding="MathML-Content" id="bib.bib7.1.1.m1.1b"><ci id="bib.bib7.1.1.m1.1.1.cmml" xref="bib.bib7.1.1.m1.1.1">𝜇</ci></annotation-xml><annotation encoding="application/x-tex" id="bib.bib7.1.1.m1.1c">\mu</annotation><annotation encoding="application/x-llamapun" id="bib.bib7.1.1.m1.1d">italic_μ</annotation></semantics></math>J, 61-ms Digital In-Memory Computing-based Microcontroller Unit for Edge TinyML</em>. </span> <span class="ltx_bibblock">CICC. </span> <span class="ltx_bibblock"> </span> </li> <li class="ltx_bibitem" id="bib.bib8"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem">Song et al<span class="ltx_text" id="bib.bib8.2.2.1">.</span> (2013)</span> <span class="ltx_bibblock"> Seung-Hwan Song, Ki Chul Chun, and Chris H. Kim. 2013. </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib8.3.1">A Logic-Compatible Embedded Flash Memory for Zero-Standby Power System-on-Chips Featuring a Multi-Story High Voltage Switch and a Selective Refresh Scheme</em>. </span> <span class="ltx_bibblock">IEEE JSSC. </span> <span class="ltx_bibblock"> </span> </li> <li class="ltx_bibitem" id="bib.bib9"> <span class="ltx_tag ltx_role_refnum ltx_tag_bibitem">Zhong et al<span class="ltx_text" id="bib.bib9.2.2.1">.</span> (2022)</span> <span class="ltx_bibblock"> Weishun Zhong, Ben Sorscher, Daniel D Lee, and Haim Sompolinsky. 2022. </span> <span class="ltx_bibblock"><em class="ltx_emph ltx_font_italic" id="bib.bib9.3.1">A theory of learning with constrained weight-distribution</em>. </span> <span class="ltx_bibblock">36th Conference on Neural Information Processing Systems. </span> <span class="ltx_bibblock"> </span> </li> </ul> </section> </article> </div> <footer class="ltx_page_footer"> <div class="ltx_page_logo">Generated on Wed Feb 12 06:29:08 2025 by <a class="ltx_LaTeXML_logo" href="http://dlmf.nist.gov/LaTeXML/"><span style="letter-spacing:-0.2em; margin-right:0.1em;">L<span class="ltx_font_smallcaps" style="position:relative; bottom:2.2pt;">a</span>T<span class="ltx_font_smallcaps" style="font-size:120%;position:relative; bottom:-0.2ex;">e</span></span><span style="font-size:90%; position:relative; bottom:-0.2ex;">XML</span><img alt="Mascot Sammy" src=""/></a> </div></footer> </div> </body> </html>