CINXE.COM

Accelerating Neural Networks on Mobile and Web with Sparse Inference

<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8" /> <meta name="description" content="Posted by Artsiom Ablavatski and Marat Dukhan, Software Engineers, Google Research On-device inference of neural networks enables a variety of real..."><meta name="keywords" content="Deep Learning,TensorFlow,Machine Learning"><link rel="canonical" href="https://research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/" /><meta property="og:title" content="Accelerating Neural Networks on Mobile and Web with Sparse Inference"><meta property="og:url" content="https://research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/"><meta property="og:description" content="Posted by Artsiom Ablavatski and Marat Dukhan, Software Engineers, Google Research On-device inference of neural networks enables a variety of real..."><meta property="og:image" content="https://storage.googleapis.com/gweb-research2023-media/images/8baef8dd4b7870b70240b7ebd016f7c1-i.width-800.format-jpeg.jpg"><meta property="og:image:secure_url" content="https://storage.googleapis.com/gweb-research2023-media/images/8baef8dd4b7870b70240b7ebd016f7c1-i.width-800.format-jpeg.jpg"><meta property="og:type" content="Website"> <title>Accelerating Neural Networks on Mobile and Web with Sparse Inference</title> <meta name="description" content="Posted by Artsiom Ablavatski and Marat Dukhan, Software Engineers, Google Research On-device inference of neural networks enables a variety of real..." /> <meta name="viewport" content="width=device-width, initial-scale=1 viewport-fit=cover"/> <link rel="icon" type="image/png" href="/gr/static/assets/favicon.ico"> <link rel="preconnect" href="https://fonts.googleapis.com"> <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin> <link rel="preload" href="https://fonts.googleapis.com/css2?family=Product+Sans&family=Google+Sans+Display:ital@0;1&family=Google+Sans:ital,wght@0,400;0,500;0,700;1,400;1,500;1,700&family=Google+Sans+Text:ital,wght@0,400;0,500;0,700;1,400;1,500;1,700&display=swap" as="style"> <link rel="stylesheet" href="https://fonts.googleapis.com/css2?family=Product+Sans&family=Google+Sans+Display:ital@0;1&family=Google+Sans:ital,wght@0,400;0,500;0,700;1,400;1,500;1,700&family=Google+Sans+Text:ital,wght@0,400;0,500;0,700;1,400;1,500;1,700&display=swap"> <link href="https://fonts.googleapis.com/css2?family=Roboto+Mono:wght@400;700&display=swap" rel="stylesheet"> <link href="https://www.gstatic.com/glue/cookienotificationbar/cookienotificationbar.min.css" rel="stylesheet" /> <link href="https://www.gstatic.com/glue/v27_1/glue-material.min.css" rel="stylesheet"> <link rel="stylesheet" type="text/css" href="/gr/static/css/googleresearch.css?id=0c26ea1fed8bdd0324f9f4fad1f6a470"> <script> window.dataLayer = window.dataLayer || []; dataLayer.push({ publishDate: '20210309', wordCount: '1599' }); </script> <!-- Google Tag Manager --> <script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start': new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0], j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src= 'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f); })(window,document,'script','dataLayer','GTM-K8QBZ7Q'); </script> <!-- End Google Tag Manager --> </head> <body class=" js-google-tag-wrapper" data-gt-page-path="https://research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/" data-env="production"> <!-- Google Tag Manager (noscript) --> <noscript><iframe src="https://www.googletagmanager.com/ns.html?id=GTM-K8QBZ7Q" height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript> <!-- End Google Tag Manager (noscript) --> <header class="global-header glue-header glue-header--single not-glue"> <a href="#page-content" class="glue-header__skip-content">Jump to Content</a> <div class="glue-header__bar glue-header__bar--mobile not-glue"> <div class="glue-header__tier not-glue"> <!-- mobile lockup component --> <div class="glue-header__container"> <div class="glue-header__lock-up"> <!-- Hamburger button component --> <div class="glue-header__hamburger"> <button class="glue-header__drawer-toggle-btn" aria-label="Open the navigation drawer"> <svg class="glue-icon glue-icon--24px" role="presentation" aria-hidden="true"> <use href="/gr/static/assets/icons/glue-icons.svg#menu"></use> </svg> </button> </div> <div class="glue-header__logo"> <a class="glue-header__logo-link" href="/" title="Google Research"> <!-- Logo component --> <div class="glue-header__logo-container"> <svg role="presentation" aria-hidden="true" alt='Google' class="glue-icon glue-icon glue-header__logo-svg"> <use href="/gr/static/assets/icons/glue-icons.svg#google-color-logo"></use> </svg> </div> <span class="glue-header__logo--product">Research</span> </a> </div> </div> </div> </div> </div> <div class="glue-header__bar glue-header__bar--desktop glue-header__drawer"> <div class="glue-header__tier"> <!-- desktop lockup component --> <div class="glue-header__container"> <div class="glue-header__lock-up"> <div class="glue-header__logo"> <a class="glue-header__logo-link" href="/" title="Google Research"> <!-- Logo component --> <div class="glue-header__logo-container"> <svg role="presentation" aria-hidden="true" alt='Google' class="glue-icon glue-icon glue-header__logo-svg not-glue --dark-logo"> <use href="/gr/static/assets/icons/glue-icons.svg#google-solid-logo"></use> </svg> <svg role="presentation" aria-hidden="true" alt='Google' class="glue-icon glue-icon glue-header__logo-svg --light-logo"> <use href="/gr/static/assets/icons/glue-icons.svg#google-color-logo"></use> </svg> </div> <span class="glue-header__logo--product">Research</span> </a> </div> </div> </div> <!-- linkbar component --> <div class="glue-header__container glue-header__container--linkbar"> <nav class="glue-header__link-bar navigation js-gt-global-nav-wrapper"> <ul class="glue-header__list"> <li class="glue-header__item js-sub-nav-parent --parent" data-gt-primary="Who we are" > <button class="glue-header__link js-sub-nav-target" aria-haspopup="true" aria-expanded="false" > <span class=""> Who we are <span class="icon icon--caret"></span> </span> </button> <div class="navigation__sub js-sub-nav" role="menu"> <div class="navigation__sub__container"> <div class="navigation__sub__mobile-heading"> <button class="glue-header__link js-sub-nav-close-mobile"> <span class="sr-text">Back to</span> <span class="icon icon--caret"></span> Who we are <span class="sr-text">menu</span> </button> <hr/> </div> <div class="block-nav_drawer_columns_content"> <div class="navigation__sub--content" data-gt-secondary="Defining the technology of today and tomorrow."> <div class="navigation__sub__wrapper"> <div class="navigation__sub__heading"> <h2 class="headline-3">Defining the technology of today and tomorrow.</h2> </div> <ul class="navigation__sub__columns"> <li data-gt-secondary="Philosophy"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading"> Philosophy </h2> <p class="navigation__sub__columns__description caption">We strive to create an environment conducive to many different types of research across many different time scales and levels of risk.</p> <a href="https://research.google/philosophy/" class="glue-inline-link js-drawer-link" > <span class="sr-text">Learn more about our Philosophy</span> <span aria-hidden="true">Learn more</span> </a> </div> <div class="navigation__sub__columns__mobile"> <a class="glue-header__link" href="https://research.google/philosophy/" > Philosophy </a> </div> </li> <li data-gt-secondary="People"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading"> People </h2> <p class="navigation__sub__columns__description caption">Our researchers drive advancements in computer science through both fundamental and applied research.</p> <a href="https://research.google/people/" class="glue-inline-link js-drawer-link" > <span class="sr-text">Learn more about our People</span> <span aria-hidden="true">Learn more</span> </a> </div> <div class="navigation__sub__columns__mobile"> <a class="glue-header__link" href="https://research.google/people/" > People </a> </div> </li> </ul> </div> </div> </div> </div> </div> </li> <li class="glue-header__item js-sub-nav-parent --parent" data-gt-primary="Research areas" > <button class="glue-header__link js-sub-nav-target" aria-haspopup="true" aria-expanded="false" > <span class=""> Research areas <span class="icon icon--caret"></span> </span> </button> <div class="navigation__sub js-sub-nav" role="menu"> <div class="navigation__sub__container"> <div class="navigation__sub__mobile-heading"> <button class="glue-header__link js-sub-nav-close-mobile"> <span class="sr-text">Back to</span> <span class="icon icon--caret"></span> Research areas <span class="sr-text">menu</span> </button> <hr/> </div> <div class="block-nav_drawer_columns_link_list"> <div class="navigation__sub--list"> <div class="navigation__sub__wrapper"> <ul class="navigation__sub__columns"> <li data-gt-secondary="Research areas"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading">Research areas</h2> <ul> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/" > Explore all research areas </a> </li> </ul> </div> <div class="navigation__sub__columns__mobile"> <button class="glue-header__link js-sub-nav-target" data-panel="nested" role="menuitem" aria-haspopup="true"> Research areas <span class="icon icon--caret"></span> </button> <div class="navigation__nested-sub js-sub-nav-parent"> <div class="navigation__sub__mobile-heading"> <button class="glue-header__link js-sub-nav-close-mobile" role="menuitem" aria-haspopup="true"> <span class="sr-text">Back to</span> <span class="icon icon--caret"></span> Research areas <span class="sr-text">menu</span> </button> <hr/> </div> <ul> <li role="menuitem"> <a href="https://research.google/research-areas/" class="navigation__sub__columns__mobile__link" > Explore all research areas <span> </span> </a> </li> </ul> </div> </div> </li> <li data-gt-secondary="Foundational ML &amp; Algorithms"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading">Foundational ML &amp; Algorithms</h2> <ul> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/algorithms-and-theory/" > Algorithms &amp; Theory </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/data-management/" > Data Management </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/data-mining-and-modeling/" > Data Mining &amp; Modeling </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/information-retrieval-and-the-web/" > Information Retrieval &amp; the Web </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/machine-intelligence/" > Machine Intelligence </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/machine-perception/" > Machine Perception </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/machine-translation/" > Machine Translation </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/natural-language-processing/" > Natural Language Processing </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/speech-processing/" > Speech Processing </a> </li> </ul> </div> <div class="navigation__sub__columns__mobile"> <button class="glue-header__link js-sub-nav-target" data-panel="nested" role="menuitem" aria-haspopup="true"> Foundational ML &amp; Algorithms <span class="icon icon--caret"></span> </button> <div class="navigation__nested-sub js-sub-nav-parent"> <div class="navigation__sub__mobile-heading"> <button class="glue-header__link js-sub-nav-close-mobile" role="menuitem" aria-haspopup="true"> <span class="sr-text">Back to</span> <span class="icon icon--caret"></span> Foundational ML &amp; Algorithms <span class="sr-text">menu</span> </button> <hr/> </div> <ul> <li role="menuitem"> <a href="https://research.google/research-areas/algorithms-and-theory/" class="navigation__sub__columns__mobile__link" > Algorithms &amp; Theory <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/data-management/" class="navigation__sub__columns__mobile__link" > Data Management <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/data-mining-and-modeling/" class="navigation__sub__columns__mobile__link" > Data Mining &amp; Modeling <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/information-retrieval-and-the-web/" class="navigation__sub__columns__mobile__link" > Information Retrieval &amp; the Web <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/machine-intelligence/" class="navigation__sub__columns__mobile__link" > Machine Intelligence <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/machine-perception/" class="navigation__sub__columns__mobile__link" > Machine Perception <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/machine-translation/" class="navigation__sub__columns__mobile__link" > Machine Translation <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/natural-language-processing/" class="navigation__sub__columns__mobile__link" > Natural Language Processing <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/speech-processing/" class="navigation__sub__columns__mobile__link" > Speech Processing <span> </span> </a> </li> </ul> </div> </div> </li> <li data-gt-secondary="Computing Systems &amp; Quantum AI"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading">Computing Systems &amp; Quantum AI</h2> <ul> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/distributed-systems-and-parallel-computing/" > Distributed Systems &amp; Parallel
Computing </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/hardware-and-architecture/" > Hardware &amp; Architecture </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/mobile-systems/" > Mobile Systems </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/networking/" > Networking </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/quantum-computing/" > Quantum Computing </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/robotics/" > Robotics </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/security-privacy-and-abuse-prevention/" > Security, Privacy, &amp; Abuse
Prevention </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/software-engineering/" > Software Engineering </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/software-systems/" > Software Systems </a> </li> </ul> </div> <div class="navigation__sub__columns__mobile"> <button class="glue-header__link js-sub-nav-target" data-panel="nested" role="menuitem" aria-haspopup="true"> Computing Systems &amp; Quantum AI <span class="icon icon--caret"></span> </button> <div class="navigation__nested-sub js-sub-nav-parent"> <div class="navigation__sub__mobile-heading"> <button class="glue-header__link js-sub-nav-close-mobile" role="menuitem" aria-haspopup="true"> <span class="sr-text">Back to</span> <span class="icon icon--caret"></span> Computing Systems &amp; Quantum AI <span class="sr-text">menu</span> </button> <hr/> </div> <ul> <li role="menuitem"> <a href="https://research.google/research-areas/distributed-systems-and-parallel-computing/" class="navigation__sub__columns__mobile__link" > Distributed Systems &amp; Parallel
Computing <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/hardware-and-architecture/" class="navigation__sub__columns__mobile__link" > Hardware &amp; Architecture <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/mobile-systems/" class="navigation__sub__columns__mobile__link" > Mobile Systems <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/networking/" class="navigation__sub__columns__mobile__link" > Networking <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/quantum-computing/" class="navigation__sub__columns__mobile__link" > Quantum Computing <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/robotics/" class="navigation__sub__columns__mobile__link" > Robotics <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/security-privacy-and-abuse-prevention/" class="navigation__sub__columns__mobile__link" > Security, Privacy, &amp; Abuse
Prevention <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/software-engineering/" class="navigation__sub__columns__mobile__link" > Software Engineering <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/software-systems/" class="navigation__sub__columns__mobile__link" > Software Systems <span> </span> </a> </li> </ul> </div> </div> </li> <li data-gt-secondary="Science, AI &amp; Society"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading">Science, AI &amp; Society</h2> <ul> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/climate-and-sustainability/" > Climate &amp; Sustainability </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/economics-and-electronic-commerce/" > Economics &amp; Electronic Commerce </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/education-innovation/" > Education Innovation </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/general-science/" > General Science </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/health-bioscience/" > Health &amp; Bioscience </a> </li> <li> <a class="navigation__sub__columns__list-link caption js-drawer-link" href="https://research.google/research-areas/human-computer-interaction-and-visualization/" > Human-Computer Interaction and Visualization </a> </li> </ul> </div> <div class="navigation__sub__columns__mobile"> <button class="glue-header__link js-sub-nav-target" data-panel="nested" role="menuitem" aria-haspopup="true"> Science, AI &amp; Society <span class="icon icon--caret"></span> </button> <div class="navigation__nested-sub js-sub-nav-parent"> <div class="navigation__sub__mobile-heading"> <button class="glue-header__link js-sub-nav-close-mobile" role="menuitem" aria-haspopup="true"> <span class="sr-text">Back to</span> <span class="icon icon--caret"></span> Science, AI &amp; Society <span class="sr-text">menu</span> </button> <hr/> </div> <ul> <li role="menuitem"> <a href="https://research.google/research-areas/climate-and-sustainability/" class="navigation__sub__columns__mobile__link" > Climate &amp; Sustainability <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/economics-and-electronic-commerce/" class="navigation__sub__columns__mobile__link" > Economics &amp; Electronic Commerce <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/education-innovation/" class="navigation__sub__columns__mobile__link" > Education Innovation <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/general-science/" class="navigation__sub__columns__mobile__link" > General Science <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/health-bioscience/" class="navigation__sub__columns__mobile__link" > Health &amp; Bioscience <span> </span> </a> </li> <li role="menuitem"> <a href="https://research.google/research-areas/human-computer-interaction-and-visualization/" class="navigation__sub__columns__mobile__link" > Human-Computer Interaction and Visualization <span> </span> </a> </li> </ul> </div> </div> </li> </ul> </div> </div></div> </div> </div> </li> <li class="glue-header__item js-sub-nav-parent --parent" data-gt-primary="Our work" > <button class="glue-header__link js-sub-nav-target" aria-haspopup="true" aria-expanded="false" > <span class=""> Our work <span class="icon icon--caret"></span> </span> </button> <div class="navigation__sub js-sub-nav" role="menu"> <div class="navigation__sub__container"> <div class="navigation__sub__mobile-heading"> <button class="glue-header__link js-sub-nav-close-mobile"> <span class="sr-text">Back to</span> <span class="icon icon--caret"></span> Our work <span class="sr-text">menu</span> </button> <hr/> </div> <div class="block-nav_drawer_columns_content"> <div class="navigation__sub--content" data-gt-secondary=""> <div class="navigation__sub__wrapper"> <ul class="navigation__sub__columns"> <li data-gt-secondary="Projects"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading"> Projects </h2> <p class="navigation__sub__columns__description caption">We regularly open-source projects with the broader research community and apply our developments to Google products.</p> <a href="https://research.google/resources/our-projects/" class="glue-inline-link js-drawer-link" > <span class="sr-text">Learn more about our Projects</span> <span aria-hidden="true">Learn more</span> </a> </div> <div class="navigation__sub__columns__mobile"> <a class="glue-header__link" href="https://research.google/resources/our-projects/" > Projects </a> </div> </li> <li data-gt-secondary="Publications"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading"> Publications </h2> <p class="navigation__sub__columns__description caption">Publishing our work allows us to share ideas and work collaboratively to advance the field of computer science.</p> <a href="https://research.google/pubs/" class="glue-inline-link js-drawer-link" > <span class="sr-text">Learn more about our Publications</span> <span aria-hidden="true">Learn more</span> </a> </div> <div class="navigation__sub__columns__mobile"> <a class="glue-header__link" href="https://research.google/pubs/" > Publications </a> </div> </li> <li data-gt-secondary="Resources"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading"> Resources </h2> <p class="navigation__sub__columns__description caption">We make products, tools, and datasets available to everyone with the goal of building a more collaborative ecosystem.</p> <a href="https://research.google/resources/" class="glue-inline-link js-drawer-link" > <span class="sr-text">Learn more about our Resources</span> <span aria-hidden="true">Learn more</span> </a> </div> <div class="navigation__sub__columns__mobile"> <a class="glue-header__link" href="https://research.google/resources/" > Resources </a> </div> </li> </ul> </div> </div> </div> </div> </div> </li> <li class="glue-header__item js-sub-nav-parent --parent" data-gt-primary="Programs &amp; events" > <button class="glue-header__link js-sub-nav-target" aria-haspopup="true" aria-expanded="false" > <span class=""> Programs &amp; events <span class="icon icon--caret"></span> </span> </button> <div class="navigation__sub js-sub-nav" role="menu"> <div class="navigation__sub__container"> <div class="navigation__sub__mobile-heading"> <button class="glue-header__link js-sub-nav-close-mobile"> <span class="sr-text">Back to</span> <span class="icon icon--caret"></span> Programs &amp; events <span class="sr-text">menu</span> </button> <hr/> </div> <div class="block-nav_drawer_columns_content"> <div class="navigation__sub--content" data-gt-secondary="Shaping the future, together."> <div class="navigation__sub__wrapper"> <div class="navigation__sub__heading"> <h2 class="headline-3">Shaping the future, together.</h2> <a href="https://research.google/programs-and-events/" class="js-drawer-link" > Collaborate with us </a> </div> <ul class="navigation__sub__columns"> <li data-gt-secondary="Student programs"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading"> Student programs </h2> <p class="navigation__sub__columns__description caption">Supporting the next generation of researchers through a wide range of programming.</p> <a href="https://research.google/programs-and-events/student-engagement/" class="glue-inline-link js-drawer-link" > <span class="sr-text">Learn more about our Student programs</span> <span aria-hidden="true">Learn more</span> </a> </div> <div class="navigation__sub__columns__mobile"> <a class="glue-header__link" href="https://research.google/programs-and-events/student-engagement/" > Student programs </a> </div> </li> <li data-gt-secondary="Faculty programs"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading"> Faculty programs </h2> <p class="navigation__sub__columns__description caption">Participating in the academic research community through meaningful engagement with university faculty.</p> <a href="https://research.google/programs-and-events/faculty-engagement/" class="glue-inline-link js-drawer-link" > <span class="sr-text">Learn more about our Faculty programs</span> <span aria-hidden="true">Learn more</span> </a> </div> <div class="navigation__sub__columns__mobile"> <a class="glue-header__link" href="https://research.google/programs-and-events/faculty-engagement/" > Faculty programs </a> </div> </li> <li data-gt-secondary="Conferences &amp; events"> <div class="navigation__sub__columns__desktop"> <h2 class="headline-6 navigation__sub__columns__heading"> Conferences &amp; events </h2> <p class="navigation__sub__columns__description caption">Connecting with the broader research community through events is essential for creating progress in every aspect of our work.</p> <a href="https://research.google/conferences-and-events/" class="glue-inline-link js-drawer-link" > <span class="sr-text">Learn more about our Conferences &amp; events</span> <span aria-hidden="true">Learn more</span> </a> </div> <div class="navigation__sub__columns__mobile"> <a class="glue-header__link" href="https://research.google/conferences-and-events/" > Conferences &amp; events </a> </div> </li> </ul> <div class="navigation__sub__cta"> <a class="glue-button glue-button--high-emphasis js-drawer-link" href="https://research.google/programs-and-events/" target="_blank" rel="noreferrer noopener" > Collaborate with us </a> </div> </div> </div> </div> </div> </div> </li> <li class="glue-header__item " data-gt-primary="Careers" > <a class="glue-header__link " href="https://research.google/careers/" > <span class=""> Careers </span> </a> </li> <li class="glue-header__item " data-gt-primary="Blog" > <a class="glue-header__link " href="https://research.google/blog/" > <span class=""> Blog </span> </a> </li> </ul> </nav> </div> <!-- search (hide on search page) --> <div class="glue-header__search js-header-search"> <div class="glue-header__search__input"> <div class="search-input " data-type="header"> <input type="search" class="caption --empty-search js-search-bar js-gt-search-input" placeholder="Search"> <button class="search-input__button --search js-gt-search-btn"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--18px "> <use href="/gr/static/assets/icons/glue-icons.svg#search"></use> </svg> </button> <button class="search-input__button --clear"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--18px "> <use href="/gr/static/assets/icons/glue-icons.svg#close"></use> </svg> </button> </div> </div> <button class="glue-header__search__btn js-header-search-btn"> <svg role="presentation" aria-hidden="true" aria-hidden="true" class="glue-icon glue-icon--24px search"> <use href="/gr/static/assets/icons/glue-icons.svg#search"></use> </svg> <svg role="presentation" aria-hidden="true" aria-hidden="true" class="glue-icon glue-icon--24px close"> <use href="/gr/static/assets/icons/glue-icons.svg#close"></use> </svg> <span class="sr-text js-header-search-sr-text">Search</span> </button> </div> </div> </div> <div class="glue-header__drawer-backdrop"> <div class="glue-header__mobile_close"> <button class="glue-header__drawer-toggle-btn js-mobile-nav-close" aria-label="Close the navigation drawer"> <svg class="glue-icon glue-icon--24px" role="presentation" aria-hidden="true"> <use href="/gr/static/assets/icons/glue-icons.svg#close"></use> </svg> </button> </div> </div> </header> <main id="page-content"> <div class="blog-detail-page --legacy " > <section class="basic-hero bhoig --theme-dark --large-image" data-gt-id="basic_hero" data-gt-component-name=""> <div class="glue-page"> <div class="glue-grid"> <div class="bhoig__image-wrapper glue-grid__col--span-4 glue-grid__col--span-5-md glue-grid__col--span-4-lg"> <div class="bhoig__image-bg" style=""> <picture> <img src="https://storage.googleapis.com/gweb-research2023-media/original_images/8baef8dd4b7870b70240b7ebd016f7c1-image3.gif" alt="" class=""/> </picture> </div> </div> <div class="bhoig__breadcrumb-wrapper glue-grid__col--span-10 glue-grid__col--span-9-md glue-grid__col--span-10-lg"> <nav class="glue-breadcrumbs" aria-label="Breadcrumbs"> <ol class="glue-breadcrumbs__list"> <li class="glue-breadcrumbs__item"> <a class="glue-breadcrumbs__link attribution" href="/">Home</a> <svg role="presentation" aria-hidden="true" class="glue-icon "> <use href="/gr/static/assets/icons/glue-icons.svg#chevron-right"></use> </svg> </li> <li class="glue-breadcrumbs__item"> <a class="glue-breadcrumbs__link attribution" href="/blog/">Blog</a> <svg role="presentation" aria-hidden="true" class="glue-icon "> <use href="/gr/static/assets/icons/glue-icons.svg#chevron-right"></use> </svg> </li> </ol> </nav> </div> <h1 class="headline-1 bhoig__headline glue-grid__col--span-8 glue-grid__col--span-7-md glue-grid__col--span-8-lg">Accelerating Neural Networks on Mobile and Web with Sparse Inference</h1> <div class="basic-hero__description bhoig__description glue-grid__col--span-8 glue-grid__col--span-7-md glue-grid__col--span-8-lg"> <div class="basic-hero--blog-detail__description"><p>March 9, 2021</p><span class="dot-separator"></span><p>Posted by Artsiom Ablavatski and Marat Dukhan, Software Engineers, Google Research</p></div> </div> <div class="bhoig__cta glue-grid__col--span-8 glue-grid__col--span-7-md glue-grid__col--span-8-lg"> </div> </div> </div> </section> <div class="glue-page"> <div class="glue-grid blog-detail-page__grid"> <div class="glue-grid__col glue-grid__col--span-4-sm glue-grid__col--span-12-md glue-grid__col--span-9-lg"> <div class="quicklinks-wrapper--mobile"> <div class="block-quick_links"> <section class="quicklinks"> <h2 class="eyebrow">Quick links</h2> <ul class="quicklinks__list"> <li class="quicklinks__item quicklinks__item--share js-quicklinks-share"> <button class="quicklinks__share-button js-quicklinks-share__button" aria-expanded="false" aria-controls="js-quicklinks-share__list"> <span class="icon icon--share"></span> <span class="quicklinks__item__text">Share</span> </button> <section class="glue-social glue-social--monochrome quicklinks__share-list js-quicklinks-share__list glue-elevation-level-1 js-gt-share-wrapper"> <div class="glue-social__group"> <ul class="glue-social__list" role="list"> <li class="glue-social__item"> <a class="glue-social__link" href="https://twitter.com/intent/tweet?text=https%3A//research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/" title="Share on Twitter" target="_blank" rel="noopener" data-gt-method="x"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--24px"> <use href="/gr/static/assets/icons/twitter-x.svg#twitter-x"></use> </svg> </a> </li> <li class="glue-social__item"> <a class="glue-social__link" href="https://www.facebook.com/sharer/sharer.php?u=https%3A//research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/" title="Share on Facebook" target="_blank" rel="noopener" data-gt-method="facebook"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--color-facebook glue-icon--24px"> <use href="/gr/static/assets/icons/facebook.svg#facebook"></use> </svg> </a> </li> <li class="glue-social__item"> <a class="glue-social__link" href="https://www.linkedin.com/shareArticle?url=https%3A//research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/&amp;mini=true" title="Share on LinkedIn" target="_blank" rel="noopener" data-gt-method="linkedin"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--color-linkedin glue-icon--24px"> <use href="/gr/static/assets/icons/glue-icons.svg#post-linkedin"></use> </svg> </a> </li> <li class="glue-social__item"> <a class="glue-social__link" href="mailto:name@example.com?subject=Check%20out%20this%20site&body=Check%20out%20https%3A//research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/" title="Send via Email" data-gt-method="email"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--color-sharemail glue-icon--24px"> <use href="/gr/static/assets/icons/glue-icons.svg#email"></use> </svg> </a> </li> <li class="glue-social__item"> <div class="glue-social__popover"> <div class="glue-social__icon-trigger" aria-label="Get shareable link" title="Get shareable link" id="share-static-popover-trigger"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--color-sharelink glue-icon--24px"> <use href="/gr/static/assets/icons/glue-icons.svg#link"></use> </svg> </div> <div class="glue-social__dialog" id="share-popover-dialog"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--color-sharelink glue-icon--24px"> <use href="/public/icons/glue-icons.svg#link"></use> </svg> <div class="glue-social__copy" glue-copy-success="Copied to clipboard" glue-copy-fail="Press Ctrl+C or ⌘+C to copy"> <input class="glue-social__copy-input" readonly="" type="text" value="https://research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/" aria-label="URL"> <button class="glue-social__copy-btn" id="share-copy-btn" data-gt-method="link-copied">Copy link</button> </div> <div aria-label="Close" class="glue-social__close-btn"> × </div> </div> </div> </li> </ul> </div> </section> </li> </ul> </section> </div> </div> <div class="blog-detail-wrapper js-gt-blog-detail-wrapper" data-gt-publish-date="20210309"> <div class="rich-text --theme- --mode-" data-gt-id="rich_text" data-gt-component-name=""> <img src="https://1.bp.blogspot.com/-Oek0gYcQBJk/YEe0hcHAeOI/AAAAAAAAHQ0/c_j2LjrFaI4Pj0ahdLsT5H59y-c-sp3oQCLcBGAsYHQ/w640-h186/image3.gif" style="display: none;" /> <p> On-device inference of neural networks enables a variety of real-time applications, like <a href="https://ai.googleblog.com/2020/08/on-device-real-time-body-pose-tracking.html" target="_blank" rel="noopener noreferrer">pose estimation</a> and <a href="https://ai.googleblog.com/2020/10/background-features-in-google-meet.html" target="_blank" rel="noopener noreferrer">background blur</a>, in a low-latency and privacy-conscious way. Using ML inference frameworks like <a href="https://www.tensorflow.org/lite" target="_blank" rel="noopener noreferrer">TensorFlow Lite</a> with <a href="https://blog.tensorflow.org/2020/07/accelerating-tensorflow-lite-xnnpack-integration.html" target="_blank" rel="noopener noreferrer">XNNPACK</a> ML acceleration library, engineers optimize their models to run on a variety of devices by finding a sweet spot between model size, inference speed and the quality of the predictions. </p> <a name='more'></a> <p> One way to optimize a model is through use of sparse neural networks [<a href="https://arxiv.org/pdf/2102.00554.pdf" target="_blank" rel="noopener noreferrer">1</a>, <a href="https://arxiv.org/pdf/1902.09574.pdf" target="_blank" rel="noopener noreferrer">2</a>, <a href="https://arxiv.org/pdf/1911.09723.pdf" target="_blank" rel="noopener noreferrer">3</a>], which have a significant fraction of their weights set to zero. In general, this is a desirable quality as it not only reduces the model size via compression, but also makes it possible to skip a significant fraction of multiply-add operations, thereby speeding up inference. Further, it is possible to increase the number of parameters in a model and then sparsify it to match the quality of the original model, while still benefiting from the accelerated inference. However, the use of this technique remains limited in production largely due to a lack of tools to sparsify popular convolutional architectures as well as insufficient support for running these operations on-device. </p> <p> Today we announce the release of a set of new features for the <a href="https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/delegates/xnnpack/README.md#sparse-inference" target="_blank" rel="noopener noreferrer">XNNPACK acceleration library</a> and TensorFlow Lite that enable efficient inference of sparse networks, along with guidelines on how to sparsify neural networks, with the goal of helping researchers develop their own sparse on-device models. Developed in collaboration with <a href="https://deepmind.com/" target="_blank" rel="noopener noreferrer">DeepMind</a>, these tools power a new generation of live perception experiences, including <a href="https://ai.googleblog.com/2019/08/on-device-real-time-hand-tracking-with.html" target="_blank" rel="noopener noreferrer">hand tracking</a> in <a href="https://mediapipe.dev" target="_blank" rel="noopener noreferrer">MediaPipe</a> and <a href="https://ai.googleblog.com/2020/10/background-features-in-google-meet.html" target="_blank" rel="noopener noreferrer">background features</a> in Google Meet, accelerating inference speed from 1.2 to 2.4 times, while reducing the model size by half. In this post, we provide a technical overview of sparse neural networks — from inducing sparsity during training to on-device deployment — and offer some ideas on how researchers might create their own sparse models.</p> <table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-xuf3V4bqAm0/YEfW-ga64JI/AAAAAAAAHSE/JURKIQO5Gf0NfXOIOi5sJOPdvkpli3v1gCLcBGAsYHQ/s642/image5.gif" target="_blank" rel="noopener noreferrer"><img border="0" data-original-height="178" data-original-width="642" height="178" src="https://1.bp.blogspot.com/-xuf3V4bqAm0/YEfW-ga64JI/AAAAAAAAHSE/JURKIQO5Gf0NfXOIOi5sJOPdvkpli3v1gCLcBGAsYHQ/w640-h178/image5.gif" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Comparison of the processing time for the dense (<b>left</b>) and sparse (<b>right</b>) models of the same quality for Google Meet <a href="https://ai.googleblog.com/2020/10/background-features-in-google-meet.html" target="_blank" rel="noopener noreferrer">background features</a>. For readability, the processing time shown is the moving average across 100 frames.</td></tr></tbody></table> <br> <div style="line-height:40%;"> <br> </div> <h2>Sparsifying a Neural Network</h2><p> Many modern deep learning architectures, like <a href="https://arxiv.org/abs/1704.04861" target="_blank" rel="noopener noreferrer">MobileNet</a> and <a href="https://blog.tensorflow.org/2020/03/higher-accuracy-on-vision-models-with-efficientnet-lite.html" target="_blank" rel="noopener noreferrer">EfficientNetLite</a>, are primarily composed of <a href="https://arxiv.org/abs/1610.02357" target="_blank" rel="noopener noreferrer">depthwise convolutions</a> with a small spatial kernel and <a href="https://arxiv.org/pdf/1409.4842.pdf" target="_blank" rel="noopener noreferrer">1x1 convolutions</a> that linearly combine features from the input image. While such architectures have a number of potential targets for sparsification, including the full <a href="https://en.wikipedia.org/wiki/Convolutional_neural_network#Convolutional_layer" target="_blank" rel="noopener noreferrer">2D convolutions</a> that frequently occur at the beginning of many networks or the <a href="https://www.tensorflow.org/api_docs/python/tf/keras/layers/DepthwiseConv2D" target="_blank" rel="noopener noreferrer">depthwise convolutions</a>, it is the 1x1 convolutions that are the most expensive operators as measured by inference time. Because they account for over 65% of the total compute, they are an optimal target for sparsification. </p> <br> <table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container"> <tbody><tr> <td style="text-align: left;"><strong>Architecture</strong></td> <td style="text-align: center;"><strong>Inference Time</strong></td> </tr> <tr> <td style="text-align: left;"><a href="https://arxiv.org/abs/1704.04861" target="_blank" rel="noopener noreferrer">MobileNet</a></td> <td style="text-align: center;">85%</td> </tr> <tr> <td style="text-align: left;"><a href="https://arxiv.org/abs/1801.04381" target="_blank" rel="noopener noreferrer">MobileNetV2</a></td> <td style="text-align: center;">71%</td> </tr> <tr> <td style="text-align: left;"><a href="https://arxiv.org/abs/1905.02244" target="_blank" rel="noopener noreferrer">MobileNetV3</a></td> <td style="text-align: center;">71%</td> </tr> <tr> <td style="text-align: left;"><a href="https://blog.tensorflow.org/2020/03/higher-accuracy-on-vision-models-with-efficientnet-lite.html" target="_blank" rel="noopener noreferrer">EfficientNet-Lite&nbsp;&nbsp;</a></td> <td style="text-align: center;">66%</td> </tr> </tbody></table> <table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"> <br> <tbody><tr><td class="tr-caption" style="text-align: center;">Comparison of inference time dedicated to 1x1 convolutions in % for modern mobile architectures.</td></tr> </tbody></table> <br> <p> In modern on-device inference engines, like XNNPACK, the implementation of 1x1 convolutions as well as other operations in the deep learning models <a href="https://arxiv.org/pdf/1907.01989.pdf" target="_blank" rel="noopener noreferrer">rely on the HWC tensor layout</a>, in which the tensor dimensions correspond to the height, width, and channel (e.g., red, green or blue) of the input image. This tensor configuration allows the inference engine to process the channels corresponding to each spatial location (i.e., each pixel of an image) in parallel. However, this ordering of the tensor is not a good fit for sparse inference because it sets the channel as the innermost dimension of the tensor and makes it more computationally expensive to access. </p> <p> Our updates to XNNPACK enable it to detect if a model is sparse. If so, it switches from its standard dense inference mode to <a href="https://arxiv.org/pdf/1911.09723.pdf" target="_blank" rel="noopener noreferrer">sparse inference</a> mode, in which it employs a CHW (channel, height, width) tensor layout. This reordering of the tensor allows for an accelerated implementation of the sparse 1x1 convolution kernel for two reasons: 1) entire spatial slices of the tensor can be skipped when the corresponding channel weight is zero following a single condition check, instead of a per-pixel test; and 2) when the channel weight is non-zero, the computation can be made more efficient by loading neighbouring pixels into the same memory unit. This enables us to process multiple pixels simultaneously, while also performing each operation in parallel across several threads. Together these changes result in a speed-up of 1.8x to 2.3x when at least 80% of the weights are zero. </p> <p> In order to avoid converting back and forth between the CHW tensor layout that is optimal for sparse inference and the standard HWC tensor layout after each operation, XNNPACK provides efficient implementations of <a href="https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/delegates/xnnpack/README.md#sparse-inference" target="_blank" rel="noopener noreferrer">several CNN operators</a> in CHW layout. </p> <div style="line-height:40%;"> <br> </div> <h2>Guidelines for Training Sparse Neural Networks</h2><p> To create a sparse neural network, the guidelines included in this release suggest one start with a dense version and then gradually set a fraction of its weights to zero during training. This process is called pruning. Of the many available techniques for pruning, we recommend using <a href="https://blog.tensorflow.org/2019/05/tf-model-optimization-toolkit-pruning-API.html" target="_blank" rel="noopener noreferrer">magnitude pruning</a> (available in the <a href="https://www.tensorflow.org/model_optimization" target="_blank" rel="noopener noreferrer">TF Model Optimization Toolkit</a>) or the recently introduced <a href="https://ai.googleblog.com/2020/09/improving-sparse-training-with-rigl.html" target="_blank" rel="noopener noreferrer">RigL</a> method. With a modest increase in training time, both of these can successfully sparsify deep learning models without degrading their quality. The resulting sparse models can be <a href="https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/python/lite.py#L108" target="_blank" rel="noopener noreferrer">stored efficiently in a compressed format</a> that reduces the size by a factor of two compared to their dense equivalent. </p> <p> The quality of sparse networks is influenced by several hyperparameters, including training time, learning rate and schedules for pruning. The <a href="https://www.tensorflow.org/model_optimization/guide/pruning/comprehensive_guide?hl=en" target="_blank" rel="noopener noreferrer">TF Pruning API</a> provides an excellent example of how to select these, as well as some tips for training such models. We recommend running hyperparameter searches to find the sweet spot for your application. </p> <div style="line-height:40%;"> <br> </div> <h2>Applications</h2><p> We demonstrate that it is possible to sparsify classification tasks, dense segmentation (e.g., <a href="https://ai.googleblog.com/2020/10/background-features-in-google-meet.html" target="_blank" rel="noopener noreferrer">Meet background blur</a>) and regression problems (<a href="https://google.github.io/mediapipe/solutions/hands.html" target="_blank" rel="noopener noreferrer">MediaPipe Hands</a>), which provides tangible benefits to users. For example, in the case of Google Meet, sparsification lowered the inference time of the model by 30%, which provided access to higher quality models for more users. </p> <table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-tkr7VEYdqYQ/YEfPhZugVOI/AAAAAAAAHR8/Jq_7oUQCNAI5qogP4tUGa0zhHGcG0Nf-gCLcBGAsYHQ/s1374/use%2Bthis%2Bscreenshot%2BXeno%2BSparse.png" target="_blank" rel="noopener noreferrer"><img border="0" data-original-height="683" data-original-width="1374" height="318" src="https://1.bp.blogspot.com/-tkr7VEYdqYQ/YEfPhZugVOI/AAAAAAAAHR8/Jq_7oUQCNAI5qogP4tUGa0zhHGcG0Nf-gCLcBGAsYHQ/w640-h318/use%2Bthis%2Bscreenshot%2BXeno%2BSparse.png" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Model size comparisons for the dense and sparse models in Mb. The models have been stored in 16- and 32-bit floating-point formats.</td></tr></tbody></table> <br> <p> The approach to sparsity described here works best with architectures based on <a href="https://openaccess.thecvf.com/content_cvpr_2018/papers/Sandler_MobileNetV2_Inverted_Residuals_CVPR_2018_paper.pdf" target="_blank" rel="noopener noreferrer">inverted residual blocks</a>, such as <a href="https://arxiv.org/abs/1801.04381" target="_blank" rel="noopener noreferrer">MobileNetV2</a>, <a href="https://arxiv.org/abs/1905.02244" target="_blank" rel="noopener noreferrer">MobileNetV3</a> and <a href="https://blog.tensorflow.org/2020/03/higher-accuracy-on-vision-models-with-efficientnet-lite.html" target="_blank" rel="noopener noreferrer">EfficientNetLite</a>. The degree of sparsity in a network influences both inference speed and quality. Starting from a dense network of a fixed capacity, we found modest performance gains even at 30% sparsity. With increased sparsity, the quality of the model remains relatively close to the dense baseline until reaching 70% sparsity, beyond which there is a more pronounced drop in accuracy. However, one can compensate for the reduced accuracy at 70% sparsity by increasing the size of the base network by 20%, which results in faster inference times without degrading the quality of the model. No further changes are required to run the sparsified models, because XNNPACK can recognize and automatically enable sparse inference. </p> <table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-wEjLztSohMw/YEkVpmDYpxI/AAAAAAAAHSM/BKv8xyeqnwIOBAI1GuUdM80P0IWWXQaSwCLcBGAsYHQ/s1049/Sparsity.png" target="_blank" rel="noopener noreferrer"><img border="0" data-original-height="1049" data-original-width="971" height="640" src="https://1.bp.blogspot.com/-wEjLztSohMw/YEkVpmDYpxI/AAAAAAAAHSM/BKv8xyeqnwIOBAI1GuUdM80P0IWWXQaSwCLcBGAsYHQ/w592-h640/Sparsity.png" width="592" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Ablation studies of different sparsity levels with respect to inference time (the smaller the better) and the quality measured by the <a href="https://en.wikipedia.org/wiki/Jaccard_index" target="_blank" rel="noopener noreferrer">Intersection over Union</a> (IoU) for predicted segmentation mask.</td></tr></tbody></table> <!--<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/--vQkXrOlDCc/YEe1__kw7TI/AAAAAAAAHQ8/GRX_sgvUdkIwnPQzs5kypdQoQj-fukKcgCLcBGAsYHQ/s1200/image4.png" target="_blank" rel="noopener noreferrer"><img border="0" data-original-height="742" data-original-width="1200" height="248" src="https://1.bp.blogspot.com/--vQkXrOlDCc/YEe1__kw7TI/AAAAAAAAHQ8/GRX_sgvUdkIwnPQzs5kypdQoQj-fukKcgCLcBGAsYHQ/w400-h248/image4.png" width="400" /></a></td></tr></tbody></table> <table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-98ShON7z8Jk/YEfH9X-G-QI/AAAAAAAAHRg/aIfROKPkXtQPLUl558MkfNs9NIEBU1McACLcBGAsYHQ/s1200/image1.png" target="_blank" rel="noopener noreferrer"><img border="0" data-original-height="742" data-original-width="1200" height="248" src="https://1.bp.blogspot.com/-98ShON7z8Jk/YEfH9X-G-QI/AAAAAAAAHRg/aIfROKPkXtQPLUl558MkfNs9NIEBU1McACLcBGAsYHQ/w400-h248/image1.png" width="400" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Ablation studies of different sparsity levels with respect to inference time (the smaller the better) and the quality measured by the <a href="https://en.wikipedia.org/wiki/Jaccard_index" target="_blank" rel="noopener noreferrer">Intersection over Union</a> (IoU) for predicted segmentation mask.</td></tr></tbody></table>--> <br> <div style="line-height:40%;"> <br> </div> <h2>Sparsity as Automatic Alternative to Distillation</h2><p> Background blur in Google Meet uses a segmentation model based on a modified MobileNetV3 backbone with <a href="https://arxiv.org/pdf/1709.01507.pdf" target="_blank" rel="noopener noreferrer">attention blocks</a>. We were able to speed up the model by 30% by applying a 70% sparsification, while preserving the quality of the foreground mask. We examined the predictions of the sparse and dense models on images from 17 geographic subregions, finding no significant difference, and released the details in the associated <a href="https://mediapipe.page.link/meet-segmentation-sparse-mc" target="_blank" rel="noopener noreferrer">model card</a>. </p> <p> Similarly, <a href="https://ai.googleblog.com/2019/08/on-device-real-time-hand-tracking-with.html" target="_blank" rel="noopener noreferrer">MediaPipe Hands</a> predicts hand landmarks in real-time on mobile and the web using a model based on the EfficientNetLite backbone. This backbone model was <a href="https://en.wikipedia.org/wiki/Knowledge_distillation" target="_blank" rel="noopener noreferrer">manually distilled</a> from the large dense model, which is a computationally expensive, iterative process. Using the sparse version of the dense model instead of distilled one, we were able to maintain the same inference speed but without the labor intensive process of distilling from a dense model. Compared with the dense model the sparse model improved the inference by a factor of two, achieving the identical landmark quality as the distilled model. In a sense, sparsification can be thought of as an automatic approach to unstructured model distillation, which can improve model performance without extensive manual effort. We evaluated the sparse model on the geodiverse dataset and made the model card publicly <a href="https://mediapipe.page.link/handmc-sparse" target="_blank" rel="noopener noreferrer">available</a>. </p> <table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto;"><tbody><tr><td style="text-align: center;"><a href="https://1.bp.blogspot.com/-OEeusK9Q5ZQ/YEfIRE1t78I/AAAAAAAAHRs/BptXu3kHCR8bvgo_uFmCW2dUXMCXZLbiACLcBGAsYHQ/s962/image5.gif" target="_blank" rel="noopener noreferrer"><img border="0" data-original-height="238" data-original-width="962" height="158" src="https://1.bp.blogspot.com/-OEeusK9Q5ZQ/YEfIRE1t78I/AAAAAAAAHRs/BptXu3kHCR8bvgo_uFmCW2dUXMCXZLbiACLcBGAsYHQ/w640-h158/image5.gif" width="640" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Comparison of execution time for the dense (<b>left</b>), distilled (<b>middle</b>) and sparse (<b>right</b>) models of the same quality. Processing time of the dense model is 2x larger than sparse or distilled models. The distilled model is taken from the official <a href="https://github.com/google/mediapipe/blob/master/docs/solutions/hands.md" target="_blank" rel="noopener noreferrer">MediPipe solution</a>. The <a href="https://mediapipe.dev/demo/hands_cpu/hands_demo_raw.html" target="_blank" rel="noopener noreferrer">dense</a> and <a href="https://mediapipe.dev/demo/hands_cpu_sparse/hands_demo_raw.html" target="_blank" rel="noopener noreferrer">sparse</a> web demos are publicly available.</td></tr></tbody></table> <br> <div style="line-height:40%;"> <br> </div> <h2>Future work</h2><p> We find sparsification to be a simple yet powerful technique for improving CPU inference of neural networks. Sparse inference allows engineers to run larger models without incurring a significant performance or size overhead and offers a promising new direction for research. We are continuing to extend XNNPACK with wider support for operations in CHW layout and are exploring how it might be combined with other optimization techniques like quantization. We are excited to see what you might build with this technology!</p> <div style="line-height:40%;"> <br> </div> <h2>Acknowledgments</h2><p> <em>Special thanks to all who worked on this project: Karthik Raveendran, Erich Elsen, Tingbo Hou‎, Trevor Gale, Siargey Pisarchyk, Yury Kartynnik, Yunlu Li, Utku Evci, Matsvei Zhdanovich, Sebastian Jansson, Stéphane Hulaud, Michael Hays, Juhyun Lee, Fan Zhang, Chuo-Ling Chang, Gregory Karpiak, Tyler Mullen, Jiuqiang Tang, Ming Guang Yong, Igor Kibalchich, and Matthias Grundmann. </em> </p> </div> </div> <section aria-label="List of footnotes" data-gt-id="footnotes" data-gt-component-name="Footnotes"> <ol class="js-footnotes footnotes"> </ol> </section> <section class="blog-labels" data-gt-id="blog_labels" data-gt-component-name="Blog Labels"> <ul class="blog-labels__list"> <span class="caption">Labels:</span> <li class="caption"> <a class="caption" href="/blog/label/machine-intelligence">Machine Intelligence</a> <div class="blog-labels__spacer"></div> </li> <li class="caption"> <a class="caption" href="/blog/label/product">Product</a> </li> </ul> </section> </div> <div class="glue-grid__col glue-grid__col--span-4-sm glue-grid__col--span-12-md glue-grid__col--span-3-lg"> <div class="quicklinks-wrapper--desktop quicklinks-wrapper--sticky"> <div class="block-quick_links"> <section class="quicklinks"> <h2 class="eyebrow">Quick links</h2> <ul class="quicklinks__list"> <li class="quicklinks__item quicklinks__item--share js-quicklinks-share"> <button class="quicklinks__share-button js-quicklinks-share__button" aria-expanded="false" aria-controls="js-quicklinks-share__list"> <span class="icon icon--share"></span> <span class="quicklinks__item__text">Share</span> </button> <section class="glue-social glue-social--monochrome quicklinks__share-list js-quicklinks-share__list glue-elevation-level-1 js-gt-share-wrapper"> <div class="glue-social__group"> <ul class="glue-social__list" role="list"> <li class="glue-social__item"> <a class="glue-social__link" href="https://twitter.com/intent/tweet?text=https%3A//research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/" title="Share on Twitter" target="_blank" rel="noopener" data-gt-method="x"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--24px"> <use href="/gr/static/assets/icons/twitter-x.svg#twitter-x"></use> </svg> </a> </li> <li class="glue-social__item"> <a class="glue-social__link" href="https://www.facebook.com/sharer/sharer.php?u=https%3A//research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/" title="Share on Facebook" target="_blank" rel="noopener" data-gt-method="facebook"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--color-facebook glue-icon--24px"> <use href="/gr/static/assets/icons/facebook.svg#facebook"></use> </svg> </a> </li> <li class="glue-social__item"> <a class="glue-social__link" href="https://www.linkedin.com/shareArticle?url=https%3A//research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/&amp;mini=true" title="Share on LinkedIn" target="_blank" rel="noopener" data-gt-method="linkedin"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--color-linkedin glue-icon--24px"> <use href="/gr/static/assets/icons/glue-icons.svg#post-linkedin"></use> </svg> </a> </li> <li class="glue-social__item"> <a class="glue-social__link" href="mailto:name@example.com?subject=Check%20out%20this%20site&body=Check%20out%20https%3A//research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/" title="Send via Email" data-gt-method="email"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--color-sharemail glue-icon--24px"> <use href="/gr/static/assets/icons/glue-icons.svg#email"></use> </svg> </a> </li> <li class="glue-social__item"> <div class="glue-social__popover"> <div class="glue-social__icon-trigger" aria-label="Get shareable link" title="Get shareable link" id="share-static-popover-trigger"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--color-sharelink glue-icon--24px"> <use href="/gr/static/assets/icons/glue-icons.svg#link"></use> </svg> </div> <div class="glue-social__dialog" id="share-popover-dialog"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--color-sharelink glue-icon--24px"> <use href="/public/icons/glue-icons.svg#link"></use> </svg> <div class="glue-social__copy" glue-copy-success="Copied to clipboard" glue-copy-fail="Press Ctrl+C or ⌘+C to copy"> <input class="glue-social__copy-input" readonly="" type="text" value="https://research.google/blog/accelerating-neural-networks-on-mobile-and-web-with-sparse-inference/" aria-label="URL"> <button class="glue-social__copy-btn" id="share-copy-btn" data-gt-method="link-copied">Copy link</button> </div> <div aria-label="Close" class="glue-social__close-btn"> × </div> </div> </div> </li> </ul> </div> </section> </li> </ul> </section> </div> </div> </div> </div> </div> <section class="related-posts offset-two-up --theme-dark" data-gt-id="related_blog_posts" data-gt-component-name="Related Blog Posts"> <div class="glue-page glue-grid"> <div class="offset-two-up__left-col glue-grid__col glue-grid__col--span-4-sm glue-grid__col--span-12-md glue-grid__col--span-3-lg"> <h3 class="offset-two-up__headline headline-3">Other posts of interest</h3> </div> <div class="glue-grid__col glue-grid__col--span-4-sm glue-grid__col--span-12-md glue-grid__col--span-9-lg"> <ul class="card-stack--basic nested-glue-grid-override"> <li class="glue-grid__col glue-grid__col--span-4-md glue-grid__col--span-4-sm"> <a class="glue-card not-glue " href="/blog/chain-of-agents-large-language-models-collaborating-on-long-context-tasks/" aria-label="" > <div class="glue-card__inner"> <div class="related-posts__image"> <img src="https://storage.googleapis.com/gweb-research2023-media/original_images/CoA-5-HotpotQA.png" alt="" /> </div> <div class="glue-card__content --no-media"> <p class="glue-label glue-spacer-1-bottom">January 23, 2025</p> <span class="headline-5 js-gt-item-id"> Chain of Agents: Large language models collaborating on long-context tasks </span> </div> <ul class="glue-card__link-list"> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Generative AI <span class="glue-card__link-list__spacer">&#183;</span> </span> </li> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Machine Intelligence <span class="glue-card__link-list__spacer">&#183;</span> </span> </li> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Natural Language Processing </span> </li> </ul> </div> </a> </li> <li class="glue-grid__col glue-grid__col--span-4-md glue-grid__col--span-4-sm"> <a class="glue-card not-glue " href="/blog/google-research-2024-breakthroughs-for-impact-at-every-scale/" aria-label="" > <div class="glue-card__inner"> <div class="related-posts__image"> <img src="https://storage.googleapis.com/gweb-research2023-media/original_images/2024YiR-0-Hero.png" alt="" /> </div> <div class="glue-card__content --no-media"> <p class="glue-label glue-spacer-1-bottom">December 19, 2024</p> <span class="headline-5 js-gt-item-id"> Google Research 2024: Breakthroughs for impact at every scale </span> </div> <ul class="glue-card__link-list"> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Algorithms &amp; Theory <span class="glue-card__link-list__spacer">&#183;</span> </span> </li> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Climate &amp; Sustainability <span class="glue-card__link-list__spacer">&#183;</span> </span> </li> <li class="glue-card__link-list__item"> <span class="not-glue caption"> General Science <span class="glue-card__link-list__spacer">&#183;</span> </span> </li> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Generative AI <span class="glue-card__link-list__spacer">&#183;</span> </span> </li> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Health &amp; Bioscience <span class="glue-card__link-list__spacer">&#183;</span> </span> </li> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Machine Intelligence <span class="glue-card__link-list__spacer">&#183;</span> </span> </li> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Quantum <span class="glue-card__link-list__spacer">&#183;</span> </span> </li> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Year in Review </span> </li> </ul> </div> </a> </li> <li class="glue-grid__col glue-grid__col--span-4-md glue-grid__col--span-4-sm"> <a class="glue-card not-glue " href="/blog/satellite-powered-estimation-of-global-solar-potential/" aria-label="" > <div class="glue-card__inner"> <div class="related-posts__image"> <img src="https://storage.googleapis.com/gweb-research2023-media/original_images/SatelliteSun12_Predictions.png" alt="" /> </div> <div class="glue-card__content --no-media"> <p class="glue-label glue-spacer-1-bottom">December 12, 2024</p> <span class="headline-5 js-gt-item-id"> Satellite powered estimation of global solar potential </span> </div> <ul class="glue-card__link-list"> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Climate &amp; Sustainability <span class="glue-card__link-list__spacer">&#183;</span> </span> </li> <li class="glue-card__link-list__item"> <span class="not-glue caption"> Machine Intelligence </span> </li> </ul> </div> </a> </li> </ul> </div> </div> </section> </div> </main> <footer class="glue-footer"> <div class="glue-page"> <section class="glue-social"> <div class="glue-social__group glue-social--monochrome"> <p class="glue-social__title glue-social__title--inline"> Follow us </p> <nav class="js-gt-follow-us-wrapper" aria-label="Social media links"> <ul class="glue-social__list" role="list"> <li class="glue-social__item"> <a class="glue-social__link" href="https://twitter.com/GoogleAI" title="Follow us on x" target="_blank" rel="noopener" data-gt-method="x"" > <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--24px"> <use href="/gr/static/assets/icons/twitter-x.svg#twitter-x"></use> </svg> </a> </li> <li class="glue-social__item"> <a class="glue-social__link" href="https://www.linkedin.com/showcase/googleresearch/" title="Follow us on linkedin" target="_blank" rel="noopener" data-gt-method="linkedin"" > <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--24px"> <use href="/gr/static/assets/icons/glue-icons.svg#post-linkedin"></use> </svg> </a> </li> <li class="glue-social__item"> <a class="glue-social__link" href="https://www.youtube.com/c/GoogleResearch" title="Follow us on youtube" target="_blank" rel="noopener" data-gt-method="youtube"" > <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--24px"> <use href="/gr/static/assets/icons/glue-icons.svg#video-youtube"></use> </svg> </a> </li> <li class="glue-social__item"> <a class="glue-social__link" href="https://github.com/google-research" title="Follow us on github" target="_blank" rel="noopener" data-gt-method="github"" > <svg role="presentation" aria-hidden="true" class="glue-icon glue-icon--social glue-icon--24px"> <use href="/gr/static/assets/icons/github.svg#github"></use> </svg> </a> </li> </ul> </nav> </div> </section> </div> <div class="glue-fullbleed"></div> <section class="glue-page"> <nav class="glue-footer__global" aria-label="Footer resource links"> <div class="glue-footer__logo"> <a href="https://www.google.com" title="Google" class="glue-footer__link"> <svg role="presentation" aria-hidden="true" class="glue-icon glue-footer__logo-img"> <use href="/gr/static/assets/icons/glue-icons.svg#google-solid-logo"></use> </svg> </a> </div> <ul class="glue-footer__global-links glue-no-bullet js-gt-global-nav-wrapper" role="list"> <li class="glue-footer__global-links-list-item" data-gt-primary="About Google"> <a class="glue-footer__link" href="https://about.google/" target="_blank" rel="noopener"> About Google </a> </li> <li class="glue-footer__global-links-list-item" data-gt-primary="Google Products"> <a class="glue-footer__link" href="https://about.google/intl/en/products/" target="_blank" rel="noopener"> Google Products </a> </li> <li class="glue-footer__global-links-list-item" data-gt-primary="Privacy"> <a class="glue-footer__link" href="https://policies.google.com/privacy" target="_blank" rel="noopener"> Privacy </a> </li> <li class="glue-footer__global-links-list-item" data-gt-primary="Terms"> <a class="glue-footer__link" href="https://policies.google.com/terms" target="_blank" rel="noopener"> Terms </a> </li> </ul> <ul class="glue-footer__global-links glue-footer__global-links--extra glue-no-bullet" role="list"> <li class="glue-footer__global-links-list-item glue-footer__global-links-list-item--extra"> <a class="glue-footer__link" href="https://support.google.com/?hl=en"> <svg role="presentation" aria-hidden="true" aria-hidden="true" class="glue-icon glue-icon--24px glue-icon--footer-help"> <use href="/gr/static/assets/icons/glue-icons.svg#help"></use> </svg> Help </a> </li> <li class="glue-footer__global-links-list-item glue-footer__global-links-list-item--extra"> <button class="glue-footer__link google-feedback js-feedback-button" href="" data-product-id="5137383" > Submit feedback </button> </li> </ul> </nav> </section> </footer> <script src="https://www.gstatic.com/glue/v27_1/material-components-web.min.js"></script> <script src="https://www.youtube.com/player_api"></script> <script type="text/javascript" src="/gr/static/js/googleresearch.js?id=b70549917812130af912601ad763f13e"></script> <script type="text/javascript" src="https://support.google.com/inapp/api.js"></script> <script src="https://www.gstatic.com/glue/cookienotificationbar/cookienotificationbar.min.js" data-glue-cookie-notification-bar-category="2B"> </script> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10