CINXE.COM

Llamafile v0.8.14: a new UI, performance gains, and more - Mozilla Hacks - the Web developer blog

<!doctype html> <html lang="en-US"> <head data-template-path="https://hacks.mozilla.org/wp-content/themes/Hax"> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <meta name="google-site-verification" content="w2ocEMd5yV9IsGCjhq-7ls67r4VH-Ob6oWdiZpqjN8U"> <meta name="title" content="Llamafile v0.8.14: a new UI, performance gains, and more – Mozilla Hacks - the Web developer blog"> <meta property="og:site_name" content="Mozilla Hacks &#8211; the Web developer blog"> <meta property="og:url" content="https://hacks.mozilla.org/2024/10/llamafile-v0-8-14-a-new-ui-performance-gains-and-more"> <meta property="og:title" content="Llamafile v0.8.14: a new UI, performance gains, and more – Mozilla Hacks - the Web developer blog"> <meta property="og:description" content="Introducing Llamafile 0.8.14, Mozilla's open-source AI tool with a new chat interface, faster performance and support for powerful models."> <meta property="og:image" content="https://hacks.mozilla.org/wp-content/uploads/2024/10/llamafile_8.1.14_release_image.png"> <meta property="og:image:width" content="1007"> <meta property="og:image:height" content="790"> <meta property="twitter:title" content="Llamafile v0.8.14: a new UI, performance gains, and more – Mozilla Hacks - the Web developer blog"> <meta property="twitter:description" content="Introducing Llamafile 0.8.14, Mozilla's open-source AI tool with a new chat interface, faster performance and support for powerful models."> <meta name="twitter:card" content="summary_large_image"> <meta property="twitter:image" content="https://hacks.mozilla.org/wp-content/uploads/2024/10/llamafile_8.1.14_release_image.png"> <meta name="twitter:site" content="@mozhacks"> <link href='//fonts.googleapis.com/css?family=Open+Sans:400,400italic,700,700italic' rel='stylesheet' type='text/css'> <link rel="stylesheet" href="https://hacks.mozilla.org/wp-content/themes/Hax/css/font-awesome.min.css"> <link rel="stylesheet" href="https://hacks.mozilla.org/wp-content/themes/Hax/style.css"> <link rel="stylesheet" href="//cdn.jsdelivr.net/highlight.js/8.6.0/styles/solarized_light.min.css"> <script type="text/javascript"> window.hacks = {}; // http://cfsimplicity.com/61/removing-analytics-clutter-from-campaign-urls var removeUtms = function(){ var l = window.location; if( l.hash.indexOf( "utm" ) != -1 ){ var anchor = l.hash.match(/#(?!utm)[^&]+/); anchor = anchor? anchor[0]: ''; if(!anchor && window.history.replaceState){ history.replaceState({},'', l.pathname + l.search); } else { l.hash = anchor; } }; }; var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-35433268-8'], ['_setAllowAnchor', true]); _gaq.push (['_gat._anonymizeIp']); _gaq.push(['_trackPageview']); _gaq.push( removeUtms ); (function(d, k) { var ga = d.createElement(k); ga.type = 'text/javascript'; ga.async = true; ga.src = 'https://ssl.google-analytics.com/ga.js'; var s = d.getElementsByTagName(k)[0]; s.parentNode.insertBefore(ga, s); })(document, 'script'); </script> <script async src="https://www.googletagmanager.com/gtag/js?id=G-5WVW12ST9K"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-5WVW12ST9K'); </script> <meta name='robots' content='index, follow, max-image-preview:large, max-snippet:-1, max-video-preview:-1' /> <!-- This site is optimized with the Yoast SEO plugin v22.6 - https://yoast.com/wordpress/plugins/seo/ --> <title>Llamafile v0.8.14: a new UI, performance gains, and more - Mozilla Hacks - the Web developer blog</title> <meta name="description" content="Introducing Llamafile 0.8.14, Mozilla&#039;s open-source AI tool with a new chat interface, faster performance and support for powerful models." /> <link rel="canonical" href="https://hacks.mozilla.org/2024/10/llamafile-v0-8-14-a-new-ui-performance-gains-and-more/" /> <meta name="twitter:label1" content="Written by" /> <meta name="twitter:data1" content="Stephen Hood" /> <meta name="twitter:label2" content="Est. reading time" /> <meta name="twitter:data2" content="4 minutes" /> <script type="application/ld+json" class="yoast-schema-graph">{"@context":"https://schema.org","@graph":[{"@type":"WebPage","@id":"https://hacks.mozilla.org/2024/10/llamafile-v0-8-14-a-new-ui-performance-gains-and-more/","url":"https://hacks.mozilla.org/2024/10/llamafile-v0-8-14-a-new-ui-performance-gains-and-more/","name":"Llamafile v0.8.14: a new UI, performance gains, and more - Mozilla Hacks - the Web developer blog","isPartOf":{"@id":"https://hacks.mozilla.org/#website"},"primaryImageOfPage":{"@id":"https://hacks.mozilla.org/2024/10/llamafile-v0-8-14-a-new-ui-performance-gains-and-more/#primaryimage"},"image":{"@id":"https://hacks.mozilla.org/2024/10/llamafile-v0-8-14-a-new-ui-performance-gains-and-more/#primaryimage"},"thumbnailUrl":"https://hacks.mozilla.org/wp-content/uploads/2024/10/llamafile_8.1.14_release_image.png","datePublished":"2024-10-16T13:32:30+00:00","dateModified":"2024-10-16T13:32:30+00:00","author":{"@id":"https://hacks.mozilla.org/#/schema/person/1fef9bfbb5a1a8d19c438c98f6481861"},"description":"Introducing Llamafile 0.8.14, Mozilla's open-source AI tool with a new chat interface, faster performance and support for powerful models.","breadcrumb":{"@id":"https://hacks.mozilla.org/2024/10/llamafile-v0-8-14-a-new-ui-performance-gains-and-more/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https://hacks.mozilla.org/2024/10/llamafile-v0-8-14-a-new-ui-performance-gains-and-more/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https://hacks.mozilla.org/2024/10/llamafile-v0-8-14-a-new-ui-performance-gains-and-more/#primaryimage","url":"https://hacks.mozilla.org/wp-content/uploads/2024/10/llamafile_8.1.14_release_image.png","contentUrl":"https://hacks.mozilla.org/wp-content/uploads/2024/10/llamafile_8.1.14_release_image.png","width":1007,"height":790,"caption":"llamafile"},{"@type":"BreadcrumbList","@id":"https://hacks.mozilla.org/2024/10/llamafile-v0-8-14-a-new-ui-performance-gains-and-more/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https://hacks.mozilla.org/"},{"@type":"ListItem","position":2,"name":"Articles","item":"https://hacks.mozilla.org/articles/"},{"@type":"ListItem","position":3,"name":"Llamafile v0.8.14: a new UI, performance gains, and more"}]},{"@type":"WebSite","@id":"https://hacks.mozilla.org/#website","url":"https://hacks.mozilla.org/","name":"Mozilla Hacks - the Web developer blog","description":"hacks.mozilla.org","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https://hacks.mozilla.org/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https://hacks.mozilla.org/#/schema/person/1fef9bfbb5a1a8d19c438c98f6481861","name":"Stephen Hood","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https://hacks.mozilla.org/#/schema/person/image/1768fb8e4896e8fbb83cffdf36d42f80","url":"https://hacks.mozilla.org/wp-content/uploads/2024/10/cropped-stephen-hood-headshot-96x96.png","contentUrl":"https://hacks.mozilla.org/wp-content/uploads/2024/10/cropped-stephen-hood-headshot-96x96.png","caption":"Stephen Hood"},"description":"Stephen leads open source AI projects (including llamafile) in Mozilla Builders. He previously managed social bookmarking pioneer del.icio.us; co-founded Storium, Blockboard, and FairSpin; and worked on Yahoo Search and BEA WebLogic.","sameAs":["https://stephenhood.com"],"url":"https://hacks.mozilla.org/author/slangtonhoodmozilla-com/"}]}</script> <!-- / Yoast SEO plugin. --> <link rel="alternate" type="application/rss+xml" title="Mozilla Hacks - the Web developer blog &raquo; Feed" href="https://hacks.mozilla.org/feed/" /> <link rel="alternate" type="application/rss+xml" title="Mozilla Hacks - the Web developer blog &raquo; Comments Feed" href="https://hacks.mozilla.org/comments/feed/" /> <link rel='stylesheet' id='wp-block-library-css' href='https://hacks.mozilla.org/wp-includes/css/dist/block-library/style.min.css?ver=6.6.1' type='text/css' media='all' /> <style id='co-authors-plus-coauthors-style-inline-css' type='text/css'> .wp-block-co-authors-plus-coauthors.is-layout-flow [class*=wp-block-co-authors-plus]{display:inline} </style> <style id='co-authors-plus-avatar-style-inline-css' type='text/css'> .wp-block-co-authors-plus-avatar :where(img){height:auto;max-width:100%;vertical-align:bottom}.wp-block-co-authors-plus-coauthors.is-layout-flow .wp-block-co-authors-plus-avatar :where(img){vertical-align:middle}.wp-block-co-authors-plus-avatar:is(.alignleft,.alignright){display:table}.wp-block-co-authors-plus-avatar.aligncenter{display:table;margin-inline:auto} </style> <style id='co-authors-plus-image-style-inline-css' type='text/css'> .wp-block-co-authors-plus-image{margin-bottom:0}.wp-block-co-authors-plus-image :where(img){height:auto;max-width:100%;vertical-align:bottom}.wp-block-co-authors-plus-coauthors.is-layout-flow .wp-block-co-authors-plus-image :where(img){vertical-align:middle}.wp-block-co-authors-plus-image:is(.alignfull,.alignwide) :where(img){width:100%}.wp-block-co-authors-plus-image:is(.alignleft,.alignright){display:table}.wp-block-co-authors-plus-image.aligncenter{display:table;margin-inline:auto} </style> <link rel='stylesheet' id='prismatic-blocks-css' href='https://hacks.mozilla.org/wp-content/plugins/prismatic/css/styles-blocks.css?ver=6.6.1' type='text/css' media='all' /> <style id='safe-svg-svg-icon-style-inline-css' type='text/css'> .safe-svg-cover{text-align:center}.safe-svg-cover .safe-svg-inside{display:inline-block;max-width:100%}.safe-svg-cover svg{height:100%;max-height:100%;max-width:100%;width:100%} </style> <style id='classic-theme-styles-inline-css' type='text/css'> /*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} </style> <style id='global-styles-inline-css' type='text/css'> :root{--wp--preset--aspect-ratio--square: 1;--wp--preset--aspect-ratio--4-3: 4/3;--wp--preset--aspect-ratio--3-4: 3/4;--wp--preset--aspect-ratio--3-2: 3/2;--wp--preset--aspect-ratio--2-3: 2/3;--wp--preset--aspect-ratio--16-9: 16/9;--wp--preset--aspect-ratio--9-16: 9/16;--wp--preset--color--black: #000000;--wp--preset--color--cyan-bluish-gray: #abb8c3;--wp--preset--color--white: #ffffff;--wp--preset--color--pale-pink: #f78da7;--wp--preset--color--vivid-red: #cf2e2e;--wp--preset--color--luminous-vivid-orange: #ff6900;--wp--preset--color--luminous-vivid-amber: #fcb900;--wp--preset--color--light-green-cyan: #7bdcb5;--wp--preset--color--vivid-green-cyan: #00d084;--wp--preset--color--pale-cyan-blue: #8ed1fc;--wp--preset--color--vivid-cyan-blue: #0693e3;--wp--preset--color--vivid-purple: #9b51e0;--wp--preset--gradient--vivid-cyan-blue-to-vivid-purple: linear-gradient(135deg,rgba(6,147,227,1) 0%,rgb(155,81,224) 100%);--wp--preset--gradient--light-green-cyan-to-vivid-green-cyan: linear-gradient(135deg,rgb(122,220,180) 0%,rgb(0,208,130) 100%);--wp--preset--gradient--luminous-vivid-amber-to-luminous-vivid-orange: linear-gradient(135deg,rgba(252,185,0,1) 0%,rgba(255,105,0,1) 100%);--wp--preset--gradient--luminous-vivid-orange-to-vivid-red: linear-gradient(135deg,rgba(255,105,0,1) 0%,rgb(207,46,46) 100%);--wp--preset--gradient--very-light-gray-to-cyan-bluish-gray: linear-gradient(135deg,rgb(238,238,238) 0%,rgb(169,184,195) 100%);--wp--preset--gradient--cool-to-warm-spectrum: linear-gradient(135deg,rgb(74,234,220) 0%,rgb(151,120,209) 20%,rgb(207,42,186) 40%,rgb(238,44,130) 60%,rgb(251,105,98) 80%,rgb(254,248,76) 100%);--wp--preset--gradient--blush-light-purple: linear-gradient(135deg,rgb(255,206,236) 0%,rgb(152,150,240) 100%);--wp--preset--gradient--blush-bordeaux: linear-gradient(135deg,rgb(254,205,165) 0%,rgb(254,45,45) 50%,rgb(107,0,62) 100%);--wp--preset--gradient--luminous-dusk: linear-gradient(135deg,rgb(255,203,112) 0%,rgb(199,81,192) 50%,rgb(65,88,208) 100%);--wp--preset--gradient--pale-ocean: linear-gradient(135deg,rgb(255,245,203) 0%,rgb(182,227,212) 50%,rgb(51,167,181) 100%);--wp--preset--gradient--electric-grass: linear-gradient(135deg,rgb(202,248,128) 0%,rgb(113,206,126) 100%);--wp--preset--gradient--midnight: linear-gradient(135deg,rgb(2,3,129) 0%,rgb(40,116,252) 100%);--wp--preset--font-size--small: 13px;--wp--preset--font-size--medium: 20px;--wp--preset--font-size--large: 36px;--wp--preset--font-size--x-large: 42px;--wp--preset--spacing--20: 0.44rem;--wp--preset--spacing--30: 0.67rem;--wp--preset--spacing--40: 1rem;--wp--preset--spacing--50: 1.5rem;--wp--preset--spacing--60: 2.25rem;--wp--preset--spacing--70: 3.38rem;--wp--preset--spacing--80: 5.06rem;--wp--preset--shadow--natural: 6px 6px 9px rgba(0, 0, 0, 0.2);--wp--preset--shadow--deep: 12px 12px 50px rgba(0, 0, 0, 0.4);--wp--preset--shadow--sharp: 6px 6px 0px rgba(0, 0, 0, 0.2);--wp--preset--shadow--outlined: 6px 6px 0px -3px rgba(255, 255, 255, 1), 6px 6px rgba(0, 0, 0, 1);--wp--preset--shadow--crisp: 6px 6px 0px rgba(0, 0, 0, 1);}:where(.is-layout-flex){gap: 0.5em;}:where(.is-layout-grid){gap: 0.5em;}body .is-layout-flex{display: flex;}.is-layout-flex{flex-wrap: wrap;align-items: center;}.is-layout-flex > :is(*, div){margin: 0;}body .is-layout-grid{display: grid;}.is-layout-grid > :is(*, div){margin: 0;}:where(.wp-block-columns.is-layout-flex){gap: 2em;}:where(.wp-block-columns.is-layout-grid){gap: 2em;}:where(.wp-block-post-template.is-layout-flex){gap: 1.25em;}:where(.wp-block-post-template.is-layout-grid){gap: 1.25em;}.has-black-color{color: var(--wp--preset--color--black) !important;}.has-cyan-bluish-gray-color{color: var(--wp--preset--color--cyan-bluish-gray) !important;}.has-white-color{color: var(--wp--preset--color--white) !important;}.has-pale-pink-color{color: var(--wp--preset--color--pale-pink) !important;}.has-vivid-red-color{color: var(--wp--preset--color--vivid-red) !important;}.has-luminous-vivid-orange-color{color: var(--wp--preset--color--luminous-vivid-orange) !important;}.has-luminous-vivid-amber-color{color: var(--wp--preset--color--luminous-vivid-amber) !important;}.has-light-green-cyan-color{color: var(--wp--preset--color--light-green-cyan) !important;}.has-vivid-green-cyan-color{color: var(--wp--preset--color--vivid-green-cyan) !important;}.has-pale-cyan-blue-color{color: var(--wp--preset--color--pale-cyan-blue) !important;}.has-vivid-cyan-blue-color{color: var(--wp--preset--color--vivid-cyan-blue) !important;}.has-vivid-purple-color{color: var(--wp--preset--color--vivid-purple) !important;}.has-black-background-color{background-color: var(--wp--preset--color--black) !important;}.has-cyan-bluish-gray-background-color{background-color: var(--wp--preset--color--cyan-bluish-gray) !important;}.has-white-background-color{background-color: var(--wp--preset--color--white) !important;}.has-pale-pink-background-color{background-color: var(--wp--preset--color--pale-pink) !important;}.has-vivid-red-background-color{background-color: var(--wp--preset--color--vivid-red) !important;}.has-luminous-vivid-orange-background-color{background-color: var(--wp--preset--color--luminous-vivid-orange) !important;}.has-luminous-vivid-amber-background-color{background-color: var(--wp--preset--color--luminous-vivid-amber) !important;}.has-light-green-cyan-background-color{background-color: var(--wp--preset--color--light-green-cyan) !important;}.has-vivid-green-cyan-background-color{background-color: var(--wp--preset--color--vivid-green-cyan) !important;}.has-pale-cyan-blue-background-color{background-color: var(--wp--preset--color--pale-cyan-blue) !important;}.has-vivid-cyan-blue-background-color{background-color: var(--wp--preset--color--vivid-cyan-blue) !important;}.has-vivid-purple-background-color{background-color: var(--wp--preset--color--vivid-purple) !important;}.has-black-border-color{border-color: var(--wp--preset--color--black) !important;}.has-cyan-bluish-gray-border-color{border-color: var(--wp--preset--color--cyan-bluish-gray) !important;}.has-white-border-color{border-color: var(--wp--preset--color--white) !important;}.has-pale-pink-border-color{border-color: var(--wp--preset--color--pale-pink) !important;}.has-vivid-red-border-color{border-color: var(--wp--preset--color--vivid-red) !important;}.has-luminous-vivid-orange-border-color{border-color: var(--wp--preset--color--luminous-vivid-orange) !important;}.has-luminous-vivid-amber-border-color{border-color: var(--wp--preset--color--luminous-vivid-amber) !important;}.has-light-green-cyan-border-color{border-color: var(--wp--preset--color--light-green-cyan) !important;}.has-vivid-green-cyan-border-color{border-color: var(--wp--preset--color--vivid-green-cyan) !important;}.has-pale-cyan-blue-border-color{border-color: var(--wp--preset--color--pale-cyan-blue) !important;}.has-vivid-cyan-blue-border-color{border-color: var(--wp--preset--color--vivid-cyan-blue) !important;}.has-vivid-purple-border-color{border-color: var(--wp--preset--color--vivid-purple) !important;}.has-vivid-cyan-blue-to-vivid-purple-gradient-background{background: var(--wp--preset--gradient--vivid-cyan-blue-to-vivid-purple) !important;}.has-light-green-cyan-to-vivid-green-cyan-gradient-background{background: var(--wp--preset--gradient--light-green-cyan-to-vivid-green-cyan) !important;}.has-luminous-vivid-amber-to-luminous-vivid-orange-gradient-background{background: var(--wp--preset--gradient--luminous-vivid-amber-to-luminous-vivid-orange) !important;}.has-luminous-vivid-orange-to-vivid-red-gradient-background{background: var(--wp--preset--gradient--luminous-vivid-orange-to-vivid-red) !important;}.has-very-light-gray-to-cyan-bluish-gray-gradient-background{background: var(--wp--preset--gradient--very-light-gray-to-cyan-bluish-gray) !important;}.has-cool-to-warm-spectrum-gradient-background{background: var(--wp--preset--gradient--cool-to-warm-spectrum) !important;}.has-blush-light-purple-gradient-background{background: var(--wp--preset--gradient--blush-light-purple) !important;}.has-blush-bordeaux-gradient-background{background: var(--wp--preset--gradient--blush-bordeaux) !important;}.has-luminous-dusk-gradient-background{background: var(--wp--preset--gradient--luminous-dusk) !important;}.has-pale-ocean-gradient-background{background: var(--wp--preset--gradient--pale-ocean) !important;}.has-electric-grass-gradient-background{background: var(--wp--preset--gradient--electric-grass) !important;}.has-midnight-gradient-background{background: var(--wp--preset--gradient--midnight) !important;}.has-small-font-size{font-size: var(--wp--preset--font-size--small) !important;}.has-medium-font-size{font-size: var(--wp--preset--font-size--medium) !important;}.has-large-font-size{font-size: var(--wp--preset--font-size--large) !important;}.has-x-large-font-size{font-size: var(--wp--preset--font-size--x-large) !important;} :where(.wp-block-post-template.is-layout-flex){gap: 1.25em;}:where(.wp-block-post-template.is-layout-grid){gap: 1.25em;} :where(.wp-block-columns.is-layout-flex){gap: 2em;}:where(.wp-block-columns.is-layout-grid){gap: 2em;} :root :where(.wp-block-pullquote){font-size: 1.5em;line-height: 1.6;} </style> <script type="text/javascript" src="https://hacks.mozilla.org/wp-includes/js/jquery/jquery.min.js?ver=3.7.1" id="jquery-core-js"></script> <script type="text/javascript" src="https://hacks.mozilla.org/wp-includes/js/jquery/jquery-migrate.min.js?ver=3.4.1" id="jquery-migrate-js"></script> <script type="text/javascript" src="https://hacks.mozilla.org/wp-content/themes/Hax/js/analytics.js?ver=6.6.1" id="analytics-js"></script> <link rel="https://api.w.org/" href="https://hacks.mozilla.org/wp-json/" /><link rel="alternate" title="JSON" type="application/json" href="https://hacks.mozilla.org/wp-json/wp/v2/posts/48203" /><link rel="EditURI" type="application/rsd+xml" title="RSD" href="https://hacks.mozilla.org/xmlrpc.php?rsd" /> <link rel='shortlink' href='https://hacks.mozilla.org/?p=48203' /> <link rel="alternate" title="oEmbed (JSON)" type="application/json+oembed" href="https://hacks.mozilla.org/wp-json/oembed/1.0/embed?url=https%3A%2F%2Fhacks.mozilla.org%2F2024%2F10%2Fllamafile-v0-8-14-a-new-ui-performance-gains-and-more%2F" /> <link rel="alternate" title="oEmbed (XML)" type="text/xml+oembed" href="https://hacks.mozilla.org/wp-json/oembed/1.0/embed?url=https%3A%2F%2Fhacks.mozilla.org%2F2024%2F10%2Fllamafile-v0-8-14-a-new-ui-performance-gains-and-more%2F&#038;format=xml" /> </head> <body> <div class="outer-wrapper"> <header class="section section--fullwidth header"> <div class="masthead row"> <div class="branding block block--3"> <h1> <a href="https://hacks.mozilla.org"> <img class="branding__logo" src="https://hacks.mozilla.org/wp-content/themes/Hax/img/mdn-logo-mono.svg"> <img class="branding__wordmark" src="https://hacks.mozilla.org/wp-content/themes/Hax/img/wordmark.svg" alt="Mozilla"> <span class="branding__title">Hac<span class="logo-askew">k</span>s</span> </a> </h1> </div> <div class="search block block--2"> <form class="search__form" method="get" action="https://hacks.mozilla.org/"> <input type="search" name="s" class="search__input" placeholder="Search Mozilla Hacks" value=""> <i class="fa fa-search search__badge"></i> </form> </div> <nav class="social"> <a class="social__link youtube" href="http://www.youtube.com/user/mozhacks" title="YouTube"><i class="fa fa-youtube" aria-hidden="true"></i><span>Hacks on YouTube</span></a> <a class="social__link twitter" href="https://twitter.com/mozhacks" title="Twitter"><i class="fa fa-twitter" aria-hidden="true"></i><span>@mozhacks on Twitter</span></a> <a class="social__link rss" href="https://hacks.mozilla.org/feed/" title="RSS Feed"><i class="fa fa-rss" aria-hidden="true"></i><span>Hacks RSS Feed</span></a> <a class="fx-button" href="https://www.mozilla.org/firefox/download/thanks/?utm_source=hacks.mozilla.org&utm_medium=referral&utm_campaign=header-download-button&utm_content=header-download-button">Download Firefox</a> </nav> </div> </header> <div id="content-head" class="section"> <h1 class="post__title">Llamafile v0.8.14: a new UI, performance gains, and more</h1> <div class="byline"> <h3 class="post__author"> <img alt='Avatar photo' src='https://hacks.mozilla.org/wp-content/uploads/2024/10/cropped-stephen-hood-headshot-64x64.png' srcset='https://hacks.mozilla.org/wp-content/uploads/2024/10/cropped-stephen-hood-headshot-128x128.png 2x' class='avatar avatar-64 photo' height='64' width='64' decoding='async'/> By <a class="url" href="https://stephenhood.com" rel="external me">Stephen Hood</a> </h2> <div class="post__meta"> Posted on <abbr class="published" title="2024-10-16T06:32:30-07:00"> October 16, 2024 </abbr> <span class="entry-cat">in <a href="https://hacks.mozilla.org/category/featured/" rel="category tag" title="View all posts in Featured Article" >Featured Article</a> </span> <div class="socialshare" data-type="bubbles"></div> </div> </div> </div> <main id="content-main" class="section article"> <article class="post" role="article"> <p><span style="font-weight: 400;">We’ve just released</span> <a href="https://github.com/Mozilla-Ocho/llamafile/releases/tag/0.8.14"><b>Llamafile 0.8.14</b></a><span style="font-weight: 400;">, the latest version of our popular open source AI tool. A </span><a href="https://future.mozilla.org/builders/"><span style="font-weight: 400;">Mozilla Builders project</span></a><span style="font-weight: 400;">, Llamafile turns model weights into fast, convenient executables that run on most computers, making it easy for anyone to get the most out of open LLMs using the hardware they already have.</span></p> <h2><span style="font-weight: 400;">New chat interface</span></h2> <p><span style="font-weight: 400;">The key feature of this new release is </span><b>our colorful new command line chat interface</b><span style="font-weight: 400;">. When you launch a Llamafile we now automatically open this new chat UI for you, right there in the terminal. This new interface is fast, easy to use, and an all around simpler experience than the Web-based interface we previously launched by default. (That interface, which our project inherits from the upstream llama.cpp project, is still available and supports a range of features, including image uploads. </span><span style="font-weight: 400;">Simply point your browser at port 8080 on localhost).</span></p> <p><img fetchpriority="high" decoding="async" class="aligncenter size-full wp-image-48204" src="https://hacks.mozilla.org/wp-content/uploads/2024/10/llamafile_8.1.14_release_image.png" alt="llamafile" width="1007" height="790" srcset="https://hacks.mozilla.org/wp-content/uploads/2024/10/llamafile_8.1.14_release_image.png 1007w, https://hacks.mozilla.org/wp-content/uploads/2024/10/llamafile_8.1.14_release_image-250x196.png 250w, https://hacks.mozilla.org/wp-content/uploads/2024/10/llamafile_8.1.14_release_image-500x392.png 500w, https://hacks.mozilla.org/wp-content/uploads/2024/10/llamafile_8.1.14_release_image-768x603.png 768w" sizes="(max-width: 1007px) 100vw, 1007px" /></p> <h2><span style="font-weight: 400;">Other recent improvements</span></h2> <p><span style="font-weight: 400;">This new chat UI is just the tip of the iceberg. In the months since our last blog post here, lead developer </span><a href="https://justine.lol/"><span style="font-weight: 400;">Justine Tunney</span></a><span style="font-weight: 400;"> has been busy shipping a slew of new releases, each of which have moved the project forward in important ways. Here are just a few of the highlights:</span></p> <p><b>Llamafiler</b><span style="font-weight: 400;">: We’re building our own clean sheet OpenAI-compatible API server, called </span><i><span style="font-weight: 400;">Llamafiler</span></i><span style="font-weight: 400;">. This new server will be more reliable, stable, and most of all </span><i><span style="font-weight: 400;">faster</span></i><span style="font-weight: 400;"> than the one it replaces. We’ve already shipped the embeddings endpoint, which runs </span><i><span style="font-weight: 400;">three times as fast</span></i><span style="font-weight: 400;"> as the one in llama.cpp. Justine is currently working on the completions endpoint, at which point Llamafiler will become the default API server for Llamafile.</span></p> <p><b>Performance improvements</b><span style="font-weight: 400;">: With the help of open source contributors like k-quant inventor </span><a href="https://github.com/Kawrakow"><span style="font-weight: 400;">@Kawrakow</span></a><span style="font-weight: 400;"> Llamafile has enjoyed a series of dramatic speed boosts over the last few months. In particular, pre-fill (prompt evaluation) speed has improved dramatically on a variety of architectures:</span></p> <ul> <li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Intel Core i9 went from 100 tokens/second to 400 (4x).</span></li> <li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">AMD Threadripper went from 300 tokens/second to 2,400 (8x).</span></li> <li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Even the modest Raspberry Pi 5 jumped from 8 tokens/second to 80 (10x!).</span></li> </ul> <p><span style="font-weight: 400;">When combined with the new high-speed embedding server described above, Llamafile has become one of the fastest ways to run complex local AI applications that use methods like retrieval augmented generation (RAG).</span></p> <p><b>Support for powerful new models</b><span style="font-weight: 400;">: Llamafile continues to keep pace with progress in open LLMs, adding support for dozens of new models and architectures, ranging in size from 405 billion parameters all the way down to 1 billion. Here are just a few of the new Llamafiles </span><a href="https://huggingface.co/Mozilla"><span style="font-weight: 400;">available for download on Hugging Face</span></a><span style="font-weight: 400;">:</span></p> <ul> <li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Llama 3.2 </span><a href="https://huggingface.co/Mozilla/Llama-3.2-1B-Instruct-llamafile"><span style="font-weight: 400;">1B</span></a><span style="font-weight: 400;"> and </span><a href="https://huggingface.co/Mozilla/Llama-3.2-3B-Instruct-llamafile"><span style="font-weight: 400;">3B</span></a><span style="font-weight: 400;">: offering extremely impressive performance and quality for their small size. (Here’s </span><a href="https://www.youtube.com/watch?v=Lqh7egmfy4o"><span style="font-weight: 400;">a video</span></a><span style="font-weight: 400;"> from our own Mike Heavers showing it in action.)</span></li> <li style="font-weight: 400;" aria-level="1"><span style="font-weight: 400;">Llama 3.1 </span><a href="https://huggingface.co/Mozilla/Meta-Llama-3.1-405B-llamafile"><span style="font-weight: 400;">405B</span></a><span style="font-weight: 400;">: a true “frontier model” that’s possible to run </span><i><span style="font-weight: 400;">at home</span></i><span style="font-weight: 400;"> with sufficient system RAM.</span></li> <li style="font-weight: 400;" aria-level="1"><a href="https://huggingface.co/Mozilla/OLMo-7B-0424-llamafile"><span style="font-weight: 400;">OLMo 7B</span></a><span style="font-weight: 400;">: from our friends at the </span><a href="https://alleninstitute.org/"><span style="font-weight: 400;">Allen Institute</span></a><span style="font-weight: 400;">, OLMo is one of the first truly open and transparent models available.</span></li> <li style="font-weight: 400;" aria-level="1"><a href="https://huggingface.co/Mozilla/TriLM-llamafile"><span style="font-weight: 400;">TriLM</span></a><span style="font-weight: 400;">: a new “1.58 bit” tiny model that is optimized for CPU inference and points to a near future where matrix multiplication might no longer rule the day.</span></li> </ul> <p><b>Whisperfile, speech-to-text in a single file</b><span style="font-weight: 400;">: Thanks to contributions from community member </span><a href="https://github.com/cjpais"><span style="font-weight: 400;">@cjpais</span></a><span style="font-weight: 400;">, we’ve created </span><a href="https://huggingface.co/Mozilla/whisperfile"><span style="font-weight: 400;">Whisperfile</span></a><span style="font-weight: 400;">, which does for whisper.cpp what Llamafile did for llama.cpp: that is, turns it into a multi-platform executable that runs nearly everywhere. Whisperfile thus makes it easy to use OpenAI’s Whisper technology to efficiently convert speech into text, no matter which kind of hardware you have.</span></p> <h2><span style="font-weight: 400;">Get involved</span></h2> <p><span style="font-weight: 400;">Our goal is for Llamafile to become a rock-solid foundation for building sophisticated locally-running AI applications. Justine’s work on the new Llamafiler server is a big part of that equation, but so is the ongoing work of supporting new models and optimizing inference performance for as many users as possible. We’re proud and grateful that some of the project’s biggest breakthroughs in these areas, and others, have come from the community, with contributors like </span><a href="https://github.com/Kawrakow"><span style="font-weight: 400;">@Kawrakow</span></a><span style="font-weight: 400;">, </span><a href="https://github.com/cjpais"><span style="font-weight: 400;">@cjpais</span></a><span style="font-weight: 400;">, </span><a href="https://github.com/mofosyne"><span style="font-weight: 400;">@mofosyne</span></a><span style="font-weight: 400;">, and </span><a href="https://github.com/djip007"><span style="font-weight: 400;">@Djip007</span></a><span style="font-weight: 400;"> routinely leaving their mark.</span></p> <p><span style="font-weight: 400;">We invite you to join them, and us. We welcome issues and PRs in </span><a href="https://github.com/Mozilla-Ocho/llamafile"><span style="font-weight: 400;">our GitHub repo</span></a><span style="font-weight: 400;">. And we welcome you to become a member of Mozilla’s AI Discord server, which has </span><a href="https://discord.gg/gbR6vJH9gu"><span style="font-weight: 400;">a dedicated channel just for Llamafile</span></a><span style="font-weight: 400;"> where you can get direct access to the project team. Hope to see you there!</span></p> <p>&nbsp;</p> <section class="about"> <h2 class="about__header">About <a class="url" href="https://stephenhood.com" rel="external me"> Stephen Hood </a> </h3> <p>Stephen leads open source AI projects (including llamafile) in Mozilla Builders. He previously managed social bookmarking pioneer del.icio.us; co-founded Storium, Blockboard, and FairSpin; and worked on Yahoo Search and BEA WebLogic.</p> <ul class="author-meta fa-ul"><li><i class="fa-li fa fa-globe"></i><a href="https://stephenhood.com" class="website" rel="me">https://stephenhood.com</a></li></ul> <p><a class="url" href="https://hacks.mozilla.org/author/slangtonhoodmozilla-com/">More articles by Stephen Hood&hellip;</a></p> </section> </article> <section class="promo"> <form id="newsletterForm" name="newsletter-form" class="newsletter block block--1 block--polite" action="https://www.mozilla.org/en-US/newsletter/" method="post"> <h2 class="heading">Discover great resources for web development</h2> <p class="newsletter__description">Sign up for the Mozilla Developer Newsletter:</p> <input id="fmt" name="fmt" value="H" type="hidden"> <input id="newsletterNewslettersInput" name="newsletters" value="app-dev" type="hidden"> <div id="newsletterErrors" class="newsletter__errors"></div> <div id="newsletterEmail" class="form__row"> <label for="newsletterEmailInput" class="offscreen">E-mail</label> <input id="newsletterEmailInput" name="email" class="newsletter__input" required="" placeholder="you@example.com" size="30" type="email"> </div> <div id="newsletterPrivacy" class="form__row form__fineprint"> <input id="newsletterPrivacyInput" name="privacy" required="" type="checkbox"> <label for="newsletterPrivacyInput"> I'm okay with Mozilla handling my info as explained in this <a href="https://www.mozilla.org/privacy/">Privacy Policy</a>. </label> </div> <button id="newsletter-submit" type="submit" class="button positive">Sign up now</button> </form> <div id="newsletterThanks" class="newsletter newsletter--thanks block block--1 block--polite hidden"> <h2 class="heading">Thanks! Please check your inbox to confirm your subscription.</h2> <p>If you haven’t previously confirmed a subscription to a Mozilla-related newsletter you may have to do so. Please check your inbox or your spam filter for an email from us. </p> </div> </section> </main><!-- /#content-main --> <footer class="footer section section--fullwidth"> <div class="row"> <p class="block block--1"> Except where otherwise noted, content on this site is licensed under the <a href="https://creativecommons.org/licenses/by-sa/3.0/" rel="license external">Creative Commons Attribution Share-Alike License v3.0</a> or any later version. </p> <img class="footer__logo" alt="the Mozilla dino logo" src="https://hacks.mozilla.org/wp-content/themes/Hax/img/dino.svg"> </div> </footer> </div> <script> // External links should open in a new tab. (function () { var postLinks = document.querySelectorAll('#content-main a'); var origin = location.origin; for (var i = 0; i < postLinks.length; i++) { var link = postLinks[i]; if (link.origin !== origin && !link.getAttribute('target')) { link.setAttribute('target', '_blank'); } } })(); window.addEventListener('load', function () { if (document.querySelector('#newsletterForm')) { var script = document.createElement('script'); var path = document.head.getAttribute('data-template-path'); script.setAttribute('src', path + '/js/newsletter.js'); document.head.appendChild(script); } }); </script> </body> </html>

Pages: 1 2 3 4 5 6 7 8 9 10