CINXE.COM
Star History Monthly Nov 2023 | Open-source Text To Speech Engines
<!DOCTYPE html><html lang="en"> <head><meta charSet="utf-8"/><meta name="viewport" content="width=device-width"/><link rel="icon" href="/assets/favicon.ico"/><script defer="" data-domain="star-history.com" src="https://plausible.io/js/script.js"></script><meta name="next-head-count" content="4"/><link data-next-font="size-adjust" rel="preconnect" href="/" crossorigin="anonymous"/><link rel="preload" href="/_next/static/css/f94657194d4c857a.css" as="style" crossorigin=""/><link rel="stylesheet" href="/_next/static/css/f94657194d4c857a.css" crossorigin="" data-n-g=""/><noscript data-n-css=""></noscript><script defer="" crossorigin="" nomodule="" src="/_next/static/chunks/polyfills-c67a75d1b6f99dc8.js"></script><script src="/_next/static/chunks/webpack-38cee4c0e358b1a3.js" defer="" crossorigin=""></script><script src="/_next/static/chunks/framework-fda0a023b274c574.js" defer="" crossorigin=""></script><script src="/_next/static/chunks/main-001c9e19b1894c7d.js" defer="" crossorigin=""></script><script src="/_next/static/chunks/pages/_app-915effad870aa62e.js" defer="" crossorigin=""></script><script src="/_next/static/chunks/6c86d9ce-d8b7531786dd65a5.js" defer="" crossorigin=""></script><script src="/_next/static/chunks/472-8057db644de3d496.js" defer="" crossorigin=""></script><script src="/_next/static/chunks/590-d0a3c67c09cc0662.js" defer="" crossorigin=""></script><script src="/_next/static/chunks/pages/blog/%5Bslug%5D-7b378153203b51eb.js" defer="" crossorigin=""></script><script src="/_next/static/xKX4ZiOi_N7h3OBOEsSZu/_buildManifest.js" defer="" crossorigin=""></script><script src="/_next/static/xKX4ZiOi_N7h3OBOEsSZu/_ssgManifest.js" defer="" crossorigin=""></script></head><body><div id="__next"><div class="relative w-full h-auto min-h-screen overflow-auto flex flex-col"><title>Star History Monthly Nov 2023 | Open-source Text To Speech Engines</title><meta name="description" content="Text To Speech, or TTS, is a part of human-computer dialogue that enables machines to speak, here are a few open-source TTS options to look at."/><meta property="og:type" content="website"/><meta property="og:url" content="https://star-history.com/blog/tts"/><meta property="og:title" content="Star History Monthly Nov 2023 | Open-source Text To Speech Engines"/><meta property="og:description" content="Text To Speech, or TTS, is a part of human-computer dialogue that enables machines to speak, here are a few open-source TTS options to look at."/><meta property="og:image" content="https://star-history.com/assets/blog/tts/banner.webp"/><meta name="twitter:card" content="summary_large_image"/><meta name="twitter:url" content="https://star-history.com/blog/tts"/><meta name="twitter:title" content="Star History Monthly Nov 2023 | Open-source Text To Speech Engines"/><meta name="twitter:description" content="Text To Speech, or TTS, is a part of human-computer dialogue that enables machines to speak, here are a few open-source TTS options to look at."/><meta name="twitter:image" content="https://star-history.com/assets/blog/tts/banner.webp"/><nav><div class="flex justify-center items-center gap-x-6 bg-green-600 px-6 py-1 sm:px-3.5 "><p class="text-sm leading-6 text-white"><a href="/blog/list-your-open-source-project">Want to promote your open source project? Be on our ⭐️Starlet List⭐️ for FREE →</a></p></div></nav><header class="w-full h-14 shrink-0 flex flex-row justify-center items-center bg-[#363636] text-light"><div class="w-full md:max-w-5xl lg:max-w-7xl h-full flex flex-row justify-between items-center px-0 sm:px-4"><div class="h-full bg-dark flex flex-row justify-start items-center"><a class="h-full flex flex-row justify-center items-center px-3 hover:bg-zinc-800" href="/"><img class="w-7 h-auto" src="/assets/icon.png" alt="Logo"/></a><a class="h-full flex flex-row justify-center items-center text-base px-3 hover:bg-zinc-800" href="/blog"><span class="text-white font-semibold -2">Blog</span></a><span class="h-full flex flex-row justify-center items-center cursor-pointer text-white text-base px-3 font-semibold mr-2 hover:bg-zinc-800">Add Access Token</span></div><div class="hidden h-full md:flex flex-row justify-start items-center"><a target="_blank" rel="noopener noreferrer" class="h-full flex text-white text-base flex-row justify-center items-center px-4 hover:bg-zinc-800" href="https://www.bytebase.com/?source=star-history"><img class="h-6 mt-1 mr-2" src="/assets/craft-by-bytebase.webp" alt=""/></a></div><div class="h-full hidden md:flex flex-row justify-end items-center space-x-2"><a class="h-full flex flex-row justify-center items-center px-2 hover:bg-zinc-800" href="https://twitter.com/StarHistoryHQ" target="_blank" rel="noopener noreferrer"><i class="fab fa-twitter text-2xl text-blue-300"></i></a></div><div class="h-full flex md:hidden flex-row justify-end items-center"><span class="relative h-full w-10 px-3 flex flex-row justify-center items-center cursor-pointer font-semibold text-light hover:bg-zinc-800"><span class="w-4 transition-all h-px bg-light absolute top-1/2 -mt-1"></span><span class="w-4 transition-all h-px bg-light absolute top-1/2 "></span><span class="w-4 transition-all h-px bg-light absolute top-1/2 mt-1"></span></span></div></div></header><div class="w-full h-auto py-2 flex md:hidden flex-col justify-start items-start shadow-lg border-b hidden"><a class="h-12 text-base px-3 w-full flex flex-row justify-start items-center cursor-pointer font-semibold text-dark mr-2 hover:bg-gray-100 hover:text-blue-500" href="/blog/how-to-use-github-star-history">📕 How to use this site</a><span class="h-12 px-3 text-base w-full flex flex-row justify-start items-center cursor-pointer font-semibold text-dark mr-2 hover:bg-gray-100 hover:text-blue-500">Add Access Token</span><span class="h-12 text-base px-3 w-full flex flex-row justify-start items-center"><a class="github-button -mt-1" href="https://github.com/star-history/star-history" data-show-count="true" aria-label="Star star-history/star-history on GitHub" target="_blank" rel="noopener noreferrer">Star</a></span></div><div class="w-full h-auto grow lg:grid lg:grid-cols-[256px_1fr_256px]"><div class="w-full hidden lg:block"><div class="flex flex-col justify-start items-start w-full mt-2 p-4 pl-8"><a class="hover:opacity-75" href="/blog/list-your-open-source-project"><img class="w-auto max-w-full" src="/assets/starlet-icon.webp"/></a><div><div class="w-full flex flex-row justify-between items-center my-2"><h3 class="text-sm font-medium text-gray-400 leading-6">Playbook</h3></div><ul class="list-disc list-inside"><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/how-to-use-github-star-history"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">📕 How to Use this Site</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/playbook-for-more-github-stars"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">⭐️ How to Get More Stars</span></a></li></ul></div><div><div class="w-full flex flex-row justify-between items-center my-2"><h3 class="text-sm font-medium text-gray-400 leading-6">Monthly Pick</h3></div><ul class="list-disc list-inside"><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/ai-devtools"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2024 Nov (AI DevTools)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/homelab"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2024 Oct (Homelab)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/ai-agents"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2024 Sep (AI Agents)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/rag-frameworks"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2024 Aug (RAG frameworks)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/ai-generators"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2024 Jul (AI Generators)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/ai-search"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2024 Jun (AI Searches)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/ai-web-scraper"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2024 May (AI Web Scraper)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/prompt-engineering"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2024 Apr (AI Prompt)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/non-ai"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2024 Mar (Non-AI)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/most-underrated"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2024 Feb (Most Underrated)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/text2sql"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2024 Jan (Text2SQL)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/gpt-wrappers"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 Dec (GPT Wrappers)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/tts"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 Nov (TTS)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/ai-for-postgres"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 Oct (AI for Postgres)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/coding-ai"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 Sept (Coding AI)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/cli-tool-for-llm"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 Aug (CLI tool for LLMs)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/llama2"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 July (Llama 2 Edition)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/star-history-monthly-pick-202306"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 June</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/star-history-monthly-pick-202305"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 May</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/star-history-monthly-pick-202304"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 Apr</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/star-history-monthly-pick-202303"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 Mar (ChatGPT Edition)</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/star-history-monthly-pick-202302"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 Feb</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/star-history-monthly-pick-202301"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023 Jan</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/star-history-monthly-pick-202212"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2022 Dec</span></a></li></ul></div><div><div class="w-full flex flex-row justify-between items-center my-2"><h3 class="text-sm font-medium text-gray-400 leading-6">Yearly Pick</h3></div><ul class="list-disc list-inside"><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/best-of-2023"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2023</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/star-history-yearly-pick-2022-data-infra-devtools"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2022 Data, Infra & DevTools</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/star-history-open-source-2022-platform-engineering"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2022 Platform Engineering</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/star-history-open-source-2022-open-source-alternatives"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2022 OSS Alternatives</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/star-history-yearly-pick-2022-frontend"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">2022 Front-end</span></a></li></ul></div><div><div class="w-full flex flex-row justify-between items-center my-2"><h3 class="text-sm font-medium text-gray-400 leading-6">Starlet List</h3></div><ul class="list-disc list-inside"><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/list-your-open-source-project"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">🎁 Prompt yours for FREE</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/trench"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #28 - Trench</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/langfuse"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #27 - langfuse</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/thepipe"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #26 - thepi.pe</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/taipy"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #25 - Taipy</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/superlinked"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #24 - Superlinked</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/tea-tasting"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #23 - tea-tasting</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/giskard"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #22 - Giskard</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/khoj"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #21 - Khoj</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/paradedb"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #20 - ParadeDB</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/skyvern"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #19 - Skyvern</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/prisma"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #18 - Prisma</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/spicedb"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #17 - SpiceDB</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/answer"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #16 - Apache Answer</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/infinity"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #15 - Infinity</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/proton"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #14 - Proton</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/earthly"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #13 - Earthly</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/wasp"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #12 - Wasp</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/libsql"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #11 - libSQL</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/postgresml"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #10 - PostgresML</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/electricsql"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #9 - ElectricSQL</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/prompt-flow"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #8 - Prompt flow</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/clipboard"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #7 - Clipboard</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/hoppscotch"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #6 - Hoppscotch</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/metisfl"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #5 - MetisFL</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/chatgpt-js"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #4 - chatgpt.js</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/mockoon"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #3 - Mockoon</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/dlta-ai"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #2 - DLTA-AI</span></a></li><li class="mb-2 leading-3"><a class="cursor-pointer" rel="noopener noreferrer" href="/blog/sniffnet"><span class="inline -ml-2 text-sm text-blue-700 hover:underline">Issue #1 - Sniffnet</span></a></li></ul></div></div></div><div class="w-full flex flex-col justify-start items-center"><div class="w-full p-4 md:p-0 mt-6 md:w-5/6 lg:max-w-6xl h-full flex flex-col justify-start items-center self-center"><img class="hidden md:block w-auto max-w-full object-scale-down" src="/assets/blog/tts/banner.webp" alt=""/><div class="w-auto max-w-6xl mt-4 md:mt-12 prose prose-indigo prose-xl md:prose-2xl flex flex-col justify-center items-center"><h1 class="leading-16">Star History Monthly Nov 2023 | Open-source Text To Speech Engines</h1></div><div class="w-full mt-8 mb-2 max-w-6xl px-2 flex flex-row items-center justify-center text-sm text-gray-900 font-semibold trackingwide uppercase"><div class="flex space-x-1 text-gray-500"><span class="text-gray-900">Mila</span><span aria-hidden="true"> · </span><time dateTime="2023-12-22T09:50:06.000Z">Dec 22, 2023</time><span aria-hidden="true"> · </span><span> <!-- -->4<!-- --> min read </span></div></div><div class="mt-8 w-full max-w-5xl prose prose-indigo prose-xl md:prose-2xl"><p>Happy (almost) holidays! Can't believe we are already here: the last Star History Monthly of 2023.</p> <p>A colleague of mine was making a tutorial series a while back and was looking at TTS apps to save her the hassle of recording and editing, which gave me the idea for this topic.</p> <p>Text To Speech, or TTS, which means "from text to speech". It is a part of human-computer dialogue that enables machines to speak.</p> <p>Actually, OpenAI <a href="https://openai.com/blog/chatgpt-can-now-see-hear-and-speak">started</a> to roll out voice and image capabilities in ChatGPT this September, and you can basically speak to ChatGPT and have it talk back, or chat about images. But if you feel like having more freedom and control over your TTS program, here are a few open-source options to look at.</p> <h2>EmotiVoice</h2> <p><img src="/assets/blog/tts/emotivoice.webp" alt="emotivoice"></p> <p><a href="https://github.com/netease-youdao/EmotiVoice">EmotiVoice</a> is a multi-voice and prompt-controlled TTS engine, by the NetEase Youdao team. The project gained 4.2k stars within the first week of its launch.</p> <p>EmotiVoice currently supports English and Chinese, with multiple voice and prompt controls. It offers over 2,000 different voices. The most prominent feature, as its name suggests, is emotional synthesis, which can create voices with various emotions such as happiness, excitement, sadness, anger, etc.</p> <h2>Coqui TTS 🐸</h2> <p><img src="/assets/blog/tts/coqui-tts.webp" alt="coqui-tts"></p> <p><a href="https://github.com/coqui-ai/TTS">Coqui TTS</a> is a TTS model that can clone voices in different languages in just 3 seconds. It enables cross-language voice cloning and multilingual speech generation so that you can quickly and easily create, cast, and direct AI voice actors. It's got:</p> <ul> <li>🚀 Pretrained models in +1100 languages.</li> <li>🛠️ Tools for training new models and fine-tuning existing models in any language.</li> <li>📚 Utilities for dataset analysis and curation.</li> </ul> <p>Coqui TTS is essentially a continuation of Mozilla's TTS: some time ago Mozilla stopped working on their TTS project, and <a href="http://Coqui.ai">Coqui.ai</a> was founded by the founders of Mozilla’s machine learning group.</p> <p>Note that it is licensed under MPL-2.0, which allows commercial use as well as modification of the code, but the copyright of the modified code belongs to the initiator of the software.</p> <h2>TorToiSe 🐢</h2> <p><img src="/assets/blog/tts/tortoise.webp" alt="tortoise"></p> <p><a href="https://github.com/neonbjb/tortoise-tts">TorToiSe</a> is a TTS program released in April 2022 (and got big in 2023). It has:</p> <ol> <li>Strong multi-voice capabilities.</li> <li>Highly realistic prosody and intonation.</li> </ol> <p>The author of Tortoise was actually inspired by OpenAI's DALL-E model, released back in 2021. A while later, in 2022, the author <a href="https://nonint.com/2022/10/28/ive-joined-openai/">joined</a> OpenAI.</p> <p>Tortoise is an advanced TTS system known for its exceptional voice cloning capabilities and expressiveness. Its architecture is similar to GPT (hard to imagine why), which transforms input text into discritized acoustic tokens. One notable drawback of Tortoise is it is slower, compared to other TTS models, due to the decoders it uses, which got the project <a href="https://github.com/neonbjb/tortoise-tts#whats-in-a-name">its name</a>.</p> <h2>Real-Time Voice Cloning</h2> <p><img src="/assets/blog/tts/rvc.webp" alt="rvc"></p> <p><a href="https://github.com/CorentinJ/Real-Time-Voice-Cloning">Real-Time Voice Cloning</a> can generate arbitrary speech in real-time. It is an implementation of the paper "<a href="https://arxiv.org/pdf/1806.04558.pdf">Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis</a>" (SV2TTS, a deep learning framework) with a vocoder that works in real-time.</p> <p>The project started way back in 2019, and the author (who currently works for an AI Voice Generator company), <a href="https://github.com/CorentinJ/Real-Time-Voice-Cloning#heads-up">wrote</a> that "just like everything else in Deep Learning, the Real-Time Voice Cloning repo is quickly getting old. Many other open-source repositories or SaaS apps (often paying) will give you a better audio quality than this repository will." Even so, the repo has not lost its momentum, and has been steadily gaining stargazers over the years.</p> <h2>VALL-E-X</h2> <p><img src="/assets/blog/tts/vall-e-x.webp" alt="vall-e-x"></p> <p><a href="https://github.com/Plachtaa/VALL-E-X">VALL-E-X</a> is Microsoft's cross-lingual neural codec language model, which is an extension of its original <a href="https://www.microsoft.com/en-us/research/project/vall-e-x/vall-e/">VALL-E</a>. Microsoft initially published the research paper, and a team at Nanyang Technological University reproduced the results and trained their own model.</p> <p>VALL-E X can synthesize personalized speech in another language for a monolingual speaker. Thanks to its powerful in-context learning capabilities, VALL-E X does not require cross-lingual speech data of the same speakers for training and can perform various zero-shot cross-lingual speech generation tasks, such as cross-lingual text-to-speech synthesis and speech-to-speech translation.</p> <h2>Lastly</h2> <p>If you want more AI content, check out earlier editions of the star history open-source monthly:</p> <ul> <li><a href="/blog/ai-for-postgres">AI Extensions for Postgres</a></li> <li><a href="/blog/coding-ai">GitHub Copilot alternatives</a></li> <li><a href="/blog/cli-tool-for-llm">CLI Tools for Working with LLMs</a></li> <li><a href="/blog/llama2">Llama 2 and Ecosystem</a></li> <li><a href="/blog/star-history-monthly-pick-202303">ChatGPT Special</a></li> </ul> </div></div><div class="mt-12"><iframe src="https://embeds.beehiiv.com/2803dbaa-d8dd-4486-8880-4b843f3a7da6?slim=true" data-test-id="beehiiv-embed" height="52" frameBorder="0" scrolling="no" style="margin:0;border-radius:0px !important;background-color:transparent"></iframe></div></div><div class="w-full hidden lg:block"></div></div><footer class="relative w-full shrink-0 h-auto mt-6 flex flex-col justify-end items-center"><div class="w-full py-2 px-3 md:w-5/6 lg:max-w-7xl flex flex-row flex-wrap justify-between items-center text-neutral-700 border-t"><div class="text-sm leading-8 flex flex-row flex-wrap justify-start items-center"><div class="h-full text-gray-600">The missing GitHub star history graph</div><a class="h-full flex flex-row justify-center items-center ml-3 text-lg hover:opacity-80" href="https://twitter.com/StarHistoryHQ" target="_blank" rel="noopener noreferrer"><svg stroke="currentColor" fill="currentColor" stroke-width="0" viewBox="0 0 512 512" height="1em" width="1em" xmlns="http://www.w3.org/2000/svg"><path d="M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"></path></svg></a><a class="h-full flex flex-row justify-center items-center mx-3 text-lg hover:opacity-80" href="mailto:star@bytebase.com" target="_blank" rel="noopener noreferrer"><svg stroke="currentColor" fill="currentColor" stroke-width="0" viewBox="0 0 512 512" height="1em" width="1em" xmlns="http://www.w3.org/2000/svg"><path d="M502.3 190.8c3.9-3.1 9.7-.2 9.7 4.7V400c0 26.5-21.5 48-48 48H48c-26.5 0-48-21.5-48-48V195.6c0-5 5.7-7.8 9.7-4.7 22.4 17.4 52.1 39.5 154.1 113.6 21.1 15.4 56.7 47.8 92.2 47.6 35.7.3 72-32.8 92.3-47.6 102-74.1 131.6-96.3 154-113.7zM256 320c23.2.4 56.6-29.2 73.4-41.4 132.7-96.3 142.8-104.7 173.4-128.7 5.8-4.5 9.2-11.5 9.2-18.9v-19c0-26.5-21.5-48-48-48H48C21.5 64 0 85.5 0 112v19c0 7.4 3.4 14.3 9.2 18.9 30.6 23.9 40.7 32.4 173.4 128.7 16.8 12.2 50.2 41.8 73.4 41.4z"></path></svg></a><a class="h-full flex flex-row justify-center items-center mr-3 text-lg hover:opacity-80" href="https://github.com/star-history/star-history" target="_blank" rel="noopener noreferrer"><svg stroke="currentColor" fill="currentColor" stroke-width="0" viewBox="0 0 496 512" height="1em" width="1em" xmlns="http://www.w3.org/2000/svg"><path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"></path></svg></a></div><div class="flex flex-row flex-wrap items-center space-x-4"><div class="flex flex-row text-sm leading-8 underline text-blue-700 hover:opacity-80"><img class="h-6 mt-1 mr-2" src="/assets/sqlchat.webp" alt="SQL Chat"/><a href="https://sqlchat.ai" target="_blank" rel="noopener noreferrer"> <!-- -->SQL Chat<!-- --> </a></div><div class="flex flex-row text-sm leading-8 underline text-blue-700 hover:opacity-80"><img class="h-6 mt-1 mr-2" src="/assets/dbcost.webp" alt="DB Cost"/><a href="https://dbcost.com" target="_blank" rel="noopener noreferrer">DB Cost</a></div></div><div class="text-xs leading-8 flex flex-row flex-nowrap justify-end items-center"><span class="text-gray-600">Maintained by<!-- --> <a class="text-blue-500 font-bold hover:opacity-80" href="https://bytebase.com" target="_blank" rel="noopener noreferrer">Bytebase</a>, originally built by<!-- --> <a class="bg-blue-400 text-white p-1 pl-2 pr-2 rounded-l-2xl rounded-r-2xl hover:opacity-80" href="https://twitter.com/tim_qian" target="_blank" rel="noopener noreferrer">@tim_qian</a></span></div></div></footer><div class="fixed right-0 top-32 hidden lg:flex flex-col justify-start items-start transition-all bg-white w-48 xl:w-56 p-2 z-10 "><div class="w-full flex justify-between items-center mb-2"><p class="text-xs text-gray-400">Sponsors (random order)</p><svg stroke="currentColor" fill="currentColor" stroke-width="0" viewBox="0 0 352 512" class="fas fa-times text-xs text-gray-400 cursor-pointer hover:text-gray-500" height="1em" width="1em" xmlns="http://www.w3.org/2000/svg"><path d="M242.72 256l100.07-100.07c12.28-12.28 12.28-32.19 0-44.48l-22.24-22.24c-12.28-12.28-32.19-12.28-44.48 0L176 189.28 75.93 89.21c-12.28-12.28-32.19-12.28-44.48 0L9.21 111.45c-12.28 12.28-12.28 32.19 0 44.48L109.28 256 9.21 356.07c-12.28 12.28-12.28 32.19 0 44.48l22.24 22.24c12.28 12.28 32.2 12.28 44.48 0L176 322.72l100.07 100.07c12.28 12.28 32.2 12.28 44.48 0l22.24-22.24c12.28-12.28 12.28-32.19 0-44.48L242.72 256z"></path></svg></div><a href="https://dify.ai/?utm_source=star-history" class="bg-gray-50 p-2 rounded w-full flex flex-col justify-center items-center mb-2 text-zinc-600 hover:opacity-80 hover:text-blue-600 hover:underline" target="_blank"><img class="w-auto max-w-full" src="/assets/sponsors/dify/logo.webp" alt="Dify"/><span class="text-xs mt-2">Dify: Open-source platform for building LLM apps, from agents to AI workflows.</span></a><a href="https://bytebase.com?utm_source=star-history" class="bg-gray-50 p-2 rounded w-full flex flex-col justify-center items-center mb-2 text-zinc-600 hover:opacity-80 hover:text-blue-600 hover:underline" target="_blank"><img class="w-auto max-w-full" src="/assets/sponsors/bytebase/logo.webp" alt="Bytebase"/><span class="text-xs mt-2">Bytebase: Database DevOps and CI/CD for MySQL, PG, Oracle, SQL Server, Snowflake, ClickHouse, Mongo, Redis</span></a><a href="mailto:star@bytebase.com?subject=I'm interested in sponsoring star-history.com" target="_blank" class="w-full p-2 text-center bg-gray-50 text-xs leading-6 text-gray-400 rounded hover:underline hover:text-blue-600">Your logo</a></div></div></div><script id="__NEXT_DATA__" type="application/json" crossorigin="">{"props":{"pageProps":{"blog":{"title":"Star History Monthly Nov 2023 | Open-source Text To Speech Engines","slug":"tts","author":"Mila","featured":true,"featureImage":"/assets/blog/tts/banner.webp","publishedDate":"2023-12-22T09:50:06.000Z","excerpt":"Text To Speech, or TTS, is a part of human-computer dialogue that enables machines to speak, here are a few open-source TTS options to look at.","readingTime":4},"parsedBlogHTML":"\u003cp\u003eHappy (almost) holidays! Can\u0026#39;t believe we are already here: the last Star History Monthly of 2023.\u003c/p\u003e\n\u003cp\u003eA colleague of mine was making a tutorial series a while back and was looking at TTS apps to save her the hassle of recording and editing, which gave me the idea for this topic.\u003c/p\u003e\n\u003cp\u003eText To Speech, or TTS, which means \u0026quot;from text to speech\u0026quot;. It is a part of human-computer dialogue that enables machines to speak.\u003c/p\u003e\n\u003cp\u003eActually, OpenAI \u003ca href=\"https://openai.com/blog/chatgpt-can-now-see-hear-and-speak\"\u003estarted\u003c/a\u003e to roll out voice and image capabilities in ChatGPT this September, and you can basically speak to ChatGPT and have it talk back, or chat about images. But if you feel like having more freedom and control over your TTS program, here are a few open-source options to look at.\u003c/p\u003e\n\u003ch2\u003eEmotiVoice\u003c/h2\u003e\n\u003cp\u003e\u003cimg src=\"/assets/blog/tts/emotivoice.webp\" alt=\"emotivoice\"\u003e\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"https://github.com/netease-youdao/EmotiVoice\"\u003eEmotiVoice\u003c/a\u003e is a multi-voice and prompt-controlled TTS engine, by the NetEase Youdao team. The project gained 4.2k stars within the first week of its launch.\u003c/p\u003e\n\u003cp\u003eEmotiVoice currently supports English and Chinese, with multiple voice and prompt controls. It offers over 2,000 different voices. The most prominent feature, as its name suggests, is emotional synthesis, which can create voices with various emotions such as happiness, excitement, sadness, anger, etc.\u003c/p\u003e\n\u003ch2\u003eCoqui TTS 🐸\u003c/h2\u003e\n\u003cp\u003e\u003cimg src=\"/assets/blog/tts/coqui-tts.webp\" alt=\"coqui-tts\"\u003e\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"https://github.com/coqui-ai/TTS\"\u003eCoqui TTS\u003c/a\u003e is a TTS model that can clone voices in different languages in just 3 seconds. It enables cross-language voice cloning and multilingual speech generation so that you can quickly and easily create, cast, and direct AI voice actors. It\u0026#39;s got:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e🚀 Pretrained models in +1100 languages.\u003c/li\u003e\n\u003cli\u003e🛠️ Tools for training new models and fine-tuning existing models in any language.\u003c/li\u003e\n\u003cli\u003e📚 Utilities for dataset analysis and curation.\u003c/li\u003e\n\u003c/ul\u003e\n\u003cp\u003eCoqui TTS is essentially a continuation of Mozilla\u0026#39;s TTS: some time ago Mozilla stopped working on their TTS project, and \u003ca href=\"http://Coqui.ai\"\u003eCoqui.ai\u003c/a\u003e was founded by the founders of Mozilla’s machine learning group.\u003c/p\u003e\n\u003cp\u003eNote that it is licensed under MPL-2.0, which allows commercial use as well as modification of the code, but the copyright of the modified code belongs to the initiator of the software.\u003c/p\u003e\n\u003ch2\u003eTorToiSe 🐢\u003c/h2\u003e\n\u003cp\u003e\u003cimg src=\"/assets/blog/tts/tortoise.webp\" alt=\"tortoise\"\u003e\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"https://github.com/neonbjb/tortoise-tts\"\u003eTorToiSe\u003c/a\u003e is a TTS program released in April 2022 (and got big in 2023). It has:\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eStrong multi-voice capabilities.\u003c/li\u003e\n\u003cli\u003eHighly realistic prosody and intonation.\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThe author of Tortoise was actually inspired by OpenAI\u0026#39;s DALL-E model, released back in 2021. A while later, in 2022, the author \u003ca href=\"https://nonint.com/2022/10/28/ive-joined-openai/\"\u003ejoined\u003c/a\u003e OpenAI.\u003c/p\u003e\n\u003cp\u003eTortoise is an advanced TTS system known for its exceptional voice cloning capabilities and expressiveness. Its architecture is similar to GPT (hard to imagine why), which transforms input text into discritized acoustic tokens. One notable drawback of Tortoise is it is slower, compared to other TTS models, due to the decoders it uses, which got the project \u003ca href=\"https://github.com/neonbjb/tortoise-tts#whats-in-a-name\"\u003eits name\u003c/a\u003e.\u003c/p\u003e\n\u003ch2\u003eReal-Time Voice Cloning\u003c/h2\u003e\n\u003cp\u003e\u003cimg src=\"/assets/blog/tts/rvc.webp\" alt=\"rvc\"\u003e\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"https://github.com/CorentinJ/Real-Time-Voice-Cloning\"\u003eReal-Time Voice Cloning\u003c/a\u003e can generate arbitrary speech in real-time. It is an implementation of the paper \u0026quot;\u003ca href=\"https://arxiv.org/pdf/1806.04558.pdf\"\u003eTransfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis\u003c/a\u003e\u0026quot; (SV2TTS, a deep learning framework) with a vocoder that works in real-time.\u003c/p\u003e\n\u003cp\u003eThe project started way back in 2019, and the author (who currently works for an AI Voice Generator company), \u003ca href=\"https://github.com/CorentinJ/Real-Time-Voice-Cloning#heads-up\"\u003ewrote\u003c/a\u003e that \u0026quot;just like everything else in Deep Learning, the Real-Time Voice Cloning repo is quickly getting old. Many other open-source repositories or SaaS apps (often paying) will give you a better audio quality than this repository will.\u0026quot; Even so, the repo has not lost its momentum, and has been steadily gaining stargazers over the years.\u003c/p\u003e\n\u003ch2\u003eVALL-E-X\u003c/h2\u003e\n\u003cp\u003e\u003cimg src=\"/assets/blog/tts/vall-e-x.webp\" alt=\"vall-e-x\"\u003e\u003c/p\u003e\n\u003cp\u003e\u003ca href=\"https://github.com/Plachtaa/VALL-E-X\"\u003eVALL-E-X\u003c/a\u003e is Microsoft\u0026#39;s cross-lingual neural codec language model, which is an extension of its original \u003ca href=\"https://www.microsoft.com/en-us/research/project/vall-e-x/vall-e/\"\u003eVALL-E\u003c/a\u003e. Microsoft initially published the research paper, and a team at Nanyang Technological University reproduced the results and trained their own model.\u003c/p\u003e\n\u003cp\u003eVALL-E X can synthesize personalized speech in another language for a monolingual speaker. Thanks to its powerful in-context learning capabilities, VALL-E X does not require cross-lingual speech data of the same speakers for training and can perform various zero-shot cross-lingual speech generation tasks, such as cross-lingual text-to-speech synthesis and speech-to-speech translation.\u003c/p\u003e\n\u003ch2\u003eLastly\u003c/h2\u003e\n\u003cp\u003eIf you want more AI content, check out earlier editions of the star history open-source monthly:\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003ca href=\"/blog/ai-for-postgres\"\u003eAI Extensions for Postgres\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/blog/coding-ai\"\u003eGitHub Copilot alternatives\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/blog/cli-tool-for-llm\"\u003eCLI Tools for Working with LLMs\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/blog/llama2\"\u003eLlama 2 and Ecosystem\u003c/a\u003e\u003c/li\u003e\n\u003cli\u003e\u003ca href=\"/blog/star-history-monthly-pick-202303\"\u003eChatGPT Special\u003c/a\u003e\u003c/li\u003e\n\u003c/ul\u003e\n"},"__N_SSG":true},"page":"/blog/[slug]","query":{"slug":"tts"},"buildId":"xKX4ZiOi_N7h3OBOEsSZu","isFallback":false,"gsp":true,"scriptLoader":[]}</script></body></html>