
Inworld TTS - Voice AI that’s 5% of the cost. 100% of the quality.
Inworld TTS makes state-of-the-art Voice AI more accessible, with radically affordable pricing ~20x lower than comparable models. It's real-time, multilingual and offers free voice cloning. We're also open sourcing our training and modeling code.
Replies
Inworld
Hi Product Hunt! I'm Kylan, co-founder of @Inworld, and I'm stoked to share Inworld TTS with you.
We've spent the past four years working alongside thousands of builders, and this launch represents a lot of what we've learned since our first Product Hunt debut in 2022. At Inworld, we build AI products that help consumer applications grow and evolve with their users. Inworld TTS is our first step towards removing a critical barrier that is keeping builders from their next million users.
New users get 2M free characters. Accessible via API or in our Playground. Try it now.
Inworld TTS delivers state-of-the-art quality and latency at the most affordable price on the market.
$5 per million characters, with comparable models around $100. Here’s a quick example:
What do you get?
✅ Industry-leading quality (Word Error Rate & Speaker Similarity)
✅ Real-time latency (median latency of 200ms)
✅ Free zero-shot voice cloning
✅ Multilingual and crosslingual support
✅ Audio markups for emotion, style and nonverbals
✅ SOC2 Type II + on-premise deployments
✅ Open-sourced training and modeling code
How is this possible?
We're focused on removing the most pressing infrastructure barriers that keep great AI applications from scaling. Voice is one of the biggest cost and complexity hurdles facing today's builders, so we decided to tackle it head-on.
We repurposed large language models for speech synthesis rather than using traditional TTS architectures. This innovative approach, combined with streamlined serving infrastructure, enabled us to deliver state-of-the-art quality and real-time performance at a fraction of the cost. You can read (or listen) to the specifics here.
Where can you try it?
Inworld TTS is available today via API and can be experienced in our TTS Playground, where you can test pre-built voices or clone your own voice. Find more technical details in our blog post or try it now.
Inworld
@kylan_gibbs Exciting to be launching on Product Hunt again – biggest product launch to-date, with much more to come. Scale is getting solved. Evolution is next.
Inworld
@kylan_gibbs looking forward to the next steps :)
Inworld
@kylan_gibbs It's truly remarkable to see what the team has accomplished with this product. The effort and brilliance that's gone into this labor of love cannot be overstated and is clearly evident in the results. So pumped to see what's to come!
@kylan_gibbs This is a game changer for developers - Excited to see users interact with AI via voice seamlessly!
@kylan_gibbs Can't wait to see the great things that developers are going to build using this!
This looks impressive! How does Inworld TTS handle different languages and accents?
Inworld
@evgenii_zaitsev1 Thanks! We currently support 11 languages, not including accents. If a voice cloning prompt represents all the particularities of an accent, the model will reproduce most of them. If you have more data, we can perform professional cloning (self-service will be available later). We have customers to whom we delivered multiple voices with a New York accent, a Kiwi accent, etc.
The Max model is better for multilingual support and accent preservation. API will be available soon, currently in UI only.
The next model iteration will have further improvements in quality and support for more languages. If you are interested in specific languages, please let us know. For now, if you have an app, you can set up routing that sends traffic to Inworld for the languages we support.
No Cap
@No Cap will be trying this out!
Inworld
Inworld
@ednevsky let us know what you think! Always looking forward to feedback :)
I really see a fit here. We're cloning spokesperson voices for our clients and own companies at the moment... Is that something you're looking to launch as well? Clone voices, that is.
Inworld
@manuel_lemholt_berger Yes. Currently, you can clone any voice from 5-15 seconds samples via the UI. We also offer professional cloning if you have a longer/more voice samples. Soon, all of this functionality will be available via API.
The pricing here is awesome — to the point where you can imagine adopting voice capabilities into contexts where it previously would have been economically infeasible.
What are some interesting use cases that people might be willing to try now that before they wouldn't have bothered with?
@chrismessina Yes, exactly! Things like personalized language learning apps, voice narration driven AI games, real-time translation, and voice-enabled customer support agents at scale. But that's just scratching the surface, really excited to see what people build!
Inworld
Inworld
@chrismessina that's the objective :). Democratizing the technology so that all consumer applications, be it startups or enterprise can have access.
Inworld
And by the way, both Inworld TTS-1 and TTS-1-Max models are multilingual!
They support English, Chinese, Korean, Dutch, French, Spanish, Japanese, German, Italian, Polish, and Portuguese.
This looks amazing! We’ll be rolling this out for our team!
Inworld
Inworld
@kim_schreiber love it! Let us know if you have any questions in the implementation!
This rocks!!! Voice clone is insanely good
Inworld
Fantastic work on Inworld TTS! The decision to open-source your training and modeling code is a significant step, fostering transparency and community collaboration. Beyond the technical aspects, what kind of innovative use cases are you most excited to see emerge from this open-source initiative?
Inworld
Exitfund
Interesting approach. How do you ensure the AI evolves in sync with user behavior and not just usage data?
Inworld
@startupsharma for TTS in particular it is a problem that is yet to be solved. But there is idea how - end-users listening to the model generations may like/dislike certain generations and then update the model to behave to their liking. We're working on solutions for this to become a part of the offering.
Tribalist
Fantastic voice quality, translation support and feature comp to their competitors and at an unbeatable price! Look forward to putting it to use. Congrats to @kylan_gibbs and the @Inworld team!
Inworld
Thanks @vlasso ! Appreciate your support!
We are excited for you to put it to use! Keep us posted on any feedback you may have.
Really impressive team and consistently high quality product output! The fact that you can clone a voice with only 5-15 seconds of audio is wild!
Inworld
Really great voice quality! Nice work from your researchers and looking forward to trying it out in some of my projects.
Inworld
We are having an lot of fun experimenting with this model in the office! Great work, team.
Inworld
So many use cases for TTS!! this is awesome will defenitly try!!
The examples and features sound good! I'm going to try some of these voices out to power Mary Maker.
Inworld
Cloned my voice and I was impressed with how well it picked up the nuance in my Nepali accent. Good job Inworld!
Inworld
So wonderful TTS model. Its voice cloning function surprise me.
Inworld
@aalisabb great, thanks!
Congrats!! Woohoo!
Absolutely love this feature! It looks really impressive.