Real-time voice, video, and AI for developers

Start new thread

RTVI-AI Open Standard - Make an AI voice chat app in 21 lines of JavaScript

Rajiv Ayyangar

Product Hunt

RTVI-AI is a new open standard for Real-time Voice and Video Inference. Open source reference JavaScript and React SDKs are available today, with iOS, Android and other platform SDKS coming soon.

Replies

Best

Rajiv Ayyangar

Product Hunt

Hunter

📌

I'm a fan of anything that enables builders to build better, faster, and more expressively. This seems promising in that regard. I know @kwindla and the Daily.co team have many decades of combined experience with the WebRTC community and other open source projects. It's exciting to see them setting a new standard for real-time AI inference. From @kwindla: --- Today we’re announcing an open standard for Real-time Voice and Video Inference: RTVI-AI. The RTVI abstractions and data structures define how client applications communicate with inference services. These are the “real-time APIs” for use cases like: - Voice chat with LLMs - Enterprise voice workflows such as healthcare patient intake - Video avatars and immersive experiences - Voice-driven user interfaces - Voice conversational apps for education, customer support, and games - High-framerate image generation and streaming generative video We’re shipping open source reference JavaScript and React SDKs today, with iOS, Android and other platform SDKS coming soon. This first release has been several months in the making, and incorporates work and insights from Groq, Deepgram, fal, Cartesia, Cerebrium, Vapi, and Daily With RTVI, a “hello world” voice-to-voice AI chat app in JavaScript is 21 lines of code. If you want to build real-time AI applications, implement infrastructure for real-time inference, or implement your own SDKs that leverage the RTVI standard, you are more than welcome to join this project. We welcome all contributions and ideas!

Report

1yr ago

Kwindla Kramer

Daily.co

Maker

Thanks, @rajiv_ayyangar! Really fun to see this on Product Hunt. We've been building a lot of real-time voice and video AI apps, and there's so much potential to do useful, interesting new things. There's a live demo of here: https://demo.rtvi.ai/ And lots of good discussion on the Discord here: https://discord.com/invite/pipecat Our goal with RTVI is to make it easy to build AI voice-to-voice and real-time video applications. * Applications developers should be able to write code that can use any inference service. * Inference services should be able to leverage open source for the complicated, client-side developer tooling needed for real-time multimedia. * Any developer should be able to trivially stand up real-time AI infrastructure for small-scale use, testing, or prototyping.

Report

1yr ago

Kyrylo Silin

Telebugs

Hey Rajiv, How does RTVI-AI handle scalability for large-scale applications? Are there any performance benchmarks available? Congrats on the launch!

Report

1yr ago

varun

Daily.co

Maker

@kyrylosilin the goal of RTVI is to be able to write the client side code without worrying about the underlying infrastructure. The infrastructure in theory should be swappable. The current RTVI implementation uses pipecat bots, which uses webrtc and the @dailyco infrastructure. The daily.co infrastructure can manage 10s of millions of simultaneous calls and we have a global footprint, 15 geo locations around the world, namely, us-east, us-west, canada, london, frankfurt, middle-east, mumbai, singapore, seoul, sydney, capetown, saopaulo. That being said, since RTVI is opensource, it’s possible to add other types of transports or services.

Report

1yr ago

Andriy Semenets

DepsHub

Congratulations on the launch! How does the video part works here? Does it use the same WebRTC standard? Thanks!

Report

1yr ago

varun

Daily.co

Maker

@semanser Yes, in the example above, it sending both audio and video, but just receiving audio. It is possible to manipulate the video within pipecat (the server side) and send it back. We will have demo code for this shortly on github! Yes, pipecat supports and defaults to Daily's WebRTC transport. So you get all the benefits of webrtc's low latency and Daily's Global Mesh-SFU infrastructure.

Report

1yr ago

Pavel Bocharov

Nebbl

Wow this is so cool! Congrats on the launch! I already see a couple of ideas to implement with this, upvoted!

Report

1yr ago

Hassaan Raza

Tavus

Another amazing launch from the team at Daily. Appreciate all the great work y'all do @kwindla !

Report

1yr ago

Rudi Skogman

Looks super powerful! Good job!

Report

1yr ago

blank

Wow, this is super exciting stuff! 🚀 Kudos to @kwindla and the Daily.co team for pushing the boundaries of real-time AI inference! I love how RTVI-AI opens up so many possibilities for builders to create innovative solutions. The use cases listed are mind-blowing, especially voice chat with LLMs and immersive video experiences! Can't wait to see the iOS and Android SDKs roll out too. It’s great to know that it’s open-source, making it so accessible for developers! Definitely looking forward to trying out that 21-line "hello world" app. This feels like just the beginning. Let’s build some awesome stuff together!

Report

1yr ago

Toshit Garg

Congratulations for launch on Ph...

Report

1yr ago

Jayesh Gohel

What is RTVI-AI? Is it a new way to use AI for real-time voice and video? Can developers use it easily with JavaScript and React now? Will there be tools for other platforms like phones soon?

Report

1yr ago

varun

Daily.co

Maker

@jpgohil93 Yes, we launched with the React and Web/JS SDK, today. We are working on iOS and Android SDKs, which will be announced shortly. The way to think about this is RTVI is a the client-side implementation, which is open-source and can essentially connect to any server-side RTVI implementation. Today, the server-side implementation is pipecat.ai, which co-ordinates with the configured Speech-to-text, LLM, Text-to-speech.

Report

1yr ago

jonathan ander

This sounds like a valuable tool for developers working with real-time voice and video. The open source approach and upcoming platform SDKs are impressive.

Report

1yr ago