AI voice chat infrastructure that uses WebSockets. It can achieve voice-to-voice latency as low as 300ms (what GPT-4o does) without a unified voice codec. Everything runs on a single high-end consumer GPU.
Hey,
I'm curious about the scalability. How many concurrent users can it support on a single GPU?
Do you have any plans to develop a hosted version for those who might not want to set it up themselves?
Congrats on the launch!
@kyrylosilin thanks for the well wishes. if you do scalability testing, feel free to post that info, I'm sure others might curious about the answer as well. I have no plans for hosting this myself, but as this code is permissively licensed, anyone can adapt or deploy it how they want! (there are so many existing inferencing endpoints w/ cracked teams and deep pockets, its hard to recommend that as a smart business idea though)
voicechat2
Telebugs
voicechat2