Sign in

p/vapi

Voice AI for developers

AMA w/ Vapi Founder — the future of voice & agents, series A, and more

Jordan Dearsley

Featured

19

•

1d ago

Hi everyone,

One year ago, we launched on this platform with a vision to transform voice AI. Today, Vapi has grown to 100,000+ developers and recently raised our $20M Series A.

Voice AI has reached its tipping point. We're seeing hundreds of startups, agencies, and developers building innovative voice solutions for enterprises and SMBs on Vapi's platform.

AMA about building better voice products, promising use cases, voice models, and what's coming in 2025. I'll be around 1PM PT to answer your questions!

19

Replies

Best

Guillermo Horno

I’m a happy Vapi user!! :) What are some cool things we can expect that you guys have in the development pipeline?

19h ago

Jordan Dearsley

@guilledhorno Awesome! Biggest thing we hear from customers is the need for better agent reliability. So we're investing a ton in Workflows, a way to design step-by-step conversation flows that stay on the rails :)

15h ago

Product Hunt

Why is Siri so bad and what's involved in building a better alternative? What's hard about that problem?

20h ago

Jordan Dearsley

@rajiv_ayyangar Thanks for the question! The current Siri stack is pre-generative.

To handle two-way multi-turn interaction, you need to hold context across multiple turns. And this means more complex and indeterminate inputs. Generative models are needed to handle the increased complexity.

We'd expect apple to come out with something in the next year, but there is a lot of risk that comes with indeterministic models, so they're taking their time.

20h ago

@rajiv_ayyangar Oh man. Siri. I sometimes have to use the "Bene Gesserit" voice from dune to get siri to turn out the lights. 😂

7h ago

Product Hunt

31m ago

Product Hunt

I've followed you and @nikhilro since you built @Superpowered , which I really admired for its craft and attention to UX. What prompted the pivot to AI voice infra?

20h ago

Jordan Dearsley

@rajiv_ayyangar Haha the old days. To be honest, we burnt out. It was 3-4 years on Superpowered.

We grew it to profitability, it was a sustainable business. But, we weren't growing fast, in general it's hard to build a unicorn as a B2C productivity tool, you have to go to enterprise. Either we could have:

Picked a vertical and go deep with note-taking (ex. Healthcare scribes), then go hard on enterprise.
Became another all-in-one team productivity platform, then go hard on enterprise

In either case, we didn't have the will to go down either of these paths, so we decided to pivot.

20h ago

Product Hunt

You've got lots of developers building on Vapi. What are a few of the products built on Vapi you're most excited about, and what do you think makes them special? (I'm guessing you work really closely with these teams so you get a privileged view of their iteration speed and style).

20h ago

Jordan Dearsley

@rajiv_ayyangar We see everything from customer service to AI girlfriends haha. Generally I'm most excited by the ones that access to unlock voice to someone who couldn't access voice technology before.

Ex. We have customers serving tradespeople to help them accept more inbound appointments after hours, others helping patients get their test results faster, etc. etc.

Without these last-mile builders, this tech couldn't get out into the world.

20h ago

Product Hunt

Kinda niche UX question: As I'm using agents and apps in general, I find myself going to audio input / dictation more and more. I really wish there were more audio-in-text-out style interfaces, because it seems like the most efficient, highest-bandwidth interface for semantic info. Have you seen this anywhere?

For example, in ChatGPT advanced voice, it frustrates me to no end that I have to wait for a voice reply, rather than just reading the output.

20h ago

Jordan Dearsley

@rajiv_ayyangar I do agree voice-in text-out will be the highest-bandwidth interface, and we are starting to see this with Apple Intelligence, but it's not quite real time yet.

I have seen some applications in Drive-thru voice AI, like https://www.of.one/ where the user says what they want and the order form changes in real-time visually. The order itself is a lot of context to hold in a person's memory (and annoying to have it all confirmed back to the user), so it makes sense here.

Other than that, I have not seen enough innovation here. But would expect developments in Apple Intelligence to change this norm and drive the wave.

P.S. Whispr Flow!! https://wisprflow.ai/ Less of an interface, but great dictation experience.

20h ago

Product Hunt

Do you have the same "hallucination" issues in AI voice as with text? Does that make it hard for large customers to adopt AI voice for deterministic use cases? If so, what 'breakthrough' is needed to overcome this?

18h ago

Product Hunt

Should we be concerned about "voice-likeness" and how models are storing our voice data for training? I'm thinking about voice actors and how they can secure (or enhance) their livelihood as models evolve and become more indistinguishable from real humans.

19h ago

How do you see voice AI changing human computer interaction long-term?

16h ago

When using an agent with Twilio, the quality of the results often gets worse. How can we improve this? For example, in the Hungarian language, the performance degrades significantly.

10h ago

I was playing around with 11labs voice models yesterday. And as with OpenAI's voice models. I find my self having to record multiple takes of different text segments. Since every take is slightly different. Some takes have clear "voice flukes". So in order to have a clean output. I then have to cherry pick which iterations works best. Then stitch them together in one session. I mainly use this for product presentation videos. So having a clean session without the "AI artifact tell tales" is the key. So people don't notice it's AI. I have to do the same with OpenAI's voice models, they also have "artifacts" in the output. I actually prefer OpenAIs voice models over 11labs at least for product videos. My question is. Is there going to be a "cherry picker algorithm" out soon? its kinda laborious to make nice clean outputs. How are you folks at vapi thinking about this?

10h ago

How do you plan to tackle the subtle nuances of human conversation—things like emotion, sarcasm, or even cultural references—when building out Vapi’s next-generation voice AI features, and do you think mastering these more “human” qualities will ultimately be the key differentiator in the market by 2025?

8h ago

Congrats on the incredible growth and the Series A!
Curious, what were the strategies that helped Vapi gain early traction and reach this scale so quickly? Also, what metrics did investors care about most during your Series A?

Thanks for doing this AMA!

12m ago