
Announcing: Voice Agents course and online community ...
@swyx and I are hosting a month-long technical deep dive into Voice AI and Voice Agents, starting in May.
Our goals are to:
cover all the lessons we've learned over the last two years building realtime, conversational AI,
host fun sessions with all our favorite people who are doing related things,
and
build a long-term online community.
Sign-up link: https://lnkd.in/gnPuHyD4
We'll start announcing free credits for students on Monday. Sign up this weekend with promo code PHUNT for a super-secret Product Hunt community discount.
Last year I signed up for the LLM fine-tuning course taught by Hamel Husain and Dan Becker.
The experience was fantastic in every way. The material was great. The course expanded to cover way more than fine-tuning. It seemed like all of Twitter signed up. I met people in the course Discord that have become online and offline friends. Someone eventually dubbed the course "AI Woodstock." (I think credit for that goes to Swyx.)
We think this is the moment to try to create a similar thing for voice AI.
Voice interfaces are going to be a huge part of the near-future of computing. Voice agents are being deployed at scale today for a wide range of use cases.
collecting patient data prior to healthcare appointments
following up on inbound sales leads,
handling an increasing variety of call center tasks,
coordinating scheduling and logistics between companies, and
answering the phone for nearly every kind of small business.
I'm personally excited about voice interactions for games, realtime video, and voice-enabled programming environments.
https://lnkd.in/gnPuHyD4
Promo code: PHUNT
Here's my hacked-together, messy, voice-based dev environment:
Voice-driven loop with screen-shotting so the LLM in the loop can see what's in my terminal and editor. The prompt varies depending on what I'm trying to drive with this loop.
A few tool definitions that give read access to files and URLs.
A tool the LLM can send a block of output to that generates keyboard events, so the LLM can drive any editor/terminal.
A separate process watching a directory and constantly making LLM-driven git commits. (git autosave).
I have some pieces of this running most of the time. But I'm lazy, and doing other stuff, and I also try to use a variety of editors and tools, to see what's good lately. Which ... no stability, so my hacked-together stuff is always broken.
I don't want to replace @Windsurf / @Cursor / Claude code. A seriously good agent and expert-system dev toolkit is a lot of work.
Daily is your own URL for video calls, free. No downloads! Teams, 50 person calls, international dial-in, screen shares, lock rooms on your domain. Get a URL you like!