
AI Development for Dummies (I’m the dummy) - plus resources from the real experts
It seems like every product on here these days is AI.
And every AI team is outstanding (I’ve even spotted a few in the same category as Meet-Ting - which I actually love, nothing better than building to save people time and bring humans together).
But let’s say you’re in the 0.1% not already building with AI - or you’re on the non-technical side of the founder house, like me.
What’s it really like to move from prototype to closed beta with an AI product?
Some of this might be basic for a lot of esteemed brains here, but if you’re a “dummy” like me, hopefully this guide helps:
Your devs need to know how to work with AI. Experience really matters - it’s not traditional software development.
AI isn’t binary. You need an eval framework to measure progress across multiple possible outcomes.
It’s all about constant testing - and you need real human data to do it.
Our biggest challenge: every new “golden set” (learning example) comes from a customer having a rough experience. If anyone has ideas on how to expand real-world testing without burning user trust, my DMs are open!
You can’t get to “delightful” without an eval framework - and you also need to define clearly what delightful means (ongoing discussion with Ting CTO...).
Wrote more about this (from the POV of a first-time founder, aka the dummy, me) on Substack here: https://chiefting.substack.com/p/ai-development-for-dummies-im-the
Replies
As a non-technical founder myself, I can relate, AI development is a whole different beast. Love that you’re documenting the learning curve. How do you turn those rough early experiences into reliable training data?
Meet-Ting
@musa_molla Thanks! Right now our approach is pretty simple: every “rough” experience becomes part of whatwould typically be called a golden set. We save the interaction, define what “good” would have looked like, and then run new versions of Ting against that set every week. It’s painful (because it means someone had a bad time - messing up a meeting for someone is *messy*!), but it’s also the most reliable way we’ve found to measure progress. Still figuring out how to scale this without burning user trust though - any tips on how you’ve tackled it on your side?