Adarsh Punj

Flapico - Prompt versioning, testing, and evaluation

Flapico lets you version, test & evaluate your prompts, and makes your LLM apps reliable in production. 🔓 Decouple your prompts from your codebase 📊 Quantitative tests, instead of guesswork 💻 Have your team collaborate on writing & testing prompts

Add a comment

Replies

Best
Adarsh Punj
Maker
📌

Hi everyone,

I am excited to launch Flapico. Over last couple of years, I've been working with different companies to build & ship LLM apps. I saw some common problems everywhere, irrespective of the size of the company.

- People unsure about where to keep prompts (some track in git, which makes it messy).

- Functional teammates sharing prompts over Teams & Slack, so it can be integrated by developers.

- The selection of LLM (and prompt) is often a gut-feeling or a guesswork, instead of numbers.

Flapico addresses these issues, and more. Being a developer, I know I never want to change my development paradigm. That's why Flapico doesn't aim to replace your existing tools or methods. Instead it exists to solve just the right problems.

It gives me joy to see how functional team members (like domain experts) are collaborating with developers on Flapico.

I'd love to hear your thoughts, and feedback.
Feel free to reach out at adarsh[at]flapico[dot]com

Masum Parvej

@adarsh_punj Would be cool to version prompt performance over time. Helpful when teams tweak things but want to track regressions.

Adarsh Punj

@masump Thank so much for the suggestion. In next phase, we're planning to build some features around anomaly detection so we can know how a prompt's been performing, and alert users beforehand.

Parth Ahir

@adarsh_punj 

Totally get this — I’ve seen the same mess with prompts floating around Slack and no real system to track what works. Love that you’re not trying to reinvent the dev workflow but just solve the annoying parts. Curious to see how teams use it in the wild.

Supa Liu

Flapico offers a clever way to take control of your LLM prompts—versioning, testing, and evaluating them for reliability in production. I love how it replaces guesswork with real data and supports team collaboration, making prompt management way more professional and efficient.

Adarsh Punj

@supa_l Thanks!

Jason Chernofsky

this is awesome!

Evan

"Is it possible to take it even further? Like, not just managing prompts, but also handling the large model api and api keys on all my platforms."

Adarsh Punj

@yige Yes you can bring your models & API keys (which are encrypted with Fernet).