Pi is a toolkit of 30+ AI techniques designed to boost the quality of your AI apps. Pi first builds your scoring system to capture your application requirements and then compiles 30+ optimizers against it - automated prompt opt., search ranking, RL & more.
Pi
👋 Hi Product Hunt makers! I'm Dhruv, founding team member at Pi Labs. We've heard from all our clients that the hardest problem in evals is not the tooling or the workflow, it's knowing what good looks like and how to measure it. We're excited to announce our second product launch to solve exactly this, the Pi Copilot, and are keen to get feedback from the maker community.
• The copilot builds your first set of eval metrics within seconds, rather than requiring you and your team taking a bunch of time to brainstorm metrics. Instead of endlessly iterating on prompts to make your LLM as a Judge "work", watch the copilot write you qualitative checks and python code for more objective metrics
• These evals use our proprietary Pi Scorer language models - small and fast encoder models trained specifically for scoring that let you assess 20+ quality dimensions in sub 100ms - to provide faster, more consistent scoring than LLM as a judge
• Pi's scoring models can be calibrated with human feedback. Manual calibration, labeled data, or preference pairs, your scoring system adjusts to your and your users' preference creating robust feedback loops for your application.
• Because our scorers are so fast and lightweight (sub 100ms), they can be used beyond just evaluation; from reward modeling for RL to online control flow with agents
Try Pi out, no sign in required at https://withpi.ai. Have Pi build your first scoring system in seconds and start optimizing your AI right away. You can also visit https://code.withpi.ai for our API reference and links to end to end tutorials and notebooks that show you how to use those techniques in real-world examples.
this is really smart!