Sign in

Simulation testing for AI agents

Start new thread

Janus - Simulation testing for AI agents

by

Product Hunt

Janus battle-tests your AI agents to surface hallucinations, rule violations, and tool-call/performance failures. We run thousands of AI simulations against your chat/voice agents and offer custom evals for further model improvement.

Replies

Best

Janus

Maker

📌

Hi, we're Jet and Shivum, and today we're launching Janus!

AI agents are breaking in production - not because companies aren't testing, but because traditional testing doesn't match real-world complexity. Static datasets and generic benchmarks miss the edge cases, policy violations, and tool failures that actual users expose.

We built Janus because we believe the only way to truly test AI agents is with realistic human simulation at scale - AI users stress-testing AI agents.

What makes Janus different?

Unlike other platforms, we don't give you canned prompts or off-the-shelf evals. Instead, we generate thousands of synthetic AI users that:

1. Think, talk, and behave like your actual customers
2. Run thousands of realistic multi-turn conversations
3. Evaluate agents with tailored, rule-aware test cases
4. Judge fuzzy qualities like realism and response quality—not just guardrail pass/fail
5. Track regressions and improvements over time
6. Provide actionable insights from advanced judge models

This is simulation-driven testing designed for your domain - not generic playgrounds.

🧠 Our Vision
We believe human simulation will become the standard for AI agent evaluation. As agents become more sophisticated, only realistic human behavior can truly stress-test their capabilities and surface edge cases before users do.

🚀 Try Janus Today
Book a demo today and see Janus generate custom AI users for your specific business!
We rethought AI agent testing from the ground up with human simulation - let's make reliable AI agents the norm, not the exception.

Get started at withjanus.com

3mo ago

Prit

A lot of AI companies made powerful AI models,

but even the developers couldn't trust their results, because of halluciations, policy breaks, etc.

I hope them to sleep without worry :) Congratulations!

3mo ago

Janus

Maker

@pritraveler Thank you so much! That's exactly why we built this ourselves as well - it's so easy to ship an AI agent. But why hasn't evals and testing gotten easier as well? Janus is our passion project to help fix this!

3mo ago

Janus provides exactly the kind of rigorous testing AI agents need before going live. The large-scale simulations and customizable evaluations make it a powerful ally for building more reliable systems.

3mo ago

Janus

Maker

@supa_l We love to hear that! You put it perfectly - right now evals need to be fundamentally rethought as conversational AI becomes more and more important in our everyday lives.

3mo ago

CoLaunchly

Congratulations on the launch, Jet and Shivum! Janus sounds like a game-changer for AI testing. The focus on realistic human simulation to stress-test AI agents is so crucial in addressing real-world complexities. Excited to see how this advances reliable AI development. Best of luck!

3mo ago

Janus

Maker

@alex_cloudstarThank you so much! We're thrilled to hear that, and we completely agree — capturing real-world nuance is essential for building robust AI. Janus is just the beginning, and we’re excited to push the boundaries of what’s possible in AI testing. Appreciate the support! 🙌

3mo ago

Hyring

This looks interesting @jw_12 ! We're currently using Coval and would like to understand how Janus is priced, as well as some of its key differentiators.

3mo ago

Janus

Maker

@adithyan_rk Would love to chat! Feel free to book a demo!

3mo ago

Geocities.live

@jw_12 We definitely need to introduce Janus in @Job for Agent 🔥

3mo ago

Janus

Maker

@kamilstanuch Thanks Kamil!

3mo ago

All the best for the launch @jw_12 & team!

3mo ago

Janus

Maker

@parekh_tanmay Thanks Tanmay, really appreciate the support!

3mo ago

Jazzberry

How do you get the thousands of synthetic AI users to behave differently, so that you cover all user paths?

3mo ago

Janus

Maker

@marco_dewey Great question Marco! We use a mix of data-driven techniques to make the magic happen - but definitely a long ways to go still in refining and improving our product!

3mo ago

We are precisely having this problem at our company now, I will reach out for a demo!

3mo ago

Janus

Maker

@manuelflara Looking forward to chatting Manuel, would love to find a way to help!

3mo ago

Janus is a powerful tool for testing and improving AI agents! By running thousands of simulations, it surfaces hallucinations, rule violations, and tool-call/performance failures, ensuring your agents are battle-tested and reliable. I’m excited to see how its custom evaluations help fine-tune models for better performance and accuracy!

3mo ago

Pokecut

Congrats on the launch! 🚀 Janus sounds like an essential tool for anyone building with AI agents. Love how you’re tackling the tough problems like hallucinations and rule violations—those can be real blockers for scaling AI solutions. The ability to run thousands of simulations and get custom evaluations is a game-changer for improving model reliability and trustworthiness. Definitely excited to try it out and see how it can help take our AI products to the next level! 💡🤖👏

3mo ago

Grok Button

super neat but...pricing doesn't seem to be simple/transparent?

https://www.withjanus.com/pricing

3mo ago

Janus

Maker

@niyogi Appreciate you checking it out! Totally fair point on pricing - we’re working closely with early partners to tailor things based on usage patterns, but we hear the call for more transparency. Thanks for the nudge!

3mo ago

Congrats on the launch! Cool demo and great team! :)

3mo ago

Janus

Maker

@xinding thanks Xin - really appreciate the vote of confidence!

3mo ago

Such a game changer! Congrats on the launch!

3mo ago

Janus

Maker

@jennifer_song3 thanks Jennifer!

3mo ago

Den

run my ai agents through this pls

3mo ago

Janus

Maker

@justin_lee27 Haha might have to!

3mo ago

I think this is much wanted for AI Agent

3mo ago

The "Jenkins for AI agents" is born 🛠️. Must-have for:

- Deterministic scenario replay 🔄

- Multi-agent collision testing 💥

- Ethical boundary stress tests ⚖️

3mo ago

This is so needed.

3mo ago

huge

3mo ago