Spending countless hours manually testing your chatbot after each change? Automate the full testing process with bottest.ai, the no-code platform to build quality, reliability, and safety into your AI-based chatbot. Get started now: https://bottest.ai
Hi! I'm Noah the Founder and CEO of bottest.ai
Our mission is to help AI creators building chatbots with the testing process. Often times, developers or product managers will spend hours and hours a week manually having the same conversations with their chatbots to ensure the quality is the same after each internal change.
Traditional testing paradigms don't solve this issue for 3 main reasons:
1. Manual testing quickly becomes overwhelming as your chatbot evolves.
2. Developer-created evaluations eat up precious time and rarely cover all scenarios.
3. Rigid testing scripts can't capture the fluid nature of language in conversations.
bottest.ai is a no-code platform to fully automate the testing of your chatbot. We use an AI-powered evaluation engine to effectively determine if the quality of your chatbot is degrading with each change.
We are currently running our beta program which has full access and is completely free for all users for the next 6 months. Get started testing now! https://bottest.ai
@noah_moscovici Hey Noah, cool idea. I'm currently building a chatbot product. We have some evals already setup. Is there an easy way for us to import this into your tool? Check out our launch and lmk if we're actually a good fit for what you're building
I signed up :)
@shashank_sanjay Thank you! We just launched our beta program which has full access and is completely free for the next 6 months.
I'm curious to learn more about your current setup. We don't have a direct to customer way to import evals yet, but depending on what type of data you have and how you are storing it there may be some easy ways we can import it for you!
@dash4u Good question! I'll answer your question in two ways, since it seems to hit on a couple of good questions:
1. How do we evaluate whether something is "correct"?
We have an AI-powered evaluation engine that breaks down the conversation between you and the chatbot, and compares it to the baseline from when you originally recorded the test. Our AI evaluation engine can accurately pick out the key differences in the Tests, and any inconsistencies or deviations from the baseline. The engine can then determine whether these differences are major enough to fail the test, or pass the test if they are just minor changes with no semantic difference.
2. What if my definition of "correct" is unique to my product?
You can fully customize what constitutes a "pass" in bottest.ai. This is called "Success Criteria" and can be customized on a Test level, or on an entire Suite level. You can fully define exactly what types of difference should pass and which should fail. So if you don't care about tone or intent differences and only factual information, you can specify that in your configuration!
Or maybe you're expecting the chatbot to respond with a lot of variance, and you just want to make sure the general tone/intent is the same, you can fully specify that as well. There's unlimited freedom when it comes to customizing this aspect of the bottest.ai evaluation process
Super excited about bottest.ai, Noah! 🙌 Automating the testing process for chatbots is a game changer. Manual testing can be such a time sink, and your no-code approach is going to help a lot of developers focus on what really matters. Can't wait to see the traction you gain during this beta phase! Upvoting this for sure!
This sounds like a game changer, Noah! 🛠️ Automating the testing process for chatbots seems like a huge relief for developers who are constantly updating their bot. I'm curious, though—what specific types of evaluations does your AI engine perform that traditional methods might miss? Also, how do you ensure the platform adapts to different languages or dialects? Since chatbots are increasingly used globally, I wonder how your tool can handle that fluidity in conversation across diverse linguistic contexts. Looking forward to seeing how bottest.ai evolves during the beta phase!
@star_boat Thank you for your thoughts! You asked a couple good questions so hopefully I can answer them:
1. What types of evaluations does your AI engine perform that traditional methods miss?
Traditional testing methods fail when testing chatbots in 3 main ways:
a) Language based responses are subjective and non-deterministic. Unlike traditional software testing, where you have an input with a determined expected output, responses from a chatbot require a nuanced evaluation based on semantic meaning.
b) Upgrades or improvements in the AI can cause unexpected issues elsewhere. Each change to the underlying LLM or AI model may improve the quality of answers on some questions but cause quality degradation on other prompts. Developing a high-quality AI chatbot without extensive regression testing is practically impossible.
c) The subject matter experts often aren't the ones maintaining the code. The people in your team who can best determine how the chatbot should perform are rarely the same engineers working on maintaining the tests. This makes it very difficult to build comprehensive automated test coverage for a chatbot.
Our AI-evaluation engine solves the first point, allowing for language responses to be evaluated in a multi step process that can pick apart key differences you care about and skip the differences like rephrasing/synonyms or other noise. However, using an AI to test the quality of another AI isn't anything new, The true power of bottest.ai comes from the other 2 points, allowing automated regression tests across all questions whenever changes in your chatbot happens, and the ability for product owners to directly manage the tests and quality (no more passing prompts and expected responses to developers, who then have to judge whether a response is "good" when they aren't experts in the niche of your chatbot!)
---
2. How do you ensure different language support?
Our AI-powered evaluation engine is built language agnostic. Your failure reasons/details or any information about the tests will be in English, but our engine can handle any language that the conversation may be happening in.
For example, if your chatbot serves an international customer base, you can record and test in that conversation, but all of the information that you need (such as why tests failed) will stay in English.
---
Hopefully that answers your questions! Thank you for your support, and let me know if I can answer any other questions :)
bottest.ai
Resubscribe
bottest.ai
bottest.ai
bottest.ai
bottest.ai