Openlayer

Openlayer

Test, fix, and improve your ML models

5.0
โ€ข9 reviewsโ€ข

780 followers

Openlayer is a powerful testing and observability platform for ML. It lets you collaborate with others on finding issues in models and data, debugging them, and committing new versions.
This is the 2nd launch from Openlayer. View more

Openlayer

Slack or email alerts for when your AI fails
Openlayer was ranked #5 of the day for December 7th, 2023
Openlayer provides observability, evaluation, and versioning tools for LLMs and machine learning products.
Openlayer gallery image
Openlayer gallery image
Openlayer gallery image
Openlayer gallery image
Openlayer gallery image
Free Options
Launch Team

What do you think? โ€ฆ

Vikas Nair
Maker
๐Ÿ“Œ
@mwseibel, thank you for the hunt! Hi Product Hunt ๐Ÿ‘‹๐Ÿ‘‹๐Ÿ‘‹ We are Vikas, Gabe and Rish from Openlayer. Weโ€™re building a next-gen monitoring and eval platform that allows developers to keep track of how their AI is performing beyond token cost and latency. Why? There is no comprehensive toolkit for AI reliability that gives developers insight into how and why their AI fails. Traditional software has a myriad of standardized tests and monitors, but ML development has lagged behind, often hinging on intuition and post-release user feedback. What does Openlayer do? ๐Ÿง  Intelligent testing: You can setup all sorts of tests in Openlayer, ranging from data quality and drift checks to model performance expectations using rule-based metrics and GPT evaluation. ๐Ÿ” Log and debug: Log your production requests, and run your tests on them to make sure theyโ€™re performing for your users as expected. ๐Ÿšจ Alerts: Get proactive notifications in Slack and email when somethingโ€™s amiss, and detailed reports that let you drill down into what exactly went wrong. ๐Ÿ’ผ For pre-release and production: Whether your model is in development or shipped, our platform stands watch, offering continuous analysis. ๐Ÿš€ Seamless integration: Deploy our monitors with just a few lines of code and watch as Openlayer becomes the guardrail for your AI models. Itโ€™s developer-first, hassle-free, and provides real-time insights. ๐ŸŒ Beyond LLMs: While we have specialized tools for language models, Openlayerโ€™s scope includes a wide range of AI tasks. Tabular data, classification, regression models โ€“ you name it, we can test it. ๐Ÿ“Š Subpopulation analysis: Understand how your model interacts with different data demographics. Pinpoint where your AI excels and where improvement is needed. ๐Ÿ”’ Security-conscious: Weโ€™re SOC 2 Type 2 compliant, with on-premise hosting options for those with stringent security requirements. Using this toolset, you can set up all sorts of guardrails on your AI. For example, you might be building an application that relies on an LLM, and want to guarantee the output doesnโ€™t contain PII or profanity. You can define a test for this in Openlayer. Or, you might be developing a classification model that predicts fraud on your platform, and you want to guard against data drift over time. You can do this with Openlayer. Product Hunt offer: For those willing to give feedback on their experience using the product, weโ€™re giving out free (limited edition ๐Ÿ˜‰) swag. Send an email to founders@openlayer.com after trying out the product to schedule a call! We're eager for you to explore Openlayer and shape its evolution. Join for free and start testing your models and data! โ—พ Sign up for free: https://app.openlayer.com/ โ—พ Join our community: https://discord.gg/t6wS2g6MMB โ—พ Curious about diving deeper into Openlayer? Reach out to us at founders@openlayer.com to request an upgrade to your workspace.
Eddie Forson
Congrats on the launch Vikas and team! I haven't looked much into LLM Monitoring tools but it's something on my radar for the new year. I hear the term "eval" a lot these days but how do we concretely do this if we are not a ML Engineer or Researcher? It would be great if you had some tutorials on how to do eval for the non-initiated.
Vikas Nair
@ed_forson Hey thanks! We do offer a ton of really useful tutorials that anybody can follow, even if you're not an ML engineer. You can check them out at https://docs.openlayer.com/docum... The gist is that evals are tests that you run on your models and data. It's important to stress test your AI to make sure the data it's trained on is clean, high-quality, and up-to-date. It's also important to set guardrails on the model's behavior, to make sure it performs up to standard both as you are iterating during development, and also after you ship to users. We make it super easy for anybody to be able to set up tests, whether your an ML engineer or if you serve a less technical role in a team or company that leverages AI. You can use our super simple UI to create all sorts of powerful tests that make sure your AI is highly performant and to understand where it fails. After you create these tests in the UI, you can also connect to Slack or email so your team gets notifications whenever they do fail. As your developers iterate on the problems, they can keep track of the versions and see how they improve on their tests all through our platform. Check out the docs for more, and feel free to join our Discord too and reach out to us or our community for more info about how to set up quality evals in your LLMOps stack! https://discord.gg/t6wS2g6MMB
Eddie Forson
@vikasnair Thanks for sharing all these useful resources! Will check them out ๐Ÿ‘๐Ÿฟ
Millie Crystal
Openlayer's focus on observability and evaluation for LLM and machine learning products is timely and vital. Best wishes for contributing to the advancement of AI technologies! How does Openlayer ensure the accuracy and reliability of its evaluation tools?
Vikas Nair
@millie_crystal Hey great question. There are two ways to ensure accuracy and reliability of evaluations: 1. Constructing the right set of evaluations for a specific use-case. Here, we suggest some evaluations that are worth running for most use-cases of a certain type, and provide a number of different types of tests a user can choose from and customize to fit their use-case for the rest. 2. Making sure the evaluations / metrics we calculate are actually accurate. Here, we have a rigorous set of internal unit tests to make sure our evaluations are working correctly. We also rely on very popular community packages like sklearn where possible.