Sign in

p/deepchecks-monitoring

Open Source Monitoring for AI & ML

Start new thread

Deepchecks LLM Evaluation - Validate, monitor, and safeguard LLM-based apps

by

Kevin William David

Continuously validate LLM-based applications including LLM hallucinations, performance metrics, and potential pitfalls throughout the entire lifecycle from pre-deployment and internal experimentation to production.🚀

1yr ago

Replies

Best

Deepchecks Monitoring

Maker

📌

Thanks @kevin for hunting our LLM Evaluation solution 😊 👋 Hey, ProductHunt community I am Shir, co-founder and CTO of Deepchecks. At Deepchecks, we’ve built a pretty special solution for LLM Evaluation and are thrilled to launch it today on ProductHunt! When we launched our open-source testing package last year, we quickly received an overwhelming response with over 3K stars 🌟 and more than 900K downloads! After the launch of our NLP package in June, we noticed that an incredible amount of the feedback calls we were having about the NLP package were asking for help with evaluating LLM-based apps. 🤯 After creating an initial POC and getting feedback from various companies, we gained the confidence we needed to dive deeply into the LLM Evaluation space. And yes, turns out it’s a pretty big deal. 🚀 As we began working on the LLM Evaluation module, we’ve arrived at some important learnings that teams are struggling to figure out answers to these questions while deploying their LLM apps: - Is it good? 👍 (accuracy, relevance, usefulness, grounded in context, etc.) - Is it not problematic? 👎 (bias, toxicity, PII leakage, straying from company policy, etc.) - Evaluating and comparing versions (that differ in their prompts, basemodels, or any other change in the pipeline) - Efficiently building a process for automatically estimating the quality of the LLM interactions and annotating them - Deployment lifecycle management from experimentations/development, staging/beta testing, to production. Deepchecks LLM Evaluation solution helps with- ✅ Simply and clearly assess "How good is your LLM application?" 🔀 Track and compare different combinations of prompts, models, and code. 🔍 Gain direct visibility into the functioning of your LLM-based application. ⚠️ Reduce the risk during the deployment of LLM-based applications. 🏛 Simplify compliance with AI-related policies and regulations. We're also hosting a launch event today at 8.30 AM PST today, feel free to sign up to interact with the Deepchecks team and see a live demo: https://www.linkedin.com/events/... Apply for Deepcheks LLM evaluation access: https://deepchecks.com/solutions... 😊 Would appreciate any questions, and hope to see you there!

1yr ago

Congrats on your launch 🚀 great stuff!

1yr ago

Garen D Orchyan

Sugar Free: Food Scanner

@shirch Congratulations on the launch team, best of luck today ♥️🦄

1yr ago

so promising...

1yr ago

Deepchecks Monitoring

Maker

@hay_day3 thanks my friend!

1yr ago

Divyansh Chaurasia👨🏻‍💻

Deepchecks Monitoring

Maker

📌

Excited for the launch! 🎉

1yr ago

Deepchecks Monitoring

Maker

@asdivyansh Such a pleasure to have you with us on this journey

1yr ago

Deepchecks Monitoring

Maker

@asdivyansh yup it’s a big deal ❤️

1yr ago

Wilco

I've been loving every release from this team. Can't wait to try this one out.

1yr ago

Deepchecks Monitoring

Maker

@on thanks so much! Looking forward to hear your thoughts :)

1yr ago

Deepchecks Monitoring

Maker

@on can’t wait for the feedback!

1yr ago

Wow. You guys can ship. I've seen at least two other really useful tools you've published in the past few months.

1yr ago

Deepchecks Monitoring

Maker

@mtngs_io haha, thanks! Crazy advancements in the world, we're doing our best to keep up :) and of course, this is a great opportunity to say once again a huge kudos to the team.

1yr ago

Deepchecks Monitoring

Maker

@mtngs_io you rock! Thanks my friend

1yr ago

Antoni Kozelski

Evryface

🚀 Congratulations on launching the Deepchecks LLM assessment! You have mastered analyzing applications using artificial intelligence! Your team did a complete job! 💯

1yr ago

Deepchecks Monitoring

Maker

Thanks @antonikozelski!

1yr ago

Deepchecks Monitoring

Maker

@antonikozelski not sure if we mastered it, but it is a big step 😇

1yr ago

Congratulations ! 🚀 Your journey from the overwhelming success of your open-source testing package to this latest venture is truly inspiring. It's evident that Deepchecks has tapped into a vital need in the rapidly evolving field of language model applications.

1yr ago

Deepchecks Monitoring

Maker

@qiufeng Appreciate your following during our journey!

1yr ago

Deepchecks Monitoring

Maker

@qiufeng thanks my friend!

1yr ago

Andrey Cheptsov

dstack

Congratulations on the launch! It's an amazing and much-needed product.

1yr ago

Deepchecks Monitoring

Maker

@andrey_cheptsov happy to hear your thoughts :)

1yr ago

Akanksha Bhasin

Devtron

Congratulations Deepchecks team on the launch! 🚀 It is truly an impressive solution in the world of LLMs!

1yr ago

Deepchecks Monitoring

Maker

@akankshabhasin thanks for the kind words and support

1yr ago

Deepchecks Monitoring

Maker

@akankshabhasin thank you so much!

1yr ago

Laurentiu Stefan

Cool product. Congrats on the launch! 💪

1yr ago

Deepchecks Monitoring

Maker

@laurentiu_stefan thanks so much my friend!

1yr ago

Nice! How does it support RAG if any?

1yr ago

Deepchecks Monitoring

Maker

@matan_mishan Thanks for your question. Indeed, this is one of the most popular use cases users use :-) Question answering, customer support, etc... We enable logging the various steps in the interaction (e.g. the input, information retrieval part, output, etc.), and common issues we find are things like: the output wasn't based on the information retrieval part (a.k.a. indication for hallucination), the retrieved info isn't relevant to the question asked, etc.

1yr ago

Manmohit Grewal

Crustdata

Congrats team Deepchecks LLM Evaluation on the launch!

1yr ago

Deepchecks Monitoring

Maker

@manmohit Appreciate it!

1yr ago

Deepchecks Monitoring

Maker

@manmohit thanks my friend!

1yr ago

Congratulations on launching the Deepchecks LLM assessment! This is an incredible achievement and a testament to your team's dedication to the field. I can see how this will be a game-changer for many projects. Keep up the great work!

1yr ago

Deepchecks Monitoring

Maker

@shai_yanovski Thanks so much. Appreciate your support throughout our journey! And looking forward to our next random meeting on bikes in the park 😅

1yr ago

Avaturn: Real 3D Avatars from Photo

Great stuff, we are using deepchecks for our internal LLM evaluation, requires couple of minutes to get big insights!

1yr ago

Deepchecks Monitoring

Maker

@sergei2020 thanks a million my friend!

1yr ago

@ptannor, outstanding work! This Deepchecks LLM Evaluation looks absolutely amazing. I'm sure it will help validate, monitor, and safeguard LLM-based apps with ease. Bravo!

1yr ago

Deepchecks Monitoring

Maker

@rep_eat amen!!

1yr ago

Reals by Hour One

Another quality product from deepchecks. you've been kicking ass this year!

1yr ago

Deepchecks Monitoring

Maker

@lstmeow thank you so much my friend!

1yr ago

PROCESIO

An innovative approach to evaluating language models. The detailed insights it provides are invaluable for improving model performance. Congrats on the launch! 👏

1yr ago

Deepchecks Monitoring

Maker

@alex_gavril1 thanks, you rock!

1yr ago

A to Z of Pricing and Monetisation

Looks cool

1yr ago

Carrus

Congrats on the launch team!

1yr ago

Deepchecks Monitoring

Maker

@nilay1101 thanks my friend!

1yr ago

Impressive concept—congrats!

1yr ago

Yael Barsheshet

Congratulations!!

1yr ago

Deepchecks Monitoring

Maker

@yael_barsheshet1 thanks for your support!

1yr ago