Zebrium Autonomous Monitoring

Zebrium Autonomous Monitoring

Let machine learning catch software problems

2 followers

We live in a world where everything is being automated. But catching and understanding software problems still takes a lot of manual work. There's a better way! Let machine learning catch software problems and tell you what happened.
Zebrium Autonomous Monitoring gallery image
Zebrium Autonomous Monitoring gallery image
Zebrium Autonomous Monitoring gallery image
Zebrium Autonomous Monitoring gallery image
Zebrium Autonomous Monitoring gallery image
Launch Team

What do you think? …

sakthivel chandrasekaran
Gavin and Zebrium team congrats on the product launch! Can you help us understand how Zebrius is different from the likes of DataDog or AppDynamics? I understand from the previous comments ( from Ajay) that Zebrius eliminates the need for dedicated resources to analyze data and build rules - you automate it based on ML. Can you share examples to illustrate the magnitude of this automation?
Gavin Cohen
Hi @sakthi_chandra - thanks for the comments. It's a very exciting time for us! You're correct - the biggest difference is that we are ML driven and take away the human work involved with monitoring. A couple of examples: Soon after installing, one of our customers saw a Zebrium alert that pointed at a bug in their code. To cut a long story short, we uncovered an issue related to certificate handling that would have caused a major outage if it had gone unnoticed. Another example was a customer where we caught a malicious login attempt. In other cases we have detected infrastructure issues, database issues, networking issues, etc. The list is endless. The key thing is that no-one built rules to catch any of these problems - it was all done 100% by our machine learning with zero human supervision.
Matthew W
@sakthi_chandra Having used traditional cloud logging tools in the past, this is a huge breath of fresh air. No more having to manually create alert rules, graphs, and other configurations that require frequent updating. A tool that will automatically tell me something is wrong based on what it learns from my logs is tremendously helpful. On top of all that - providing a root cause helps us avoid searching in the dark.
Larry Lancaster
Hey, Product Hunters! I'm excited to tell you about Zebrium, an AUTONOMOUS monitoring platform. Zebrium is built to identify "Unknown Unknowns" using machine learning - problems you don't have alert rules built for - and show you root-cause, even the FIRST time you hit them. Are you tired of "Unknown Unknowns" biting you in production? Tired of the fire drills, the slogging through logs and metrics? Tired of having to figure out what just broke, with hundreds of customers waiting on you? We need a new weapon to defeat complexity and slash resolution time. I believe autonomous monitoring is that new weapon. *** How To Get Started *** 1.) Go to our website and follow the instructions. We'll email you a URL with login details. 2.) If you're using k8s, it's a one-line chart install. Once you log in and set your password, you can cut and paste the install command which includes your API key. That's it! You're signed up and set up for our free service, with autonomous monitoring on your side. Here's what you WON'T have to do: 3.) Training, connectors, parsers, configuration, waiting, alert rules, etc. Works immediately on any app or stack! *** How This All Began *** I founded Zebrium because I was tired of writing data parsers and alert rules and then maintaining them, just to keep an eye on deployed software. The most frustrating part was the long tail of always-new failure modes. There was always the next fire drill. Zebrium means "elemental pattern", and the idea was straightforward: use ML to structure telemetry from deployed software, to extract regular and anomalous features from this data, and to cross-correlate these features to detect incidents with root-cause. We suspected there were fundamental ways software behaves when it breaks - elemental patterns - and that we could exploit these patterns for incident detection. Experience with hundreds of real-world incidents and dozens of applications has proven that we were right! It turns out that, most of the time, we can detect important incidents automatically. We can usually surface root-cause too, if the logs and metrics reflect it. Zebrium works so well that I would want to use it, if I weren't me; so, I imagine you'll want to use it, too. :)
Rod Bagg
I've been on the "short end of the stick" building rules, maintaining them, managing dozens of engineers building and maintaining monitoring solutions and pipelines.... for 20+ years. More than once I've been heads-down at Zebrium when I hear the familiar Slack-ding only to see our own software just alerted us to root cause of a problem in our own software!!! It's frikin brilliant! I'll tell you, it makes me giddy every time I use the UI or see one of those Slack alerts in action.