
Used as an LLM proxy, it allows the caching and load balancing between multiple AI services (Groq, OpenRouter, etc.) and even local with Ollama. It uses an OpenAI-compatible API that allows (when we can set the base URL) to use it in many apps or services. I use it configured with Langfuse which provides the performance analysis (monitoring) of each prompt/session.
I find myself recommending this library to serious LLM-powered app developers that are trying to standardize their codebase by unifying all the APIs they use. Love it!