SelfHostLLM

SelfHostLLM

Calculate the GPU memory you need for LLM inference

100 followers

Calculate GPU memory requirements and max concurrent requests for self-hosted LLM inference. Support for Llama, Qwen, DeepSeek, Mistral and more. Plan your AI infrastructure efficiently.
SelfHostLLM gallery image
SelfHostLLM gallery image
Free
Launch Team

What do you think? …

Chris Messina
Hunter
📌

Built to simplify planning for self-hosted AI deployments.

Unlike other AI infrastructure tools, SelfHostLLM lets you precisely estimate GPU requirements and concurrency for Llama, Qwen, DeepSeek, Mistral, and more using custom config.

B̶u̶t̶ n̶o̶w̶ I̶ w̶̶a̶n̶t̶ t̶o̶ s̶e̶e̶ ̶A̶p̶p̶l̶e̶ s̶i̶l̶i̶c̶o̶n̶ ̶a̶d̶d̶e̶d̶ t̶o̶ t̶h̶e̶ m̶i̶x̶!

Update: Now there's a Mac version too!

Cruise Chen

Love how SelfHostLLM lets you actually estimate GPU needs for different LLMs—no more guessing and overbuying fr. Super smart idea, realy impressed!

Mcval Osborne

Very cool calculator, looking forward to checking this out.