Groq provides ultra-fast LLM inference powered by their custom LPU (Language Processing Unit) hardware. They offer the fastest token generation speeds available.
Models: LLaMA, Mixtral, Gemma (via LPU inference)
Check the live status above. We perform HTTP health checks against the Groq API every 60 seconds.
LLMStatus checks api.groq.com every 60 seconds. The status shows whether their inference endpoint is reachable and responding.
Groq typically has very low latency due to their LPU hardware. If latency is elevated, it may indicate high demand or infrastructure issues. Check the latency chart for trends.
Groq runs open-source models including LLaMA, Mixtral, and Gemma on their custom LPU hardware for ultra-fast inference. Check their API docs for the current model list.