Multiple Load Balancing Strategies
Benchmark and compare Round Robin, Consistent Hashing, Least KV Cache and Least Queue routing strategies to find what works best for your LLM workload.
Real-Time Metrics with Prometheus
Built-in Prometheus integration tracks request counts and latency per backend, giving to evaluate routing performance.
Pluggable & Extensible
The router is built around loadbalancer.Router interface in Go, making it straightforward to add new routing strategies and run them against the same benchmark harness.