Distributed AI Gateway
Active / ProdA high-performance observability and routing gateway for Large Language Models. Features semantic caching via Redis, token cost tracking, rate limiting, and fallback routing across GPT-4, Claude 3, and Local models.
Latency-32% (P95)
Load4k RPS tested
Cost Red.38% via Cache
Next.js/NodeRedisPostgreSQLAWS EC2, S3Docker