Aryan Patil | Infrastructure Engineer

Distributed AI Gateway

Active / Prod

A high-performance observability and routing gateway for Large Language Models. Features semantic caching via Redis, token cost tracking, rate limiting, and fallback routing across GPT-4, Claude 3, and Local models.

Latency-32% (P95)

Load4k RPS tested

Cost Red.38% via Cache

Next.js/NodeRedisPostgreSQLAWS EC2, S3Docker

Event-Driven Logging Engine

Phase 2

Kafka-based centralized logging and telemetry collection simulating high-throughput distributed tracing.