AI Gateway

Enterprise model infrastructure.

14 providers

Unified middleware between every agent and every LLM provider. Budget enforcement, failover, caching, priority queuing, and cost attribution — at the infrastructure layer.

Request Pipeline

Agent Request→Auth→Budget Check→Priority Queue→Cache Lookup→Model Alias→Provider Select→Circuit Breaker→Dispatch→Cost Ledger

Budget Engine

Per-agent monthly and daily spend caps. Hard stop on excess. CFO approval flow for budget raises. Forecasting with anomaly detection.

Circuit Breaker

Per-provider-key failure tracking. 5 consecutive failures opens the circuit — automatic fallover to next key. Zero-downtime provider outages.

Priority Queue

Three-tier concurrency: Interactive (user-facing), Background (agent tasks), Batch (bulk processing). Interactive never blocked by lower tiers.

Semantic Cache

Tier 1: exact SHA-256 match (sub-ms). Tier 2: pgvector cosine similarity (>0.95 threshold). Eliminates redundant LLM calls across your entire workforce.

Model Aliases

"fast", "smart", "cheap", "vision", "code", "reason" resolve to cheapest available model in that capability tier. Auto-adapts to pricing changes.

Cost Ledger

1,000+ model pricing definitions. Per-request cost calculation in micro-USD. Raw audit trail + daily aggregates. Real-time dashboards via SSE.

OAuth Provider Flow

Native OAuth 2.0 for Claude Code, GitHub Copilot, Google Vertex AI. PKCE support. Automatic token refresh.

Key Pool Rotation

Multiple keys per provider with automatic rotation on failure. AES-256-GCM encrypted vault. Prefix validation on entry.

14 Supported Providers

AnthropicOpenAIGoogle GeminiAWS BedrockAzure OpenAIAWS SageMakerGroqOllamaLlamaSwapOpenRouterTogetherFireworksDeepSeekDashScope

← Back to all features