AI that works in production — not just in demos.

Drop-in middleware for LLM routing, retry logic, cost control, and observability. Ship agents that actually work. Open-source.

░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░

░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░ ░

░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

opti status --live

10K requests

REQUESTS        ████████████████████░░░░  847/1000
SUCCESS RATE    ██████████████████████████  99.4%
AVG LATENCY     ███████░░░░░░░░░░░░░░░░░░   312ms

────────────────────────────────────────────────
LIVE TRACE

▸ req_7f3a  "Summarize Q3 earnings report"
  ├─ openai/gpt-4o ·········· timeout (2.1s)
  ├─ anthropic/sonnet ······· ✓ 847ms  $0.012
  └─ result cached

▸ req_8b2c  "Extract key metrics from PDF"
  └─ openai/gpt-4o-mini ····· ✓ 203ms  $0.002

────────────────────────────────────────────────
COST TODAY       $47.23    ▼ 52% vs baseline
ROUTED CHEAPER   38% of requests
FALLBACKS        12 catches today

53%

cost reduction

via smart routing

99.4%

effective uptime

with auto-fallback

52%

faster p99

through parallelization

Full

visibility

every request tracked

The Problems

You didn't sign up to build AI infrastructure. You signed up to build product.

Costs Spiral

You're using GPT-4 for tasks GPT-4o-mini handles fine. No visibility until the bill arrives.

Hidden costs add 20-40% to direct API usage

Agents Fail

1% error per step = 63% failure on 100-step workflows. Loops, timeouts, silent failures.

Tool calling fails 3-15% in production

Latency Kills UX

Sequential calls stack. 3 calls × 1.5s = 4.5s. Your users leave.

P95 latencies of 2-3s make apps unusable

Context Lies

128K tokens advertised. Quality degrades after 50K. You're paying for context you can't use.

Performance drops sharply at 50% capacity

Features

Terminal

# Install Opti

$ npm install @opti-ai/runtime

# Initialize in your project

$ opti init

✓ Created opti.config.ts

✓ Detected providers: openai, anthropic

✓ Enabled: routing, retry, fallback

# Check status anytime

$ opti status

Requests today: 2,847

Success rate: 99.4%

Cost: $47.23

Routed to cheaper: 38%

Performance

Real benchmarks, real workloads

Measured across 10K production requests. Results vary by workload composition and provider mix.

┌─────────────────────────────────────────────────────────────────────────────────────┐
│                                                                                     │
│   OPTI PERFORMANCE REPORT                                              10K requests │
│   ═══════════════════════════════════════════════════════════════════════════════   │
│                                                                                     │
│                                                                                     │
│   LATENCY DISTRIBUTION                                                              │
│   ────────────────────────────────────────────────────────────────────────────────  │
│                                                                                     │
│            WITHOUT OPTI                              WITH OPTI                      │
│                                                                                     │
│   p50      ░░░░░░░░░░░░░░░░░░░  890ms               ████████░░░░░░░░░░░  420ms      │
│   p90      ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  1.65s    █████████████░░░░░░  780ms      │
│   p99      ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  ████████████████░░░  1.12s      │
│            └───────────────────────────────────┘    └──────────────────────────┘    │
│                       2.34s max                            ▲                        │
│                                                            │                        │
│                                                       52% faster                    │
│                                                                                     │
│                                                                                     │
│   COST ANALYSIS                                                                     │
│   ────────────────────────────────────────────────────────────────────────────────  │
│                                                                                     │
│   Per 10K requests:                                                                 │
│                                                                                     │
│   Baseline     $142.00  ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  │
│   With Opti    $67.00   ████████████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  │
│                         ├──────────────────────────┤                                │
│                                    ▲                                                │
│                               53% savings                                           │
│                                                                                     │
│   Routing breakdown:                                                                │
│   ┌─────────────────────────────────────────────────────────────────────────────┐   │
│   │  38% routed to cheaper models   │  27% cache hits   │  35% premium models   │   │
│   │  ████████████████░░░░░░░░░░░░░░░│██████████░░░░░░░░░│█████████████░░░░░░░░░░│   │
│   └─────────────────────────────────────────────────────────────────────────────┘   │
│                                                                                     │
│                                                                                     │
│   RELIABILITY                                                                       │
│   ────────────────────────────────────────────────────────────────────────────────  │
│                                                                                     │
│   Success Rate                                                                      │
│                                                                                     │
│   Baseline     94.7%   ██████████████████████████████████████░░░░░░░░░░░░░░░░░░░░   │
│   With Opti   99.4%   █████████████████████████████████████████████████████████░░   │
│                                                                        ▲            │
│                                                                   +4.7% uplift     │
│                                                                                     │
│   Fallback events: 847 catches  │  Retries: 1,203  │  Circuit breaks: 12           │
│                                                                                     │
│                                                                                     │
│   ═══════════════════════════════════════════════════════════════════════════════   │
│   Summary: 53% cost reduction  •  52% faster p99  •  99.4% success rate             │
│                                                                                     │
└─────────────────────────────────────────────────────────────────────────────────────┘

to run benchmarks on your specific workload.

Works With Your Stack

Not a framework. Not a platform lock-in. Just middleware that works.

Providers

OpenAI

Anthropic

Google AI

Azure OpenAI

AWS Bedrock

Ollama (local)

Frameworks

LangChain

LlamaIndex

Vercel AI SDK

Custom

Deployment

npm package

Self-hosted

Docker

Cloud (coming)

Get Started

Up and running in 2 minutes.

Terminal

$ npm install @opti-ai/runtime

// opti.config.ts

export default {

providers: {

openai: { apiKey: process.env.OPENAI_API_KEY },

anthropic: { apiKey: process.env.ANTHROPIC_API_KEY },

},

routing: 'auto', // or define custom rules

fallback: true, // auto-failover between providers

observability: true, // track everything

}

// your-code.ts

import { opti } from '@opti-ai/runtime'

const response = await opti.complete("Summarize this document...")

// That's it. Routing, retry, fallback, and tracking are automatic.

$ opti status // see what's happening