Back to products
LunaRoute

LunaRoute

High-perf, secure local proxy for AI coding assistants

Website github.com
Overview

What it is

LunaRoute is a high-performance secure local proxy for AI coding assistants like Claude Code, Codex, and OpenCode. Get complete visibility into every LLM interaction with zero-overhead passthrough, session recording, and powerful debugging capabilities

Intent

I need it when

Track token usage and estimate costs across multiple AI coding sessions

LunaRoute automatically logs input, output, and thinking tokens broken down by session and model, storing data in SQLite for analytics queries. Developers can run SQL queries to analyze token consumption patterns, calculate estimated costs per session, and identify which models or tools consume the most tokens.

Support both OpenAI and Anthropic API formats without code changes

LunaRoute runs in dual-dialect passthrough mode, accepting both OpenAI /v1/chat/completions and Anthropic /v1/messages formats simultaneously on the same port. Requests are routed based on model prefix (gpt-* to OpenAI, claude-* to Anthropic) with sub-millisecond overhead and 100% API fidelity.

Debug and inspect AI coding assistant interactions in real-time

LunaRoute acts as a local proxy that records every LLM request and response with zero configuration, enabling developers to see exactly what Claude Code, OpenAI Codex, or other AI assistants send and receive. The built-in web UI at localhost:8082 provides session browsing, timeline views, and raw JSON inspection for complete visibility into AI behavior.

Protect sensitive data when recording AI interactions for debugging

LunaRoute includes automatic PII detection and redaction (emails, SSN, credit cards, phone numbers) before data hits disk. Developers can enable tokenization or masking modes to redact sensitive information while preserving session records for analysis and debugging.

Analyze tool call performance and identify bottlenecks in AI-assisted workflows

LunaRoute records tool call statistics including call counts, latency, and success rates (e.g., Read: 12 calls avg 45ms, Write: 8 calls avg 120ms). Session statistics show proxy overhead vs. provider latency, helping developers distinguish between LLM response time and their own code performance.

Drop

Not a fit when

  • User needs a managed SaaS proxy service with vendor support and uptime guarantees
  • User requires enterprise-grade compliance certifications (SOC 2, HIPAA) for production use
  • User needs multi-tenant isolation or role-based access controls for team collaboration
  • User operates in an air-gapped or offline environment without internet connectivity
  • User requires commercial indemnification or liability coverage for LLM interactions
Commercials

Pricing

Free and open source