Back to products
Kuzco

Kuzco

Open-source Swift package to run LLMs locally on iOS & macOS

Overview

What it is

Kuzco runs LLMs, vision AI, and image generation locally on iPhone and iPad. Your users get unlimited AI while you pay one flat monthly fee (no more pesky per-token pricing eating into your margins 😀). Works completely offline. User data never leaves their device. Ships in 3 lines of Swift. Build chat assistants, image generators, smart document tools, and more. All the AI power, none of the API anxiety.

Intent

I need it when

Integrate AI capabilities into iOS apps quickly without complex backend setup

Kuzco provides a Swift SDK with drop-in SwiftUI components and simple 3-line code examples, enabling developers to add text generation, image generation, and vision AI in minutes

Support offline AI functionality across all Apple devices including iPhone, iPad, Mac, and Vision Pro

Kuzco SDK works across all Apple platforms and operates in airplane mode, enabling consistent AI experiences without internet connectivity requirements

Build privacy-first iOS apps with AI features without sending user data to cloud servers

Kuzco runs LLMs, vision models, and image generation entirely on-device with no data transmission, enabling developers to ship AI features while maintaining complete user privacy and offline functionality

Reduce AI infrastructure costs by eliminating per-token API fees for iOS applications

Kuzco offers a flat monthly subscription model instead of per-token pricing, allowing unlimited AI access for users at predictable cost regardless of usage volume

Deliver faster AI inference performance to iOS users compared to cloud-dependent solutions

Kuzco's on-device models generate text ~50% faster than Apple Intelligence (22 tok/s vs 15 tok/s) and work offline, eliminating network latency and dependency on server availability

Drop

Not a fit when

  • User requires cloud-based API with per-token pricing model for cost-predictable scaling
  • Application targets non-Apple platforms (Android, web, Windows) as Kuzco is iOS/Swift-exclusive
  • User needs real-time model updates or access to latest frontier models requiring cloud inference
  • Project requires offline-first AI but device storage is severely constrained (models range 1.1GB–5GB)
  • User prioritizes maximum model accuracy over on-device privacy and latency trade-offs
Commercials

Pricing

Flat monthly subscription for unlimited on-device AI access; free tier available to start