Parallax by Gradient

Host LLMs across devices sharing GPU to make your AI go brrr

Website github.com

What it is

Your local AI just leveled up to multiplayer. Parallax is the easiest way to build your own AI cluster to run the best large language models across devices, no matter their specs or location.

Intent

I need it when

Build a cost-effective, self-hosted AI inference cluster using existing hardware across multiple locations

Parallax enables users to deploy LLMs across distributed nodes with varying configurations and physical locations, supporting cross-platform deployment (GPU, Mac, CPU) with pipeline parallel model sharding and continuous batching for efficient resource utilization without vendor lock-in

Run large language models locally on personal devices without relying on cloud APIs or third-party inference services

Parallax supports hosting local LLMs on personal devices through its Mac backend (powered by MLX LM) and GPU backends (SGLang, vLLM), enabling private, offline inference with full model control

Optimize inference performance across heterogeneous hardware by leveraging dynamic request scheduling and advanced memory management

Parallax provides paged KV cache management, continuous batching, and dynamic request scheduling to maximize throughput and minimize latency across nodes with different hardware capabilities

Deploy and manage multiple open-source LLMs (DeepSeek, Qwen, Llama, GLM, etc.) in a unified distributed inference framework

Parallax supports a wide range of open-source models from major providers and includes integration with OpenClaw, enabling users to host and route requests across multiple model instances in a single cluster

Drop

Not a fit when

User requires commercial support or SLA guarantees; Parallax is open-source community-driven with no official commercial support tier
User needs a managed inference service without infrastructure setup; Parallax requires users to build and manage their own distributed cluster
User lacks technical expertise in distributed systems, GPU configuration, or Python development; Parallax demands significant DevOps and ML engineering knowledge
User requires proprietary model hosting with vendor lock-in; Parallax is decentralized and designed for self-hosted, heterogeneous node environments
User operates in a regulated environment requiring vendor compliance certifications; Parallax is community-maintained without formal compliance documentation

Commercials

Pricing

Open source (Apache-2.0 license)