Tinker - Inward App

Tinker

Control every aspect of model training and fine-tuning

Website thinkingmachines.ai

What it is

Tinker is a flexible API for efficiently fine-tuning open source models with LoRA. It's designed for researchers and developers who want flexibility and full control of their data and algorithms without worrying about infrastructure management.

Intent

I need it when

Access a diverse range of open-source models for training from a single API

Tinker supports 30+ open-source models including Qwen, Llama, DeepSeek, and others ranging from 1B to 397B parameters, allowing users to select optimal model sizes for their research without switching platforms.

Fine-tune open-source language models efficiently without managing GPU infrastructure

Tinker provides a training API that abstracts away infrastructure complexity, allowing researchers to focus on datasets and algorithms. Users control training via four core functions (forward_backward, optim_step, sample, save_state) while Tinker handles distributed GPU orchestration and resource management.

Quickly iterate on model training without hardware procurement or configuration overhead

Tinker handles scheduling, tuning, and resource management automatically. Users can save checkpoints and resume training via save_state, enabling rapid experimentation cycles without infrastructure delays.

Reduce compute costs during model fine-tuning while maintaining training quality

Tinker uses LoRA (Low-Rank Adaptation) which trains only a small adapter instead of all model weights, matching full fine-tuning performance with significantly lower compute requirements and cost per token.

Experiment with reinforcement learning workflows on large language models

Tinker's sample function enables token generation for RL actions and evaluation. The API supports models up to 397B parameters, providing sufficient scale for complex RL experiments while maintaining infrastructure abstraction.

Drop

Not a fit when

User needs to fine-tune proprietary or closed-source models not in Tinker's supported model list
User requires full model weight modification beyond LoRA adapter training
User needs on-premise or self-hosted infrastructure without cloud dependency
User operates with non-USD currency and cannot absorb exchange rate fluctuations
User requires real-time inference serving rather than training and batch sampling
User has minimal compute needs and cannot justify per-token pricing model

Commercials

Pricing

Pay-per-use based on compute tokens. Prefill, sample, and training operations charged separately per million tokens. Storage at $0.10/GB-month. Prices vary by model size and context length. View pricing