Back to products
SmolDocling

SmolDocling

256M VLM for end-to-end document AI

Overview

What it is

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Intent

I need it when

Build document understanding into machine learning applications

SmolDocling integrates with Hugging Face's comprehensive ML stack (Transformers, Diffusers, PEFT) allowing developers to combine document parsing with fine-tuning, inference optimization, and deployment on Inference Endpoints or Spaces.

Collaborate on document processing models with a team

SmolDocling can be hosted and versioned on the Hugging Face Hub, enabling teams to collaborate on model improvements, share datasets, and manage access controls through Team or Enterprise plans with SSO, audit logs, and resource groups.

Extract structured data from documents programmatically

SmolDocling, as part of the Hugging Face ecosystem, enables developers to parse and extract document content using state-of-the-art models. Users can leverage the Hub's 2M+ models and integrate document parsing into ML pipelines via Python libraries like Transformers and the Hub client library.

Deploy document parsing at scale with managed infrastructure

SmolDocling models can be deployed via Hugging Face Inference Endpoints (starting at $0.60/hour for GPU) or integrated into Spaces applications, providing managed, scalable document processing without infrastructure management.

Drop

Not a fit when

  • User needs a standalone document processing tool without integration into the Hugging Face ecosystem
  • Organization requires proprietary, closed-source document parsing solutions
  • User lacks familiarity with Python or machine learning frameworks and needs a no-code GUI solution
  • Project demands real-time document processing with sub-second latency on edge devices
  • User requires commercial support contracts with guaranteed SLAs for production document workflows
Commercials

Pricing

Pricing not specified