SmolDocling: Lightweight All-in-One Document OCR Model#
Current mainstream OCR systems typically require large models with 1B+ parameters for computation. Recently, I discovered a lightweight all-in-one document OCR model tool with only 256M parameters.
Features of SmolDocling OCR Model#
-
Lightweight and Fast
- 256M small parameters, can run on CPU/low-end GPU without high-end computing resources.
- Fast OCR speed, taking only 0.35 seconds per page, suitable for batch processing.
-
Core Capabilities
- Full Document OCR Parsing
- Intelligent recognition of titles, body text, lists, tables, charts, code, formulas, and more.
- Suitable for various document types including academic papers, business documents, patents, reports, handwritten documents, etc.
- Diverse Element Recognition
- Layout recognition, code recognition, formula recognition, chart and table recognition, graphic classification, etc.
- Flexible Output Formats
- Supports export to various formats including Markdown, HTML, JSON, etc.
- Batch Processing Support
- Can process multiple documents at once, suitable for large-scale data conversion.
- Full Document OCR Parsing
Quick Start#
To use the latest SmolDocling, there are two methods:
- Online Demo: The official demo of SmolDocling-256M-preview is deployed on HuggingFace, allowing you to directly experience its powerful features.
SmolDocling is a lightweight, ultra-fast, and fully document-parsing multimodal OCR model that is more accurate and efficient than traditional OCR, suitable for tasks such as paper parsing, contract analysis, data extraction, and knowledge base construction. It not only supports complete document OCR, including tables, code, formulas, and charts, but also processes quickly, taking only 0.35 seconds per page, and can export in various formats, making it suitable for many different user needs.
If you are looking for a fast and efficient OCR tool, SmolDocling is definitely worth a try!
- Model Link: SmolDocling-256M-preview