Docling
Docling parses documents and exports them to the desired format with ease and speed.
Features
- ποΈ Reads popular document formats (PDF, DOCX, PPTX, XLSX, Images, HTML, AsciiDoc & Markdown) and exports to HTML, Markdown and JSON (with embedded and referenced images)
- π Advanced PDF document understanding incl. page layout, reading order & table structures
- 𧩠Unified, expressive DoclingDocument representation format
- π€ Plug-and-play integrations incl. LangChain, LlamaIndex, Crew AI & Haystack for agentic AI
- π OCR support for scanned PDFs
- π» Simple and convenient CLI
Coming soon
- βΎοΈ Equation & code extraction
- π Metadata extraction, including title, authors, references & language
Get started
Concepts
Learn Docling fundamendals Examples
Try out recipes for various use cases, including conversion, RAG, and more Integrations
Check out integrations with popular frameworks and tools Reference
See more API details
Learn Docling fundamendals Examples
Try out recipes for various use cases, including conversion, RAG, and more Integrations
Check out integrations with popular frameworks and tools Reference
See more API details
IBM β€οΈ Open Source AI
Docling has been brought to you by IBM.