Data Prep Kit
Get started
Docling is used by the Data Prep Kit [↗] open-source toolkit for preparing unstructured data for LLM application development ranging from laptop scale to datacenter scale.
Below you find the Data Prep Kit modules powered by Docling.
PDF ingestion to Parquet
- 💻 GitHub [↗]
- 📖 API docs [↗]
Document chunking
- 💻 GitHub [↗]
- 📖 API docs [↗]