Document
intelligencefor AI engineering
workflows.
Extract structured, machine-readable content from
any document and feed it directly into AI agents,
pipelines, and applications.
1
One API
97
file formats
Drop a file here
Or click to browse
• Limited to 1 page per document
• File size capped at small uploads (under 1MB)
• Limited to 10 demo requests per IP
Built for AI engineering workflows
Speed That
Unblocks Your Team
Process documents in milliseconds instead of seconds! Your RAG pipeline moves at the speed of API calls, not extraction bottlenecks. Index millions of documents without waiting weeks for processing to complete.
Batch-Processing at
Scale
Effectively process large number of documents in bulk. Kreuzberg is built for batch processing, and our cloud infrastructure is designed to scale.
LLM-Powered Intelligence
Go beyond extraction. Use vision language models as an OCR backend, extract structured JSON from documents using a schema, and generate embeddings - all via 146 LLM providers, including local models with zero API key configuration.
Built for AI
Teams
Kreuzberg is a full toolbox - text extraction, metadata extraction, NER, embedding and chunking, all in a CPU optimized binary
Code
Intelligence
Extract functions, classes, imports, and symbols from code files across 305 programming languages. Structured output, ready for semantic chunking and RAG pipelines.
Polyglot and multiplatform
Get native performance in the language of your choice. Kreuzberg is written in Rust and is shipped for 11 other programming languages. It supports Linux, MacOS and Windows runtimes.
Three steps. One API
01
Send the file
Upload via API, SDK, CLI, or Docker. Supports PDFs, images, scanned docs, DOCX, PPTX, XLSX, HTML, and 90+ more formats.
02
We process it
Layout detection, OCR when needed, table extraction, optional VLM, and schema validation - all in a single call.
03
Pipe it anywhere
JSON response with full document structure. Webhook delivery for async workflows. Plug directly into your embeddings pipeline or RAG framework.
Pay only for what you
extract - no seats, no minimums
Cloud · Pay-as-you-go
Production-ready extraction, managed by us.
$0.008/page
First 10,000 pages free
92 file formats, 305 code formats
Images and scanned PDFs supported
OCR, layout detection, table extraction
No monthly minimum
Get started instantly, no card required
Try it For Free!High volume
100K+ pages a month? Let's talk pricing.
Custom/page
Everything from the Pay as you go plan
Discounted per-page rate on the cloud
Frequently Asked Questions
Start Building Today
Join thousands of developers already building document intelligence pipelines using Kreuzberg - in their language of choice!