User Guide¶
This guide provides comprehensive documentation for the Kreuzberg document intelligence framework, covering core concepts, configuration options, and integration patterns.
Contents¶
- Basic Usage - Essential usage patterns and concepts (API)
- Extraction Configuration - Configure the extraction process (API)
- Metadata Extraction - Document metadata extraction (API)
- Content Chunking - Split documents into manageable chunks
- OCR Configuration - Configure OCR settings (API)
- OCR Backends - Choose and configure different OCR engines
- Supported Formats - All supported document formats
- MCP Server - Model Context Protocol server for AI integration
- API Server - REST API for document extraction
- Docker - Using Kreuzberg with Docker
Best Practices¶
- Use the async API for better performance in web applications and concurrent extraction
- Configure OCR language settings to match your document languages for better accuracy
- For large documents, consider file streaming methods to reduce memory usage
- When processing many similar documents, reuse configuration objects for consistency
Common Use Cases¶
Document Analysis: