Installation¶
Kreuzberg is available in multiple formats optimized for different runtimes: native bindings for server-side languages and WebAssembly for JavaScript environments. Choose the package that matches your runtime environment.
Which Package Should I Install?¶
| Runtime/Environment | Package | Performance | Best For |
|---|---|---|---|
| Node.js | @kreuzberg/node | ⚡ Fastest (native) | Server-side Node applications, native performance |
| Bun | @kreuzberg/node | ⚡ Fastest (native) | Bun runtime, native performance |
| Browser | @kreuzberg/wasm | ✓ Good (WASM) | Client-side apps, no native dependencies |
| Deno | @kreuzberg/wasm | ✓ Good (WASM) | Deno runtime, pure WASM execution |
| Cloudflare Workers | @kreuzberg/wasm | ✓ Good (WASM) | Serverless functions, edge computing |
| Python | kreuzberg | ⚡ Fastest (native) | Server-side Python, native performance |
| Ruby | kreuzberg | ⚡ Fastest (native) | Ruby applications, native performance |
| Rust | kreuzberg crate | ⚡ Fastest (native) | Rust projects, full control |
| CLI/Docker | kreuzberg-cli | ⚡ Fastest (native) | Command-line usage, batch processing |
Performance Notes¶
- Native bindings (@kreuzberg/node, kreuzberg Python/Ruby): ~100% performance, compiled C/C++ speed, full feature access
- WASM: ~60-80% performance relative to native, pure JavaScript, zero native dependencies, works anywhere JavaScript runs
Choose native bindings for server-side applications requiring maximum performance. Choose WASM for browser/edge environments or when avoiding native dependencies is essential.
System Dependencies¶
System dependencies vary by package:
Native Bindings Only (Python, Ruby, Node.js)¶
- Rust toolchain (
rustup) for building the core and bindings. - C/C++ build tools (Xcode Command Line Tools on macOS, MSVC Build Tools on Windows,
build-essentialon Linux). - Tesseract OCR (optional but recommended). Install via Homebrew (
brew install tesseract), apt (sudo apt install tesseract-ocr), or Windows installers. - Pdfium binaries are fetched automatically during builds; no manual steps required.
WASM (@kreuzberg/wasm)¶
No system dependencies required. WASM binaries are prebuilt and included in the npm package.
Python¶
Optional extras:
Next steps: Python Quick Start • Python API Reference
TypeScript (Node.js / Bun) - Native¶
Use @kreuzberg/node for server-side TypeScript/Node.js applications requiring maximum performance.
The package ships with prebuilt N-API binaries for Linux, macOS (Intel/Apple Silicon), and Windows. If you need to build from source, ensure Rust is available on your PATH and rerun the install command.
Performance: Native bindings provide ~100% performance through NAPI-RS compiled bindings.
Next steps: TypeScript Quick Start • TypeScript API Reference
TypeScript/JavaScript (Browser / Edge) - WASM¶
Use @kreuzberg/wasm for client-side JavaScript applications, serverless environments, and runtimes where native binaries are unavailable or undesirable.
When to Use WASM¶
- Client-side browser applications
- Cloudflare Workers or other edge computing platforms
- Deno or other JavaScript runtimes
- Environments where native dependencies cannot be installed
- Scenarios where package size reduction matters
When to Use Native (@kreuzberg/node)¶
- Server-side Node.js applications (10-40% faster)
- Bun runtime
- Maximum performance requirements
- Full feature access with no limitations
Installation¶
Performance: WASM bindings provide ~60-80% of native performance with zero native dependencies.
Browser Usage¶
<!DOCTYPE html>
<html>
<head>
<script type="module">
import { initWasm, extractFromFile } from '@kreuzberg/wasm';
window.initKreuzberg = async () => {
await initWasm();
console.log('Kreuzberg initialized');
};
window.extractFile = async (file) => {
const result = await extractFromFile(file);
console.log(result.content);
};
</script>
</head>
<body>
<input type="file" id="file" />
</body>
</html>
Deno¶
import { initWasm, extractFile } from 'npm:@kreuzberg/wasm';
await initWasm();
const result = await extractFile('./document.pdf');
console.log(result.content);
Cloudflare Workers¶
import { initWasm, extractBytes } from '@kreuzberg/wasm';
export default {
async fetch(request: Request): Promise<Response> {
await initWasm();
const file = await request.arrayBuffer();
const bytes = new Uint8Array(file);
const result = await extractBytes(bytes, 'application/pdf');
return new Response(JSON.stringify({ content: result.content }));
},
};
Optional Features¶
OCR support requires browser Web Workers and additional memory. Enable it selectively:
import { initWasm, enableOcr, extractFromFile } from '@kreuzberg/wasm';
await initWasm();
const fileInput = document.getElementById('file');
fileInput.addEventListener('change', async (e) => {
const file = e.target.files[0];
if (file.type.startsWith('image/')) {
// Enable OCR only for images
await enableOcr();
}
const result = await extractFromFile(file);
console.log(result.content);
});
WASM bindings work in: - Modern browsers (Chrome 74+, Firefox 79+, Safari 14+, Edge 79+) - Node.js 22+ - Deno 1.35+ - Bun 0.6+ - Cloudflare Workers - Other JavaScript runtimes with WebAssembly support
Next steps: WASM Quick Start • WASM API Reference
Ruby¶
Bundler projects can add it to the Gemfile:
Native extension builds require Ruby 3.3+ plus MSYS2 on Windows. Set RBENV_VERSION/chruby accordingly and ensure bundle config set build.kreuzberg --with-cflags="-std=c++17" if your compiler defaults are older.
Next steps: Ruby Quick Start • Ruby API Reference
Rust¶
Or edit Cargo.toml manually:
Enable optional features as needed:
Next steps: Rust API Reference
CLI¶
Homebrew tap (macOS / Linux):
Cargo install:
Docker image:
docker pull goldziher/kreuzberg:latest # Core image with essential features
docker pull goldziher/kreuzberg:latest-all # Full image with all extensions
Next steps: CLI Usage • API Server Guide
Development Environment¶
To work on the repository itself:
task setup # Install all dependencies (Python, Node.js, Ruby, Rust)
task lint # Run linters across all languages
task dev:test # Execute full test suite (Rust, Python, Ruby, TypeScript)
See Contributing for branch naming, coding conventions, and test expectations.