# Getting started with Kreuzberg Cloud

A minimal end-to-end walkthrough: sign up, get an API key, extract one document.

## 1. Create an account and project

1. Go to <https://kreuzberg.dev> and sign up.
2. The dashboard creates a default project; you can also create new projects from `/dashboard`.
3. In project settings, generate an API key. Keys look like `kbg_live_…`. Treat them as secrets.

The first 10,000 pages on each project are free. After that it's $0.008/page.

## 2. Extract a document — curl

The extraction API base URL is `https://api.kreuzberg.dev`. Inline `POST /v1/extract` is the simplest path; max 10 documents per request, max 1 MB per document.

```bash
# Encode without line wrapping. Use base64 -w0 on Linux, -i on macOS.
B64=$(base64 < invoice.pdf | tr -d '\n')

curl -X POST https://api.kreuzberg.dev/v1/extract \
  -H "Authorization: Bearer $KREUZBERG_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [{
      "filename": "invoice.pdf",
      "mime_type": "application/pdf",
      "data": "'"$B64"'"
    }],
    "options": {
      "extraction_config": {
        "output_format": "markdown",
        "ocr": { "backend": "tesseract", "language": "eng" }
      }
    },
    "webhook": {
      "url": "https://your-app.invalid/hooks/kreuzberg",
      "secret": "your-shared-secret"
    }
  }'
```

Response (HTTP 202):

```json
{ "job_ids": ["550e8400-e29b-41d4-a716-446655440000"], "status": "pending" }
```

Poll the job (or wait for the webhook):

```bash
curl -H "Authorization: Bearer $KREUZBERG_API_KEY" \
  https://api.kreuzberg.dev/v1/jobs/550e8400-e29b-41d4-a716-446655440000
```

Recommended polling cadence: every 1 second for the first ~10 seconds, then every 5 seconds. Treat any `*ing` status (`pending`, `processing`, `chunking`, `aggregating`) as in-flight.

## 3. Python

```python
import base64
import httpx

with open("invoice.pdf", "rb") as f:
    data = base64.b64encode(f.read()).decode()

response = httpx.post(
    "https://api.kreuzberg.dev/v1/extract",
    headers={"Authorization": f"Bearer {api_key}"},
    json={
        "documents": [{
            "filename": "invoice.pdf",
            "mime_type": "application/pdf",
            "data": data,
        }],
        "options": {
            "extraction_config": {
                "output_format": "markdown",
                "ocr": {"backend": "tesseract", "language": "eng"},
            },
        },
        "webhook": {"url": webhook_url, "secret": webhook_secret},
    },
)
response.raise_for_status()
job_ids = response.json()["job_ids"]
```

## 4. TypeScript

```ts
const data = Buffer.from(await fs.readFile("invoice.pdf")).toString("base64");

const response = await fetch("https://api.kreuzberg.dev/v1/extract", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${apiKey}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    documents: [{ filename: "invoice.pdf", mime_type: "application/pdf", data }],
    options: {
      extraction_config: {
        output_format: "markdown",
        ocr: { backend: "tesseract", language: "eng" },
      },
    },
    webhook: { url: webhookUrl, secret: webhookSecret },
  }),
});
const { job_ids } = await response.json();
```

## 5. Verifying webhook signatures

Webhooks include `X-Webhook-Signature: sha256=<hex>` when a `webhook.secret` is supplied. Compute HMAC-SHA256 of the raw request body using the secret as the key and hex-encode.

```python
import hmac, hashlib
expected = "sha256=" + hmac.new(secret.encode(), raw_body, hashlib.sha256).hexdigest()
assert hmac.compare_digest(expected, request.headers["X-Webhook-Signature"])
```

## 6. Open-source alternative

If you'd rather self-host, the same extraction core is on GitHub: <https://github.com/kreuzberg-dev/kreuzberg>. SDKs ship for Rust, Python, TypeScript, JavaScript, Node.js, PHP, Ruby, Elixir, Go, C#, R, and WebAssembly (12 total). License is Elastic License v2 (free for personal, internal, and commercial use; not for managed-service resale). VLM OCR backends and arbitrary embedding providers are available in the self-hosted library.

## See also

- [Capabilities](https://kreuzberg.dev/llms/capabilities.md)
- [Extraction API](https://kreuzberg.dev/llms/api.md)
- [Pricing](https://kreuzberg.dev/llms/pricing.md)
- [Full documentation](https://docs.kreuzberg.dev/)
