Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

AeorDB is a content-addressed file database that treats your data as a filesystem, not as tables and rows. Store any file at any path, query structured fields with sub-millisecond lookups, and version everything with Git-like snapshots and forks – all from a single binary with zero external dependencies.

What Makes AeorDB Different

Content-addressed storage with BLAKE3. Every piece of data is identified by its cryptographic hash. This gives you built-in deduplication, integrity verification, and a Merkle tree that makes versioning essentially free.

A filesystem, not a schema. Data lives at paths like /users/alice.json and /docs/reports/q1.pdf, organized into directories. No schemas to define, no migrations to run. Store JSON, images, PDFs, or raw bytes – the engine handles them all the same way.

Built-in versioning. Create named snapshots, fork your database into isolated branches, diff between any two versions, and export/import self-contained .aeordb files. The content-addressed Merkle tree means historical reads resolve the exact data at the time of the snapshot, not the latest overwrite.

WASM plugin system. Extend the database with WebAssembly plugins for two purposes: parser plugins that extract structured fields from non-JSON files (PDFs, images, XML) for indexing, and query plugins that run custom logic directly at the data layer. Plugins execute in a sandboxed WASM runtime with configurable memory limits.

Native HTTP API. AeorDB exposes its full API over HTTP – no separate proxy, no client library required. Store files with PUT, read them with GET, query with POST /query, and manage versions with the /version/* endpoints. Any HTTP client works.

Embeddable. A single aeordb binary with no external dependencies. Point it at a .aeordb file and you have a running database. Like SQLite, but for files with versioning and a built-in HTTP server.

Lock-free concurrent reads. The engine uses snapshot double-buffering via ArcSwap so readers never block writers and never see partial state. Queries routinely complete in under a millisecond.

Key Features

  • Storage: Append-only WAL file, content-addressed BLAKE3 hashing, automatic zstd compression, 256KB chunking for dedup
  • Indexing: Scalar bucketing (NVT) with u64, i64, f64, string, timestamp, trigram, phonetic/soundex/dmetaphone index types
  • Querying: JSON query API with boolean logic (and, or, not), comparison operators, sorting, pagination, projections, and aggregations
  • Versioning: Snapshots, forks, diff/patch, export/import as self-contained .aeordb files
  • Plugins: WASM parser plugins for any file format, WASM query plugins for custom data-layer logic
  • Operations: Background task system, cron scheduler, garbage collection, automatic reindexing
  • Auth: Self-contained JWT auth, API keys, user/group management, path-level permissions, or --auth false for local use
  • Observability: Prometheus metrics at /admin/metrics, SSE event stream at /events/stream, structured logging

Next Steps

Installation

AeorDB is built from source using the Rust toolchain. There are no external dependencies – the binary is fully self-contained.

Prerequisites

  • Rust (stable toolchain, 1.75+)

Install Rust if you don’t have it:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Build from Source

Clone the repository and build in release mode:

git clone https://github.com/AeorDB/aeordb.git
cd aeordb
cargo build --release

The binary is located at:

target/release/aeordb

Verify the Build

./target/release/aeordb --help

You should see:

AeorDB command-line interface

Usage: aeordb <COMMAND>

Commands:
  start            Start the database server
  stress           Run stress tests against a running instance
  emergency-reset  Emergency reset: revoke the current root API key and generate a new one
  export           Export a version as a self-contained .aeordb file
  diff             Create a patch .aeordb containing only the changeset between two versions
  import           Import an export or patch .aeordb file into a target database
  promote          Promote a version hash to HEAD
  gc               Run garbage collection to reclaim unreachable entries
  help             Print this message or the help of the given subcommand(s)

Optional: Add to PATH

# Copy to a location in your PATH
sudo cp target/release/aeordb /usr/local/bin/aeordb

# Or symlink
sudo ln -s "$(pwd)/target/release/aeordb" /usr/local/bin/aeordb

No External Dependencies

AeorDB does not require:

  • A separate database process (it IS the process)
  • Runtime libraries or shared objects
  • Configuration files (sensible defaults for everything)
  • Docker, containers, or orchestration

The single binary is all you need. Point it at a file path and it creates the database on first run.

Next Steps

  • Quick Start – start the server and store your first file

Quick Start

This tutorial walks you through the core operations: storing files, querying, creating snapshots, and cleaning up. All examples use curl and assume auth is disabled for simplicity.

1. Start the Server

aeordb start --database mydb.aeordb --port 3000 --auth false

You should see log output indicating the server is listening on port 3000. The mydb.aeordb file is created automatically if it does not exist.

2. Store a File

Store a JSON file at the path /users/alice.json:

curl -X PUT http://localhost:3000/engine/users/alice.json \
  -H "Content-Type: application/json" \
  -d '{"name":"Alice","age":30,"city":"Portland"}'

Expected response:

{
  "status": "created",
  "path": "/users/alice.json",
  "hash": "a1b2c3d4..."
}

Store a few more files to have data to query:

curl -X PUT http://localhost:3000/engine/users/bob.json \
  -H "Content-Type: application/json" \
  -d '{"name":"Bob","age":25,"city":"Seattle"}'

curl -X PUT http://localhost:3000/engine/users/carol.json \
  -H "Content-Type: application/json" \
  -d '{"name":"Carol","age":35,"city":"Portland"}'

3. Read a File

curl http://localhost:3000/engine/users/alice.json

Expected response:

{"name":"Alice","age":30,"city":"Portland"}

4. List a Directory

Append a trailing slash to list directory contents:

curl http://localhost:3000/engine/users/

Expected response:

{
  "path": "/users/",
  "entries": [
    {"name": "alice.json", "type": "file"},
    {"name": "bob.json", "type": "file"},
    {"name": "carol.json", "type": "file"}
  ]
}

5. Add an Index

To query fields, you need to tell AeorDB which fields to index. Store an index configuration at .config/indexes.json inside the directory:

curl -X PUT http://localhost:3000/engine/users/.config/indexes.json \
  -H "Content-Type: application/json" \
  -d '{
    "indexes": [
      {"name": "age", "type": "u64"},
      {"name": "city", "type": "string"},
      {"name": "name", "type": ["string", "trigram"]}
    ]
  }'

This tells the engine to index the age field as a 64-bit unsigned integer, city as an exact string, and name as both an exact string and a trigram (for fuzzy matching). Existing files in the directory are automatically reindexed in the background.

6. Query

Query for users older than 28 in Portland:

curl -X POST http://localhost:3000/query \
  -H "Content-Type: application/json" \
  -d '{
    "path": "/users/",
    "where": {
      "and": [
        {"field": "age", "op": "gt", "value": 28},
        {"field": "city", "op": "eq", "value": "Portland"}
      ]
    }
  }'

Expected response:

{
  "results": [
    {"name": "Alice", "age": 30, "city": "Portland"},
    {"name": "Carol", "age": 35, "city": "Portland"}
  ],
  "total_count": 2
}

Query Operators

OperatorDescriptionExample
eqEquals{"field": "city", "op": "eq", "value": "Portland"}
gtGreater than{"field": "age", "op": "gt", "value": 25}
gteGreater than or equal{"field": "age", "op": "gte", "value": 25}
ltLess than{"field": "age", "op": "lt", "value": 30}
lteLess than or equal{"field": "age", "op": "lte", "value": 30}
betweenRange (inclusive){"field": "age", "op": "between", "value": [25, 35]}
fuzzyTrigram fuzzy match{"field": "name", "op": "fuzzy", "value": "Alce"}
phoneticPhonetic match{"field": "name", "op": "phonetic", "value": "Karrol"}

7. Create a Snapshot

Save the current state as a named snapshot:

curl -X POST http://localhost:3000/version/snapshot \
  -H "Content-Type: application/json" \
  -d '{"name": "v1"}'

Expected response:

{
  "status": "created",
  "name": "v1",
  "root_hash": "e5f6a7b8..."
}

You can list all snapshots:

curl http://localhost:3000/version/snapshots

8. Delete a File

curl -X DELETE http://localhost:3000/engine/users/alice.json

Expected response:

{
  "status": "deleted",
  "path": "/users/alice.json"
}

The file is removed from the current state (HEAD), but the snapshot v1 still contains it. You can restore the snapshot to get it back.

9. Run Garbage Collection

Over time, deleted and overwritten data accumulates in the database file. Run GC to reclaim unreachable entries:

curl -X POST http://localhost:3000/admin/gc

Expected response:

{
  "versions_scanned": 2,
  "live_entries": 15,
  "garbage_entries": 3,
  "reclaimed_bytes": 1024,
  "duration_ms": 12,
  "dry_run": false
}

To preview what would be collected without actually deleting:

curl -X POST "http://localhost:3000/admin/gc?dry_run=true"

Next Steps

  • Configuration – CLI flags, auth modes, CORS, index config
  • Architecture – understand the storage engine internals
  • Versioning – snapshots, forks, diff/patch, export/import
  • Indexing – index types, multi-strategy fields, WASM parsers

Configuration

AeorDB is configured through CLI flags at startup and through configuration files stored inside the database itself.

CLI Flags

aeordb start [OPTIONS]
FlagDefaultDescription
--port, -p3000HTTP listen port
--database, -Ddata.aeordbPath to the database file (created if it does not exist)
--authself-containedAuth provider URI (see Auth Modes)
--hot-dirdatabase parent dirDirectory for write-ahead hot files (crash recovery journal)
--corsdisabledCORS allowed origins (see CORS)
--log-formatprettyLog output format: pretty or json

Examples

# Minimal: local development with no auth
aeordb start --database dev.aeordb --port 8080 --auth false

# Production: custom port, explicit hot directory, CORS for your frontend
aeordb start \
  --database /var/lib/aeordb/prod.aeordb \
  --port 443 \
  --hot-dir /var/lib/aeordb/hot \
  --cors "https://myapp.com,https://admin.myapp.com" \
  --log-format json

Auth Modes

The --auth flag controls how authentication works:

ValueBehavior
false (or null, no, 0)Auth disabled – all requests are allowed without tokens. Use for local development only.
(omitted)Self-contained mode (default). AeorDB manages its own users, API keys, and JWT tokens. A root API key is printed on first startup.
file:///path/to/identity.jsonExternal identity file. AeorDB loads cryptographic keys from the specified file. A bootstrap API key is generated on first use.

Self-Contained Auth (Default)

On first startup, AeorDB creates an internal identity store and prints a root API key:

Root API key: aeor_ak_7f3b2a1c...

Use this key to create additional users and API keys via the admin API. If you lose the root key, use aeordb emergency-reset to generate a new one.

Obtaining a JWT Token

# Exchange API key for a JWT token
curl -X POST http://localhost:3000/auth/token \
  -H "Content-Type: application/json" \
  -d '{"api_key": "aeor_ak_7f3b2a1c..."}'

# Use the token for subsequent requests
curl http://localhost:3000/engine/users/ \
  -H "Authorization: Bearer eyJhbG..."

CORS

Global CORS via CLI

The --cors flag sets allowed origins for all routes:

# Allow all origins
aeordb start --cors "*"

# Allow specific origins (comma-separated)
aeordb start --cors "https://myapp.com,https://admin.myapp.com"

Without --cors, no CORS headers are sent and cross-origin browser requests will fail.

Per-Path CORS

For fine-grained control, store a /.config/cors.json file in the database:

curl -X PUT http://localhost:3000/engine/.config/cors.json \
  -H "Content-Type: application/json" \
  -d '{
    "rules": [
      {
        "path": "/public/",
        "origins": ["*"],
        "methods": ["GET", "HEAD"],
        "headers": ["Content-Type"]
      },
      {
        "path": "/api/",
        "origins": ["https://myapp.com"],
        "methods": ["GET", "POST", "PUT", "DELETE", "HEAD", "OPTIONS"],
        "headers": ["Content-Type", "Authorization"]
      }
    ]
  }'

Per-path rules are checked first. If no rule matches the request path, the global --cors setting applies.

Index Configuration

Indexes are configured per-directory by storing a .config/indexes.json file under the directory path. When this file changes, the engine automatically triggers a background reindex of all files in that directory.

{
  "indexes": [
    {"name": "title", "type": "string"},
    {"name": "age", "type": "u64"},
    {"name": "email", "type": ["string", "trigram"]},
    {"name": "created", "type": "timestamp"}
  ]
}

Index Types

TypeDescriptionUse Case
u64Unsigned 64-bit integerCounts, IDs, sizes
i64Signed 64-bit integerTemperatures, offsets, balances
f6464-bit floating pointCoordinates, measurements, scores
stringExact string matchCategories, statuses, enum values
timestampUTC millisecond timestampDate ranges, temporal queries
trigramTrigram-based fuzzy textTypo-tolerant search, substring matching
phoneticPhonetic matching (Soundex)Name search (“Smith” matches “Smyth”)
soundexSoundex encodingAlternative phonetic matching
dmetaphoneDouble MetaphoneMulti-cultural phonetic matching

Multi-Strategy Indexes

A single field can have multiple index types. Specify type as an array:

{"name": "title", "type": ["string", "trigram", "phonetic"]}

This creates three index files (title.string.idx, title.trigram.idx, title.phonetic.idx) from the same source field. Use the appropriate query operator to target each index type.

Source Resolution

By default, the index name is used as the JSON field name. For nested fields or parser output, use the source array:

{
  "parser": "pdf-extractor",
  "indexes": [
    {"name": "title", "source": ["metadata", "title"], "type": "string"},
    {"name": "author", "source": ["metadata", "author"], "type": ["string", "trigram"]},
    {"name": "page_count", "source": ["metadata", "pages"], "type": "u64"}
  ]
}

See Indexing & Queries for the full indexing reference.

Cron Configuration

Schedule recurring background tasks by storing /.config/cron.json:

curl -X PUT http://localhost:3000/engine/.config/cron.json \
  -H "Content-Type: application/json" \
  -d '{
    "schedules": [
      {
        "id": "weekly-gc",
        "task_type": "gc",
        "schedule": "0 3 * * 0",
        "args": {},
        "enabled": true
      },
      {
        "id": "nightly-reindex",
        "task_type": "reindex",
        "schedule": "0 2 * * *",
        "args": {"path": "/data/"},
        "enabled": true
      }
    ]
  }'

The schedule field uses standard 5-field cron syntax: minute hour day_of_month month day_of_week. Cron schedules can also be managed via the HTTP API at /admin/cron.

Compression

AeorDB uses zstd compression automatically when configured. To enable compression for a directory, add the compression field to the index config:

{
  "compression": "zstd",
  "indexes": [
    {"name": "title", "type": "string"}
  ]
}

Auto-Detection

When compression is enabled, the engine applies heuristics to decide whether to actually compress each file:

  • Files smaller than 500 bytes are stored uncompressed (header overhead negates savings)
  • Already-compressed formats (JPEG, PNG, MP4, ZIP, etc.) are stored uncompressed
  • Text, JSON, XML, and other compressible types are compressed with zstd

Compression is transparent – reads automatically decompress. The content hash is always computed on the raw uncompressed data, so deduplication works regardless of compression settings.

Architecture

AeorDB is a single-file database built on an append-only write-ahead log (WAL). The database file contains all data, indexes, and metadata in one place. Understanding the architecture helps you reason about performance, recovery, and versioning behavior.

High-Level Overview

                         aeordb start
                             |
                     +-------+-------+
                     |  HTTP Server  |
                     |  (axum)       |
                     +-------+-------+
                             |
              +--------------+--------------+
              |              |              |
        +-----+----+  +-----+----+  +------+------+
        | Query    |  | Plugin   |  | Version     |
        | Engine   |  | Manager  |  | Manager     |
        +-----+----+  +-----+----+  +------+------+
              |              |              |
              +--------------+--------------+
                             |
                    +--------+--------+
                    | Storage Engine  |
                    | (StorageEngine) |
                    +--------+--------+
                             |
              +--------------+--------------+
              |              |              |
        +-----+----+  +-----+----+  +------+------+
        | Append   |  | KV Store |  | NVT         |
        | Writer   |  | (.kv)    |  | (in-memory) |
        +----------+  +----------+  +-------------+
              |              |
              +--------------+
                    |
            [  mydb.aeordb  ]    <-- single file on disk
            [ mydb.aeordb.kv ]   <-- KV index file

The Database File (.aeordb)

The .aeordb file is an append-only WAL. Every write appends a new entry to the end of the file. Entries are never modified in place (except during garbage collection).

File Layout

[File Header - 256 bytes]
  Magic: "AEOR"
  Hash algorithm, timestamps, KV/NVT pointers, HEAD hash, entry count

[Entry 1] [Entry 2] [Entry 3] ... [Entry N]
  Chunks, FileRecords, DirectoryIndexes, Snapshots, DeletionRecords, Voids

The 256-byte file header contains pointers to the KV block, NVT, and the current HEAD hash. Every entry carries its own header with magic bytes, type tag, hash algorithm, compression flag, key, and value.

Entry Types

TypePurpose
ChunkRaw file data (256KB blocks)
FileRecordFile metadata + ordered list of chunk hashes
DirectoryIndexDirectory contents (child entries with hashes)
SnapshotNamed point-in-time version reference
DeletionRecordMarks a file as deleted (for version history completeness)
VoidFree space marker (reclaimable by future writes)

The KV Index File (.aeordb.kv)

The KV store is a sorted array of (hash, offset) pairs stored in a separate file. It maps content hashes to byte offsets in the main .aeordb file, providing O(1) lookups when combined with the NVT.

Each entry is hash_length + 8 bytes (40 bytes for BLAKE3-256). The entries are sorted by hash, and the NVT tells you which bucket to look in, so lookups are a single seek + small scan.

KV Resize

When the KV store needs to grow, the engine enters a brief resize mode:

  1. A temporary buffer KV store is created
  2. New writes go to the buffer (no blocking)
  3. The primary KV store is expanded
  4. Buffer contents are merged into the primary
  5. Buffer is discarded

Writes never block during resize.

NVT (Normalized Vector Table)

The NVT is an in-memory structure that provides fast hash-to-bucket lookups for the KV store.

How It Works

  1. Normalize the hash to a scalar: first_8_bytes_as_u64 / u64::MAX produces a value in [0.0, 1.0]
  2. Map the scalar to a bucket: bucket_index = floor(scalar * num_buckets)
  3. The bucket points to a range in the KV store – scan that range for the exact hash

BLAKE3 hashes are uniformly distributed, so buckets stay balanced without manual tuning. The NVT starts at 1,024 buckets and doubles when the average scan length exceeds a threshold.

Scaling

EntriesNVT BucketsNVT MemoryAvg Scan
10,0001,02416 KB~10
1,000,00065,5361 MB~15
100,000,0001,048,57616 MB~95

Hot File WAL (Crash Recovery)

The --hot-dir flag specifies a directory for write-ahead hot files. During a write:

  1. The entry is written to a hot file first (fsync’d)
  2. The entry is then written to the main .aeordb file
  3. On success, the hot file entry is cleared

If the process crashes between steps 1 and 2, the hot file is replayed on the next startup to recover uncommitted writes. If --hot-dir is not specified, the hot directory defaults to the same directory as the database file.

Snapshot Double-Buffering

AeorDB uses ArcSwap for lock-free concurrent reads. The in-memory directory state is wrapped in an Arc that readers clone cheaply. When a write completes:

  1. The writer builds a new directory state
  2. The new state is swapped in atomically via ArcSwap::store
  3. Readers holding the old Arc continue using it until they finish
  4. The old state is dropped when the last reader releases it

This means:

  • Readers never block writers
  • Writers never block readers
  • Every read sees a consistent point-in-time snapshot
  • No read locks, no write locks on the read path

B-Tree Directories

Small directories (under 256 entries) are stored as flat lists of child entries. When a directory exceeds 256 entries, the engine automatically converts it to a B-tree structure. This keeps directory lookups O(log n) even for directories with millions of files.

B-tree nodes are themselves stored as content-addressed entries, so they participate in versioning and structural sharing just like any other data.

Directory Propagation

When a file changes, the engine propagates the update up the directory tree:

Write /users/alice.json
  -> update /users/ directory (new child hash for alice.json)
    -> update / root directory (new child hash for users/)
      -> update HEAD (new root hash)

Each directory gets a new content hash because its contents changed. This is how the Merkle tree works – a change at any leaf creates new hashes all the way to the root. The root hash (HEAD) uniquely identifies the complete state of the database.

Next Steps

Storage Engine

The storage engine is an append-only WAL (write-ahead log) where the log IS the database. Every write appends a new entry. The file only grows (until garbage collection reclaims unreachable entries). This design gives you crash recovery, versioning, and integrity verification as structural properties rather than bolted-on features.

Entry Format

Every entry on disk shares the same header format:

[Entry Header - 31 bytes fixed + hash_length variable]
  magic:            u32    (0x0AE012DB - marks the start of a valid entry)
  entry_version:    u8     (format version, starting at 1)
  entry_type:       u8     (Chunk, FileRecord, DirectoryIndex, etc.)
  flags:            u8     (operational flags)
  hash_algo:        u16    (BLAKE3_256 = 0x0001, SHA256 = 0x0002, etc.)
  compression_algo: u8     (None = 0x00, Zstd = 0x01)
  encryption_algo:  u8     (None = 0x00, reserved for future use)
  key_length:       u32    (length of the key field)
  value_length:     u32    (length of the value field)
  timestamp:        i64    (UTC milliseconds since epoch)
  total_length:     u32    (total bytes including header, for jump-scanning)
  hash:             [u8; N] (integrity hash, N determined by hash_algo)

[Key - key_length bytes]
[Value - value_length bytes]

Key properties:

  • magic (0x0AE012DB) enables recovery scanning – find entry boundaries even in a corrupted file by scanning for magic bytes
  • total_length enables jump-scanning – skip to the next entry without reading the full key/value
  • hash covers entry_type + key + value – re-hash and compare to detect corruption
  • entry_version enables format evolution – the engine selects the correct parser based on this byte

For BLAKE3-256 (the default), the hash is 32 bytes, making the full header 63 bytes.

Content-Addressed Hashing

Every piece of data is identified by its BLAKE3 hash. Hash inputs are prefixed by type (domain separation) to prevent collisions between different entry types:

Entry TypeHash InputExample
Chunkchunk: + raw bytesBLAKE3("chunk:" + file_bytes)
FileRecord (path key)file: + pathBLAKE3("file:/users/alice.json")
FileRecord (content key)filec: + serialized recordBLAKE3("filec:" + record_bytes)
DirectoryIndex (path key)dir: + pathBLAKE3("dir:/users/")
DirectoryIndex (content key)dirc: + serialized dataBLAKE3("dirc:" + dir_bytes)

The domain prefix ensures that a chunk’s raw data can never produce the same hash as a file path, even if the bytes are identical.

Chunking

Files are split into 256KB chunks for storage. Each chunk is content-addressed independently:

Original file (700KB):
  [Chunk 1: 256KB] -> hash_a
  [Chunk 2: 256KB] -> hash_b
  [Chunk 3: 188KB] -> hash_c

FileRecord:
  path: "/docs/report.pdf"
  chunk_hashes: [hash_a, hash_b, hash_c]
  total_size: 700KB

Chunking provides:

  • Deduplication: Two files sharing identical 256KB blocks store those blocks only once
  • Efficient updates: Modifying 3 bytes of a 10GB file creates one new chunk, not a new copy of the entire file
  • Streaming reads: Read a file by iterating its chunk hashes and fetching each chunk

Dual-Key FileRecords

FileRecords are stored at two keys to support both current reads and historical versioning:

  1. Path key (file:/path) – mutable, always points to the latest version. Used for reads, metadata, indexing, and deletion. O(1) lookup.

  2. Content key (filec: + serialized record) – immutable, content-addressed. The directory tree’s ChildEntry.hash points to this key.

When the version manager walks a snapshot’s directory tree, it follows ChildEntry.hash to the content key, which resolves to the FileRecord as it existed at snapshot time – not the current version. This is what makes historical reads correct.

Directories use the same pattern: dir:/path (mutable) and dirc: + data (immutable content key).

FileRecord Format

[FileRecord Value]
  path_length:      u16
  path:             [u8; path_length]     (full file path)
  content_type_len: u16
  content_type:     [u8; content_type_len] (MIME type)
  total_size:       u64                    (file size in bytes)
  created_at:       i64                    (UTC milliseconds)
  updated_at:       i64                    (UTC milliseconds)
  metadata_length:  u32
  metadata:         [u8; metadata_length]  (arbitrary JSON metadata)
  chunk_count:      u32
  chunk_hashes:     [u8; chunk_count * 32] (ordered BLAKE3 hashes)

Metadata fields come first so you can read file metadata without skipping past the chunk list. Chunk hashes are the tail of the record for streaming reads.

Directory Propagation

When a file is stored or deleted, the change propagates up the directory tree:

Store /users/alice.json:

1. Store chunks -> [hash_a, hash_b]
2. Store FileRecord at path key + content key
3. Update /users/ DirectoryIndex (new ChildEntry for alice.json)
4. Update / root DirectoryIndex (new ChildEntry for users/)
5. Update HEAD in file header (new root hash)

Each directory gets a new content hash because one of its children changed. This chain of updates from leaf to root is what maintains the Merkle tree and makes versioning work.

Void Management

When garbage collection reclaims an entry, the space becomes a Void – a marker for reclaimable space. Voids are tracked by size using deterministic hash keys:

Key:   BLAKE3("::aeordb:void:262144")
Value: [list of file offsets where 262144-byte voids exist]

When a new entry needs to be written, the engine checks for a void of sufficient size before appending to the end of the file. If a void is larger than needed, it is split: the entry occupies the front, and a smaller void is created for the remainder (if the remainder is at least 63 bytes – the minimum entry header size).

Compression

Compression is a post-hash transform:

Write: raw data -> hash -> compress -> store
Read:  load -> decompress -> verify hash -> return

The hash is always computed on the raw uncompressed data. This preserves deduplication (same content = same hash regardless of compression) and integrity verification.

Each entry carries its own compression_algo byte, so compressed and uncompressed entries coexist in the same file. Currently, zstd is the only supported compression algorithm.

fsync Strategy

Not all entries are equally important for durability:

DatafsyncRationale
Chunks, FileRecords, DeletionRecordsImmediateThe truth – not rebuildable from other data
KV store, NVT, DirectoryIndex, SnapshotsDeferredDerived data – can be rebuilt from a full entry scan

This gives durability where it matters and performance where it doesn’t.

Crash Recovery

The recovery hierarchy, from least to most damage:

What’s LostRecovery Method
NothingRead HEAD from KV store, load directory index, ready
KV store onlyEntry-by-entry scan, rebuild KV store, load latest directory index
Directory index onlyScan FileRecords + DeletionRecords, reconstruct from paths + timestamps
KV store + directoryFull entry scan, rebuild KV, reconstruct directory
Only chunks + FileRecords surviveFull data recovery, version history reconstructed via DeletionRecords

The magic bytes at the start of every entry enable boundary detection even in partially corrupted files. The total_length field in each header enables efficient forward scanning.

Next Steps

Versioning

AeorDB’s content-addressed Merkle tree makes versioning a structural property of the storage engine rather than an add-on feature. Every write creates new hashes up the directory tree, so every committed state is already a snapshot by definition – you just need to save a pointer to the root hash.

How Versioning Works

The database state at any point in time is fully described by its root hash (HEAD). HEAD is the hash of the root DirectoryIndex, which contains hashes of its children, which contain hashes of their children, all the way down to the chunks of individual files.

HEAD -> root DirectoryIndex
         |
         +-- /users/ (DirectoryIndex)
         |     +-- alice.json (FileRecord -> [chunk_a, chunk_b])
         |     +-- bob.json   (FileRecord -> [chunk_c])
         |
         +-- /docs/ (DirectoryIndex)
               +-- readme.md  (FileRecord -> [chunk_d])

When you write /users/alice.json, the engine creates:

  1. New chunks (if the content changed)
  2. A new FileRecord with new chunk hashes
  3. A new /users/ DirectoryIndex with the updated child entry
  4. A new root DirectoryIndex with the updated /users/ child entry
  5. HEAD now points to the new root hash

The old root hash still exists and still points to the old directory tree with the old file content. Nothing was overwritten – new entries were appended.

Snapshots

A snapshot is a named reference to a root hash. Creating a snapshot saves the current HEAD so you can return to it later.

Create a Snapshot

curl -X POST http://localhost:3000/version/snapshot \
  -H "Content-Type: application/json" \
  -d '{"name": "v1.0"}'

List Snapshots

curl http://localhost:3000/version/snapshots

Response:

{
  "snapshots": [
    {"name": "v1.0", "root_hash": "a1b2c3...", "created_at": 1775968398000},
    {"name": "v2.0", "root_hash": "d4e5f6...", "created_at": 1775968500000}
  ]
}

Restore a Snapshot

Restoring a snapshot sets HEAD back to the snapshot’s root hash. The current state is not lost – you can snapshot it before restoring if you want to preserve it.

curl -X POST http://localhost:3000/version/restore \
  -H "Content-Type: application/json" \
  -d '{"name": "v1.0"}'

Delete a Snapshot

curl -X DELETE http://localhost:3000/version/snapshot/v1.0

After deleting a snapshot, entries that were only reachable through that snapshot become eligible for garbage collection.

Forks

Forks are isolated branches of the database. Writes to a fork do not affect HEAD or other forks. This is useful for testing changes, running experiments, or staging updates before promoting them.

Create a Fork

curl -X POST http://localhost:3000/version/fork \
  -H "Content-Type: application/json" \
  -d '{"name": "experiment"}'

List Forks

curl http://localhost:3000/version/forks

Promote a Fork

When you’re satisfied with the changes in a fork, promote it to HEAD:

curl -X POST http://localhost:3000/version/fork/experiment/promote

Abandon a Fork

curl -X DELETE http://localhost:3000/version/fork/experiment

Tree Walking

The content-addressed Merkle tree enables historical reads. When you walk a snapshot’s directory tree:

  1. Start from the snapshot’s root hash
  2. Load the root DirectoryIndex – each child entry has a hash
  3. Follow child hashes to subdirectories or files
  4. Each FileRecord’s ChildEntry.hash points to a content-addressed (immutable) key

Because file content keys are immutable, walking a snapshot’s tree always resolves to the data as it existed when the snapshot was taken, even if the files have been overwritten or deleted since then.

Snapshot "v1.0" root_hash: aaa111...
  -> /users/ dir_hash: bbb222...
     -> alice.json content_key: ccc333...  (resolves to Alice's v1.0 data)

Current HEAD root_hash: ddd444...
  -> /users/ dir_hash: eee555...
     -> alice.json content_key: fff666...  (resolves to Alice's current data)

Both trees can coexist because they share unchanged chunks and directories (structural sharing). Only the parts that differ consume additional storage.

Export and Import

AeorDB can export a version as a self-contained .aeordb file and import it into another database.

Export

Export creates a clean, compacted database from a single version – no history, no voids, no deleted entries:

# Export current HEAD
aeordb export --database data.aeordb --output backup.aeordb

# Export a specific snapshot
aeordb export --database data.aeordb --snapshot v1.0 --output v1.aeordb

# Export via HTTP
curl -X POST http://localhost:3000/admin/export --output backup.aeordb
curl -X POST "http://localhost:3000/admin/export?snapshot=v1.0" --output v1.aeordb

The exported file is a fully functional database that can be opened with aeordb start.

Import

Import applies an export or patch file to a target database:

# Import without promoting HEAD (inspect first)
aeordb import --database target.aeordb --file backup.aeordb

# Import and promote in one step
aeordb import --database target.aeordb --file backup.aeordb --promote

# Import via HTTP
curl -X POST http://localhost:3000/admin/import \
  -H "Content-Type: application/octet-stream" \
  --data-binary @backup.aeordb

Import does NOT automatically change HEAD. The imported version exists in the database and can be promoted explicitly when ready.

Diff and Patch

Diff extracts only the changeset between two versions. The output is a patch file – not a standalone database.

# Diff between two snapshots
aeordb diff --database data.aeordb --from v1.0 --to v2.0 --output patch.aeordb

# Diff from a snapshot to current HEAD
aeordb diff --database data.aeordb --from v1.0 --output patch.aeordb

# Diff via HTTP
curl -X POST "http://localhost:3000/admin/diff?from=v1.0&to=v2.0" --output patch.aeordb

A patch file contains only new/changed chunks, updated FileRecords, and deletion markers. Chunks shared between the two versions are not included, making patches much smaller than full exports.

Applying a Patch

# Apply patch (strict base version check)
aeordb import --database target.aeordb --file patch.aeordb

# Skip base version check
aeordb import --database target.aeordb --file patch.aeordb --force

# Apply and promote
aeordb import --database target.aeordb --file patch.aeordb --promote

If the target database’s HEAD does not match the patch’s base version, the import fails unless --force is used.

Promote

Promote sets HEAD to a specific version hash. This is separate from import so you can inspect the imported data before committing to it:

# CLI
aeordb promote --database data.aeordb --hash f6e5d4c3...

# HTTP
curl -X POST "http://localhost:3000/admin/promote" \
  -H "Content-Type: application/json" \
  -d '{"hash": "f6e5d4c3..."}'

Next Steps

  • Storage Engine – how the Merkle tree and content addressing work at the byte level
  • Architecture – system overview and crash recovery

Indexing & Queries

AeorDB indexes are opt-in and configured per-directory. Nothing is indexed by default – you control exactly which fields are indexed, with which strategies, and for which file types. This keeps the engine lean and predictable.

Index Configuration

Create a .config/indexes.json file in any directory to define indexes for files in that directory:

curl -X PUT http://localhost:3000/engine/users/.config/indexes.json \
  -H "Content-Type: application/json" \
  -d '{
    "indexes": [
      {"name": "name", "type": ["string", "trigram"]},
      {"name": "age", "type": "u64"},
      {"name": "city", "type": "string"},
      {"name": "email", "type": "trigram"},
      {"name": "created_at", "type": "timestamp"}
    ]
  }'

When this file is created or updated, the engine automatically triggers a background reindex of all existing files in the directory.

Index Types

TypeOrder-PreservingDescription
u64YesUnsigned 64-bit integer. Range-tracking with observed min/max.
i64YesSigned 64-bit integer. Shifted to [0.0, 1.0] for NVT storage.
f64Yes64-bit floating point. Clamping for NaN/Inf handling.
stringPartiallyExact string matching. Multi-stage scalar: first byte weighted + length.
timestampYesUTC millisecond timestamps. Range-tracking.
trigramNoTrigram-based fuzzy text matching. Tolerates typos, supports substring search.
phoneticNoGeneral phonetic matching (Soundex algorithm).
soundexNoSoundex encoding for English names.
dmetaphoneNoDouble Metaphone for multi-cultural phonetic matching.

Multi-Strategy Indexes

A single field can be indexed with multiple strategies by passing type as an array:

{"name": "title", "type": ["string", "trigram", "phonetic"]}

This creates three separate index files for the same field:

  • title.string.idx – exact match queries
  • title.trigram.idx – fuzzy/substring queries
  • title.phonetic.idx – phonetic queries

Use the appropriate query operator to target the desired index.

How Indexes Work

AeorDB uses a Normalized Vector Table (NVT) for index lookups. Each indexed field gets its own NVT.

The NVT Approach

  1. A ScalarConverter maps each field value to a scalar in [0.0, 1.0]
  2. The scalar maps to a bucket in the NVT
  3. The bucket points to the matching entries

For numeric types (u64, i64, f64, timestamp), the converter tracks the observed min/max and distributes values uniformly across the [0.0, 1.0] range. This means range queries (gt, lt, between) are efficient – they resolve to a contiguous range of buckets.

For a query like WHERE age > 30:

  1. converter.to_scalar(30) computes where 30 falls in the bucket range
  2. All buckets after that point are candidates
  3. Only those buckets are scanned

This is O(1) for the bucket lookup, with a small linear scan within the bucket.

Two-Tier Execution

Simple queries (single field, direct comparison) use direct scalar lookups – no bitmaps, no compositing. Most queries fall into this tier.

Complex queries (OR, NOT, multi-field boolean logic) build NVT bitmaps and composite them:

  • Each field condition produces a bitmask over the NVT buckets
  • AND = bitwise AND of masks
  • OR = bitwise OR of masks
  • NOT = bitwise NOT of a mask
  • The final mask identifies which buckets contain results

Memory usage is bounded: a bitmask for 1M buckets is only 128KB, regardless of how many entries exist.

Source Resolution

By default, the name field in an index definition is used as the JSON key to extract the value:

{"name": "age", "type": "u64"}

This extracts the age field from {"name": "Alice", "age": 30}.

Nested Fields

For nested JSON or parser output, use the source array to specify the path:

{"name": "author", "source": ["metadata", "author"], "type": "string"}

This extracts metadata.author from a JSON structure like:

{"metadata": {"author": "Jane Smith", "title": "Report"}}

The source array supports:

  • String segments for object key lookup: ["metadata", "author"]
  • Integer segments for array index access: ["items", 0, "name"]

Plugin Mapper

For complex extraction logic, delegate to a WASM plugin:

{
  "name": "summary",
  "source": {"plugin": "my-mapper", "args": {"mode": "summary", "max_length": 500}},
  "type": "trigram"
}

The plugin receives the parsed JSON and the args object, and returns the extracted field value.

WASM Parser Integration

For non-JSON files (PDFs, images, XML, etc.), a parser plugin converts raw bytes into a JSON object that the indexing pipeline can work with.

Configuration

{
  "parser": "pdf-extractor",
  "parser_memory_limit": "256mb",
  "indexes": [
    {"name": "title", "source": ["metadata", "title"], "type": ["string", "trigram"]},
    {"name": "author", "source": ["metadata", "author"], "type": "phonetic"},
    {"name": "content", "source": ["text"], "type": "trigram"},
    {"name": "page_count", "source": ["metadata", "page_count"], "type": "u64"}
  ]
}

The parser receives a JSON envelope with the file data (base64-encoded) and metadata:

{
  "data": "<base64-encoded file bytes>",
  "meta": {
    "filename": "report.pdf",
    "path": "/docs/reports/report.pdf",
    "content_type": "application/pdf",
    "size": 1048576
  }
}

The parser returns a JSON object (like {"text": "...", "metadata": {"title": "...", ...}}), and the source paths in each index definition walk this JSON to extract field values.

Global Parser Registry

You can also register parsers globally by content type at /.config/parsers.json:

{
  "application/pdf": "pdf-extractor",
  "image/jpeg": "image-metadata",
  "image/png": "image-metadata"
}

When a file is stored and no parser is configured in the directory’s index config, the engine checks this registry using the file’s content type.

Failure Handling

Parser and indexing failures never prevent file storage. The file is always stored regardless of parse/index errors. If logging is enabled in the index config ("logging": true), errors are written to .logs/ under the directory.

Automatic Reindexing

When you store or update a .config/indexes.json file, the engine automatically enqueues a background reindex task for that directory. The task:

  1. Reads the current index config
  2. Lists all files in the directory
  3. Re-runs the indexing pipeline for each file (in batches of 50, yielding between batches)
  4. Reports progress via GET /admin/tasks

During reindexing, queries still work but may return incomplete results. The query response includes a meta.reindexing field with the current progress:

{
  "results": [...],
  "meta": {
    "reindexing": 0.67,
    "reindexing_eta": 1775968398803,
    "reindexing_indexed": 6700,
    "reindexing_total": 10000
  }
}

Query API

Queries are submitted as POST /query with a JSON body:

{
  "path": "/users/",
  "where": {
    "and": [
      {"field": "age", "op": "gt", "value": 30},
      {"field": "city", "op": "eq", "value": "Portland"},
      {"not": {"field": "role", "op": "eq", "value": "banned"}}
    ]
  },
  "sort": {"field": "age", "order": "desc"},
  "limit": 50,
  "offset": 0
}

Boolean Logic

The where clause supports full boolean logic:

{
  "where": {
    "or": [
      {"field": "city", "op": "eq", "value": "Portland"},
      {
        "and": [
          {"field": "age", "op": "gt", "value": 25},
          {"field": "city", "op": "eq", "value": "Seattle"}
        ]
      }
    ]
  }
}

For backward compatibility, a flat array in where is treated as an implicit and:

{
  "where": [
    {"field": "age", "op": "gt", "value": 30},
    {"field": "city", "op": "eq", "value": "Portland"}
  ]
}

Query Operators

OperatorDescriptionValue Type
eqEqualsany
gtGreater thannumeric, timestamp
gteGreater than or equalnumeric, timestamp
ltLess thannumeric, timestamp
lteLess than or equalnumeric, timestamp
betweenInclusive range[min, max]
fuzzyTrigram fuzzy matchstring (requires trigram index)
phoneticPhonetic matchstring (requires phonetic/soundex/dmetaphone index)

Next Steps

Files & Directories

AeorDB exposes a content-addressable filesystem through its engine routes. Every path under /engine/ represents either a file or a directory.

Endpoint Summary

MethodPathDescriptionAuthStatus Codes
PUT/engine/{path}Store a fileYes201, 400, 404, 409, 500
GET/engine/{path}Read a file or list a directoryYes200, 404, 500
DELETE/engine/{path}Delete a fileYes200, 404, 500
HEAD/engine/{path}Check existence and get metadataYes200, 404, 500
POST/engine-symlink/{path}Create or update a symlinkYes201, 400, 500

PUT /engine/

Store a file at the given path. Parent directories are created automatically. If a file already exists at the path, it is overwritten (creating a new version).

Body limit: 10 GB

Request

  • Headers:
    • Authorization: Bearer <token> (required)
    • Content-Type (optional) – auto-detected from magic bytes if omitted
  • Body: raw file bytes

Response

Status: 201 Created

{
  "path": "/data/report.pdf",
  "content_type": "application/pdf",
  "total_size": 245678,
  "created_at": 1775968398000,
  "updated_at": 1775968398000
}

Side Effects

  • If the path matches /.config/indexes.json (or a nested variant like /data/.config/indexes.json), a reindex task is automatically enqueued for the parent directory. Any existing pending or running reindex for that path is cancelled first.
  • Triggers entries_created events on the event bus.
  • Runs any deployed store-phase plugins.

Example

curl -X PUT http://localhost:3000/engine/data/report.pdf \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/pdf" \
  --data-binary @report.pdf

Error Responses

StatusCondition
400Invalid input (e.g., empty path)
404Parent path references a non-existent entity
409Path conflict (e.g., file exists where directory expected)
500Internal storage failure

GET /engine/

Read a file or list a directory. The server determines the type automatically:

  • If the path resolves to a file, the file content is streamed with appropriate headers.
  • If the path resolves to a directory, a JSON array of children is returned.

Request

  • Headers:
    • Authorization: Bearer <token> (required)

Query Parameters

ParamTypeDefaultDescription
snapshotstringRead the file as it was at this named snapshot
versionstringRead the file at this version hash (hex)
nofollowbooleanfalseIf the path is a symlink, return metadata instead of following
depthinteger0Directory listing depth: 0 = immediate children, -1 = unlimited recursion
globstringFilter directory listing by file name glob pattern (*, ?, [abc])

File Response

Status: 200 OK

Headers:

HeaderDescription
X-PathCanonical path of the file
X-Total-SizeFile size in bytes
X-Created-AtUnix timestamp (milliseconds)
X-Updated-AtUnix timestamp (milliseconds)
Content-TypeMIME type (if known)

Body: raw file bytes (streamed)

Directory Response

Status: 200 OK

Each entry includes path, hash, and numeric entry_type fields. Symlink entries also include a target field.

Entry types: 2 = file, 3 = directory, 8 = symlink.

[
  {
    "path": "/data/report.pdf",
    "name": "report.pdf",
    "entry_type": 2,
    "hash": "a3f8c1...",
    "total_size": 245678,
    "created_at": 1775968398000,
    "updated_at": 1775968398000,
    "content_type": "application/pdf"
  },
  {
    "path": "/data/images",
    "name": "images",
    "entry_type": 3,
    "hash": "b2c4d5...",
    "total_size": 0,
    "created_at": 1775968000000,
    "updated_at": 1775968000000,
    "content_type": null
  },
  {
    "path": "/data/latest",
    "name": "latest",
    "entry_type": 8,
    "hash": "c3d5e6...",
    "target": "/data/report.pdf",
    "total_size": 0,
    "created_at": 1775968500000,
    "updated_at": 1775968500000,
    "content_type": null
  }
]

Examples

Read a file:

curl http://localhost:3000/engine/data/report.pdf \
  -H "Authorization: Bearer $TOKEN" \
  -o report.pdf

List a directory:

curl http://localhost:3000/engine/data/ \
  -H "Authorization: Bearer $TOKEN"

Recursive Directory Listing

Use the depth and glob query parameters to list files recursively:

# List all files recursively
curl http://localhost:3000/engine/data/?depth=-1 \
  -H "Authorization: Bearer $TOKEN"

# List only .psd files anywhere under /assets/
curl "http://localhost:3000/engine/assets/?depth=-1&glob=*.psd" \
  -H "Authorization: Bearer $TOKEN"

# List one level deep
curl http://localhost:3000/engine/data/?depth=1 \
  -H "Authorization: Bearer $TOKEN"

When depth > 0 or depth = -1, the response contains files only in a flat list. Directory entries are traversed but not included in the output.

Versioned Reads

Read a file as it was at a specific snapshot or version:

# Read file at a named snapshot
curl "http://localhost:3000/engine/data/report.pdf?snapshot=v1.0" \
  -H "Authorization: Bearer $TOKEN"

# Read file at a specific version hash
curl "http://localhost:3000/engine/data/report.pdf?version=a1b2c3..." \
  -H "Authorization: Bearer $TOKEN"

If both snapshot and version are provided, snapshot takes precedence. Returns 404 if the file did not exist at that version.

Error Responses

StatusCondition
404Path does not exist as file or directory
500Internal read failure

DELETE /engine/

Delete a file at the given path. Creates a DeletionRecord and removes the file from its parent directory listing. Directories cannot be deleted directly – delete all files within first.

Request

  • Headers:
    • Authorization: Bearer <token> (required)

Response

Status: 200 OK

{
  "deleted": true,
  "path": "/data/report.pdf"
}

Side Effects

  • Triggers entries_deleted events on the event bus.
  • Updates index entries for the deleted file.

Example

curl -X DELETE http://localhost:3000/engine/data/report.pdf \
  -H "Authorization: Bearer $TOKEN"

Error Responses

StatusCondition
404File not found
500Internal deletion failure

AeorDB supports soft symlinks — entries that point to another path. Symlinks are transparent by default: reading a symlink path returns the target’s content.

POST /engine-symlink/

Create or update a symlink.

Request Body:

{
  "target": "/assets/logo.psd"
}

Response: 201 Created

{
  "path": "/latest-logo",
  "target": "/assets/logo.psd",
  "entry_type": 8,
  "created_at": 1775968398000,
  "updated_at": 1775968398000
}

The target path does not need to exist at creation time (dangling symlinks are allowed).

By default, GET /engine/{path} follows symlinks transparently:

# Returns the content of /assets/logo.psd
curl http://localhost:3000/engine/latest-logo \
  -H "Authorization: Bearer $TOKEN"

To inspect the symlink itself without following it, use ?nofollow=true:

curl "http://localhost:3000/engine/latest-logo?nofollow=true" \
  -H "Authorization: Bearer $TOKEN"

Returns the symlink metadata as JSON instead of the target’s content.

Symlinks can point to other symlinks — chains are followed recursively. AeorDB detects cycles and enforces a maximum resolution depth of 32 hops.

ScenarioResult
Symlink → fileReturns file content
Symlink → directoryReturns directory listing
Symlink → symlink → fileFollows chain, returns file content
Symlink → nonexistent404 (dangling symlink)
Symlink cycle (A → B → A)400 with cycle detection message
Chain exceeds 32 hops400 with depth exceeded message

HEAD /engine/{path} returns symlink metadata as headers:

X-Entry-Type: symlink
X-Symlink-Target: /assets/logo.psd
X-Path: /latest-logo
X-Created-At: 1775968398000
X-Updated-At: 1775968398000

DELETE /engine/{path} on a symlink deletes the symlink itself, not the target:

curl -X DELETE http://localhost:3000/engine/latest-logo \
  -H "Authorization: Bearer $TOKEN"
{
  "deleted": true,
  "path": "latest-logo",
  "type": "symlink"
}

Symlinks appear in directory listings with entry_type: 8 and a target field:

{
  "path": "/data/latest",
  "name": "latest",
  "entry_type": 8,
  "hash": "c3d5e6...",
  "target": "/data/report.pdf",
  "total_size": 0,
  "created_at": 1775968500000,
  "updated_at": 1775968500000,
  "content_type": null
}

Symlinks are versioned like files. Snapshots capture the symlink’s target path at that point in time. Restoring a snapshot restores the link, not the resolved content.


HEAD /engine/

Check whether a path exists and retrieve its metadata as response headers, without downloading the body. Works for both files and directories.

Request

  • Headers:
    • Authorization: Bearer <token> (required)

Response

Status: 200 OK (empty body)

Headers:

HeaderValue
X-Entry-Typefile, directory, or symlink
X-PathCanonical path
X-Total-SizeFile size in bytes (files only)
X-Created-AtUnix timestamp in milliseconds (files only)
X-Updated-AtUnix timestamp in milliseconds (files only)
Content-TypeMIME type (files only, if known)
X-Symlink-TargetTarget path (symlinks only)

Example

curl -I http://localhost:3000/engine/data/report.pdf \
  -H "Authorization: Bearer $TOKEN"
HTTP/1.1 200 OK
X-Entry-Type: file
X-Path: /data/report.pdf
X-Total-Size: 245678
X-Created-At: 1775968398000
X-Updated-At: 1775968398000
Content-Type: application/pdf

Error Responses

StatusCondition
404Path does not exist
500Internal metadata lookup failure

Query API

The query engine supports indexed field queries with boolean combinators, pagination, sorting, aggregations, projections, and an explain mode.

Endpoint Summary

MethodPathDescriptionAuthStatus Codes
POST/queryExecute a queryYes200, 400, 404, 500

POST /query

Execute a query against indexed fields within a directory path.

Request Body

{
  "path": "/users",
  "where": {
    "field": "age",
    "op": "gt",
    "value": 21
  },
  "limit": 20,
  "offset": 0,
  "order_by": [{"field": "name", "direction": "asc"}],
  "after": null,
  "before": null,
  "include_total": true,
  "select": ["@path", "@score", "name"],
  "explain": false
}
FieldTypeRequiredDescription
pathstringYesDirectory path to query within
whereobject/arrayYesQuery filter (see below)
limitintegerNoMax results to return (server default applies if omitted)
offsetintegerNoSkip this many results
order_byarrayNoSort fields with direction
afterstringNoCursor for forward pagination
beforestringNoCursor for backward pagination
include_totalbooleanNoInclude total_count in response (default: false)
selectarrayNoProject specific fields in results
aggregateobjectNoRun aggregations instead of returning results
explainstring/booleanNo"plan", "analyze", or true for query plan

Query Operators

Each field query is an object with field, op, and value:

{"field": "age", "op": "gt", "value": 21}

Comparison Operators

OperatorDescriptionValue TypeExample
eqExact matchany{"field": "status", "op": "eq", "value": "active"}
gtGreater thannumber/string{"field": "age", "op": "gt", "value": 21}
ltLess thannumber/string{"field": "age", "op": "lt", "value": 65}
betweenInclusive rangenumber/string{"field": "age", "op": "between", "value": 21, "value2": 65}
inMatch any value in a setarray{"field": "status", "op": "in", "value": ["active", "pending"]}

Text Search Operators

These operators require the appropriate index type to be configured.

OperatorDescriptionIndex RequiredExample
containsSubstring matchtrigram{"field": "name", "op": "contains", "value": "alice"}
similarFuzzy trigram match with thresholdtrigram{"field": "name", "op": "similar", "value": "alice", "threshold": 0.3}
phoneticSounds-like matchphonetic{"field": "name", "op": "phonetic", "value": "smith"}
fuzzyConfigurable fuzzy matchtrigramSee below
matchMulti-strategy combined matchtrigram + phonetic{"field": "name", "op": "match", "value": "alice"}

Fuzzy Operator Options

The fuzzy operator supports additional parameters:

{
  "field": "name",
  "op": "fuzzy",
  "value": "alice",
  "fuzziness": "auto",
  "algorithm": "damerau_levenshtein"
}
ParameterValuesDefault
fuzziness"auto" or integer (edit distance)"auto"
algorithm"damerau_levenshtein", "jaro_winkler""damerau_levenshtein"

Similar Operator Options

{
  "field": "name",
  "op": "similar",
  "value": "alice",
  "threshold": 0.3
}
ParameterTypeDefaultDescription
thresholdfloat0.3Minimum similarity score (0.0 to 1.0)

Boolean Combinators

Combine multiple conditions using and, or, and not:

AND

All conditions must match:

{
  "where": {
    "and": [
      {"field": "age", "op": "gt", "value": 21},
      {"field": "status", "op": "eq", "value": "active"}
    ]
  }
}

OR

At least one condition must match:

{
  "where": {
    "or": [
      {"field": "status", "op": "eq", "value": "active"},
      {"field": "status", "op": "eq", "value": "pending"}
    ]
  }
}

NOT

Invert a condition:

{
  "where": {
    "not": {"field": "status", "op": "eq", "value": "deleted"}
  }
}

Nested Boolean Logic

Combinators can be nested arbitrarily:

{
  "where": {
    "and": [
      {"field": "age", "op": "gt", "value": 21},
      {
        "or": [
          {"field": "role", "op": "eq", "value": "admin"},
          {"field": "role", "op": "eq", "value": "moderator"}
        ]
      }
    ]
  }
}

Legacy Array Format

An array at the top level is sugar for AND:

{
  "where": [
    {"field": "age", "op": "gt", "value": 21},
    {"field": "status", "op": "eq", "value": "active"}
  ]
}

Response Format

Standard Query Response

{
  "results": [
    {
      "path": "/users/alice.json",
      "total_size": 256,
      "content_type": "application/json",
      "created_at": 1775968398000,
      "updated_at": 1775968398000,
      "score": 1.0,
      "matched_by": ["age"]
    }
  ],
  "has_more": true,
  "total_count": 150,
  "next_cursor": "eyJwYXRoIjoiL3VzZXJzL2JvYi5qc29uIn0=",
  "prev_cursor": "eyJwYXRoIjoiL3VzZXJzL2Fhcm9uLmpzb24ifQ==",
  "meta": {
    "reindexing": 0.67,
    "reindexing_eta": 1775968398803,
    "reindexing_indexed": 670,
    "reindexing_total": 1000,
    "reindexing_stale_since": 1775968300000
  }
}
FieldTypeDescription
resultsarrayMatching file metadata with scores
has_morebooleanWhether more results exist beyond the current page
total_countintegerTotal matching results (only if include_total: true)
next_cursorstringCursor for the next page (if has_more is true)
prev_cursorstringCursor for the previous page
default_limit_hitbooleanPresent and true when the server’s default limit was applied
default_limitintegerThe server’s default limit value (present with default_limit_hit)
metaobjectReindex progress metadata (present only during active reindex)

Result Fields

Each result object contains:

FieldTypeDescription
pathstringFull path to the matched file
total_sizeintegerFile size in bytes
content_typestringMIME type (nullable)
created_atintegerCreation timestamp (ms)
updated_atintegerLast update timestamp (ms)
scorefloatRelevance score (1.0 = exact match)
matched_byarrayList of field names that matched

Sorting

Sort results by one or more fields:

{
  "order_by": [
    {"field": "name", "direction": "asc"},
    {"field": "created_at", "direction": "desc"}
  ]
}
DirectionDescription
ascAscending (default)
descDescending

Pagination

Offset-Based

{
  "limit": 20,
  "offset": 40
}

Cursor-Based

Use after or before with cursor values from a previous response:

{
  "limit": 20,
  "after": "eyJwYXRoIjoiL3VzZXJzL2JvYi5qc29uIn0="
}

Projection (select)

Return only specific fields in each result. Use @-prefixed names for built-in metadata fields:

{
  "select": ["@path", "@score", "name", "email"]
}
Virtual FieldMaps To
@pathpath
@scorescore
@sizetotal_size
@content_typecontent_type
@created_atcreated_at
@updated_atupdated_at
@matched_bymatched_by

Envelope fields (has_more, next_cursor, total_count, meta) are never stripped by projection.


Aggregations

Run aggregate computations instead of returning individual results.

Request

{
  "path": "/orders",
  "where": {"field": "status", "op": "eq", "value": "complete"},
  "aggregate": {
    "count": true,
    "sum": ["total", "tax"],
    "avg": ["total"],
    "min": ["total"],
    "max": ["total"],
    "group_by": ["status"]
  }
}
FieldTypeDescription
countbooleanInclude a count of matching records
sumarrayFields to sum
avgarrayFields to average
minarrayFields to find minimum
maxarrayFields to find maximum
group_byarrayFields to group results by

Response

The response shape depends on whether group_by is used. Aggregation results are returned as a JSON object.


Explain Mode

Inspect the query execution plan without running the full query. Useful for debugging index usage and performance.

{
  "path": "/users",
  "where": {"field": "age", "op": "gt", "value": 21},
  "explain": "plan"
}
ValueDescription
true or "plan"Show the query plan
"analyze"Execute the query and include timing information

Examples

Simple equality query

curl -X POST http://localhost:3000/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "path": "/users",
    "where": {"field": "status", "op": "eq", "value": "active"},
    "limit": 10
  }'

Fuzzy name search with pagination

curl -X POST http://localhost:3000/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "path": "/users",
    "where": {"field": "name", "op": "similar", "value": "alice", "threshold": 0.4},
    "limit": 20,
    "order_by": [{"field": "name", "direction": "asc"}],
    "include_total": true
  }'

Complex boolean query

curl -X POST http://localhost:3000/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "path": "/products",
    "where": {
      "and": [
        {"field": "price", "op": "between", "value": 10, "value2": 100},
        {
          "or": [
            {"field": "category", "op": "eq", "value": "electronics"},
            {"field": "category", "op": "eq", "value": "books"}
          ]
        },
        {"not": {"field": "status", "op": "eq", "value": "discontinued"}}
      ]
    },
    "order_by": [{"field": "price", "direction": "asc"}],
    "limit": 50
  }'

Aggregation with grouping

curl -X POST http://localhost:3000/query \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "path": "/orders",
    "where": {"field": "year", "op": "eq", "value": 2026},
    "aggregate": {
      "count": true,
      "sum": ["total"],
      "avg": ["total"],
      "group_by": ["status"]
    }
  }'

Error Responses

StatusCondition
400Invalid query structure, missing field/op, unsupported operation, range query on non-range converter
404Query path or index not found
500Internal query execution failure

Version API

AeorDB provides Git-like version control through snapshots (named points in time) and forks (divergent branches of the data).

Endpoint Summary

MethodPathDescriptionAuthRoot Required
POST/version/snapshotCreate a snapshotYesNo
GET/version/snapshotsList all snapshotsYesNo
POST/version/restoreRestore a snapshotYesYes
DELETE/version/snapshot/{name}Delete a snapshotYesYes
POST/version/forkCreate a forkYesNo
GET/version/forksList all forksYesNo
POST/version/fork/{name}/promotePromote fork to HEADYesYes
DELETE/version/fork/{name}Abandon a forkYesYes
GET/engine/{path}?snapshot={name}Read file at a snapshotYesNo
GET/engine/{path}?version={hash}Read file at a version hashYesNo
GET/version/file-history/{path}File change history across snapshotsYesNo
POST/version/file-restore/{path}Restore file from a versionYesYes

Snapshots

POST /version/snapshot

Create a named snapshot of the current HEAD.

Request Body:

{
  "name": "v1.0",
  "metadata": {
    "description": "First stable release",
    "author": "alice"
  }
}
FieldTypeRequiredDescription
namestringYesUnique snapshot name
metadataobjectNoArbitrary key-value metadata (defaults to empty)

Response: 201 Created

{
  "name": "v1.0",
  "root_hash": "a1b2c3d4e5f6...",
  "created_at": 1775968398000,
  "metadata": {
    "description": "First stable release",
    "author": "alice"
  }
}

Example:

curl -X POST http://localhost:3000/version/snapshot \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name": "v1.0", "metadata": {"description": "First stable release"}}'

Error Responses:

StatusCondition
409Snapshot with this name already exists
500Internal failure

GET /version/snapshots

List all snapshots.

Response: 200 OK

[
  {
    "name": "v1.0",
    "root_hash": "a1b2c3d4e5f6...",
    "created_at": 1775968398000,
    "metadata": {"description": "First stable release"}
  },
  {
    "name": "v2.0",
    "root_hash": "f6e5d4c3b2a1...",
    "created_at": 1775969000000,
    "metadata": {}
  }
]

Example:

curl http://localhost:3000/version/snapshots \
  -H "Authorization: Bearer $TOKEN"

POST /version/restore

Restore a named snapshot, making it the current HEAD. Requires root.

Request Body:

{
  "name": "v1.0"
}

Response: 200 OK

{
  "restored": true,
  "name": "v1.0"
}

Example:

curl -X POST http://localhost:3000/version/restore \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name": "v1.0"}'

Error Responses:

StatusCondition
403Non-root user
404Snapshot not found
500Internal failure

DELETE /version/snapshot/

Delete a named snapshot. Requires root.

Response: 200 OK

{
  "deleted": true,
  "name": "v1.0"
}

Example:

curl -X DELETE http://localhost:3000/version/snapshot/v1.0 \
  -H "Authorization: Bearer $TOKEN"

Error Responses:

StatusCondition
403Non-root user
404Snapshot not found
500Internal failure

Forks

Forks create a divergent branch of the data, optionally based on a named snapshot.

POST /version/fork

Create a new fork.

Request Body:

{
  "name": "experiment",
  "base": "v1.0"
}
FieldTypeRequiredDescription
namestringYesUnique fork name
basestringNoSnapshot name to fork from (defaults to current HEAD)

Response: 201 Created

{
  "name": "experiment",
  "root_hash": "a1b2c3d4e5f6...",
  "created_at": 1775968398000
}

Example:

curl -X POST http://localhost:3000/version/fork \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name": "experiment", "base": "v1.0"}'

Error Responses:

StatusCondition
409Fork with this name already exists
500Internal failure

GET /version/forks

List all active forks.

Response: 200 OK

[
  {
    "name": "experiment",
    "root_hash": "a1b2c3d4e5f6...",
    "created_at": 1775968398000
  }
]

Example:

curl http://localhost:3000/version/forks \
  -H "Authorization: Bearer $TOKEN"

POST /version/fork/{name}/promote

Promote a fork’s state to HEAD, making it the active version. Requires root.

Response: 200 OK

{
  "promoted": true,
  "name": "experiment"
}

Example:

curl -X POST http://localhost:3000/version/fork/experiment/promote \
  -H "Authorization: Bearer $TOKEN"

Error Responses:

StatusCondition
403Non-root user
404Fork not found
500Internal failure

DELETE /version/fork/

Abandon a fork (soft delete). Requires root.

Response: 200 OK

{
  "abandoned": true,
  "name": "experiment"
}

Example:

curl -X DELETE http://localhost:3000/version/fork/experiment \
  -H "Authorization: Bearer $TOKEN"

Error Responses:

StatusCondition
403Non-root user
404Fork not found
500Internal failure

File-Level Version Access

Read, restore, and view history for individual files at specific historical versions.

Reading Files at a Version

Use query parameters on the standard file read endpoint:

# Read a file as it was at a named snapshot
curl "http://localhost:3000/engine/assets/logo.psd?snapshot=v1.0" \
  -H "Authorization: Bearer $TOKEN"

# Read a file at a specific version hash
curl "http://localhost:3000/engine/assets/logo.psd?version=a1b2c3d4..." \
  -H "Authorization: Bearer $TOKEN"

Returns the file content exactly as it was at that version, with the same headers as a normal file read. If both snapshot and version are provided, snapshot takes precedence.

Error Responses:

StatusCondition
404File did not exist at that version
404Snapshot or version not found
400Invalid version hash (not valid hex)

GET /version/file-history/

View the change history of a single file across all snapshots. Returns entries ordered newest-first, each with a change_type indicating what happened to the file at that snapshot.

Response: 200 OK

{
  "path": "assets/logo.psd",
  "history": [
    {
      "snapshot": "v2.0",
      "timestamp": 1775969000000,
      "change_type": "modified",
      "size": 512000,
      "content_type": "image/vnd.adobe.photoshop",
      "content_hash": "f6e5d4c3..."
    },
    {
      "snapshot": "v1.0",
      "timestamp": 1775968398000,
      "change_type": "added",
      "size": 256000,
      "content_type": "image/vnd.adobe.photoshop",
      "content_hash": "a1b2c3d4..."
    }
  ]
}

Change types:

TypeMeaning
addedFile exists in this snapshot but not the previous one
modifiedFile exists in both but content changed
unchangedFile exists in both with identical content
deletedFile existed in the previous snapshot but not this one

If the file has never existed in any snapshot, returns 200 with an empty history array.

Example:

curl http://localhost:3000/version/file-history/assets/logo.psd \
  -H "Authorization: Bearer $TOKEN"

POST /version/file-restore/

Restore a single file from a historical version to the current HEAD. Requires root.

Before restoring, an automatic safety snapshot is created (named pre-restore-{timestamp}) to preserve the current state. If the safety snapshot cannot be created, the restore is rejected.

Request Body:

{
  "snapshot": "v1.0"
}

Or using a version hash:

{
  "version": "a1b2c3d4..."
}
FieldTypeRequiredDescription
snapshotstringOne requiredSnapshot name to restore from
versionstringOne requiredVersion hash (hex) to restore from

If both are provided, snapshot takes precedence.

Response: 200 OK

{
  "restored": true,
  "path": "assets/logo.psd",
  "from_snapshot": "v1.0",
  "auto_snapshot": "pre-restore-2026-04-14T05-01-01Z",
  "size": 256000
}

The auto_snapshot field contains the name of the safety snapshot created before the restore. You can use this snapshot to recover the pre-restore state if needed.

Example:

curl -X POST http://localhost:3000/version/file-restore/assets/logo.psd \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"snapshot": "v1.0"}'

Error Responses:

StatusCondition
400Neither snapshot nor version provided
403Non-root user (requires both write and snapshot permissions)
404File not found at the specified version
404Snapshot or version not found
500Failed to create safety snapshot or write restored file

Pre-Hashed Upload Protocol

AeorDB provides a 4-phase upload protocol for efficient, deduplicated file transfers. Clients split files into chunks, hash them locally, and only upload chunks the server does not already have.

Protocol Overview

  1. Negotiate – GET /upload/config to learn the hash algorithm and chunk size.
  2. Dedup check – POST /upload/check with a list of chunk hashes to find which are already stored.
  3. Upload – PUT /upload/chunks/{hash} for each needed chunk.
  4. Commit – POST /upload/commit to atomically assemble chunks into files.

Endpoint Summary

MethodPathDescriptionAuthBody Limit
GET/upload/configNegotiate hash algorithm and chunk sizeNo
POST/upload/checkCheck which chunks the server already hasYes1 MB
PUT/upload/chunks/{hash}Upload a single chunkYes10 GB
POST/upload/commitAtomic multi-file commit from chunksYes1 MB

Phase 1: GET /upload/config

Retrieve the server’s hash algorithm, chunk size, and hash prefix. This endpoint is public (no authentication required).

Response

Status: 200 OK

{
  "hash_algorithm": "blake3",
  "chunk_size": 262144,
  "chunk_hash_prefix": "chunk:"
}
FieldTypeDescription
hash_algorithmstringHash algorithm used by the server (e.g., "blake3")
chunk_sizeintegerMaximum chunk size in bytes (262,144 = 256 KB)
chunk_hash_prefixstringPrefix prepended to chunk data before hashing

How to Compute Chunk Hashes

The server computes chunk hashes as:

hash = blake3("chunk:" + chunk_bytes)

Clients must use the same formula. The prefix ("chunk:") is prepended to the raw bytes before hashing, not to the hex-encoded hash.

Example

curl http://localhost:3000/upload/config

Phase 2: POST /upload/check

Send a list of chunk hashes to determine which ones the server already has (deduplication). Only upload the ones in the needed list.

Request Body

{
  "hashes": [
    "a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2",
    "f6e5d4c3b2a1f6e5d4c3b2a1f6e5d4c3b2a1f6e5d4c3b2a1f6e5d4c3b2a1f6e5"
  ]
}
FieldTypeRequiredDescription
hashesarray of stringsYesHex-encoded chunk hashes

Response

Status: 200 OK

{
  "have": [
    "a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2c3d4e5f6a1b2"
  ],
  "needed": [
    "f6e5d4c3b2a1f6e5d4c3b2a1f6e5d4c3b2a1f6e5d4c3b2a1f6e5d4c3b2a1f6e5"
  ]
}
FieldTypeDescription
havearrayHashes the server already has – skip these
neededarrayHashes the server needs – upload these

Example

curl -X POST http://localhost:3000/upload/check \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"hashes": ["a1b2c3...", "f6e5d4..."]}'

Error Responses

StatusCondition
400Invalid hex hash in the list

Phase 3: PUT /upload/chunks/

Upload a single chunk. The server verifies the hash matches the content before storing.

Request

  • URL parameter: {hash} – hex-encoded blake3 hash of "chunk:" + chunk_bytes
  • Headers:
    • Authorization: Bearer <token> (required)
  • Body: raw chunk bytes

Hash Verification

The server recomputes the hash from the uploaded bytes:

computed = blake3("chunk:" + body_bytes)

If the computed hash does not match the URL parameter, the upload is rejected.

Response

Status: 201 Created (new chunk stored)

{
  "status": "created",
  "hash": "f6e5d4c3b2a1..."
}

Status: 200 OK (chunk already exists – dedup)

{
  "status": "exists",
  "hash": "f6e5d4c3b2a1..."
}

Compression

The server automatically applies Zstd compression to chunks when beneficial (based on size heuristics). This is transparent to the client.

Example

curl -X PUT http://localhost:3000/upload/chunks/f6e5d4c3b2a1... \
  -H "Authorization: Bearer $TOKEN" \
  --data-binary @chunk_001.bin

Error Responses

StatusCondition
400Chunk exceeds maximum size (262,144 bytes)
400Invalid hex hash in URL
400Hash mismatch between URL and computed hash
500Storage failure

Phase 4: POST /upload/commit

Atomically commit multiple files from previously uploaded chunks. Each file specifies its path, content type, and the ordered list of chunk hashes that compose it.

Request Body

{
  "files": [
    {
      "path": "/data/report.pdf",
      "content_type": "application/pdf",
      "chunk_hashes": [
        "a1b2c3d4e5f6...",
        "f6e5d4c3b2a1..."
      ]
    },
    {
      "path": "/data/image.png",
      "content_type": "image/png",
      "chunk_hashes": [
        "1234abcd5678..."
      ]
    }
  ]
}
FieldTypeRequiredDescription
filesarrayYesList of files to commit
files[].pathstringYesDestination path for the file
files[].content_typestringNoMIME type
files[].chunk_hashesarrayYesOrdered list of hex-encoded chunk hashes

Response

Status: 200 OK

The response contains a summary of the commit operation.

Example

curl -X POST http://localhost:3000/upload/commit \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "files": [
      {
        "path": "/data/report.pdf",
        "content_type": "application/pdf",
        "chunk_hashes": ["a1b2c3d4...", "f6e5d4c3..."]
      }
    ]
  }'

Error Responses

StatusCondition
400Invalid input (missing path, bad hash, etc.)
500Commit task failure or panic

Full Upload Workflow

Here is a complete workflow for uploading a file:

# 1. Get server configuration
CONFIG=$(curl -s http://localhost:3000/upload/config)
CHUNK_SIZE=$(echo $CONFIG | jq -r '.chunk_size')

# 2. Split file into chunks and hash them
# (pseudo-code: split report.pdf into 256KB chunks, hash each with blake3)
# chunk_hashes=["hash1", "hash2", ...]

# 3. Check which chunks are needed
DEDUP=$(curl -s -X POST http://localhost:3000/upload/check \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"hashes": ["hash1", "hash2"]}')

# 4. Upload only the needed chunks
for hash in $(echo $DEDUP | jq -r '.needed[]'); do
  curl -X PUT "http://localhost:3000/upload/chunks/$hash" \
    -H "Authorization: Bearer $TOKEN" \
    --data-binary @"chunk_$hash.bin"
done

# 5. Commit the file
curl -X POST http://localhost:3000/upload/commit \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "files": [{
      "path": "/data/report.pdf",
      "content_type": "application/pdf",
      "chunk_hashes": ["hash1", "hash2"]
    }]
  }'

Admin Operations

Administrative endpoints for garbage collection, background tasks, cron scheduling, metrics, health checks, backup/restore, and user/group management. Most admin endpoints require root access.

Endpoint Summary

Garbage Collection

MethodPathDescriptionRoot Required
POST/admin/gcRun synchronous garbage collectionYes

Background Tasks

MethodPathDescriptionRoot Required
POST/admin/tasks/reindexTrigger a reindex taskYes
POST/admin/tasks/gcTrigger a background GC taskYes
GET/admin/tasksList all tasks with progressYes
GET/admin/tasks/{id}Get a single taskYes
DELETE/admin/tasks/{id}Cancel a taskYes

Cron Scheduling

MethodPathDescriptionRoot Required
GET/admin/cronList cron schedulesYes
POST/admin/cronCreate a cron scheduleYes
PATCH/admin/cron/{id}Update a cron scheduleYes
DELETE/admin/cron/{id}Delete a cron scheduleYes

Backup & Restore

MethodPathDescriptionRoot Required
POST/admin/exportExport database as .aeordbYes
POST/admin/diffCreate patch between versionsYes
POST/admin/importImport a backup or patchYes
POST/admin/promotePromote a version hash to HEADYes

Monitoring

MethodPathDescriptionRoot Required
GET/admin/metricsPrometheus metricsYes (auth required)
GET/admin/healthHealth checkNo (public)

API Key Management

MethodPathDescriptionRoot Required
POST/admin/api-keysCreate an API keyYes
GET/admin/api-keysList all API keysYes
DELETE/admin/api-keys/{key_id}Revoke an API keyYes

User Management

MethodPathDescriptionRoot Required
POST/admin/usersCreate a userYes
GET/admin/usersList all usersYes
GET/admin/users/{user_id}Get a userYes
PATCH/admin/users/{user_id}Update a userYes
DELETE/admin/users/{user_id}Deactivate a user (soft delete)Yes

Group Management

MethodPathDescriptionRoot Required
POST/admin/groupsCreate a groupYes
GET/admin/groupsList all groupsYes
GET/admin/groups/{name}Get a groupYes
PATCH/admin/groups/{name}Update a groupYes
DELETE/admin/groups/{name}Delete a groupYes

Garbage Collection

POST /admin/gc

Run garbage collection synchronously. Identifies and removes orphaned entries not reachable from the current HEAD.

Query Parameters:

ParameterTypeDefaultDescription
dry_runbooleanfalseIf true, report what would be collected without deleting

Response: 200 OK

The response contains GC statistics (entries scanned, reclaimed bytes, etc.).

Example:

# Dry run
curl -X POST "http://localhost:3000/admin/gc?dry_run=true" \
  -H "Authorization: Bearer $TOKEN"

# Actual GC
curl -X POST http://localhost:3000/admin/gc \
  -H "Authorization: Bearer $TOKEN"

Error Responses:

StatusCondition
403Non-root user
500GC failure

Background Tasks

POST /admin/tasks/reindex

Enqueue a reindex task for a directory path. Re-scans all files and rebuilds index entries.

Request Body:

{
  "path": "/data/"
}

Response: 200 OK

{
  "id": "task-uuid-here",
  "task_type": "reindex",
  "status": "pending"
}

Example:

curl -X POST http://localhost:3000/admin/tasks/reindex \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"path": "/data/"}'

POST /admin/tasks/gc

Enqueue a background GC task (non-blocking).

Request Body:

{
  "dry_run": false
}

Response: 200 OK

{
  "id": "task-uuid-here",
  "task_type": "gc",
  "status": "pending"
}

Example:

curl -X POST http://localhost:3000/admin/tasks/gc \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"dry_run": false}'

GET /admin/tasks

List all tasks with their current progress.

Response: 200 OK

[
  {
    "id": "task-uuid-here",
    "task_type": "reindex",
    "status": "running",
    "args": {"path": "/data/"},
    "progress": 0.45,
    "eta_ms": 1775968500000
  }
]

Each task includes progress (0.0-1.0) and eta_ms (estimated completion timestamp) if available.

Example:

curl http://localhost:3000/admin/tasks \
  -H "Authorization: Bearer $TOKEN"

GET /admin/tasks/

Get a single task by ID.

Response: 200 OK

{
  "id": "task-uuid-here",
  "task_type": "reindex",
  "status": "running",
  "args": {"path": "/data/"},
  "progress": 0.45,
  "eta_ms": 1775968500000
}

Error Responses:

StatusCondition
404Task not found

DELETE /admin/tasks/

Cancel a task.

Response: 200 OK

{
  "id": "task-uuid-here",
  "status": "cancelled"
}

Example:

curl -X DELETE http://localhost:3000/admin/tasks/task-uuid-here \
  -H "Authorization: Bearer $TOKEN"

Cron Scheduling

GET /admin/cron

List all cron schedules.

Response: 200 OK

[
  {
    "id": "nightly-gc",
    "schedule": "0 2 * * *",
    "task_type": "gc",
    "args": {"dry_run": false},
    "enabled": true
  }
]

POST /admin/cron

Create a new cron schedule.

Request Body:

{
  "id": "nightly-gc",
  "schedule": "0 2 * * *",
  "task_type": "gc",
  "args": {"dry_run": false},
  "enabled": true
}
FieldTypeRequiredDescription
idstringYesUnique schedule identifier
schedulestringYesCron expression
task_typestringYesTask type to enqueue ("gc", "reindex")
argsobjectYesArguments passed to the task
enabledbooleanYesWhether the schedule is active

Response: 201 Created

Example:

curl -X POST http://localhost:3000/admin/cron \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "id": "nightly-gc",
    "schedule": "0 2 * * *",
    "task_type": "gc",
    "args": {"dry_run": false},
    "enabled": true
  }'

Error Responses:

StatusCondition
400Invalid cron expression
409Schedule with this ID already exists

PATCH /admin/cron/

Update a cron schedule. All fields are optional – only provided fields are changed.

Request Body:

{
  "enabled": false,
  "schedule": "0 3 * * *"
}
FieldTypeDescription
enabledbooleanEnable or disable the schedule
schedulestringNew cron expression
task_typestringNew task type
argsobjectNew task arguments

Response: 200 OK

Returns the updated schedule.

Error Responses:

StatusCondition
400Invalid cron expression
404Schedule not found

DELETE /admin/cron/

Delete a cron schedule.

Response: 200 OK

{
  "id": "nightly-gc",
  "deleted": true
}

Error Responses:

StatusCondition
404Schedule not found

Backup & Restore

POST /admin/export

Export the database (or a specific version) as an .aeordb archive file.

Query Parameters:

ParameterTypeDescription
snapshotstringExport a named snapshot (default: HEAD)
hashstringExport a specific version by hex hash

Response: 200 OK

  • Content-Type: application/octet-stream
  • Content-Disposition: attachment; filename="export-{hash_prefix}.aeordb"
  • Body: binary archive data

Example:

# Export HEAD
curl -X POST http://localhost:3000/admin/export \
  -H "Authorization: Bearer $TOKEN" \
  -o backup.aeordb

# Export a specific snapshot
curl -X POST "http://localhost:3000/admin/export?snapshot=v1.0" \
  -H "Authorization: Bearer $TOKEN" \
  -o backup-v1.aeordb

# Export by hash
curl -X POST "http://localhost:3000/admin/export?hash=a1b2c3d4..." \
  -H "Authorization: Bearer $TOKEN" \
  -o backup.aeordb

POST /admin/diff

Create a patch file representing the difference between two versions.

Query Parameters:

ParameterTypeRequiredDescription
fromstringYesSource snapshot name or hex hash
tostringNoTarget snapshot name or hex hash (default: HEAD)

Response: 200 OK

  • Content-Type: application/octet-stream
  • Content-Disposition: attachment; filename="patch-{hash_prefix}.aeordb"
  • Body: binary patch data

Example:

curl -X POST "http://localhost:3000/admin/diff?from=v1.0&to=v2.0" \
  -H "Authorization: Bearer $TOKEN" \
  -o patch-v1-v2.aeordb

POST /admin/import

Import a backup or patch file. Body limit: 10 MB.

Query Parameters:

ParameterTypeDefaultDescription
forcebooleanfalseForce import even if conflicts exist
promotebooleanfalsePromote the imported version to HEAD

Request:

  • Headers:
    • Authorization: Bearer <token> (required)
  • Body: raw .aeordb file bytes

Response: 200 OK

{
  "status": "success",
  "backup_type": "export",
  "entries_imported": 1500,
  "chunks_imported": 3200,
  "files_imported": 450,
  "directories_imported": 30,
  "deletions_applied": 5,
  "version_hash": "a1b2c3d4e5f6...",
  "head_promoted": true
}

Example:

curl -X POST "http://localhost:3000/admin/import?promote=true" \
  -H "Authorization: Bearer $TOKEN" \
  --data-binary @backup.aeordb

Error Responses:

StatusCondition
400Invalid or corrupt backup file
403Non-root user

POST /admin/promote

Promote an arbitrary version hash to HEAD.

Query Parameters:

ParameterTypeRequiredDescription
hashstringYesHex-encoded version hash to promote

Response: 200 OK

{
  "status": "success",
  "head": "a1b2c3d4e5f6..."
}

Example:

curl -X POST "http://localhost:3000/admin/promote?hash=a1b2c3d4e5f6..." \
  -H "Authorization: Bearer $TOKEN"

Error Responses:

StatusCondition
400Invalid hash format
404Version hash not found in storage

Monitoring

GET /admin/health

Public health check endpoint. No authentication required.

Response: 200 OK

{
  "status": "ok"
}

Example:

curl http://localhost:3000/admin/health

GET /admin/metrics

Prometheus-format metrics endpoint. Requires authentication.

Response: 200 OK

  • Content-Type: text/plain; version=0.0.4; charset=utf-8
  • Body: Prometheus text exposition format

Example:

curl http://localhost:3000/admin/metrics \
  -H "Authorization: Bearer $TOKEN"

API Key Management

POST /admin/api-keys

Create a new API key. The plaintext key is returned only once – store it securely. Requires root.

Request Body:

{
  "user_id": "550e8400-e29b-41d4-a716-446655440000"
}
FieldTypeRequiredDescription
user_idstring (UUID)NoUser to create the key for (defaults to the calling user)

Response: 201 Created

{
  "key_id": "660e8400-e29b-41d4-a716-446655440001",
  "api_key": "aeor_660e8400_a1b2c3d4e5f6...",
  "user_id": "550e8400-e29b-41d4-a716-446655440000",
  "created_at": "2026-04-13T10:00:00Z"
}

Example:

curl -X POST http://localhost:3000/admin/api-keys \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"user_id": "550e8400-e29b-41d4-a716-446655440000"}'

GET /admin/api-keys

List all API keys (metadata only – no secrets). Requires root.

Response: 200 OK

[
  {
    "key_id": "660e8400-e29b-41d4-a716-446655440001",
    "user_id": "550e8400-e29b-41d4-a716-446655440000",
    "created_at": "2026-04-13T10:00:00Z",
    "is_revoked": false
  }
]

DELETE /admin/api-keys/

Revoke an API key. Revoked keys cannot be used to obtain tokens. Requires root.

Response: 200 OK

{
  "revoked": true,
  "key_id": "660e8400-e29b-41d4-a716-446655440001"
}

Error Responses:

StatusCondition
400Invalid key ID format
404API key not found

User Management

POST /admin/users

Create a new user. Requires root.

Request Body:

{
  "username": "alice",
  "email": "[email protected]"
}
FieldTypeRequiredDescription
usernamestringYesUnique username
emailstringNoUser email address

Response: 201 Created

{
  "user_id": "550e8400-e29b-41d4-a716-446655440000",
  "username": "alice",
  "email": "[email protected]",
  "is_active": true,
  "created_at": 1775968398000,
  "updated_at": 1775968398000
}

GET /admin/users

List all users. Requires root.

Response: 200 OK

[
  {
    "user_id": "550e8400-e29b-41d4-a716-446655440000",
    "username": "alice",
    "email": "[email protected]",
    "is_active": true,
    "created_at": 1775968398000,
    "updated_at": 1775968398000
  }
]

GET /admin/users/

Get a single user. Requires root.

Response: 200 OK (same shape as the user object above)

Error Responses:

StatusCondition
400Invalid UUID
404User not found

PATCH /admin/users/

Update a user. All fields are optional. Requires root.

Request Body:

{
  "username": "alice_updated",
  "email": "[email protected]",
  "is_active": true
}

Response: 200 OK (returns the updated user)


DELETE /admin/users/

Deactivate a user (soft delete – sets is_active to false). Requires root.

Response: 200 OK

{
  "deactivated": true,
  "user_id": "550e8400-e29b-41d4-a716-446655440000"
}

Group Management

Groups define path-level access control rules using query-based membership.

POST /admin/groups

Create a new group. Requires root.

Request Body:

{
  "name": "editors",
  "default_allow": "/content/*",
  "default_deny": "/admin/*",
  "query_field": "role",
  "query_operator": "eq",
  "query_value": "editor"
}
FieldTypeRequiredDescription
namestringYesUnique group name
default_allowstringYesPath pattern for allowed access
default_denystringYesPath pattern for denied access
query_fieldstringYesUser field to query for membership (must be a safe field)
query_operatorstringYesComparison operator
query_valuestringYesValue to match against

Response: 201 Created

{
  "name": "editors",
  "default_allow": "/content/*",
  "default_deny": "/admin/*",
  "query_field": "role",
  "query_operator": "eq",
  "query_value": "editor",
  "created_at": 1775968398000,
  "updated_at": 1775968398000
}

GET /admin/groups

List all groups. Requires root.

Response: 200 OK (array of group objects)


GET /admin/groups/

Get a single group. Requires root.

Error Responses:

StatusCondition
404Group not found

PATCH /admin/groups/

Update a group. All fields are optional. Requires root.

Request Body:

{
  "default_allow": "/content/*",
  "query_value": "senior-editor"
}

The query_field value is validated against a whitelist of safe fields. Attempting to use an unsafe field returns a 400 error.

Error Responses:

StatusCondition
400Unsafe query field
404Group not found

DELETE /admin/groups/

Delete a group. Requires root.

Response: 200 OK

{
  "deleted": true,
  "name": "editors"
}

Error Responses:

StatusCondition
404Group not found

Plugin Endpoints

AeorDB supports deploying WebAssembly (WASM) plugins that extend the database with custom logic. Plugins are scoped to a {database}/{schema}/{table} namespace.

Endpoint Summary

MethodPathDescriptionAuth
PUT/{db}/{schema}/{table}/_deployDeploy a WASM pluginYes
POST/{db}/{schema}/{table}/{function}/_invokeInvoke a plugin functionYes
GET/{db}/_pluginsList all deployed pluginsYes
DELETE/{db}/{schema}/{table}/{function}/_removeRemove a deployed pluginYes

PUT /{db}/{schema}/{table}/_deploy

Deploy a WASM plugin to the given namespace. If a plugin already exists at this path, it is replaced.

Request

  • URL parameters:
    • {db} – database name
    • {schema} – schema name
    • {table} – table name
  • Query parameters:
    • name (optional) – plugin name (defaults to the {table} segment)
    • plugin_type (optional) – plugin type string (defaults to "wasm")
  • Headers:
    • Authorization: Bearer <token> (required)
  • Body: raw WASM binary bytes

Response

Status: 200 OK

Returns the plugin metadata:

{
  "name": "my-plugin",
  "path": "mydb/public/users",
  "plugin_type": "wasm",
  "deployed_at": "2026-04-13T10:00:00Z"
}

Example

curl -X PUT "http://localhost:3000/mydb/public/users/_deploy?name=my-plugin" \
  -H "Authorization: Bearer $TOKEN" \
  --data-binary @plugin.wasm

Error Responses

StatusCondition
400Empty body or invalid plugin type
400Invalid WASM module
500Deployment failure

POST /{db}/{schema}/{table}/{function}/_invoke

Invoke a deployed plugin’s function. The request body is wrapped in a PluginRequest envelope with metadata, then passed to the WASM runtime.

Request

  • URL parameters:
    • {db} – database name
    • {schema} – schema name
    • {table} – table name
    • {function} – function name to invoke
  • Headers:
    • Authorization: Bearer <token> (required)
    • Content-Type – depends on what the plugin expects
  • Body: raw request payload (passed to the plugin as arguments)

Plugin Request Envelope

The server wraps the raw body into:

{
  "arguments": "<raw body bytes>",
  "metadata": {
    "function_name": "compute",
    "path": "/mydb/public/users/compute",
    "plugin_path": "mydb/public/users"
  }
}

Plugin Response

Plugins return a PluginResponse envelope:

{
  "status_code": 200,
  "content_type": "application/json",
  "headers": {
    "x-custom-header": "value"
  },
  "body": "<response bytes>"
}

The server maps these fields to the HTTP response. For security, only safe headers are forwarded:

  • Headers starting with x-
  • cache-control, etag, last-modified, content-disposition, content-language, content-encoding, vary

If the plugin returns data that is not a valid PluginResponse, it is sent as raw application/octet-stream bytes (backward compatibility).

WASM Host Functions

Plugins have access to the following host functions for interacting with the database:

  • CRUD: read, write, and delete files
  • Query: execute queries and aggregations against the engine
  • Context: access request metadata

Example

curl -X POST http://localhost:3000/mydb/public/users/compute/_invoke \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"input": "hello"}'

Error Responses

StatusCondition
404Plugin not found at the given path
500Plugin invocation failure (runtime error, panic, etc.)

GET /{db}/_plugins

List all deployed plugins.

Request

  • URL parameters:
    • {db} – database name
  • Headers:
    • Authorization: Bearer <token> (required)

Response

Status: 200 OK

[
  {
    "name": "my-plugin",
    "path": "mydb/public/users",
    "plugin_type": "wasm"
  }
]

Example

curl http://localhost:3000/mydb/_plugins \
  -H "Authorization: Bearer $TOKEN"

DELETE /{db}/{schema}/{table}/{function}/_remove

Remove a deployed plugin.

Request

  • URL parameters:
    • {db} – database name
    • {schema} – schema name
    • {table} – table name
    • {function} – function name
  • Headers:
    • Authorization: Bearer <token> (required)

Response

Status: 200 OK

{
  "removed": true,
  "path": "mydb/public/users"
}

Example

curl -X DELETE http://localhost:3000/mydb/public/users/compute/_remove \
  -H "Authorization: Bearer $TOKEN"

Error Responses

StatusCondition
404Plugin not found
500Removal failure

Events & Webhooks

AeorDB publishes real-time events via Server-Sent Events (SSE). Clients can subscribe to a filtered stream of engine events for live updates.

Endpoint Summary

MethodPathDescriptionAuth
GET/events/streamSSE event streamYes

GET /events/stream

Open a persistent Server-Sent Events connection. The server pushes events as they occur and sends periodic keepalive pings.

Query Parameters

ParameterTypeDescription
eventsstringComma-separated list of event types to receive (default: all)
path_prefixstringOnly receive events whose payload contains a path starting with this prefix

Request

curl -N http://localhost:3000/events/stream \
  -H "Authorization: Bearer $TOKEN"

Filtered Stream

Subscribe to only specific event types:

curl -N "http://localhost:3000/events/stream?events=entries_created,entries_deleted" \
  -H "Authorization: Bearer $TOKEN"

Filter by path prefix:

curl -N "http://localhost:3000/events/stream?path_prefix=/data/users" \
  -H "Authorization: Bearer $TOKEN"

Combine both:

curl -N "http://localhost:3000/events/stream?events=entries_created&path_prefix=/data/" \
  -H "Authorization: Bearer $TOKEN"

Response Format

The response is an SSE stream. Each event has the standard SSE fields:

id: evt-uuid-here
event: entries_created
data: {"event_id":"evt-uuid-here","event_type":"entries_created","timestamp":1775968398000,"payload":{"entries":[{"path":"/data/report.pdf"}]}}

Event Envelope

Each event is a JSON object with:

FieldTypeDescription
event_idstringUnique event identifier
event_typestringType of event (see below)
timestampintegerUnix timestamp (milliseconds)
payloadobjectEvent-specific data

Event Types

Event TypeDescriptionPayload
entries_createdFiles were created or updated{"entries": [{"path": "..."}]}
entries_deletedFiles were deleted{"entries": [{"path": "..."}]}
versions_createdA new version (snapshot/fork) was createdVersion metadata
permissions_changedPermissions were updated for a path{"path": "..."}
indexes_changedIndex configuration was updated{"path": "..."}

Keepalive

The server sends a keepalive ping every 30 seconds to prevent connection timeouts:

: ping

Path Prefix Matching

The path prefix filter checks two locations in the event payload:

  1. Batch events: payload.entries[].path – matches if any entry’s path starts with the prefix
  2. Single-path events: payload.path – matches if the path starts with the prefix

Connection Behavior

  • The connection stays open indefinitely until the client disconnects.
  • If the client falls behind (lagged), missed events are silently dropped.
  • Reconnecting clients should use the last received id for gap detection (standard SSE Last-Event-ID header).

JavaScript Example

const evtSource = new EventSource(
  'http://localhost:3000/events/stream?events=entries_created',
  { headers: { 'Authorization': 'Bearer ' + token } }
);

evtSource.addEventListener('entries_created', (event) => {
  const data = JSON.parse(event.data);
  console.log('Files created:', data.payload.entries);
});

evtSource.onerror = (err) => {
  console.error('SSE error:', err);
};

Webhook Configuration

Webhooks can be configured in-database by storing webhook configuration files. The event bus internally broadcasts all events, and webhook delivery can be wired to the SSE stream.

Webhook configuration is stored at a well-known path within the engine and follows the same event type filtering as the SSE endpoint.

Authentication

AeorDB supports multiple authentication modes. All protected endpoints require either a JWT Bearer token or are accessed through an API key exchange.

Auth Modes

AeorDB can run in one of three authentication modes, selected at startup:

ModeCLI FlagDescription
disabled--auth disabledNo authentication. All requests are allowed.
self-contained--auth self-containedKeys and users stored inside the database (default).
file--auth file://<path>Identity loaded from an external file. Returns a bootstrap API key on first run.

Disabled Mode

All middleware is bypassed. Every request is treated as authenticated. Useful for local development.

Self-Contained Mode

The default. Users, API keys, and tokens are all stored within the AeorDB engine itself. The JWT signing key is generated automatically.

File Mode

Identity is loaded from an external file at the specified path. On first startup, a bootstrap API key is printed to stdout so you can authenticate and set up additional users.


Endpoint Summary

MethodPathDescriptionAuth Required
POST/auth/tokenExchange API key for JWTNo
POST/auth/magic-linkRequest a magic linkNo
GET/auth/magic-link/verifyVerify a magic link codeNo
POST/auth/refreshRefresh an expired JWTNo
POST/admin/api-keysCreate an API keyYes (root)
GET/admin/api-keysList API keysYes (root)
DELETE/admin/api-keys/{key_id}Revoke an API keyYes (root)

JWT Tokens

All protected endpoints accept a JWT Bearer token in the Authorization header:

Authorization: Bearer eyJhbGciOiJIUzI1NiIs...

Token Claims

ClaimTypeDescription
substringUser ID (UUID) or email
issstringAlways "aeordb"
iatintegerIssued-at timestamp (Unix seconds)
expintegerExpiration timestamp (Unix seconds)
scopestringOptional scope restriction
permissionsobjectOptional fine-grained permissions

POST /auth/token

Exchange an API key for a JWT and refresh token. This is the primary authentication flow.

Request Body

{
  "api_key": "aeor_660e8400_a1b2c3d4e5f6..."
}

Response

Status: 200 OK

{
  "token": "eyJhbGciOiJIUzI1NiIs...",
  "expires_in": 3600,
  "refresh_token": "rt_a1b2c3d4e5f6..."
}
FieldTypeDescription
tokenstringJWT access token
expires_inintegerToken lifetime in seconds
refresh_tokenstringRefresh token for obtaining new JWTs

API Key Format

API keys follow the format aeor_{key_id_prefix}_{secret}. The key_id_prefix is extracted for O(1) lookup – the server does not iterate all stored keys.

Example

curl -X POST http://localhost:3000/auth/token \
  -H "Content-Type: application/json" \
  -d '{"api_key": "aeor_660e8400_a1b2c3d4e5f6..."}'

Error Responses

StatusCondition
401Invalid, revoked, or malformed API key
500Token creation failure

POST /auth/magic-link

Request a magic link for passwordless authentication. The server always returns 200 OK regardless of whether the email exists, to prevent email enumeration.

In development mode, the magic link URL is logged via tracing (no email is actually sent).

Rate Limiting

This endpoint is rate-limited per email address. Exceeding the limit returns 429 Too Many Requests.

Request Body

{
  "email": "[email protected]"
}

Response

Status: 200 OK

{
  "message": "If an account exists, a login link has been sent."
}

Example

curl -X POST http://localhost:3000/auth/magic-link \
  -H "Content-Type: application/json" \
  -d '{"email": "[email protected]"}'

Error Responses

StatusCondition
429Rate limit exceeded

GET /auth/magic-link/verify

Verify a magic link code and receive a JWT. Each code can only be used once.

Query Parameters

ParameterTypeRequiredDescription
codestringYesThe magic link code

Response

Status: 200 OK

{
  "token": "eyJhbGciOiJIUzI1NiIs...",
  "expires_in": 3600
}

Example

curl "http://localhost:3000/auth/magic-link/verify?code=abc123..."

Error Responses

StatusCondition
401Invalid code, expired, or already used
500Token creation failure

POST /auth/refresh

Exchange a refresh token for a new JWT and a new refresh token. Implements token rotation – the old refresh token is revoked and cannot be reused.

Request Body

{
  "refresh_token": "rt_a1b2c3d4e5f6..."
}

Response

Status: 200 OK

{
  "token": "eyJhbGciOiJIUzI1NiIs...",
  "expires_in": 3600,
  "refresh_token": "rt_new_token_here..."
}

Example

curl -X POST http://localhost:3000/auth/refresh \
  -H "Content-Type: application/json" \
  -d '{"refresh_token": "rt_a1b2c3d4e5f6..."}'

Error Responses

StatusCondition
401Invalid, revoked, or expired refresh token
500Token creation failure

Root User

The root user has the nil UUID (00000000-0000-0000-0000-000000000000). Only the root user can:

  • Create and manage API keys
  • Create and manage users
  • Create and manage groups
  • Restore snapshots and manage forks
  • Run garbage collection
  • Manage tasks and cron schedules
  • Export, import, and promote versions

First-Run Bootstrap

When using file:// auth mode, a bootstrap API key is printed to stdout on first run. Use this key to authenticate as root and create additional users and keys:

# Start the server (prints bootstrap key)
aeordb --auth file:///path/to/identity.json

# Exchange the bootstrap key for a token
curl -X POST http://localhost:3000/auth/token \
  -H "Content-Type: application/json" \
  -d '{"api_key": "<bootstrap-key>"}'

Authentication Flow Summary

                 API Key                    Magic Link
                   |                            |
          POST /auth/token             POST /auth/magic-link
                   |                            |
                   v                            v
              JWT + Refresh               Email with code
                   |                            |
                   |                   GET /auth/magic-link/verify
                   |                            |
                   v                            v
              Use JWT in                    JWT Token
           Authorization header                 |
                   |                            |
                   v                            v
           Protected endpoints          Protected endpoints
                   |
          Token expires
                   |
         POST /auth/refresh
                   |
                   v
           New JWT + New Refresh
           (old refresh revoked)

CORS Configuration

AeorDB supports Cross-Origin Resource Sharing (CORS) through a CLI flag and per-path rules stored in the database.

Configuration Methods

CLI Flag

Enable CORS at startup with the --cors flag:

# Allow all origins
aeordb --cors "*"

# Allow specific origins
aeordb --cors "https://app.example.com,https://admin.example.com"

The CLI flag sets the default CORS policy for all routes. When no --cors flag is provided, no CORS headers are sent.

Per-Path Rules (Config File)

For fine-grained control, store per-path CORS rules at /.config/cors.json inside the database:

{
  "rules": [
    {
      "path": "/engine/*",
      "origins": ["https://app.example.com"],
      "methods": ["GET", "POST", "PUT", "DELETE", "HEAD", "OPTIONS"],
      "allow_headers": ["Content-Type", "Authorization"],
      "max_age": 3600,
      "allow_credentials": false
    },
    {
      "path": "/query",
      "origins": ["*"],
      "methods": ["POST"],
      "allow_headers": ["Content-Type", "Authorization"],
      "max_age": 600,
      "allow_credentials": false
    },
    {
      "path": "/events/stream",
      "origins": ["https://app.example.com"],
      "allow_credentials": true
    }
  ]
}

Upload the config file using the engine API:

curl -X PUT http://localhost:3000/engine/.config/cors.json \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d @cors.json

Rule Schema

Each rule in the rules array supports:

FieldTypeDefaultDescription
pathstring(required)URL path to match. Supports trailing * for prefix matching.
originsarray of strings(required)Allowed origins. Use ["*"] for any origin.
methodsarray of strings["GET","POST","PUT","DELETE","HEAD","OPTIONS"]Allowed HTTP methods.
allow_headersarray of strings["Content-Type","Authorization"]Allowed request headers.
max_ageinteger3600Preflight cache duration in seconds.
allow_credentialsbooleanfalseWhether to include Access-Control-Allow-Credentials: true.

Path Matching

Per-path rules are checked in order (first match wins):

  • Exact match: "/query" matches only /query
  • Prefix match: "/engine/*" matches /engine/data/file.json, /engine/images/photo.png, etc.

If no per-path rule matches, the CLI default (if any) is used.


Precedence

  1. Per-path rules from /.config/cors.json (first match wins)
  2. CLI --cors flag defaults
  3. No CORS headers (if neither is configured)

CORS Middleware Behavior

The CORS middleware runs as the outermost layer in the middleware stack, ensuring that OPTIONS preflight requests are handled before authentication middleware rejects them for missing tokens.

Preflight Requests (OPTIONS)

When a preflight request arrives:

  1. The middleware checks if the Origin header matches an allowed origin.
  2. If allowed: returns 204 No Content with CORS headers:
    • Access-Control-Allow-Origin
    • Access-Control-Allow-Methods
    • Access-Control-Allow-Headers
    • Access-Control-Max-Age
    • Access-Control-Allow-Credentials (if configured)
  3. If not allowed: returns 403 Forbidden.

Normal Requests

For non-preflight requests with an allowed origin:

  1. The request passes through to the handler normally.
  2. CORS headers are appended to the response:
    • Access-Control-Allow-Origin
    • Access-Control-Allow-Credentials (if configured)

For requests from non-allowed origins, no CORS headers are added (the browser will block the response).

Wildcard Origin

When origins include "*", the Access-Control-Allow-Origin header is set to *. Note: when using allow_credentials: true, browsers require a specific origin rather than *.


Default CORS Headers (CLI Flag)

When only the --cors flag is used (no per-path rules), the defaults are:

HeaderValue
Access-Control-Allow-MethodsGET, POST, PUT, DELETE, HEAD, OPTIONS
Access-Control-Allow-HeadersContent-Type, Authorization
Access-Control-Max-Age3600
Access-Control-Allow-CredentialsNot set

Examples

Development: Allow Everything

aeordb --cors "*"

Production: Specific Origins

aeordb --cors "https://app.example.com,https://admin.example.com"

Per-Path with Credentials

Store in /.config/cors.json:

{
  "rules": [
    {
      "path": "/events/stream",
      "origins": ["https://app.example.com"],
      "allow_credentials": true,
      "max_age": 86400
    },
    {
      "path": "/engine/*",
      "origins": ["https://app.example.com", "https://admin.example.com"],
      "methods": ["GET", "PUT", "DELETE", "HEAD", "OPTIONS"],
      "allow_headers": ["Content-Type", "Authorization", "X-Request-ID"]
    }
  ]
}

Parser Plugins

Parser plugins let you transform non-JSON files (plaintext, CSV, PDF, images, etc.) into structured, queryable JSON when they are stored in AeorDB. Parsers are compiled to WebAssembly and deployed per-table, so each data collection can have its own parsing logic.

How It Works

When a file is written to a table that has a parser configured, AeorDB automatically routes the raw bytes through the parser’s WASM module. The parser receives the file data plus metadata and returns a JSON value. That JSON is then indexed by AeorDB’s query engine, making the original non-JSON file fully searchable.

Writing a Parser: Step by Step

1. Create a Rust Crate

cargo new my-parser --lib
cd my-parser

Edit Cargo.toml:

[package]
name = "my-parser"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

[dependencies]
aeordb-plugin-sdk = { path = "../aeordb-plugin-sdk" }
serde_json = "1"

The crate-type = ["cdylib"] is required – it tells the compiler to produce a dynamic library suitable for WASM.

2. Implement the Parse Function

Use the aeordb_parser! macro to generate the WASM export boilerplate. Your job is to write a function that takes a ParserInput and returns Result<serde_json::Value, String>.

#![allow(unused)]
fn main() {
use aeordb_plugin_sdk::aeordb_parser;
use aeordb_plugin_sdk::parser::*;

aeordb_parser!(parse);

fn parse(input: ParserInput) -> Result<serde_json::Value, String> {
    let text = std::str::from_utf8(&input.data)
        .map_err(|e| e.to_string())?;
    Ok(serde_json::json!({
        "text": text,
        "metadata": {
            "line_count": text.lines().count(),
            "word_count": text.split_whitespace().count(),
        }
    }))
}
}

The aeordb_parser! macro generates:

  • A global allocator for the WASM target
  • A handle(ptr, len) -> i64 export that deserializes the parser envelope, calls your function, and returns the serialized response as a packed pointer+length

You never interact with the raw WASM ABI directly.

3. Build for WASM

cargo build --target wasm32-unknown-unknown --release

The compiled module lands at:

target/wasm32-unknown-unknown/release/my_parser.wasm

4. Deploy the Parser

Upload the WASM binary to a table’s plugin deployment endpoint:

curl -X PUT \
  http://localhost:3000/mydb/myschema/mytable/_deploy \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/wasm" \
  --data-binary @target/wasm32-unknown-unknown/release/my_parser.wasm

5. Configure Content-Type Routing

Create or update /.config/parsers.json to route specific content types to your parser:

{
  "parsers": {
    "text/plain": "my-parser",
    "text/csv": "csv-parser",
    "application/pdf": "pdf-parser"
  }
}

When a file with a matching Content-Type is stored, AeorDB automatically invokes the corresponding parser.

6. Configure Indexing

Add the parser name to indexes.json so the parsed output is indexed:

{
  "indexes": [
    {
      "field": "text",
      "type": "fulltext"
    },
    {
      "field": "metadata.word_count",
      "type": "numeric"
    }
  ]
}

The ParserInput Struct

Your parse function receives a ParserInput with two fields:

FieldTypeDescription
dataVec<u8>Raw file bytes (already base64-decoded from the wire envelope)
metaFileMetaMetadata about the file being parsed

FileMeta Fields

FieldTypeDescription
filenameStringFile name only (e.g., "report.pdf")
pathStringFull storage path (e.g., "/docs/reports/report.pdf")
content_typeStringMIME type (e.g., "text/plain")
sizeu64Raw file size in bytes
hashStringHex-encoded content hash (may be empty)
hash_algorithmStringHash algorithm used (e.g., "blake3_256")
created_ati64Creation timestamp (ms since epoch, default 0)
updated_ati64Last update timestamp (ms since epoch, default 0)

Real-World Example: Plaintext Parser

The built-in plaintext parser (aeordb-parsers/plaintext) demonstrates a production parser:

#![allow(unused)]
fn main() {
use aeordb_plugin_sdk::aeordb_parser;
use aeordb_plugin_sdk::parser::*;

aeordb_parser!(parse);

fn parse(input: ParserInput) -> Result<serde_json::Value, String> {
    let text = std::str::from_utf8(&input.data)
        .map_err(|e| format!("not valid UTF-8: {}", e))?;

    let line_count = text.lines().count();
    let word_count = text.split_whitespace().count();
    let char_count = text.chars().count();
    let byte_count = input.data.len();

    // Extract first line as a "title" (common convention for text files)
    let title = text.lines().next().unwrap_or("").trim().to_string();

    // Detect if it looks like source code
    let has_braces = text.contains('{') && text.contains('}');
    let has_imports = text.contains("import ")
        || text.contains("use ")
        || text.contains("#include");
    let looks_like_code = has_braces || has_imports;

    Ok(serde_json::json!({
        "text": text,
        "metadata": {
            "filename": input.meta.filename,
            "content_type": input.meta.content_type,
            "size": byte_count,
            "line_count": line_count,
            "word_count": word_count,
            "char_count": char_count,
        },
        "title": title,
        "looks_like_code": looks_like_code,
    }))
}
}

This parser:

  • Validates UTF-8 encoding (returns an error for binary data)
  • Extracts text statistics (lines, words, characters)
  • Pulls the first line as a title
  • Heuristically detects source code

Error Handling

Return Err(String) from your parse function to signal a failure. AeorDB will store the error in the parser response and the file will not be indexed. The original file is still stored – only parsing/indexing is skipped.

#![allow(unused)]
fn main() {
fn parse(input: ParserInput) -> Result<serde_json::Value, String> {
    if input.data.is_empty() {
        return Err("empty file".to_string());
    }
    // ...
}
}

See Also

  • Query Plugins – plugins that query the database and return custom responses
  • SDK Reference – complete type reference for the plugin SDK

Query Plugins

Query plugins are WASM modules that can read, write, delete, query, and aggregate data inside AeorDB, then return custom HTTP responses. They are the extension mechanism for building custom API endpoints, computed views, data transformations, or any logic that needs to run server-side.

How It Works

A query plugin receives a PluginRequest (containing the HTTP body and metadata) and a PluginContext that provides host functions for interacting with the database. The plugin performs whatever logic it needs – querying data, writing files, aggregating results – and returns a PluginResponse with a status code, body, and content type.

Writing a Query Plugin: Step by Step

1. Create a Rust Crate

cargo new my-plugin --lib
cd my-plugin

Edit Cargo.toml:

[package]
name = "my-plugin"
version = "0.1.0"
edition = "2021"

[lib]
crate-type = ["cdylib"]

[dependencies]
aeordb-plugin-sdk = { path = "../aeordb-plugin-sdk" }
serde_json = "1"

2. Implement the Handler

Use the aeordb_query_plugin! macro and write a function that takes (PluginContext, PluginRequest) and returns Result<PluginResponse, PluginError>.

#![allow(unused)]
fn main() {
use aeordb_plugin_sdk::prelude::*;
use aeordb_plugin_sdk::aeordb_query_plugin;

aeordb_query_plugin!(handle);

fn handle(ctx: PluginContext, request: PluginRequest) -> Result<PluginResponse, PluginError> {
    let results = ctx.query("/users")
        .field("name").contains("Alice")
        .field("age").gt_u64(21)
        .limit(10)
        .execute()?;

    PluginResponse::json(200, &serde_json::json!({
        "users": results,
        "count": results.len()
    })).map_err(|e| PluginError::SerializationFailed(e.to_string()))
}
}

The aeordb_query_plugin! macro generates:

  • A global allocator for the WASM target
  • An alloc(size) -> ptr export for host-to-guest memory allocation
  • A handle(ptr, len) -> i64 export that deserializes the request, creates a PluginContext, calls your function, and returns the serialized response

3. Build and Deploy

cargo build --target wasm32-unknown-unknown --release

curl -X PUT \
  http://localhost:3000/mydb/myschema/mytable/_deploy \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/wasm" \
  --data-binary @target/wasm32-unknown-unknown/release/my_plugin.wasm

4. Invoke the Plugin

curl -X POST \
  http://localhost:3000/mydb/myschema/mytable/_invoke/my-plugin \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"query": "Alice"}'

The PluginRequest Struct

FieldTypeDescription
argumentsVec<u8>Raw argument bytes (the HTTP request body forwarded to the plugin)
metadataHashMap<String, String>Key-value metadata about the invocation context

The metadata map typically contains:

KeyDescription
function_nameThe function name from the invoke URL (e.g., "echo", "read")

You can use function_name to multiplex a single plugin into multiple operations:

#![allow(unused)]
fn main() {
fn handle(ctx: PluginContext, request: PluginRequest) -> Result<PluginResponse, PluginError> {
    let function = request.metadata
        .get("function_name")
        .map(|s| s.as_str())
        .unwrap_or("default");

    match function {
        "search" => handle_search(ctx, &request),
        "stats" => handle_stats(ctx, &request),
        _ => Ok(PluginResponse::error(404, &format!("Unknown function: {}", function))),
    }
}
}

The PluginContext

PluginContext is your handle for calling AeorDB host functions from inside the WASM sandbox. It is created automatically by the macro and passed to your handler.

File Operations

#![allow(unused)]
fn main() {
// Read a file -- returns FileData { data, content_type, size }
let file = ctx.read_file("/mydb/users/alice.json")?;

// Write a file (create or overwrite)
ctx.write_file("/mydb/output/result.json", b"{\"ok\":true}", "application/json")?;

// Delete a file
ctx.delete_file("/mydb/temp/scratch.json")?;

// Get file metadata -- returns FileMetadata { path, size, content_type, created_at, updated_at }
let meta = ctx.file_metadata("/mydb/users/alice.json")?;

// List directory entries -- returns Vec<DirEntry { name, entry_type, size }>
let entries = ctx.list_directory("/mydb/users/")?;
}

Query

Use ctx.query(path) to get a QueryBuilder with a fluent API:

#![allow(unused)]
fn main() {
let results = ctx.query("/users")
    .field("name").contains("Alice")
    .field("age").gt_u64(21)
    .sort("name", SortDirection::Asc)
    .limit(10)
    .offset(0)
    .execute()?;

// results: Vec<QueryResult { path, score, matched_by }>
}

Aggregate

Use ctx.aggregate(path) to get an AggregateBuilder:

#![allow(unused)]
fn main() {
let stats = ctx.aggregate("/orders")
    .count()
    .sum("total")
    .avg("total")
    .min_val("total")
    .max_val("total")
    .group_by("status")
    .limit(100)
    .execute()?;

// stats: AggregateResult { groups, total_count }
}

The QueryBuilder

The QueryBuilder provides a fluent API for composing queries. Multiple conditions on the top level are implicitly ANDed.

Field Operators

Start a field condition with .field("name"), then chain an operator:

Equality:

  • .eq(value: &[u8]) – exact match on raw bytes
  • .eq_u64(value) – exact match on u64
  • .eq_i64(value) – exact match on i64
  • .eq_f64(value) – exact match on f64
  • .eq_str(value) – exact match on string
  • .eq_bool(value) – exact match on boolean

Comparison:

  • .gt(value: &[u8]), .gt_u64(value), .gt_str(value), .gt_f64(value) – greater than
  • .lt(value: &[u8]), .lt_u64(value), .lt_str(value), .lt_f64(value) – less than

Range:

  • .between(min: &[u8], max: &[u8]) – inclusive range on raw bytes
  • .between_u64(min, max) – inclusive range on u64
  • .between_str(min, max) – inclusive range on strings

Set Membership:

  • .in_values(values: &[&[u8]]) – match any of the given byte values
  • .in_u64(values: &[u64]) – match any of the given u64 values
  • .in_str(values: &[&str]) – match any of the given strings

Text Search:

  • .contains(text) – substring / trigram contains
  • .similar(text, threshold) – trigram similarity (threshold 0.0–1.0)
  • .phonetic(text) – Soundex/Metaphone phonetic match
  • .fuzzy(text) – Levenshtein distance fuzzy match
  • .match_query(text) – full-text match

Boolean Combinators

#![allow(unused)]
fn main() {
// AND group
ctx.query("/users")
    .and(|q| q.field("name").contains("Alice").field("active").eq_bool(true))
    .limit(10)
    .execute()?;

// OR group
ctx.query("/users")
    .or(|q| q.field("role").eq_str("admin").field("role").eq_str("superadmin"))
    .execute()?;

// NOT
ctx.query("/users")
    .not(|q| q.field("status").eq_str("banned"))
    .execute()?;
}

Sorting and Pagination

#![allow(unused)]
fn main() {
ctx.query("/users")
    .field("active").eq_bool(true)
    .sort("created_at", SortDirection::Desc)
    .sort("name", SortDirection::Asc)
    .limit(25)
    .offset(50)
    .execute()?;
}

The PluginResponse

Three builder methods for constructing responses:

PluginResponse::json(status_code, &body)

Serializes any Serialize type to JSON. Sets Content-Type: application/json.

#![allow(unused)]
fn main() {
PluginResponse::json(200, &serde_json::json!({"ok": true}))
    .map_err(|e| PluginError::SerializationFailed(e.to_string()))
}

PluginResponse::text(status_code, body)

Returns a plain text response. Sets Content-Type: text/plain.

#![allow(unused)]
fn main() {
Ok(PluginResponse::text(201, "Created by plugin"))
}

PluginResponse::error(status_code, message)

Returns a JSON error response in the form {"error": "<message>"}.

#![allow(unused)]
fn main() {
Ok(PluginResponse::error(404, "User not found"))
}

Real-World Example: Echo Plugin

The built-in echo plugin (aeordb-plugins/echo-plugin) demonstrates multiplexing a single plugin across multiple operations:

#![allow(unused)]
fn main() {
use aeordb_plugin_sdk::prelude::*;
use aeordb_plugin_sdk::aeordb_query_plugin;

aeordb_query_plugin!(echo_handle);

fn echo_handle(ctx: PluginContext, request: PluginRequest) -> Result<PluginResponse, PluginError> {
    let function = request.metadata
        .get("function_name")
        .map(|s| s.as_str())
        .unwrap_or("echo");

    match function {
        "echo" => {
            PluginResponse::json(200, &serde_json::json!({
                "echo": true,
                "metadata": request.metadata,
                "body_len": request.arguments.len(),
            }))
            .map_err(|e| PluginError::SerializationFailed(e.to_string()))
        }
        "read" => {
            let path = std::str::from_utf8(&request.arguments)
                .map_err(|e| PluginError::ExecutionFailed(e.to_string()))?;
            match ctx.read_file(path) {
                Ok(file) => PluginResponse::json(200, &serde_json::json!({
                    "size": file.size,
                    "content_type": file.content_type,
                    "data_len": file.data.len(),
                }))
                .map_err(|e| PluginError::SerializationFailed(e.to_string())),
                Err(e) => Ok(PluginResponse::error(404, &e.to_string())),
            }
        }
        "write" => {
            match ctx.write_file("/plugin-output/result.json", b"{\"written\":true}", "application/json") {
                Ok(()) => PluginResponse::json(201, &serde_json::json!({"ok": true}))
                    .map_err(|e| PluginError::SerializationFailed(e.to_string())),
                Err(e) => Ok(PluginResponse::error(500, &e.to_string())),
            }
        }
        "delete" => {
            let path = std::str::from_utf8(&request.arguments)
                .map_err(|e| PluginError::ExecutionFailed(e.to_string()))?;
            match ctx.delete_file(path) {
                Ok(()) => PluginResponse::json(200, &serde_json::json!({"deleted": true}))
                    .map_err(|e| PluginError::SerializationFailed(e.to_string())),
                Err(e) => Ok(PluginResponse::error(500, &e.to_string())),
            }
        }
        "metadata" => {
            let path = std::str::from_utf8(&request.arguments)
                .map_err(|e| PluginError::ExecutionFailed(e.to_string()))?;
            match ctx.file_metadata(path) {
                Ok(meta) => PluginResponse::json(200, &serde_json::json!(meta))
                    .map_err(|e| PluginError::SerializationFailed(e.to_string())),
                Err(e) => Ok(PluginResponse::error(404, &e.to_string())),
            }
        }
        "list" => {
            let path = std::str::from_utf8(&request.arguments)
                .map_err(|e| PluginError::ExecutionFailed(e.to_string()))?;
            match ctx.list_directory(path) {
                Ok(entries) => PluginResponse::json(200, &serde_json::json!({"entries": entries}))
                    .map_err(|e| PluginError::SerializationFailed(e.to_string())),
                Err(e) => Ok(PluginResponse::error(500, &e.to_string())),
            }
        }
        "status" => Ok(PluginResponse::text(201, "Created by plugin")),
        _ => Ok(PluginResponse::error(404, &format!("Unknown function: {}", function))),
    }
}
}

See Also

  • Parser Plugins – plugins that transform non-JSON files into queryable data
  • SDK Reference – complete type reference for the plugin SDK

Plugin SDK Reference

Complete type reference for the aeordb-plugin-sdk crate. This covers every public struct, enum, trait, and method available to plugin authors.

Macros

aeordb_parser!(fn_name)

Generates WASM exports for a parser plugin. Your function must have the signature:

#![allow(unused)]
fn main() {
fn fn_name(input: ParserInput) -> Result<serde_json::Value, String>
}

Generated exports:

  • handle(ptr: i32, len: i32) -> i64 – deserializes the parser envelope, calls your function, returns packed pointer+length to the serialized response

aeordb_query_plugin!(fn_name)

Generates WASM exports for a query plugin. Your function must have the signature:

#![allow(unused)]
fn main() {
fn fn_name(ctx: PluginContext, request: PluginRequest) -> Result<PluginResponse, PluginError>
}

Generated exports:

  • alloc(size: i32) -> i32 – allocates guest memory for the host to write request data
  • handle(ptr: i32, len: i32) -> i64 – deserializes the request, creates a PluginContext, calls your function, returns packed pointer+length to the serialized response

Prelude

Import everything you need with:

#![allow(unused)]
fn main() {
use aeordb_plugin_sdk::prelude::*;
}

This re-exports: PluginError, PluginRequest, PluginResponse, ParserInput, FileMeta, PluginContext, FileData, DirEntry, FileMetadata, QueryResult, AggregateResult, SortDirection.


PluginRequest

Request passed to a query plugin when it is invoked.

FieldTypeDescription
argumentsVec<u8>Raw argument bytes (e.g., the HTTP request body forwarded to the plugin)
metadataHashMap<String, String>Key-value metadata about the invocation context

Common metadata keys:

KeyDescription
function_nameThe function name from the invoke URL

PluginResponse

Response returned by a plugin after handling a request.

FieldTypeDescription
status_codeu16HTTP-style status code
bodyVec<u8>Raw response body bytes
content_typeOption<String>MIME content type of the body
headersHashMap<String, String>Additional response headers

Builder Methods

PluginResponse::json(status_code: u16, body: &T) -> Result<Self, serde_json::Error>

Serializes body (any Serialize type) to JSON. Sets content_type to "application/json".

#![allow(unused)]
fn main() {
PluginResponse::json(200, &serde_json::json!({"ok": true}))
}

PluginResponse::text(status_code: u16, body: impl Into<String>) -> Self

Creates a plain text response. Sets content_type to "text/plain".

#![allow(unused)]
fn main() {
PluginResponse::text(200, "Hello, world!")
}

PluginResponse::error(status_code: u16, message: impl Into<String>) -> Self

Creates a JSON error response: {"error": "<message>"}. Sets content_type to "application/json".

#![allow(unused)]
fn main() {
PluginResponse::error(404, "not found")
}

PluginError

Error enum for the plugin system.

VariantDescription
NotFound(String)The plugin could not be found
ExecutionFailed(String)The plugin failed during execution
SerializationFailed(String)Request or response could not be serialized/deserialized
ResourceLimitExceeded(String)Plugin exceeded memory, fuel, or other resource limits
InvalidModule(String)An invalid or corrupt WASM module was provided
Internal(String)A generic internal error

All variants carry a String message. PluginError implements Display, Debug, and Error.


PluginContext

Guest-side handle for calling AeorDB host functions from WASM. Created automatically by aeordb_query_plugin! and passed to the handler.

On non-WASM targets (native compilation), all methods return PluginError::ExecutionFailed – this allows IDE support and unit testing of plugin logic without a WASM runtime.

File Operations

read_file(&self, path: &str) -> Result<FileData, PluginError>

Read a file at the given path. Returns the decoded file bytes, content type, and size.

write_file(&self, path: &str, data: &[u8], content_type: &str) -> Result<(), PluginError>

Write (create or overwrite) a file. Data is base64-encoded on the wire automatically.

delete_file(&self, path: &str) -> Result<(), PluginError>

Delete a file at the given path.

file_metadata(&self, path: &str) -> Result<FileMetadata, PluginError>

Retrieve metadata for a file without reading its contents.

list_directory(&self, path: &str) -> Result<Vec<DirEntry>, PluginError>

List directory entries at the given path.

Query and Aggregation

query(&self, path: &str) -> QueryBuilder

Start building a query against files at the given path. See QueryBuilder.

aggregate(&self, path: &str) -> AggregateBuilder

Start building an aggregation against files at the given path. See AggregateBuilder.


FileData

Raw file data returned by read_file.

FieldTypeDescription
dataVec<u8>Decoded file bytes
content_typeStringMIME content type
sizeu64File size in bytes

DirEntry

A single directory entry returned by list_directory.

FieldTypeDescription
nameStringEntry name (file or directory name, not the full path)
entry_typeString"file" or "directory"
sizeu64Size in bytes (0 for directories, defaults to 0 if absent)

FileMetadata

Metadata about a stored file.

FieldTypeDescription
pathStringFull storage path
sizeu64File size in bytes
content_typeOption<String>MIME content type (if known)
created_ati64Creation timestamp (ms since epoch)
updated_ati64Last update timestamp (ms since epoch)

ParserInput

Input to a parser function.

FieldTypeDescription
dataVec<u8>Raw file bytes (base64-decoded from the wire envelope)
metaFileMetaFile metadata

FileMeta

Metadata about the file being parsed (available inside parser plugins).

FieldTypeDescription
filenameStringFile name only (e.g., "report.pdf")
pathStringFull storage path (e.g., "/docs/reports/report.pdf")
content_typeStringMIME type
sizeu64Raw file size in bytes
hashStringHex-encoded content hash (may be empty)
hash_algorithmStringHash algorithm (e.g., "blake3_256", may be empty)
created_ati64Creation timestamp (ms since epoch, default 0)
updated_ati64Last update timestamp (ms since epoch, default 0)

QueryBuilder

Fluent builder for constructing AeorDB queries. Obtained via PluginContext::query(path) or QueryBuilder::new(path).

Field Conditions

Start with .field("name") to get a FieldQueryBuilder, then chain one operator:

Equality

MethodSignatureDescription
eq(value: &[u8]) -> QueryBuilderExact match on raw bytes
eq_u64(value: u64) -> QueryBuilderExact match on u64
eq_i64(value: i64) -> QueryBuilderExact match on i64
eq_f64(value: f64) -> QueryBuilderExact match on f64
eq_str(value: &str) -> QueryBuilderExact match on string
eq_bool(value: bool) -> QueryBuilderExact match on boolean

Greater Than

MethodSignatureDescription
gt(value: &[u8]) -> QueryBuilderGreater than on raw bytes
gt_u64(value: u64) -> QueryBuilderGreater than on u64
gt_str(value: &str) -> QueryBuilderGreater than on string
gt_f64(value: f64) -> QueryBuilderGreater than on f64

Less Than

MethodSignatureDescription
lt(value: &[u8]) -> QueryBuilderLess than on raw bytes
lt_u64(value: u64) -> QueryBuilderLess than on u64
lt_str(value: &str) -> QueryBuilderLess than on string
lt_f64(value: f64) -> QueryBuilderLess than on f64

Range

MethodSignatureDescription
between(min: &[u8], max: &[u8]) -> QueryBuilderInclusive range on raw bytes
between_u64(min: u64, max: u64) -> QueryBuilderInclusive range on u64
between_str(min: &str, max: &str) -> QueryBuilderInclusive range on strings

Set Membership

MethodSignatureDescription
in_values(values: &[&[u8]]) -> QueryBuilderMatch any of the given byte values
in_u64(values: &[u64]) -> QueryBuilderMatch any of the given u64 values
in_str(values: &[&str]) -> QueryBuilderMatch any of the given strings
MethodSignatureDescription
contains(text: &str) -> QueryBuilderSubstring / trigram contains search
similar(text: &str, threshold: f64) -> QueryBuilderTrigram similarity search (0.0–1.0)
phonetic(text: &str) -> QueryBuilderSoundex/Metaphone phonetic search
fuzzy(text: &str) -> QueryBuilderLevenshtein distance fuzzy search
match_query(text: &str) -> QueryBuilderFull-text match query

Boolean Combinators

MethodSignatureDescription
and(build_fn: FnOnce(QueryBuilder) -> QueryBuilder) -> SelfAND group via closure
or(build_fn: FnOnce(QueryBuilder) -> QueryBuilder) -> SelfOR group via closure
not(build_fn: FnOnce(QueryBuilder) -> QueryBuilder) -> SelfNegate a condition via closure

Sorting and Pagination

MethodSignatureDescription
sort(field: impl Into<String>, direction: SortDirection) -> SelfAdd a sort field
limit(count: usize) -> SelfLimit result count
offset(count: usize) -> SelfSkip the first N results

Execution

MethodSignatureDescription
execute(self) -> Result<Vec<QueryResult>, PluginError>Execute the query via host FFI
to_json(&self) -> serde_json::ValueSerialize builder state to JSON (for inspection/debugging)

QueryResult

A single query result returned by the host.

FieldTypeDescription
pathStringPath of the matching file
scoref64Relevance score (higher is better, default 0.0)
matched_byVec<String>Names of the indexes/operations that matched (default empty)

SortDirection

Sort direction for query results.

VariantDescription
AscAscending order
DescDescending order

AggregateBuilder

Fluent builder for constructing AeorDB aggregation queries. Obtained via PluginContext::aggregate(path) or AggregateBuilder::new(path).

Aggregation Operations

MethodSignatureDescription
count(self) -> SelfRequest a count aggregation
sum(field: impl Into<String>) -> SelfRequest a sum on a field
avg(field: impl Into<String>) -> SelfRequest an average on a field
min_val(field: impl Into<String>) -> SelfRequest a minimum value on a field
max_val(field: impl Into<String>) -> SelfRequest a maximum value on a field

Grouping and Filtering

MethodSignatureDescription
group_by(field: impl Into<String>) -> SelfGroup results by a field
filter(build_fn: FnOnce(QueryBuilder) -> QueryBuilder) -> SelfAdd a where condition via closure
limit(count: usize) -> SelfLimit the number of groups returned

Execution

MethodSignatureDescription
execute(self) -> Result<AggregateResult, PluginError>Execute the aggregation via host FFI
to_json(&self) -> serde_json::ValueSerialize builder state to JSON

AggregateResult

Aggregation result returned by the host.

FieldTypeDescription
groupsVec<serde_json::Value>Per-group aggregation results (default empty)
total_countOption<u64>Total count if count was requested without group_by

See Also

Garbage Collection

AeorDB is an append-only database: writes, overwrites, and deletes all append new entries without modifying existing ones. Over time, this leaves orphaned entries – old file versions, deleted content, stale directory indexes – consuming disk space. Garbage collection (GC) reclaims that space.

What GC Does

GC uses a mark-and-sweep algorithm:

  1. Mark phase: Starting from HEAD, all snapshots, and all forks, GC walks every reachable directory tree. It marks every entry that is still live: directory indexes, file records, content chunks, system tables, task queue records, and deletion records.

  2. Sweep phase: GC iterates all entries in the key-value store. Any entry whose hash is not in the live set is garbage. Non-live entries are overwritten in-place with DeletionRecord or Void entries, then removed from the KV index.

What Gets Collected

  • Old file versions that have been overwritten
  • Content chunks no longer referenced by any file record
  • Directory indexes from previous tree states
  • Entries orphaned by deletes

What Does NOT Get Collected

  • HEAD and all reachable entries from HEAD
  • All snapshot root trees and their descendants
  • All fork root trees and their descendants
  • System table entries (/.system, /.config)
  • Task queue records (registry + individual task entries)
  • DeletionRecord entries (needed for KV rebuild from .aeordb scan)

Running GC

CLI

# Run GC
aeordb gc --database data.aeordb

# Dry run -- report what would be collected without deleting
aeordb gc --database data.aeordb --dry-run

Example output:

AeorDB Garbage Collection
Database: data.aeordb

Versions scanned: 3
Live entries:     1247
Garbage entries:  89
Reclaimed:        1.2 MB
Duration:         0.3s

Dry run output:

AeorDB Garbage Collection [DRY RUN]
Database: data.aeordb

[DRY RUN] Would collect 89 garbage entries (1.2 MB)

HTTP API

Synchronous GC (blocks until complete):

# Run GC
curl -X POST http://localhost:3000/admin/gc \
  -H "Authorization: Bearer $API_KEY"

# Dry run
curl -X POST http://localhost:3000/admin/gc?dry_run=true \
  -H "Authorization: Bearer $API_KEY"

Response:

{
  "versions_scanned": 3,
  "live_entries": 1247,
  "garbage_entries": 89,
  "reclaimed_bytes": 1258291,
  "duration_ms": 312,
  "dry_run": false
}

Background GC (returns immediately, runs as a task):

curl -X POST http://localhost:3000/admin/tasks/gc \
  -H "Authorization: Bearer $API_KEY"

This enqueues a GC task that the background task worker will pick up. Track its progress via the task system.

When to Run GC

  • After bulk deletes: If you delete a large number of files, their content chunks become garbage.
  • After bulk overwrites: Updating many files leaves old versions behind.
  • After version cleanup: If you delete old snapshots, the entries they exclusively referenced become garbage.
  • Periodically: Set up a cron schedule for automatic GC.

Example cron configuration (/.config/cron.json):

{
  "schedules": [
    {
      "id": "nightly-gc",
      "task_type": "gc",
      "schedule": "0 3 * * *",
      "args": {},
      "enabled": true
    }
  ]
}

Concurrency and Safety

GC should not be run concurrently with writes. The sweep phase re-verifies each candidate against the current KV state before overwriting to mitigate races, but for full safety, callers should ensure exclusive access during GC.

Crash safety: If the process crashes mid-sweep, the .aeordb file may contain partially overwritten entries. On restart, the .kv index file will be stale and must be deleted to trigger a full rebuild from the .aeordb file scan. The rebuild replays deletion records and reconstructs the index, so no committed data is lost. Garbage entries that were not yet swept will persist until the next GC run.

Performance

At scale, expect approximately 10K entries/sec sweep throughput. The mark phase is faster since it only walks reachable trees. The sweep phase writes are batched with a single sync at the end for performance.

See Also

Reindexing

When you change a table’s index configuration (indexes.json), existing files need to be re-processed through the indexing pipeline. AeorDB handles this through background reindex tasks.

Why Reindex

  • You added a new index field (e.g., adding a fulltext index on a field that was previously unindexed)
  • You changed the index type for a field (e.g., switching from exact to fulltext)
  • You added or changed a parser plugin, and existing files need to be re-parsed
  • You modified index settings (e.g., changing similarity thresholds)

Automatic Reindexing

Changing indexes.json via the API automatically triggers a background reindex task for the affected directory. You do not need to manually trigger reindexing in most cases.

Manual Reindexing

HTTP API

curl -X POST http://localhost:3000/admin/tasks/reindex \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"path": "/data/"}'

The path argument specifies which directory to reindex. The task worker will:

  1. Read the indexes.json configuration for that path
  2. List all file entries in the directory
  3. Re-read each file and run it through the indexing pipeline
  4. Track progress and update checkpoints

Progress Tracking

During an active reindex, query responses include a meta.reindexing field indicating that results may be incomplete:

{
  "results": [...],
  "meta": {
    "reindexing": true
  }
}

You can also check progress through the task system:

curl http://localhost:3000/admin/tasks \
  -H "Authorization: Bearer $API_KEY"

The response includes progress details:

{
  "tasks": [
    {
      "id": "abc123",
      "task_type": "reindex",
      "status": "running",
      "progress": {
        "task_id": "abc123",
        "task_type": "reindex",
        "progress": 0.45,
        "eta_ms": 12000,
        "indexed_count": 450,
        "total_count": 1000,
        "message": "indexed 450/1000 files"
      }
    }
  ]
}

Circuit Breaker

If 10 consecutive files fail to index, the reindex task trips a circuit breaker and fails with an error:

circuit breaker: 10 consecutive indexing failures

This prevents runaway error loops when the index configuration or parser is fundamentally broken. Fix the underlying issue and trigger a new reindex.

Checkpoint and Resume

Reindex tasks save a checkpoint after each batch (50 files). If the server crashes or the task is cancelled and restarted, it resumes from the last checkpoint rather than starting over.

The checkpoint is the name of the last successfully processed file (files are processed in alphabetical order for deterministic ordering).

Cancellation

Cancel a running reindex task:

curl -X POST http://localhost:3000/admin/tasks/{task_id}/cancel \
  -H "Authorization: Bearer $API_KEY"

The task checks for cancellation after each batch, so it will stop within one batch cycle.

Batch Processing

Files are processed in batches of 50. After each batch, the task:

  1. Updates the checkpoint to the last file in the batch
  2. Computes progress percentage and ETA (using a rolling average of the last 10 batch times)
  3. Checks for cancellation

See Also

Task System & Cron

AeorDB runs long-running operations (reindexing, garbage collection) as background tasks. Tasks are managed by a task queue, executed by a dedicated worker, and can be triggered manually or on a cron schedule.

Built-in Task Types

Task TypeDescription
reindexRe-run the indexing pipeline on all files under a directory
gcRun garbage collection (mark-and-sweep)

Task Lifecycle

pending  -->  running  -->  completed
                       -->  failed
                       -->  cancelled
  1. Pending: Task is enqueued and waiting for the worker to pick it up.
  2. Running: Worker has dequeued the task and is executing it.
  3. Completed: Task finished successfully.
  4. Failed: Task encountered an error (e.g., circuit breaker tripped, GC failed).
  5. Cancelled: Task was cancelled by the user between batch iterations.

On server startup, any tasks left in Running state (from a previous crash) are reset to Pending so they can be re-executed.

API

List Tasks

curl http://localhost:3000/admin/tasks \
  -H "Authorization: Bearer $API_KEY"

Response:

{
  "tasks": [
    {
      "id": "abc123",
      "task_type": "reindex",
      "status": "running",
      "args": {"path": "/data/"},
      "created_at": 1700000000000,
      "progress": {
        "task_id": "abc123",
        "task_type": "reindex",
        "progress": 0.65,
        "eta_ms": 8000,
        "indexed_count": 650,
        "total_count": 1000,
        "message": "indexed 650/1000 files"
      }
    },
    {
      "id": "def456",
      "task_type": "gc",
      "status": "completed",
      "args": {},
      "created_at": 1699999000000
    }
  ]
}

Trigger a Task

Reindex:

curl -X POST http://localhost:3000/admin/tasks/reindex \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"path": "/data/"}'

Garbage Collection:

curl -X POST http://localhost:3000/admin/tasks/gc \
  -H "Authorization: Bearer $API_KEY"

Cancel a Task

curl -X POST http://localhost:3000/admin/tasks/{task_id}/cancel \
  -H "Authorization: Bearer $API_KEY"

Cancellation is cooperative: the task checks for cancellation between batch iterations. It will not interrupt a batch in progress.

Progress Tracking

Running tasks expose in-memory progress information:

FieldTypeDescription
task_idStringTask identifier
task_typeStringTask type (e.g., "reindex")
progressf64Completion fraction (0.0 to 1.0)
eta_msOption<i64>Estimated time remaining in milliseconds
indexed_countusizeNumber of items processed so far
total_countusizeTotal items to process
messageOption<String>Human-readable progress message

Progress is computed using a rolling average of the last 10 batch execution times for ETA calculation.

During an active reindex, query responses include meta.reindexing: true so clients know results may be incomplete.

Cron Scheduling

AeorDB includes a built-in cron scheduler that checks /.config/cron.json every 60 seconds and enqueues matching tasks.

Configuration

Store the cron configuration at /.config/cron.json:

{
  "schedules": [
    {
      "id": "nightly-gc",
      "task_type": "gc",
      "schedule": "0 3 * * *",
      "args": {},
      "enabled": true
    },
    {
      "id": "hourly-reindex",
      "task_type": "reindex",
      "schedule": "0 * * * *",
      "args": {"path": "/data/"},
      "enabled": true
    }
  ]
}

Cron Expression Format

Standard 5-field Unix cron expressions:

minute  hour  day-of-month  month  day-of-week
  *       *        *          *        *

Examples:

  • 0 3 * * * – every day at 3:00 AM
  • */15 * * * * – every 15 minutes
  • 0 0 * * 0 – every Sunday at midnight
  • 30 2 1 * * – 2:30 AM on the 1st of every month

Cron API

Create/update the schedule:

curl -X PUT http://localhost:3000/.config/cron.json \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "schedules": [
      {
        "id": "nightly-gc",
        "task_type": "gc",
        "schedule": "0 3 * * *",
        "args": {},
        "enabled": true
      }
    ]
  }'

Read the schedule:

curl http://localhost:3000/.config/cron.json \
  -H "Authorization: Bearer $API_KEY"

Disable a schedule (set enabled: false and re-upload):

# Fetch, modify, re-upload

Deduplication

The cron scheduler checks whether a task with the same type and arguments is already pending or running before enqueuing. This prevents duplicate tasks from stacking up if a previous run hasn’t finished.

CronSchedule Fields

FieldTypeDescription
idStringUnique identifier for this schedule
task_typeStringTask type to enqueue (e.g., "gc", "reindex")
scheduleString5-field Unix cron expression
argsserde_json::ValueArguments passed to the task
enabledboolWhether this schedule is active (default true)

Task Retention

Completed tasks are automatically pruned:

  • Tasks older than 24 hours are removed
  • At most 100 completed tasks are retained

Pruning runs after each task completes.

Events

The task system emits events on the event bus:

EventDescription
task.startedA task has begun execution
task.completedA task finished successfully
task.failedA task encountered an error
gc.completedGC-specific completion event with statistics

See Also

  • Garbage Collection – details on the GC mark-and-sweep algorithm
  • Reindexing – details on the reindex process, circuit breaker, and checkpoints

Backup & Restore

AeorDB supports exporting database versions as self-contained .aeordb files, creating incremental patches between versions, importing backups, and promoting version hashes.

Concepts

  • Full export: A clean .aeordb file containing only the live entries at a specific version. No voids, no deletion records, no stale overwrites, no history.
  • Patch (diff): A .aeordb file containing only the changeset between two versions – new/changed chunks, updated file records, updated directory indexes, and deletion records for removed files.
  • Import: Applying an export or patch into a target database.
  • Promote: Setting a version hash as the current HEAD.

Full Export

Export HEAD, a named snapshot, or a specific version hash as a self-contained backup.

CLI

# Export HEAD
aeordb export --database data.aeordb --output backup.aeordb

# Export a named snapshot
aeordb export --database data.aeordb --output backup.aeordb --snapshot v1

# Export a specific version hash
aeordb export --database data.aeordb --output backup.aeordb --hash abc123def456...

The output file must not already exist – the command will refuse to overwrite.

HTTP API

curl -X POST http://localhost:3000/admin/export \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"output": "backup.aeordb"}'

With a snapshot:

curl -X POST http://localhost:3000/admin/export \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"output": "backup.aeordb", "snapshot": "v1"}'

Output

Export complete.
  Files: 142
  Chunks: 89
  Directories: 23
  Version: abc123def456...

Diff / Patch

Create an incremental patch containing only the changes between two versions. This is significantly smaller than a full export when only a few files have changed.

CLI

# Diff between two snapshots
aeordb diff --database data.aeordb --output patch.aeordb --from v1 --to v2

# Diff from a snapshot to HEAD
aeordb diff --database data.aeordb --output patch.aeordb --from v1

# Diff using raw hashes
aeordb diff --database data.aeordb --output patch.aeordb --from abc123... --to def456...

The --from and --to arguments accept either snapshot names or hex-encoded version hashes. If --to is omitted, HEAD is used.

HTTP API

curl -X POST http://localhost:3000/admin/diff \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"output": "patch.aeordb", "from": "v1", "to": "v2"}'

Output

Patch created.
  Files added: 5
  Files modified: 12
  Files deleted: 3
  Chunks: 8
  Directories: 7
  From: abc123...
  To:   def456...

What a Patch Contains

  • New chunks: Content chunks that exist in the target version but not the base version
  • Added file records: Files present in the target but not the base
  • Modified file records: Files that changed between the two versions
  • Deletion records: Files present in the base but removed in the target
  • Changed directory indexes: Directory entries that differ between versions

Import

Apply a full export or incremental patch to a target database.

CLI

# Import a full export
aeordb import --database data.aeordb --file backup.aeordb

# Import and immediately promote HEAD
aeordb import --database data.aeordb --file backup.aeordb --promote

# Force import a patch even if base version doesn't match
aeordb import --database data.aeordb --file patch.aeordb --force

Flags:

  • --promote: Automatically set HEAD to the imported version
  • --force: Skip base version verification for patches

HTTP API

curl -X POST http://localhost:3000/admin/import \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"file": "backup.aeordb", "promote": true}'

Patch Base Version Check

When importing a patch, AeorDB verifies that the target database’s current HEAD matches the patch’s base version. If they don’t match, the import fails:

Target database HEAD (aaa111...) does not match patch base version (bbb222...).
Use --force to apply anyway.

Use --force to skip this check if you know what you’re doing.

Output

Full export imported.
  Entries: 254
  Chunks: 89
  Files: 142
  Directories: 23
  Deletions: 0
  Version: abc123...

  HEAD has been promoted.

If --promote was not used:

  HEAD has NOT been changed.
  To promote: aeordb promote --hash abc123...

Promote

Set a specific version hash as the current HEAD.

CLI

aeordb promote --database data.aeordb --hash abc123def456...

The command verifies that the hash exists in the database before promoting.

HTTP API

curl -X POST http://localhost:3000/admin/promote \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"hash": "abc123def456..."}'

Typical Workflows

Regular Backups

# Create a snapshot first
curl -X POST http://localhost:3000/admin/snapshots \
  -H "Authorization: Bearer $API_KEY" \
  -d '{"name": "daily-2024-01-15"}'

# Export it
aeordb export --database data.aeordb \
  --output backups/daily-2024-01-15.aeordb \
  --snapshot daily-2024-01-15

Incremental Backups

# First backup: full export
aeordb export --database data.aeordb --output backups/full.aeordb --snapshot v1

# Subsequent backups: just the diff
aeordb diff --database data.aeordb --output backups/patch-v1-v2.aeordb --from v1 --to v2

Restore from Backup

# Import the full backup
aeordb import --database restored.aeordb --file backups/full.aeordb --promote

# Apply incremental patches in order
aeordb import --database restored.aeordb --file backups/patch-v1-v2.aeordb --promote

Migrate Between Servers

# On source server
aeordb export --database data.aeordb --output transfer.aeordb

# Copy to target server
scp transfer.aeordb target-server:/data/

# On target server
aeordb import --database data.aeordb --file transfer.aeordb --promote

See Also

CLI Commands

Complete reference for the aeordb command-line interface.

aeordb start

Start the AeorDB server.

aeordb start [OPTIONS]

Flags

FlagShortDefaultDescription
--port-p3000TCP port to listen on
--database-Ddata.aeordbPath to the .aeordb database file
--log-formatprettyLog output format: pretty or json
--auth(none)Auth provider URI (see below)
--hot-dir(database parent dir)Directory for write-ahead hot files
--cors(disabled)CORS allowed origins

Auth Modes

The --auth flag accepts several formats:

ValueModeDescription
(not set)DisabledNo authentication required (dev mode)
false, null, no, 0DisabledExplicitly disable authentication
selfSelf-containedAeorDB manages API keys internally
file:///path/to/identityFile-basedLoad identity from a file

When using self mode, the root API key is printed once on first startup. Save it – it cannot be retrieved again (but can be reset with emergency-reset).

CORS

ValueBehavior
(not set)CORS disabled
*Allow all origins
https://a.com,https://b.comAllow specific comma-separated origins

Examples

# Development mode (no auth, default port)
aeordb start

# Production with auth on port 8080
aeordb start --port 8080 --database /var/lib/aeordb/prod.aeordb --auth self --log-format json

# Custom hot directory and CORS
aeordb start --database data.aeordb --hot-dir /fast-ssd/hot --cors "*"

What Happens on Start

  1. Opens (or creates) the database file
  2. Bootstraps root API key (if --auth self and no key exists yet)
  3. Resets any tasks left in Running state from a previous crash to Pending
  4. Starts background workers:
    • Heartbeat: emits DatabaseStats every 15 seconds
    • Cron scheduler: checks /.config/cron.json every 60 seconds
    • Task worker: dequeues and executes background tasks
    • Webhook dispatcher: delivers events to registered webhook URLs
  5. Binds to the TCP port and begins serving requests
  6. Shuts down gracefully on CTRL+C

aeordb gc

Run garbage collection to reclaim space from unreachable entries.

aeordb gc [OPTIONS]

Flags

FlagShortDefaultDescription
--database-Ddata.aeordbPath to the .aeordb database file
--dry-runfalseReport what would be collected without actually deleting

Examples

# Run GC
aeordb gc --database data.aeordb

# Preview what would be collected
aeordb gc --database data.aeordb --dry-run

Output

AeorDB Garbage Collection
Database: data.aeordb

Versions scanned: 3
Live entries:     1247
Garbage entries:  89
Reclaimed:        1.2 MB
Duration:         0.3s

See Garbage Collection for details on the mark-and-sweep algorithm.


aeordb export

Export a version as a self-contained .aeordb file.

aeordb export [OPTIONS]

Flags

FlagShortDefaultDescription
--database-Ddata.aeordbSource database file
--output-o(required)Output .aeordb file path
--snapshot-s(none)Named snapshot to export
--hash(none)Specific version hash to export (hex-encoded)

If neither --snapshot nor --hash is provided, HEAD is exported.

Examples

# Export HEAD
aeordb export --database data.aeordb --output backup.aeordb

# Export a named snapshot
aeordb export --database data.aeordb --output backup-v1.aeordb --snapshot v1

# Export a specific hash
aeordb export --database data.aeordb --output backup.aeordb --hash abc123def456...

The output file must not already exist.

See Backup & Restore for full backup workflows.


aeordb diff

Create a patch .aeordb containing only the changeset between two versions.

aeordb diff [OPTIONS]

Flags

FlagShortDefaultDescription
--database-Ddata.aeordbSource database file
--output-o(required)Output patch file path
--from(required)Base version (snapshot name or hex hash)
--toHEADTarget version (snapshot name or hex hash)

Examples

# Diff between two snapshots
aeordb diff --database data.aeordb --output patch.aeordb --from v1 --to v2

# Diff from a snapshot to HEAD
aeordb diff --database data.aeordb --output patch.aeordb --from v1

# Diff between raw hashes
aeordb diff --database data.aeordb --output patch.aeordb --from abc123... --to def456...

The --from and --to arguments first try snapshot name lookup, then fall back to interpreting the value as a hex-encoded hash.

See Backup & Restore for incremental backup workflows.


aeordb import

Import an export or patch .aeordb file into a target database.

aeordb import [OPTIONS]

Flags

FlagShortDefaultDescription
--database-Ddata.aeordbTarget database file
--file-f(required)Backup or patch file to import
--forcefalseSkip base version verification for patches
--promotefalseAutomatically set HEAD to the imported version

Examples

# Import a full backup
aeordb import --database data.aeordb --file backup.aeordb

# Import and promote HEAD
aeordb import --database data.aeordb --file backup.aeordb --promote

# Force-import a patch even if base doesn't match
aeordb import --database data.aeordb --file patch.aeordb --force --promote

Patch Base Verification

When importing a patch (backup_type=2), AeorDB verifies that the target database’s HEAD matches the patch’s base version. Use --force to bypass this check.

See Backup & Restore for restore workflows.


aeordb promote

Promote a version hash to HEAD.

aeordb promote [OPTIONS]

Flags

FlagShortDefaultDescription
--database-Ddata.aeordbDatabase file
--hash(required)Hex-encoded version hash to promote

Examples

aeordb promote --database data.aeordb --hash abc123def456...

The command verifies the hash exists in the database before promoting.


aeordb stress

Run stress tests against a running AeorDB instance.

aeordb stress [OPTIONS]

Flags

FlagShortDefaultDescription
--target-thttp://localhost:3000Target server URL
--api-key-a(required)API key for authentication
--concurrency-c10Number of concurrent workers
--duration-d10sTest duration (e.g., 30s, 5m)
--operation-omixedOperation type: write, read, or mixed
--file-size-s1kbFile size for writes (e.g., 512b, 1kb, 1mb)
--path-prefix-p/stress-testPath prefix for stress test files

Examples

# Quick mixed read/write test
aeordb stress --api-key $API_KEY

# Heavy write test for 5 minutes
aeordb stress --api-key $API_KEY --operation write --concurrency 50 --duration 5m --file-size 10kb

# Read-only test against production
aeordb stress --target https://prod.example.com --api-key $API_KEY --operation read --concurrency 100 --duration 30s

aeordb emergency-reset

Revoke the current root API key and generate a new one. Use this if the root key is lost or compromised.

aeordb emergency-reset [OPTIONS]

Flags

FlagShortDefaultDescription
--database-D(required)Database file
--forcefalseSkip confirmation prompt

Examples

# Interactive (prompts for confirmation)
aeordb emergency-reset --database data.aeordb

# Non-interactive
aeordb emergency-reset --database data.aeordb --force

What Happens

  1. Finds all API keys linked to the root user (nil UUID)
  2. Revokes each one
  3. Generates a new root API key
  4. Prints the new key (shown once, save it immediately)
WARNING: This will invalidate the current root API key.
A new root API key will be generated.
Proceed? [y/N]: y
Revoked 1 existing root API key(s).

==========================================================
  NEW ROOT API KEY (shown once, save it now!):
  aeordb_abc123def456...
==========================================================

This command requires direct file access to the database – it cannot be run over HTTP. It is intended for recovery scenarios where you have lost the root API key.


See Also