CapDB

Chapter 1 — CapDB: Truth at the Storage Engine

"Every system in this series trusts its truth store. This is the chapter about what it takes to be the thing that is trusted."

Laws most engaged: I (Truth Has One Home), II (Constraint Management).

The engine under everything

Throughout this series, the Truth Plane has been a given. Ampriot trusts MariaDB; AISDR trusts its immutable registry; every architecture rests on some store that is authoritative, durable, and correct. CapDB is the chapter where we stop taking that store for granted and ask what it costs, at the level of the engine itself, to be the thing everything else trusts.

CapDB is a hard fork of SQLite that adds connection pooling, TLS-backed networked access, and optional physical replication — SQLite's simplicity and embeddability with production infrastructure built in. It is interesting to this series not because it is a database but because it is truth at its source, and it honors Law I — truth has one home — at a level the other proof points sit on top of. Examining CapDB is examining what Law I demands when there is no lower layer to delegate to.

Engine stability first: the defining constraint

CapDB's central architectural decision is stated as a principle: engine stability first. It does not replace SQLite's B-tree, pager, or query planner. It layers new capability around a proven core that it commits to leaving alone.

Volume I, Chapter 2 used this exact decision as its illustration of Law II — constraint as sustained commitment — and it is worth seeing why the constraint is the architecture. The parts of a database most likely to harbor subtle, data-corrupting bugs are precisely the parts CapDB refuses to touch: the storage kernel, the pager, the query planner, battle-tested over decades. By constraining itself to add only around that kernel, CapDB inherits its correctness rather than risking it. The constraint is expensive — it means tracking upstream SQLite and merging its patches indefinitely, work CapDB could avoid by forking freely — and CapDB pays it deliberately, because the alternative would forfeit the one property a truth store must have: that its core is correct. To be trusted with truth, CapDB constrains itself so that its users never have to wonder whether the pager is sound.

This is Law II in its purest form: the architecture is the constraint, held over time, against the temptation to relax it.

Replication that copies truth without forking it

CapDB's most direct engagement with Law I is its replication, and the design choice that matters is physical replication rather than logical. The primary streams write-ahead-log frames — byte-for-byte — to replicas, rather than shipping logical change events (insert/update/delete) for replicas to re-apply.

flowchart LR
    C["Clients"]
    P[("Primary<br/>authoritative truth")]
    RA[("Replica A<br/>read-only")]
    RB[("Replica B<br/>read-only")]
    F["Generation fencing:<br/>reject segments whose generation < local<br/>(a deposed primary cannot resume writing)"]
    C -->|"writes"| P
    P -->|"WAL frames — physical, byte-for-byte"| RA
    P -->|"WAL frames — physical, byte-for-byte"| RB
    RA -->|"reads"| C
    RB -->|"reads"| C
    RA -. enforces .-> F
    RB -. enforces .-> F

The reason is Law I. Logical replication invites divergence: a replica re-applying logical operations can, through subtle differences in execution, drift from the primary, producing two stores that disagree about truth — exactly the multiple-sources-of-reality problem Law I forbids, now between a primary and its own replica. Physical replication forecloses it: a replica receiving byte-for-byte WAL frames sees exactly what the primary persisted, with no room for logical divergence. The replica is not a second truth that might disagree; it is a faithful copy of the one truth. CapDB chooses the replication strategy that makes "one home for truth" survivable across machines.

The same concern drives two more decisions. Replicas are read-only — they reject writes (EXEC, PREPARE, STEP) outright — so a replica cannot become a second writable truth. And generation fencing prevents split-brain: WAL segments carry a generation, and a replica rejects any segment from an older generation, so a deposed primary cannot resume writing to the cluster after a failover. Read-only replicas keep truth singular under normal operation; generation fencing keeps it singular through failure. Both are Law I defended at the engine level, where "one home" means "and not two, even for a moment, even during a failover."

Constraints managed as a coherent set

CapDB also illustrates Law II's subtler demand: that constraints be managed as a coherent set, not accreted independently. Its capabilities are gated behind feature flags — pool, network, store, replication, full-text search — with implication rules that encode the dependencies between them: network implies pool, replication implies store, store implies pool.

Volume I, Chapter 2 called these implication rules "constraint management made literal," and the phrase is exact. They make incoherent configurations unbuildable — you cannot compile networking without a pool to serve it, because the build system enforces the implication. This is the antidote to constraint creep: rather than discovering at runtime that someone assembled a nonsensical combination, CapDB prevents the combination from existing. The constraints understand each other, and the build refuses any assembly that violates their relationships.

Growth by adding constraints

A final observation ties CapDB to the spirit of the whole series. Where most systems grow by adding capability (and thus degrees of freedom), CapDB grows by adding constraints. Its connection pool refuses shared-cache mode, giving each connection a private page cache — trading a memory optimization for the elimination of a class of cross-connection locking bugs. Its replicas are constrained to read-only. Its generations fence out stale writers. Each new feature is, on inspection, a new boundary drawn, and the boundary is what makes the feature safe.

This is what it looks like to honor Law II while extending a system: not "what can we now allow?" but "what must we now forbid so that this addition stays safe?" CapDB stays trustworthy as it grows because it grows by constraining, which is the discipline a truth store, of all things, can least afford to abandon.

What CapDB demonstrates about truth

CapDB is Law I and Law II seen all the way down, at the engine that other systems merely trust. It honors "truth has one home" not as a schema convention but as a physical-replication strategy, read-only replicas, and generation fencing that keep truth singular across machines and through failover (Law I). And it honors "architecture is constraint management" as a sustained commitment to engine stability, coherent feature implications, and growth-by-constraint (Law II).

But a truth store must do more than be correct and singular; it must be safe to operate over a network and available when parts of it fail — without ever sacrificing the truth it holds. Those are Laws III and VII, and the next chapter shows CapDB weaving them into the engine.