Document why heph's op-based sync lets most new features (new link types, read-side queries, optional payload fields) ship without a coordinated migration across the hub and spokes, and the narrow case — a new required SQLite column the apply path writes — that does need a hub-first rollout. Groundwork for the indented/counted project sidebar, which is pure read-side (existing parent links + a GROUP BY) and needs no migration. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
4.1 KiB
| title | modified | tags | ||
|---|---|---|---|---|
| Hub + Spoke Data Evolution | 2026-06-05 |
|
Hub + Spoke Data Evolution
How the data model evolves safely when nodes run different versions across the hub/spoke deployment (indri is the hub; see set-up-sync-hub and host-heph-pwa). The short version: sync is op-based, not schema-based, so most new features need no coordinated migration — but adding a SQLite column does.
Two independent layers
heph keeps two layers that evolve on different clocks:
- The op-log (synced). Every change is an operation —
node.create,node.set,task.set,link.add,link.remove, … — carrying an HLC, an origin device, and a JSON payload. Spokes push/pull ops to/from the hub; both sides run the same merge logic fromheph-core(sqlite/apply.rs). This is the only thing that crosses the wire. - The SQLite schema (local, per node). Each node materializes ops into local
tables. The schema version is tracked by SQLite's
PRAGMA user_versionand advanced by the ordered, append-only migration list inheph-core/src/sqlite/migrations.rs. No schema or migration state is ever synced. A spoke can sit on an older schema than the hub indefinitely.
Because the wire format is ops — not rows — a node only has to understand the ops its peers emit, not their table layout.
What forward/backward compatibility already buys you
The merge engine is deliberately lenient:
- Unknown op types are stored but not applied (
apply.rs) — a spoke that receives a newer op type keeps it in the log (so a later upgrade can replay it) but doesn't choke on it. - Unknown payload fields are ignored. Field extraction is by name
(
str_field/i64_field), so a payload with extra keys an older node doesn't recognize just drops the extras. - Links are schema-free. A link's
typeis a string column. A brand-new link kind (a newLinkType) needs no migration — every version reads it as text and applies OR-set add/remove identically.
The rule of thumb
| Change | Needs coordinated migration? |
|---|---|
New LinkType (e.g. a new relationship between nodes) |
No — just emit link.add with the new type string |
| New optional/nullable scalar carried in an op payload | No, if every node's apply reads it defensively and tolerates its absence |
New read-side feature over existing data (counts, hierarchy from existing parent links) |
No — pure local queries, no op or schema change |
New required SQLite column that apply must write on every relevant op |
Yes — old spokes lack the column and the UPDATE fails |
Renaming/removing a column other nodes' apply paths reference |
Yes |
When a migration is required, do it hub-first
If a change genuinely needs a new column that the apply path writes:
- Ship the migration to every node (hub and all spokes) before any node
emits an op that depends on the new column. The migration list is
append-only and ordered, so rolling the new
hephdout everywhere is the gate. - Keep new columns nullable / defaulted so an op that predates the column still applies, and so a node that hasn't yet upgraded degrades to "field absent" rather than erroring.
- Prefer encoding the new fact as a link or an op-payload field over a new column whenever you can — that keeps the change in the no-migration column of the table above.
Worked example: indented, counted projects
The sidebar's subproject indentation and per-project task counts (see install-heph and the agenda surface in design §8.1) are a pure read-side feature:
- Nesting is read from
parentlinks that already exist — created byheph project add <name> --parent <parent>— via the existingproject_subtreetraversal. - Counts are a read-only
SELECT … GROUP BYover thetasks/linkstables.
No new column, no new op type, no migration — it works against a hub and a spoke
on any schema version that already understands parent links. That is the case
the rule of thumb is meant to make obvious.