--- title: Hub + Spoke Data Evolution modified: 2026-06-05 tags: - explanation - sync --- # Hub + Spoke Data Evolution How the data model evolves safely when nodes run different versions across the hub/spoke deployment (indri is the hub; see [[set-up-sync-hub]] and [[host-heph-pwa]]). The short version: **sync is op-based, not schema-based**, so most new features need no coordinated migration — but adding a SQLite *column* does. ## Two independent layers heph keeps two layers that evolve on different clocks: 1. **The op-log (synced).** Every change is an operation — `node.create`, `node.set`, `task.set`, `link.add`, `link.remove`, … — carrying an HLC, an origin device, and a JSON payload. Spokes push/pull ops to/from the hub; both sides run the **same** merge logic from `heph-core` (`sqlite/apply.rs`). This is the only thing that crosses the wire. 2. **The SQLite schema (local, per node).** Each node materializes ops into local tables. The schema version is tracked by SQLite's `PRAGMA user_version` and advanced by the ordered, append-only migration list in `heph-core/src/sqlite/migrations.rs`. **No schema or migration state is ever synced.** A spoke can sit on an older schema than the hub indefinitely. Because the wire format is ops — not rows — a node only has to understand the *ops* its peers emit, not their table layout. ## What forward/backward compatibility already buys you The merge engine is deliberately lenient: - **Unknown op types are stored but not applied** (`apply.rs`) — a spoke that receives a newer op type keeps it in the log (so a later upgrade can replay it) but doesn't choke on it. - **Unknown payload fields are ignored.** Field extraction is by name (`str_field` / `i64_field`), so a payload with extra keys an older node doesn't recognize just drops the extras. - **Links are schema-free.** A link's `type` is a string column. A brand-new link kind (a new `LinkType`) needs no migration — every version reads it as text and applies OR-set add/remove identically. ## The rule of thumb | Change | Needs coordinated migration? | |--------|------------------------------| | New `LinkType` (e.g. a new relationship between nodes) | **No** — just emit `link.add` with the new `type` string | | New optional/nullable scalar carried in an op payload | **No, if** every node's `apply` reads it defensively and tolerates its absence | | New *read-side* feature over existing data (counts, hierarchy from existing `parent` links) | **No** — pure local queries, no op or schema change | | New **required** SQLite column that `apply` must write on every relevant op | **Yes** — old spokes lack the column and the `UPDATE` fails | | Renaming/removing a column other nodes' `apply` paths reference | **Yes** | ## When a migration *is* required, do it hub-first If a change genuinely needs a new column that the apply path writes: 1. Ship the migration to **every** node (hub and all spokes) **before** any node emits an op that depends on the new column. The migration list is append-only and ordered, so rolling the new `hephd` out everywhere is the gate. 2. Keep new columns **nullable / defaulted** so an op that predates the column still applies, and so a node that hasn't yet upgraded degrades to "field absent" rather than erroring. 3. Prefer encoding the new fact as a **link or an op-payload field** over a new column whenever you can — that keeps the change in the no-migration column of the table above. ## Worked example: indented, counted projects The sidebar's subproject indentation and per-project task counts (see [[install-heph]] and the agenda surface in [[design]] §8.1) are a pure read-side feature: - **Nesting** is read from `parent` links that already exist — created by `heph project add --parent ` — via the existing `project_subtree` traversal. - **Counts** are a read-only `SELECT … GROUP BY` over the `tasks`/`links` tables. No new column, no new op type, no migration — it works against a hub and a spoke on any schema version that already understands `parent` links. That is the case the rule of thumb is meant to make obvious.