hephaestus/docs/reference/tech-spec.md
Erich Blume cbf859b2d7
Some checks failed
Build / validate (push) Failing after 2s
Set up hephaestus from template and add design + tech spec
Customize the generated repo (rename Dagger module to hephaestus_ci /
HephaestusCi, set docs baseUrl, add All-Rights-Reserved LICENSE, update
README/AGENTS), and add the project's foundational design documentation:

- docs/explanation/design.md — rationale + decision-history record
- docs/reference/tech-spec.md — implementation-ready technical spec

These define hephaestus as a self-hosted, client/server + offline-first
system unifying a markdown knowledge base with task management: typed node
graph, the lived priority discipline ("what is next?"), recurrence with
fresh-per-occurrence checklists, op-log/CRDT sync with conflict resolution,
OIDC/Authentik auth, the heph.nvim surface, and a TDD strategy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 09:37:28 -07:00

21 KiB
Raw Blame History

title modified tags
Technical Specification 2026-05-31
reference
design

Hephaestus — Technical Specification

Clean, implementation-facing spec for the v1 prototype. For the why behind every choice (history, prior art, decision trail), see design. Where this spec and the design doc disagree, the design doc's latest decision wins — file an update here.

1. Overview

Hephaestus (heph) is a self-hosted personal context-management system that unifies a markdown knowledge base with task management in one database. v1 is a distributed client/server system: each device runs a local replica that works fully offline, and syncs to a central hub when reachable, with automatic merge + conflict resolution for concurrent offline edits. Access is authenticated via OIDC (Authentik) with per-user data isolation. The web UI and the actual k3s deployment are later phases; the hub ships in v1 as a runnable/deployable binary. Components:

  • heph-core — Rust library: data model, SQLite store, query engine, markdown parsing/extraction, recurrence, and the sync engine (op-log, HLC, CRDT merge, conflict detection).
  • hephd — Rust daemon with two modes:
    • client mode (per device): owns the local SQLite replica; serves a JSON-RPC API over a unix socket to local surfaces; runs background sync to the hub.
    • server/hub mode: owns the central SQLite; serves the authenticated sync endpoint (and, later, the web UI) over the network.
  • heph — Rust CLI: utility/admin surface (export, scripting, smoke tests, heph conflicts).
  • heph.nvim — Lua Neovim plugin: the primary user surface ("org-mode"-style); a thin client of the local hephd.

2. Development approach

Development is test-driven (TDD). Write the failing test first; implement to green; refactor. No feature is "done" without tests at the appropriate layer(s) in §9. Core logic must be deterministic and clock-injected (no ambient wall-clock reads in heph-core; the current time is always passed in) so ranking and recurrence are testable.

3. Architecture

  • Surfaces (heph.nvim, heph CLI) are thin clients; they never touch SQLite directly. All state operations go through the local hephd over the unix socket. Socket path default: $XDG_RUNTIME_DIR/heph/hephd.sock.
  • Each device's hephd (client mode) is the single writer/owner of that device's local SQLite replica (WAL mode), and works fully offline. It records every local mutation as an op in an append-only op-log and runs background sync to the hub.
  • The hub hephd (server mode) is the hub-and-spoke rendezvous: devices push/pull ops to it over the network (authenticated, §13). It is not required to be online — devices keep working and reconcile when it returns.
  • Merge is automatic where possible; the unresolvable remainder goes to a conflict queue surfaced to the user (§12). Sync never blocks local work.
  • heph-core is synchronous and side-effect-light (incl. deterministic merge logic); hephd wraps it with async I/O, transport, and auth (tokio). DB calls run on a blocking pool.
  • See §12 (Sync & Conflict Resolution) and §13 (Authentication) for the detailed models.

4. Data model

All first-class entities are nodes; relationships are links. Markdown bodies are stored in SQLite; files are an export artifact, not the source of truth.

4.1 Node kinds

kind meaning body
doc rich context document (knowledge base, work-logs, journals) markdown
task thin task or ephemeral context item (see §4.3) none (context via links)
project grouping/context for tasks optional
tag label optional
journal daily note, titled by ISO date markdown

wiki (materialized from [[links]] in a body), canonical-context (task → its auto-created context doc), context-of, log-of (task → its append-only log), blocks, parent, tagged, in-project.

4.3 Task semantics

  • Attention-state (required on committed tasks): white (do once do-date arrives), orange (top of mind), red (top of mind + a consequence exists if late — consequence, not severity), blue (on-deck/backlog).
  • do-date = earliest actionable date ("do date"), not a deadline. Optional late-on marks when lateness becomes a problem.
  • Commitment axis: committed = 1 tasks participate in scheduling/"what is next"; committed = 0 are ephemeral context items scoped to a container (container_id), with only outstanding/done states, never surfaced globally. Context items may be promoted to committed tasks.
  • States: outstanding, done, dropped (done and dropped are both "not outstanding"; the distinction is retained).
  • No hard deletes: everything uses tombstoned; physical deletion only in an explicit cleanup mode.

4.4 Recurrence (§3.3 of design)

A task with a non-null recurrence (RFC-5545 RRULE) is a recurring definition. Sub-items flagged is_template = 1 form its checklist template. Each occurrence produces a fresh checklist instance (new outstanding context items copied from the template); completion never carries forward across occurrences — this is a hard requirement.

Two candidate implementations (pick at kickoff; (a) is the lean):

  • (a) Occurrence instances: definition spawns a task_occurrences row per occurrence, each with its own do-date and fresh checklist items. Full history.
  • (b) Roll-forward in place: single node; on completion, log the occurrence, reset the checklist to outstanding, advance the do-date.

4.5 SQLite schema (starting point)

nodes(
  id           TEXT PRIMARY KEY,     -- ULID
  owner_id     TEXT NOT NULL REFERENCES users(id),  -- per-user isolation
  kind         TEXT NOT NULL,        -- doc|task|project|tag|journal
  title        TEXT NOT NULL,
  body         TEXT,                 -- markdown (nullable); materialized view of body_crdt
  body_crdt    BLOB,                 -- text-CRDT state for the body (merge), nullable
  created_at   INTEGER NOT NULL,     -- epoch ms
  modified_at  INTEGER NOT NULL,
  hlc          TEXT NOT NULL,        -- hybrid logical clock of last write (sync ordering)
  tombstoned   INTEGER NOT NULL DEFAULT 0
)

tasks(
  node_id      TEXT PRIMARY KEY REFERENCES nodes(id),
  attention    TEXT,                 -- white|orange|red|blue (committed tasks)
  do_date      INTEGER,              -- epoch ms, nullable
  late_on      INTEGER,              -- epoch ms, nullable
  state        TEXT NOT NULL,        -- outstanding|done|dropped
  committed    INTEGER NOT NULL,     -- 1 committed task, 0 context item
  container_id TEXT REFERENCES nodes(id),  -- context item → container
  recurrence   TEXT,                 -- RRULE; present = recurring definition
  is_template  INTEGER NOT NULL DEFAULT 0  -- checklist-template item
)

-- recurrence model (a) only:
task_occurrences(
  id              TEXT PRIMARY KEY,  -- ULID
  def_id          TEXT NOT NULL REFERENCES nodes(id),
  occurrence_date INTEGER NOT NULL,
  state           TEXT NOT NULL,     -- outstanding|done|dropped|skipped
  created_at      INTEGER NOT NULL,
  tombstoned      INTEGER NOT NULL DEFAULT 0
)

links(
  id          TEXT PRIMARY KEY,      -- ULID
  src_id      TEXT NOT NULL REFERENCES nodes(id),
  dst_id      TEXT NOT NULL REFERENCES nodes(id),
  type        TEXT NOT NULL,
  created_at  INTEGER NOT NULL,
  tombstoned  INTEGER NOT NULL DEFAULT 0
)

aliases(node_id TEXT REFERENCES nodes(id), alias TEXT)   -- wiki-link name resolution
nodes_fts                                                -- FTS5 over title, body

-- identity & sync --
users(
  id          TEXT PRIMARY KEY,      -- ULID
  oidc_sub    TEXT UNIQUE NOT NULL,  -- OIDC subject (Authentik)
  name        TEXT,
  created_at  INTEGER NOT NULL
)

oplog(                               -- append-only operation log (the sync unit)
  id          TEXT PRIMARY KEY,      -- ULID
  owner_id    TEXT NOT NULL REFERENCES users(id),
  hlc         TEXT NOT NULL,         -- hybrid logical clock (causal order)
  origin      TEXT NOT NULL,         -- originating device id
  op_type     TEXT NOT NULL,         -- node.create|node.body_delta|task.set_field|link.add|link.remove|...
  target_id   TEXT NOT NULL,
  payload     TEXT NOT NULL,         -- JSON (e.g. CRDT delta, field+value, OR-Set add/remove)
  applied     INTEGER NOT NULL DEFAULT 0
)

sync_state(                          -- per-peer cursor (device ↔ hub)
  peer        TEXT PRIMARY KEY,      -- 'hub' on a client; device id on the hub
  last_pushed_hlc TEXT,
  last_pulled_hlc TEXT,
  updated_at  INTEGER NOT NULL
)

conflicts(                           -- ambiguous merges surfaced to the user
  id          TEXT PRIMARY KEY,
  owner_id    TEXT NOT NULL REFERENCES users(id),
  node_id     TEXT NOT NULL REFERENCES nodes(id),
  field       TEXT NOT NULL,         -- which field / 'body-region'
  local_val   TEXT, remote_val TEXT,
  local_hlc   TEXT, remote_hlc TEXT,
  status      TEXT NOT NULL,         -- open|resolved
  created_at  INTEGER NOT NULL
)

Projects/tags are nodes; membership is links (in-project, tagged). All tasks/task_occurrences/links rows inherit ownership via their node(s).

5. Markdown handling

  • Bodies are stored verbatim. On write (node create/update), heph-core extracts:
    • [[wiki-links]]wiki links (resolved via aliases/title; unresolved links are allowed and recorded).
    • GFM task-list items (- [ ] / - [x]) → context-item state under the node (Option A editing model — see design §6.3; the alternative is command-driven items, decided in-prototype).
  • Extraction is idempotent and diff-based: re-writing an unchanged body is a no-op; reworded checklist lines tombstone-old + add-new (context items are cheap).
  • export materializes all non-tombstoned nodes to a directory tree of .md files (frontmatter + body), reproducing the corpus portably.

6. Daemon RPC API (JSON-RPC over unix socket)

Methods (request → response; errors are JSON-RPC errors). Signatures are indicative, not final:

  • node.get(id) → Node
  • node.create({kind, title, body?}) → Node
  • node.update({id, title?, body?}) → Node (body update re-runs extraction)
  • node.tombstone(id) → ok
  • task.create({title, project?, attention?, do_date?, late_on?, recurrence?, committed?}) → Task (auto-creates the canonical context doc + canonical-context link)
  • task.set_state({id, state}) → Task (recurring: advances per §4.4)
  • task.set_attention({id, attention}) → Task
  • task.promote({context_item_id, attention?, project?}) → Task
  • next({scope?, limit?}) → [RankedTask] (the "what is next?" query, §7)
  • search({query, filters?}) → [Node] (FTS)
  • links.outgoing(id) → [Link] / links.backlinks(id) → [Link]
  • journal.open_or_create(date) → Node
  • log.append({task_id, text}) → ok (append to the task's log-of node)
  • export({path}) → {count}
  • health() → {orange_count, active_count, on_deck_count, conflict_count, sync_status, ...} (working-set + sync indicators)
  • auth: auth.login() → {device_code_flow...} / auth.status() → {user, logged_in} / auth.logout() (§13)
  • sync: sync.now() → {pushed, pulled, conflicts} (force a sync cycle; background sync runs automatically) / sync.status() → {last_pushed, last_pulled, pending_ops, online}
  • conflicts: conflicts.list() → [Conflict] / conflicts.resolve({id, choice}) → ok

Every local mutation method records an op in the op-log for sync. The local daemon supports server-push notifications (e.g. a node changed by an incoming sync) so an open buffer can reconcile in real time.

6.1 Hub (server-mode) sync endpoint

Separate from the local unix-socket RPC, the hub exposes an authenticated network endpoint (HTTP/JSON or gRPC — pick at kickoff) for op exchange: clients present an OIDC bearer token (§13); the hub validates, scopes by owner_id, accepts pushed ops, and returns ops the client hasn't seen (by HLC cursor). The hub applies the same heph-core merge logic.

7. "What is next?" ranking

Given an optional scope (project/context) and limit (default 5):

  1. Candidates: committed tasks where state = outstanding, not tombstoned, attention ≠ blue, and actionable now: do_date IS NULL OR do_date ≤ now. For recurring definitions, evaluate the current active occurrence's do-date. Apply scope if given.
  2. Order:
    1. attention: redorangewhite;
    2. urgency: tasks past late_on first, then most-overdue (smallest do_date) first;
    3. tie-break: earlier do_date, then created_at.
  3. Output: concise rows — title, project, attention, do/late, link to canonical context. red items always appear regardless of limit.

blue (on-deck) is hidden from next by design; surfaced only by an explicit on-deck view. health() exposes the working-set tensions (orange ≤ 6, active ≤ ~30, on-deck count) honestly — never masking overload nor manufacturing calm.

8. heph.nvim surface (v1)

Replaces obsidian.nvim. Telescope-backed. Core commands/gestures:

  • Follow [[wiki-link]] under cursor on <Enter>.
  • Search / quick-switch / tags / backlinks / outgoing links (pickers).
  • Daily journal picker (create/open dated journal nodes).
  • Task capture; show "what is next" (:Heph next); set attention; mark done/dropped.
  • Open a task's canonical context doc; edit context-item checkboxes (Option A) in the buffer (extracted on :w).
  • Per-task log quick-append without leaving the current buffer.

9. Testing strategy (TDD, layered)

All layers are required; CI runs them on every push/PR (extend .forgejo/scripts/build to run cargo test and the nvim e2e suite; prek already runs in build.yaml).

  • Unit (heph-core): model invariants; markdown extraction (wiki-links, checkboxes); RRULE expansion and the fresh-checklist-per-occurrence rule (assert completion never carries forward); the "what is next?" ranking (table-driven cases); migration up/down.
  • Property tests (proptest): ranking yields a total order; extraction is idempotent; recurrence never leaks completion state across occurrences; tombstones are never resurrected.
  • Integration (hephd, real sockets): start a daemon against a temp SQLite file, connect over a real unix socket, and exercise the RPC API end-to-end, asserting resulting DB state. Include multi-client concurrency tests on the socket and clock-injection for deterministic time.
  • Sync & offline (multi-replica): spin up two client hephd replicas + a hub hephd, all over real network sockets against temp DBs, and assert convergence:
    • online round-trip: edit on A → appears on B via the hub;
    • offline → reconcile: partition A and B from the hub, make divergent edits on each, reconnect, assert both converge;
    • conflict path: concurrent conflicting scalar edits (e.g. both set a different do-date) land in the conflict queue and conflicts.resolve settles them deterministically;
    • body CRDT merge: concurrent edits to the same doc body auto-merge without a hard conflict;
    • HLC ordering and op-log idempotency (replaying ops is a no-op).
  • Auth: OIDC token validation on the hub endpoint (reject missing/invalid/expired); per-user isolation (user A cannot read/sync user B's nodes); device-code flow happy path against a mock IdP.
  • End-to-end (headless nvim): drive heph.nvim in nvim --headless against a real hephd + temp DB, running scripted example workflows and asserting outcomes (via RPC/DB state and buffer contents). Minimum workflows:
    • capture a task → appears in :Heph next → open canonical context → add a checklist item → check it → mark task done;
    • create today's journal via the picker;
    • follow a [[link]] on <Enter> to the target doc;
    • a recurring task with a checklist: complete it, then assert the next occurrence presents a fresh, all-unchecked checklist;
    • a sync-driven update arrives while a buffer is open and the buffer reconciles.
    • Harness: plenary.nvim/busted, or drive nvim via its msgpack-RPC from the test runner. Keep example workflows as reusable fixtures.
  • CLI tests: invoke heph subcommands against a temp DB; snapshot output; assert export round-trips the corpus; heph conflicts lists/resolves.

10. Technology stack (ratified)

rusqlite (bundled) + migration runner · tokio + line-delimited JSON-RPC over unix socket · ulid · rrule · pulldown-cmark · clap · anyhow/thiserror · tracing. Neovim plugin in Lua, depending on telescope.nvim. Cargo workspace: crates/heph-core, crates/hephd, crates/heph, plus heph.nvim/.

Added for v1 client/server + auth (some to confirm at kickoff):

  • Text CRDT (body merge): yrs (Rust Yjs) — leaning; alternative automerge. Used for doc/journal/log bodies. Structured fields use a bespoke op-log + HLC (no library needed).
  • HLC: small bespoke hybrid-logical-clock (or a crate) — deterministic, clock-injected.
  • Hub network transport: axum (HTTP/JSON) for the sync endpoint — leaning (reuses the eventual web-UI server); reqwest on the client side.
  • OIDC: openidconnect crate for the Authentik device-code flow; tokens cached in the OS keychain (keyring) / 1Password.

11. v1 scope

In:

  • The full data model, markdown handling, "what is next?" ranking, and recurrence + recurring checklists (§4§8).
  • Client/server architecture: per-device client hephd + a central hub hephd (runnable/deployable binary).
  • Offline-first operation with op-log + CRDT sync and automatic merge + a conflict queue (§12).
  • OIDC/Authentik authentication with per-user data isolation (§13).
  • heph.nvim + heph CLI surfaces (incl. heph conflicts).

Out (later phases, scaffolded so as not to block):

  • Web UI (the hub serves sync only in v1; reserve axum for it later).
  • Actual k3s deployment to blumeops (Dagger→Zot image, ArgoCD app + Kustomize manifests, external-secrets) — fast-follow once the architecture is proven; the hub binary is built to be deployable.
  • Calendar integration (read-mostly CalDAV; never explode recurrence into stored events), iOS/Watch capture, inferred/semantic context, P2P-over-tailnet sync fallback.

See design §5§7 for the constraints later phases impose on present choices (keep tasks vs. calendar events separate; expand RRULEs lazily).

12. Sync & conflict resolution

Topology: hub-and-spoke. Each device holds a full local replica + op-log; the hub is the rendezvous. Devices push/pull ops by HLC cursor; the hub never needs to be online for local work.

Merge semantics (the unit of sync is the op):

  • doc/journal/log bodies: text CRDT (yrs) → concurrent edits always merge, no hard conflict.
  • Scalar task fields (attention, do_date, late_on, state, …): last-writer-wins by HLC. The losing value, if meaningful, is recorded in conflicts (surfaced, not silently dropped).
  • Links / set membership (tags, project, parent): OR-Set add/remove semantics → no conflicts.
  • Tombstones, never hard deletes → deletion/merge is monotonic and CRDT-friendly.

Conflict queue: the unresolvable/ambiguous remainder (a discarded LWW value on a meaningful field; flagged overlapping body regions) becomes an open row in conflicts. Surfaced via health().conflict_count, conflicts.list, heph conflicts, and a heph.nvim view: "you have N conflicts." conflicts.resolve({id, choice}) settles each. Sync never blocks on conflicts.

Determinism: HLCs are clock-injected; op application is idempotent and order-independent given HLC. These are the core invariants the sync tests assert.

Open at kickoff: CRDT lib confirmation (yrs vs automerge); hub transport (axum HTTP/JSON vs gRPC); propagation cadence (push vs. periodic pull); exactly which fields are "meaningful" enough to enqueue vs. silently LWW.

13. Authentication

  • OIDC against Authentik. Clients authenticate via the OAuth 2.0 device-code flow (auth.login); the resulting tokens are cached in the OS keychain (keyring) / 1Password and refreshed automatically. Offline devices operate on cached credentials.
  • Hub enforcement: the sync endpoint requires a valid OIDC bearer token; the hub maps the token's sub to a users row and scopes every op by owner_id. No cross-user reads/writes.
  • Per-user isolation: all nodes (and their dependent rows) carry owner_id; queries and sync are always user-scoped. In practice a single user (eblume), but the isolation is real from day one.
  • Local trust: the local unix-socket RPC trusts the OS user (file-permission-scoped socket); app-level auth is for the network boundary (device ↔ hub).
  • At-rest: plain SQLite in v1 (no encryption) — security boundary is auth + (eventually) network restriction. May revisit (see design).
  • design — full design document with rationale and decision history