Build / validate (pull_request) Failing after 5s

Details

docs: Phase 1 progress tracker (design roadmap + tech-spec §14)

Record what's built and the resume order for the next context session:
heph-core + hephd local mode + CLI/export + local query surface + the
sync engine (HLC, op-log, converging merge/conflict-queue) are done;
resume at yrs body-CRDT → network sync → OIDC auth → heph.nvim.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-01 05:23:43 -07:00

48 KiB

Raw Blame History

title

modified

Hephaestus — Design Document

Status: Living design record. This is the rationale + decision-history document. For the clean, implementation-facing spec the next session builds from, see tech-spec. Sections marked ❓ OPEN are unresolved; 🔒 DECIDED are settled.

1. Purpose & Vision

Hephaestus is a personal context management system that fuses two systems the owner already relies on into a single, purpose-built, self-hosted application:

The Zettelkasten (~/code/personal/zk) — an Obsidian vault of ~600 markdown files acting as a personal wiki, journal, knowledge base, and "task-context" store for long-running projects. Notes use [[wiki-links]], YAML frontmatter (id, aliases, tags), timestamp-based note IDs (1722897441-MWFE), and daily notes (YYYY-MM-DD).
Todoist — the owner's primary task manager, accessed today via a bespoke Python mise-task in blumeops (mise-tasks/blumeops-tasks) that polls the Todoist v1 API (projects → tasks; fields: content, description, priority, due) using a token from 1Password.

Today these are loosely coupled by fragile cross-links (todoist://<task-id> in notes, [[note]] text in Todoist comments). Hephaestus replaces the fragile coupling with one unified data model and database where tasks and knowledge are first-class, linkable entities.

Guiding principles

Context-aware: A task like "Fix the roof leak" is coupled to the relevant knowledge — the home-repair log, the log of contractor calls, etc. Surfacing that context is a core feature, not an afterthought.
Workflow-optimized: Highly optimized, concise answers to recurring questions — above all "What is next?" — are the primary UX. Speed and concision beat feature breadth.
Owner-built, owner-hosted: A standalone codebase replacing a loose stack of tools, hosted out of blumeops.

2. Goals & Non-Goals

Goals (v1 / prototype)

🔒 Unified data model linking notes and tasks as first-class, cross-linkable entities.
🔒 SQLite backend.
🔒 Rust implementation.
🔒 Distributed dual-mode operation: useful fully offline, auto-syncs to a central self-hosted instance in blumeops (k3s container) when reachable. The central instance may be unavailable for long periods.
🔒 Surfaces over a shared core (via the hephd daemon): CLI (everyday driver), nvim plugin (primary editing UX), web UI (hub-hosted).
🔒 Optimized workflow queries, especially "What is next?", with context surfacing.
🔒 Auth from the ground up (OIDC/Authentik), with isolated multi-user accounts — sensitive data.

Non-Goals (v1) / Later

⏳ iOS app + Apple Watch app for quick voice-dispatched task capture — explicitly a follow-up goal, not v1.
⏳ Sharing/collaboration between users — accounts are isolated for v1 (multi-user isolation is in scope; multi-user collaboration is not).
❌ Obsidian feature parity (canvas, graph view, plugins) — clean break.
❌ Migration / import / Todoist sync in v1 — the prototype is a clean break: heph is the system of record from day one. No ZK import, no Todoist bridge. (Migration may return as a later, optional phase.)
❌ Encryption at rest / E2E in v1 — security model is access restriction (OIDC auth) only; plain SQLite. May revisit.
❌ Inferred/semantic context surfacing in v1 — explicit links + search only.

3. Data Model

🔒 DECIDED (shape). SQLite is the canonical store; a node's body is plain markdown text; an export mode materializes the whole store as a directory of .md files. Clean break from Obsidian-the-app, but markdown stays the portable payload. Tasks are thin; rich context lives in documents; everything is connected by typed links.

Entities

Node — the base entity. Every first-class thing is a node with a stable, sync-safe ID (see §3.1). A node has a kind discriminator. Kinds (initial set):
- doc — a rich markdown context document (the bulk of the knowledge base; replaces ZK articles, journals, logs). Body = markdown text. Has aliases, timestamps.
- task — thin: ID + state, plus a few scalar attributes (attention-state, do-date, late-on, recurrence) and links to a project/tags/context. No prose body of its own — context is supplied by linked doc nodes (or a doc is the work-log for the task). This is the key unification: ZK-articles-with-todos and Todoist-tasks-with-context both reduce to typed links between thin tasks and rich docs. The full task-attribute model is in §6.2 (it encodes the owner's actual lived prioritization system).
- project — a grouping node (maps to Todoist projects / ZK project dirs like payrix/).
- tag — a label node (so tags are linkable/queryable, not just strings).
- journal — a daily-note node keyed by date (YYYY-MM-DD).
Link — a typed, directional edge between two nodes. Types (initial set): wiki (a [[link]] found in a doc body), context-of (doc ↔ task), blocks, parent/child, tagged, in-project. Wiki-links in markdown bodies are parsed and materialized as wiki links so the graph and the prose stay consistent.

3.1 Identity

🔒 DECIDED — ULID for content nodes (doc/task/project), sortable and sync-safe, with a human-facing slug/alias layer so [[wiki-links]] can be written by name. The ZK's timestamp-IDs (1722897441-MWFE) are ULID-like already; a future migration can map them.
🔒 DECIDED — deterministic ids for key-unique kinds. Random ULIDs diverge when two offline replicas independently create the same logical singleton (today's journal, a tag by name) — duplicates that never merge. So journal ids derive deterministically from (owner, ISO-date) and tag ids from (owner, normalized-name); independent creation then yields the same id and the CRDT/op-log coalesce them (two partial daily notes even merge). project stays a random ULID (deliberately created, renameable, tiny stable set). Tag rename = retag (no in-place key change); the normalization function is a fixed, versioned constant. See tech-spec §4.5.

3.2 Scalar task attributes vs. links

Tags and projects are nodes (linkable, queryable). Attention-state, do-date, late-on, recurrence are scalar columns on the task. See §6.2 for the semantics — these columns are not arbitrary; they encode a specific, battle-tested model. Due dates are not first-class nodes (a task has one project/context; "this Friday" doesn't need its own context).

3.3 Recurrence model (recurring tasks & checklists) — IN SCOPE for v1

A recurring task carries an RFC-5545 RRULE and acts as a recurring definition.

❌ The anti-pattern we must avoid (Todoist's mistake): modeling a recurring task's checklist sub-items as standalone, non-recurring tasks that carry their completion state forward when the parent recurs. This is the corner; designing around it now is why recurring checklists are in v1 rather than deferred.

heph model — 🔒 DECIDED (roll-forward in place):

The recurring definition's checklist template is just the - [ ] lines in its canonical context doc body (under Fork A, §6.3, the body is the template — no separate template entity).
A recurring task is a single node. On completion: (1) append the finished occurrence to the per-task log (§6.4), (2) reset the body's checkboxes to unchecked (a body-CRDT edit), (3) advance the do-date to the next RRULE instance strictly after now, skipping missed occurrences. So a fresh, all-outstanding checklist each occurrence; completion never carries forward; a missed daily routine is one gently-overdue item, not a pile.
History is narrative (the log), matching "narrative > list" (§6.1). RRULE expansion stays lazy — only "next instance after now" is ever computed, never the series (§6.6).

Why not occurrence-instances? The rejected alternative (a fresh node per occurrence) was attractive for queryable per-occurrence history, but under Fork A each occurrence would need its own body — hence its own node — turning every daily recurrence into a new node-pair: exactly the explosion §6.6 forbids. No v1 query needs occurrence rows (the ranking reads the single node's do-date), and adherence/streak stats can be reconstructed from the log later. See tech-spec §4.4.

4. Architecture

🔒 DECIDED (shape).

Layers, top to bottom:

Surfaces (thin clients): the nvim plugin (heph.nvim) is the primary surface for v1 — a full "org-mode"-style experience (markdown buffers backed by doc nodes; agenda / "what is next" / capture / linking / journaling as plugin commands). It is the explicit successor to obsidian.nvim, which the owner currently drives the ZK with (telescope picker, <Leader>o* verbs, <Enter> to follow [[wiki-links]], dailies, multi-state checkboxes) and rarely uses Obsidian-proper alongside. heph.nvim must reach that feature parity (see §6.5) and replace it. The CLI (heph) is a secondary/utility surface (export, scripting, admin) — explicitly not a focus, though it shares the same command surface. The web UI is the occasional hub-served surface. A later iOS/Watch client talks to the hub directly.
- 🔒 Single binary, three modes. One Rust binary runs as local / server / client (tech-spec §3.1); the heph CLI shares the same command surface. Mode is two orthogonal axes (backend + inbound listener) plus an optional hub_url that makes any local instance a syncing spoke — the everyday device is local + hub_url, the hub is server, client is the online-only convenience.
Per-device daemon (hephd): owns the local SQLite handle, the CRDT/op-log state, and background sync. All local surfaces connect to it over a local socket. This is what makes multi-surface access concurrent and safe on one SQLite file, and gives one place to run background sync.
Core crate (Rust lib): data model, query engine, markdown parsing + wiki-link extraction, and sync logic. Linked by both hephd and the hub server.
Hub: hephd running in "server" mode on blumeops k3s — central SQLite, the sync rendezvous point, and the web UI host. May be offline for long periods; devices keep working and reconcile when it returns.

Open points:

❓ nvim ↔ daemon transport: JSON-RPC over a unix socket (nvim has a native RPC channel; allows the daemon to push updates when sync brings in remote changes) vs. the plugin shelling out to the heph CLI. Leaning socket RPC.
❓ Web front-end style: server-rendered (Axum + HTMX/Askama) vs. WASM SPA (Leptos/Dioxus). Leaning SSR + HTMX since it's the lowest-traffic surface — favor simplicity.

5. Sync & Conflict Resolution

Requirements (gathered from the owner)

🔒 ~99% of interaction today is via the CLI on Gilbert (dev laptop). Future split: ~90% Gilbert, ~9% phone/watch, ~1% web.
🔒 Must work seamlessly across tailnet workstations (Gilbert, ringtail, others): dispatch a change on one, walk to another, see it arrive "in short order."
🔒 Genuine offline conflicts will occur — e.g. Gilbert edits offline, then ringtail edits, then Gilbert reconnects. Must resolve gracefully: auto-merge where possible; for the unresolvable remainder, a conflict queue + alert ("you have N merge conflicts, run heph conflicts").
🔒 Auth from the ground up — this is extremely sensitive data. Use Authentik (OIDC) as IdP, support multiple isolated user accounts (in practice just eblume).
🔒 A web UI is part of the hosted (hub) model.

Proposed model — for confirmation

A hub-and-spoke, op-log + CRDT design:

Topology: each device runs hephd with a full local SQLite replica; the blumeops hub is the rendezvous. Devices push/pull an append-only operation log to the hub when reachable. Hub-and-spoke (not P2P mesh) — simpler, matches "central self-hosted app," and the hub is normally up on the tailnet so propagation is prompt. When the hub is down, devices keep working and the op-log drains on reconnect.
Merge semantics (so "graceful" is precise):
- doc bodies (markdown): a text CRDT (e.g. Yrs/Automerge) → concurrent edits always merge with no hard conflict. This covers the common case.
- Scalar task fields (completion, due, priority): last-writer-wins via Hybrid Logical Clocks. Deterministic and cheap.
- Links / set membership (tags, projects): add/remove as a CRDT set (OR-Set) → no conflicts.
- Conflict queue: because CRDTs auto-merge, true blocking conflicts are rare. We surface (not block) the cases that are semantically ambiguous: a discarded LWW value on a meaningful field (e.g. completion flipped both ways, divergent due dates), or concurrent edits to overlapping text regions. These go to heph conflicts with an alert; sync never stalls.
Identity & ordering: ULIDs for nodes (§3.1); HLC timestamps stamp every op for causal ordering across offline devices.
Auth: the hub's sync + web endpoints require an OIDC bearer token from Authentik. Devices obtain tokens via the OAuth device-code flow; refresh token stored in the OS keychain / 1Password. Every node/op is owned by a user_id; the hub enforces per-user isolation. Offline devices operate on cached credentials.

Open points

❓ Hub-only vs. P2P fallback: if the hub is down but Gilbert and ringtail are both on the tailnet, do they sync directly, or wait for the hub? Hub-only is simpler; direct P2P is more available. (Leaning hub-only for v1.)
❓ Encryption at rest: given "extremely sensitive," should each device's SQLite (and the hub's) be encrypted at rest (e.g. SQLCipher), and/or should op-log payloads be end-to-end encrypted so the hub stores ciphertext it can't read? (E2E complicates server-side search/web UI.)
❓ Todoist's role during/after migration — two-way bridge during transition, import-once, or retire on cutover?
❓ How "short order" is short — is a few seconds of propagation (push-based) needed, or is periodic pull (e.g. every 30–60s) fine?

6. Workflow Features

Driven by "highly optimized specific workflows." heph is organized around a small set of workflows, each a first-class surface in heph.nvim:

Task / "What is next?" (§6.1–§6.3) — the Tactical / Strategic / Organizational task engine. The flagship.

Log (§6.4) — daily journal + optional append-only per-task work-logs ("narrative > list").

Wiki / KB (§6.5) — the whole context corpus as a browsable/queryable personal wiki; the direct obsidian.nvim replacement.

Calendar (§6.6) — entering via the calendar, tied to Strategic mode. Deferred, but scaffolded carefully.

"What is next?" — the flagship

❓ ACTIVE DEEP-DIVE. This is the defining workflow and has ~10 years of the owner's prior art and prior tool attempts behind it. Being designed from that prior art rather than from scratch — see §6.1. The blumeops-tasks ranking (overdue-first, then priority, p3 = backlog) is a known reference point, not the target.

Context surfacing — v1 scope

🔒 DECIDED for v1. Explicit links only; no inferred/semantic context yet.

Every task automatically gets one canonical doc (context) node, created and linked on task creation. That canonical doc is the jumping-off point and can wiki-link onward to other docs/cards.
nvim commands provide telescope/fzf/rg-style search over context docs, optionally with preloaded filters (e.g. scoped to a project).
Inferred context (shared tags, full-text/semantic similarity) is deferred — large surface area, not the v1 focus.

6.1 "What is next?" — prior art & synthesis

Distilled from ~10 years of the owner's thinking — predating the ZK by ~5 years — chiefly the owner's direct account (this design conversation), plus zk/mole/20240202113646.todo.md (the most developed written framework), daily logs (zk/home/2023-09-01.log.md, 2023-10-12.log.md), and zk/payrix/friction_log.md. Prior tool code may exist at github.com/eblume/{mole,hermes} (check detached-head branches — several restarts); the owner expects little reusable there beyond what's captured here.

History: Hermes → Mole → heph (and the lessons)

Hermes — the "time accountant" (≈ earliest). Goal: a timeline-based system fed "blueprints" of what a day should look like, with rules like "check email twice a day, unless a high-priority email comes in — but even then no more than once an hour," continuously re-solving the calendar in real time. Built a working prototype using a SAT solver (OR-tools). Killer feature deduced: a quick, meaningful answer to "what is next" — here, the next thing on the calendar. Why it failed: the calendar became a battle zone, swinging wildly every few minutes as the solver re-resolved the dependency graph → impossible to plan around. Lesson + salvage: there may be fruit here as a future, opt-in heph planning mode, but real-time auto-rescheduling of a visible calendar is anti-useful.
Mole — "whack-a-mole" rules engine. A rules engine that constantly re-evaluated rules and materialized Todoist tasks (canonical first test: "is there an active task 'whack the mole'? if not, make one"). "What is next" = "what Todoist says is available." Worked reasonably. Why it stalled: Todoist accumulated too many tasks, so choosing the actual next thing became the whole problem again. This pushed the owner to design the Todoist project/priority/filter discipline in §6.2. The deeper lesson: Mole's automation wasn't adding much value over the project/priority discipline itself — so the owner turned Mole off and just kept using Todoist. The manual discipline was the real engine.
Today. No automation beyond blumeops' read-only blumeops-tasks mise poller. Just Todoist + the §6.2 discipline — but as the direct descendant of all the above. Implication for heph: the value is in embodying and strengthening the discipline (and finally fixing the weak note↔task link), not in clever auto-scheduling or a rules engine. Earn any automation only after the discipline is faithfully supported.

"What's next?" is three questions, not one

The owner's framework splits the question by mode:

Tactical — blank-slate "what do I do right now?" Stay in the editor/terminal; capture interruptions and resume in milliseconds ("save state and walk away in milliseconds"). This is the mode a single ranked list serves.
Strategic — "I want to do X — what's the next step, or where was I?" Exploring a goal, generating the sub-tasks to reach it.
Organizational — "I want to do X — help me plan the steps." Project-level enumeration of what needs doing, not how.

🔒 REAFFIRMED. The owner still strongly identifies with this split (the earlier "maybe" was conflating it with the unrelated CLI-first process note on the same page). The clean operational test, in the owner's words: "looking at a project's list of todos = Organizational; deciding which to work on = Strategic; working a todo = Tactical." Escalation flows upward (project-infra work can become Strategic, then Tactical).

How heph embodies it — as named modes/views in the nvim plugin, not an inference engine that guesses your mode:

Tactical (execution view): you are "tactically engaging" a specific committed task. Loads its canonical context + sub-items into a goal stack; millisecond interrupt-capture pushes new context items onto the stack; "smallest amount of contextual guidance and next steps to stay productive." Includes Chore Mode (synthesize a goal stack from a query — e.g. the Chores project — and chug through one by one).
Strategic (planning view): editor present but the goal is to explore/decompose, not edit. "To do X I must also do Y and Z." Produces a fleshed-out task with ready sub-items so Tactical can grab and go. When Strategic completes, no work has been done — only new/changed tasks, docs, links.
Organizational (survey view): project-level enumeration — what needs doing, not how.
Shared primitive — the goal stack (see §6.3) spans Tactical and Strategic; transitions between modes are first-class (Strategic produces → Tactical consumes).

The real cost is resumption, not ranking

The recurring pain in the logs is reorienting after a context switch, not deciding raw priority. Design rules that follow:

Resumption hint invariant: when work pauses on a task/todo, it should leave at least one concrete next-step so future-you can re-enter quickly. heph should make capturing/“leaving a breadcrumb” frictionless and surface it first on return.
Narrative > list: daily logs proved more motivating and contextual than flat task lists ("a log is the way"). Journaling/log nodes are first-class, and "what is next" should be able to lean on recent log narrative, not just task metadata.

The "weak link" is heph's reason to exist

The #1 unsolved problem: the fragile, clunky coupling between Todoist and notes (manually typing nb o 45 into task descriptions). heph's thin-task ↔ auto-created canonical-context-doc model is the direct fix: the link is structural, not a hand-typed string.

Mapping prior taxonomy → heph entities

Prior (`nb`/Todoist)	Meaning	heph
nb-project	grouping; completing all members completes it	`project` node
nb-todo	planning complex multi-step work; has "a single unified context log"	`task` + its canonical context `doc`
nb-task	immediate, short-lived "might eventually be done"; checking off = "no longer might be done" (not necessarily completed)	thin `task` node
Todoist	"deciding what is next and knowing what needs to be done now"	heph's "what is next?" engine
Calendar	scheduling / future	due dates (+ later calendar integration)

🔒 Preserved. The nb-task semantic — checking off means "no longer outstanding," not "accomplished" — carries into heph. Context items are outstanding / not-outstanding (see §6.3). For committed tasks, heph distinguishes end-states: done (accomplished) vs. dropped/dismissed (let go, e.g. during a Blue review). Both are "not outstanding"; the distinction is kept for honesty and history.

Process steer (now historical)

The owner's old rule — "avoid ncurses and interactive UIs; write atomic code and call it manually with subcommands… until a strong pattern emerges" — was a focus device for hand-building tools. It no longer constrains heph: we go straight to the nvim org-mode plugin backed by the Rust server (the patterns are now well understood; see §6.2/§6.3). The CLI remains as a secondary/utility surface.

Distilled requirements for heph's "what is next?" — to confirm/refine

Mode-aware (Tactical / Strategic / Organizational), not a single global sort.
Sub-millisecond capture and fast state-save (Tactical interrupt handling).
Resumption-first: on returning to a task, surface the last breadcrumb/next-step before anything else.
Context-coupled: every task one keystroke away from its canonical context doc and onward links.
Log-aware: can factor in / surface recent daily-log narrative.
A sane default ranking for the Tactical blank-slate case — the concrete model is §6.2, not the blumeops-tasks placeholder.

6.2 The lived priority discipline (the working basis for heph)

🔒 This is the real model. The owner has used this Todoist discipline daily for years; Mole was eventually turned off because this discipline outvalued the automation. heph should embody and strengthen it. Captured verbatim-in-spirit from the owner's account.

Projects = contexts (a task has exactly one). Examples: Life (catch-all), Work, Chores, Maintenance, Outside, Allison (wife), Errands, Cooking, Health. A project contextualizes a task; it is not a plan.

Attention-state (thought of by color, not number):

State	Meaning
White (default)	Able to be done once its do-date arrives. (See do-date semantics below.)
Orange	Top of mind. Keep ≤ 6, reassessed every day. Goal: you know your top-of-minds without looking at the tool.
Red	Top of mind + there will be a consequence if not done ASAP. Red means the existence of a consequence, NOT its severity/importance. Flagging by importance is a trap → anxiety (everything due feels important).
Blue	On Deck (backlog). Doable, but deliberately cooling off. The pressure-relief valve that keeps the working set light.

Do-date, not due-date (key insight): "due date" means DO date — don't do it before this date. Optionally a separate "Late On" marker for when it actually becomes a problem. Due dates are a terrible planning tool — calendar-style deadlines make everything "yell for attention" constantly. heph must treat the do-date as earliest actionable, not a deadline alarm.

Working-set discipline (the healthy tensions — a feature, not a bug):

Keep White+Orange+Red ≤ ~30 total, or the system collapses into noise. Keep Orange ≤ 6.
Push freely to Blue to stay light; periodically review Blue: keep vs. let go (this is where "drop/dismiss" happens). Target On-Deck < 100 (owner is at ~149 and hates it).
These tensions map the owner's emotional state to the task system and should be surfaced honestly. heph must avoid two failure modes: (1) "fake happy" — telling you everything's fine when it isn't; (2) overwhelm — yelling that you're buried until you freeze and make no progress. Reflect true state; help maintain the tensions.

Filters = saved views (queries over the above):

Top of Mind — all Red + Orange.
Tasks / Work Tasks — Red/Orange/White that are actionable now (do-date arrived), scoped by project.
Chores — like Tasks but from Chores, slightly tuned down so the (near-universally recurring) chores don't become a nuisance.
On Deck — all Blue.
Schedule — recurring checklists from a few projects (e.g. Morning Routine → brush teeth, feed birds, coffee, medicine). The owner uses this particular filter sparingly, but the underlying recurring-checklist mechanism is IN SCOPE for v1 (see §3.3): the only way to be sure we avoid Todoist's mistake is to model it up front. Each recurrence gets a fresh checklist instance; completion never carries forward.

heph's default "what is next?" (Tactical blank-slate) therefore ≈ Top of Mind (Red first — by consequence, then Orange) + White items whose do-date has arrived, scoped to the current project/context, with Blue hidden. Concise, honest, light.

🔒 DECIDED ranking mechanics (tech-spec §7): do_date is a boolean candidacy filter only (null ⇒ "now"), never an urgency input; late_on is the sole urgency signal (a global "now a problem" top tier); within a band, FIFO by created_at — age never becomes urgency. The order is expressed as a reorderable named-dimension list so it can be retuned without touching the engine.

6.3 Two kinds of task: commitments vs. context items

🔒 DECIDED (shape). A commitment axis orthogonal to the §6.2 attention-states.

Committed task — long-lived, strongly tied to context (e.g. "Complete PLAT-1234", "Build a nursery"). Participates in §6.2: has an attention-state, a project, a do-date; is a candidate for "what is next"; may carry sub-structure and a web of context.
Context item — ephemeral, created mid-flow ("six things before this is done", "oh crap, another" pushed on the stack). Does not enter the §6.2 scheduling system — it is part of its containing task's context, with only outstanding / not-outstanding state. Never appears in global "what is next."
Promotion is first-class: a context item can be promoted to a committed task when you decide to actually schedule it (the old "nb-todo → Todoist-task on activation").
Ephemeral ≠ deleted: deletes effectively never happen — tombstones + a dedicated cleanup mode only. "Ephemeral" is lifecycle/visibility, not storage. (This also keeps the CRDT sync simple.)
The goal stack is a session-scoped, synthesized construct: seeded from a committed task (Tactical) or from a query over contexts (Chore Mode — "grind on some chores"). It holds committed tasks and/or context items for the duration of a work session.

Editing surface for context items — deliberately deferred to the prototype

🔑 DECIDED — Fork A "index" backend (the affordance is still trialed both ways). The stored source of truth is the doc body markdown: context items are - [ ] lines in it, and a context item's checked state is the [ ]/[x]. They are not synced nodes — each replica derives a local index from its converged CRDT body, so they converge for free (no divergent ULIDs across offline edits — the original sync trap). Identity is pinned only at promotion, which mints a real committed task node and rewrites the source line into a link to it. This collapses the hardest sync problem into "the body CRDT already converges" and matches this section's "ephemeral, low-stakes identity" intent. What's still trialed both ways is purely the nvim affordance (Option A checkboxes-in-body vs. Option B command-driven capture) — the backend is shared, so ergonomics can be judged in the prototype. See tech-spec §4.3, §5.

Option A — checkboxes in the body, derived on save. Edit the task's canonical context doc as plain markdown; - [ ] lines are materialized into context-item state on :w (BufWritePost), reusing the wiki-link extraction machinery. No real-time scanning needed (debounced live updates are optional polish). Remote CRDT edits are pushed from the daemon to reconcile an open buffer. Item identity is low-stakes (reword = tombstone-old + add-new) since items are ephemeral; identity is pinned only at promotion. Most "personal org-mode"-native; millisecond capture = "type a line."
Option B — command-driven capture, rendered. Capture via a leader chord + vim.ui.input (no raw checkbox typing); the new item is still written as a - [ ] line into the body-backed store (Fork A), then rendered inline via virtual text/extmarks or a transient floating window (avoids the disliked persistent cutaway). Heavier capture; more UI to build; promotion identical.
Lean: A for capture flow, but build the shared node backend first and trial both affordances in the plugin.

Prior-art note (multi-state checkboxes): the owner's obsidian.nvim config uses checkbox states " ", "x", ">", "~", "!" (todo / done / forwarded / ~ / important). heph's task/context-item state model should accommodate richer-than-binary states (at least the done/dropped/outstanding set in §6.1; possibly forwarded/deferred).

6.4 The Log workflow

🔒 In scope for v1 (daily journal certainly; per-task logs as the model allows). Logs embody the "narrative > list" insight and the resumption-hint invariant from §6.1.

Two related shapes:

Daily journal. Today driven by the zkd abbrev → Obsidian dailies: a telescope picker to create/open dated journal buffers. heph keeps this as a first-class feature: journal nodes keyed by date, a picker to jump to today / recent / create. The daily journal is the catch-all narrative stream.
Per-task log. Sometimes a task wants an append-only "here's what I was working on" side-commentary. It half-belongs to the task's context and half doesn't — so model it as a separate buffer/node from the task's canonical context doc, optionally present only for tasks that need longer-duration commentary. Properties:
- Append-only (entries accrete; you don't rewrite history).
- Dispatchable from an isolated context: you must be able to fire a log entry without leaving / losing your place in the context buffer you're navigating — i.e. append to the log from elsewhere (a quick-capture into the task's log), while the context buffer stays where it is. (This is the ergonomic crux; likely a quick-capture command targeting "the current task's log".)
- The per-task log is a strong candidate to be the resumption breadcrumb store (§6.1).

❓ Open: is the per-task log a distinct node kind (log, append-only), or just a doc flagged append-only and linked log-of → task? Leaning a dedicated append-only log kind to make "dispatch an entry" a cheap, well-defined op.

6.5 The Wiki / KB workflow

🔒 First-class in v1. This is the "view all my contexts as one big personal wiki" surface — the direct replacement for obsidian.nvim.

The owner estimates ~90% of the value comes from a few well-worn primitives, so heph.nvim should reach obsidian.nvim feature parity on day one:

Follow [[wiki-links]] on <Enter> to jump between context docs (the core navigation gesture).
Telescope-based search (rg), quick-switch between docs, tag browse, backlinks, outgoing links — mirroring the owner's <Leader>o* verbs (to be rebound under heph's own prefix).
Logical layout + tagging of context docs so the corpus is browsable/queryable as a whole.
New-doc-from-query and insert-link from the picker (obsidian.nvim's <C-x>/<C-l>).

heph.nvim replaces obsidian.nvim outright — the owner already rarely opens Obsidian itself. This workflow is the knowledge-base analogue of the Organizational task view: surveying and traversing the doc graph.

6.6 The Calendar workflow

⏳ Deferred (later version) — but scaffold now, don't paint ourselves into a corner. Ties closely to Strategic mode (planning against time).

The owner uses Calendar.app bound to a Gmail calendar (almost certainly CalDAV; Gmail also exposes ICS/WebCal feeds — to confirm when implemented).
Eventual goal: integration between calendar views and task views — enter the system via the calendar, see tasks/do-dates in temporal context.
⚠️ Hard-won gotcha: enabling Todoist's gcal integration flooded the calendar with a million events because of recurring rules. heph must never explode recurrence into individual calendar events. This echoes the Hermes "battle zone" failure (§6.1) — calendar integration must be read-mostly, careful, de-duplicated, and opt-in, not a firehose.
Scaffolding guidance (decisions to keep clean now): keep a crisp separation between tasks/do-dates and calendar events; model recurrence as RRULEs that are expanded lazily for display, never materialized as stored events; design sync so a future read-only CalDAV ingestion (calendar → heph) is easy, and treat any heph → calendar export as a later, explicitly bounded, non-exploding feature.

7. Hosting & Deployment

Reuse the established blumeops patterns (🔒 confirmed by repo conventions):

Containerized service deployed to k3s (ringtail cluster) via ArgoCD app-of-apps + Kustomize raw manifests (no Helm/Flux).
Secrets via 1Password + external-secrets operator (ClusterSecretStore onepassword-blumeops).
Image built via Dagger + Forgejo Actions, pushed to Zot registry (registry.ops.eblu.me).
❓ Rust build approach for the container: Nix (like kingfisher) vs. Dagger-native cargo multi-stage build.
❓ Persistence for the central instance's SQLite — NFS PVC (sifaka) vs. local PV. (SQLite over NFS has known locking caveats.)
❓ How do clients reach the central instance — Tailscale tailnet only, or public via Caddy/Fly proxy?

8. Technology Choices (to decide)

❓ nvim plugin stack: Lua + JSON-RPC over a socket to hephd (with daemon→plugin push) vs. CLI shell-out. Telescope as the picker dependency.
❓ Web framework (Axum + HTMX/Askama? Leptos? Dioxus?).
❓ SQLite access layer (sqlx, rusqlite, SeaORM, Diesel) and migrations.
❓ CRDT / sync library (Automerge-rs, Yrs, bespoke op-log).
❓ Markdown parser/renderer (pulldown-cmark, comrak) and wiki-link / checkbox extraction.

9. Open Questions Summary

Decided

✅ Source of truth: SQLite-primary, markdown as the body payload, with a directory export mode. Clean break from Obsidian-the-app.
✅ Note/task model: typed node graph — thin task nodes, rich doc nodes, connected by typed links.
✅ Front-end: hephd daemon with thin clients over a local socket; nvim plugin (heph.nvim) as the primary surface, CLI secondary; web UI on the hub (later).
✅ Sync (shape): hub-and-spoke, op-log + CRDT, HLC ordering, OIDC/Authentik auth, conflict queue for the ambiguous remainder, over a targetable backend (tech-spec §3.1).

Decided since (B/C/D)

✅ Context surfacing v1: explicit links only + auto-created canonical context doc per task + nvim fuzzy search. Inferred/semantic deferred.
✅ v1 supports local-only and distributed via a targetable backend (tech-spec §3.1): Store trait + three modes (local/server/client) + exclusive-lock handoff over a shared SQLite file. Offline-first sync with automatic merge + conflict queue + full OIDC/Authentik auth with per-user isolation are all IN v1 (tech-spec §12/§13). Local-only is a first-class config. Web UI and k3s deployment are later (hub binary is built deployable).
✅ Security v1: OIDC/Authentik auth + per-user isolation is the boundary; plain SQLite at rest (no encryption). May revisit.
✅ Migration v1: clean break, no ZK import / no Todoist bridge. heph is system of record day one.

Decided (workflows)

✅ heph organized around first-class workflows (§6): Task (what-is-next), Log, Wiki/KB, Calendar.
✅ Log (§6.4) in scope: daily journal (picker over dated journal nodes, à la zkd/Obsidian dailies) + optional append-only per-task work-logs (separate buffer from the canonical context doc; dispatchable without leaving the context buffer).
✅ Wiki/KB (§6.5) first-class: heph.nvim reaches obsidian.nvim feature parity (follow [[links]] on <Enter>, telescope search/quick-switch/tags/backlinks/links) and replaces it.
✅ Calendar (§6.6) deferred but scaffolded carefully: never explode recurrence into stored events (Todoist-gcal-flood / Hermes lesson); keep tasks vs. events separate; lazy RRULE expansion; design for a future read-only CalDAV ingest.

Decided (workflow model)

✅ "What is next?" rests on the lived priority discipline (§6.2): projects-as-contexts; attention-states White/Orange/Red/Blue (Red = consequence, not severity); do-date not due-date; working-set tensions surfaced honestly (avoid "fake happy" and overwhelm).
✅ Tactical / Strategic / Organizational reaffirmed as named modes/views in the nvim plugin, sharing a goal-stack primitive (§6.1).
✅ Commitment axis (§6.3): committed tasks (in §6.2 system) vs. ephemeral context items (outstanding/not-outstanding, scoped to a container); promotion is first-class; no hard deletes (tombstones + cleanup mode).
✅ Task end-states: done vs. dropped/dismissed (both "not outstanding").
✅ Prototype goes straight to the nvim plugin + Rust server; CLI is secondary.
✅ Rust stack RATIFIED (§11.2): rusqlite + tokio/JSON-RPC + pulldown-cmark + ulid + rrule + clap.
✅ Recurrence + recurring checklists IN SCOPE for v1 (§3.3): RRULE-driven; each occurrence gets a fresh checklist instance, completion never carries forward (avoids Todoist's corner).

Still open

Prototype-blocking (resolve at kickoff — see §11 / tech-spec):

CRDT library for body merge: yrs (leaning) vs automerge; bespoke op-log/HLC for scalar fields either way.
Hub network transport (tech-spec §6.1): axum HTTP/JSON (leaning) vs gRPC; sync propagation cadence (push vs periodic pull); device-id provisioning (how a spoke mints its origin id and registers with the hub).

Resolved in the second-pass review (2026-05-31), now folded into tech-spec:

✅ Recurrence → roll-forward in place (§3.3) — occurrence-instances rejected (would explode into a node-per-occurrence under Fork A); advance to the next RRULE instance after now, skipping misses.
✅ Context items → Fork A "index" backend (§6.3) — body markdown is the source of truth; context items are a local derived index, not synced nodes; identity at promotion. The nvim affordance (A vs B) is still trialed in-prototype.
✅ Identity → deterministic ids for journal/tag (§3.1); random ULID for content nodes/project.
✅ Mode/sync orthogonality (§4): backend + inbound-listener axes plus optional hub_url; everyday device = local + hub_url.
✅ Auth/owner (tech-spec §13): nullable oidc_sub, local user for local-only, hub-authoritative identity, one-time pre-first-sync adoption rewrite.
✅ Ranking (tech-spec §7): do_date boolean candidacy filter only; late_on sole urgency signal (global tier); FIFO tiebreak; order as a reorderable named-dimension list.
✅ Modes (§6.1) are plugin-side compositions of mode-agnostic daemon primitives; added list (Organizational) and log.tail (resumption breadcrumb).

Not prototype-blocking (later phases):

Per-task log representation: dedicated append-only log node kind vs. doc flagged append-only (§6.4).
Calendar protocol confirmation (CalDAV vs ICS/WebCal) and the careful, non-exploding integration (§6.6).
Web front-end style (SSR + HTMX vs. WASM SPA).
k3s deployment specifics: Rust container build (Nix vs Dagger-native cargo); central SQLite persistence (local PV vs NFS); client reachability (tailnet-only vs public).
Encryption at rest / E2E (currently out; may revisit).

10. Roadmap (provisional)

Phase 0 — Design (this document + tech-spec): done enough to build.
Phase 1 — v1 prototype (a single C1 effort, deliberately — a one-shot test of delivering a high-complexity prototype from the spec; built in TDD slices): heph-core (model, schema, extraction, recurrence, "what is next", Store trait, op-log/HLC/CRDT merge) → hephd local mode → server + client modes (+ lock handoff) → offline sync + conflict queue → OIDC/Authentik auth + per-user isolation → heph.nvim + heph CLI. Local-only works standalone; runnable client/server + offline sync on the tailnet. The build order doubles as the cross-session resume tracker (next un-green slice = where to resume). C2/Mikado is not used: it sequences prerequisites against existing code under test, and this is greenfield delivery from a complete spec; follow-up C1s or a C2 refactor come later as needed.

Phase 1 progress (branch feature/v1-prototype, PR #1; 102 tests green as of 2026-06-01 — see tech-spec §14 for the per-area tracker):
- heph-core library — schema/Store/LocalStore, extraction, tasks/links/canonical-context + wiki-link materialization, "what is next?" ranking, recurrence roll-forward + per-task logs.
- hephd local mode — file lock + JSON-RPC over a unix socket (tokio blocking pool); heph CLI + export.
- Local query surface — list, health, journal, search (FTS5).
- Sync engine (no network yet) — HLC + device origin; op-log recording; apply_op merge (LWW + conflict queue, OR-set links, monotonic tombstones, idempotent); two-replica convergence proven; adopt_owner (basic §13).
- Resume here → yrs text-CRDT for bodies (upgrade body merge from LWW); then server/client modes + network push/pull (transport over the existing engine); then OIDC/Authentik auth; then heph.nvim.
Phase 2 — k3s deployment: Dagger→Zot image, ArgoCD app + Kustomize manifests, external-secrets; hub on blumeops.
Phase 3 — Web UI on the hub.
Phase 4 (later, optional) — calendar integration (careful CalDAV); migration from ZK / Todoist; iOS / Apple Watch voice capture; Hermes-style planning mode.

11. v1 Prototype — Scope, Stack, and Kickoff

The handoff target for the next context session. Items here are proposals to ratify; once ratified, a fresh session can build directly from this.

The canonical, implementation-ready scope/stack/schema/API now live in tech-spec. This section keeps the intent and the kickoff process; defer to the spec for details (and update the spec when decisions change).

11.1 Scope — local-to-distributed via a targetable backend (ratified)

v1 supports the full spectrum — pure local-only through full distributed — chosen by configuration, not a rebuild. A Store abstraction backs three runtime modes (local / server / client) with an exclusive-lock handoff over a shared SQLite file (tech-spec §3.1).

In: the full data model (§3, §6) incl. recurrence + recurring checklists (§3.3); the "what is next?" engine; the targetable backend + all three modes with the same-file local↔server lock handoff; offline-first op-log + CRDT sync with automatic merge and a conflict queue for local-backed replicas (tech-spec §12); OIDC/Authentik auth with per-user isolation on the server endpoint (tech-spec §13); the heph.nvim + heph CLI surfaces.
Out (later, scaffolded): the web UI (hub serves sync only in v1); actual k3s deployment to blumeops (hub binary is built deployable; ArgoCD/Kustomize/Dagger is a fast-follow); calendar; iOS/Watch; inferred context; P2P-over-tailnet fallback; persistent goal stack (an in-plugin session stack suffices).

Why this v1: the backend abstraction, offline/merge behavior, and auth model are architecturally load-bearing — bolting them on later means reworking the core. The owner wants them designed in from the start, while keeping local-only a first-class, friction-free configuration. (Recurrence-with-checklists is in for the same reason — see §3.3.) Natural build order: local mode first (simplest), then server/client, then sync, then auth.

11.2 Stack — ✅ RATIFIED

Core: rusqlite (bundled) + migrations · tokio + JSON-RPC/unix-socket (local) · ulid · rrule · pulldown-cmark · clap · anyhow/thiserror · tracing. Distributed/auth additions (some to confirm at kickoff): text CRDT yrs (vs automerge) for bodies + bespoke op-log/HLC for fields · hub transport axum/reqwest · OIDC via openidconnect + keyring. Full detail and the SQLite schema: tech-spec §4.5, §10, §12, §13.

11.3 First-session kickoff checklist

mise run ai-docs; read tech-spec (the build spec) and this doc's §3/§6 for rationale.
Classify as a single C1 (deliberate — a one-shot test of delivering this high-complexity prototype from the spec; C2/Mikado is for sequencing prerequisites against existing code, not greenfield delivery). Single long-lived feature branch + early draft PR, docs-first, push as you go (agent-change-process). The §11.1 build order is the resume tracker; the hairy slices (sync/CRDT) may spin off follow-up C1s rather than blocking the prototype.
Scaffold the cargo workspace + heph.nvim skeleton; fill the AGENTS.md Project Structure section (last template TODO).
Build outward in testable slices (TDD, tech-spec §2/§9): heph-core (schema, model, extraction, recurrence, "what is next", Store trait, op-log/HLC/CRDT merge) → hephd local mode (LocalStore + lock + local RPC) → server + client modes (network endpoint, RemoteStore, lock handoff) → sync + conflict queue → OIDC auth → heph.nvim + heph CLI.
Confirm the kickoff-open picks: CRDT lib (yrs vs automerge), hub transport (axum vs gRPC) + propagation cadence + device-id provisioning. (Recurrence = roll-forward and the context-item backend = Fork A are now decided; only the nvim affordance A vs B is trialed in-prototype.)

tech-spec — clean implementation-facing technical specification distilled from this document
ai-assistance-guide — conventions for AI agents in this repo
agent-change-process — C0/C1/C2 change methodology

48 KiB Raw Blame History Unescape Escape