Erich Blume 6514296b87 docs: reframe tech-spec as historical; heph self-hosts its roadmap

v1 reached Todoist feature-parity, so remaining/future work is now tracked
in heph itself — tasks in the Hephaestus project (heph view ondeck) — not in
a doc. Renamed docs/reference/tech-spec.md -> v1-prototype-tech-spec.md and
rewrote all 27 [[tech-spec]] wiki-links + README/changelog path refs (docs
checks green). Retitled + bannered the spec as a historical v1 build record
and froze its §14 tracker. AGENTS.md gains a "Planning future work" section
(capture via `heph task --project Hephaestus`, triage in heph-tui On Deck);
README status reflects parity + the three daily-driver surfaces. The design
doc remains the living rationale.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-03 20:19:35 -07:00

55 KiB

Raw Permalink Blame History

title

modified

Hephaestus — Design Document

Status: Living design record. This is the rationale + decision-history document. For the clean, implementation-facing spec the next session builds from, see v1-prototype-tech-spec. Sections marked ❓ OPEN are unresolved; 🔒 DECIDED are settled.

1. Purpose & Vision

Hephaestus is a personal context management system that fuses two systems the owner already relies on into a single, purpose-built, self-hosted application:

The Zettelkasten (~/code/personal/zk) — an Obsidian vault of ~600 markdown files acting as a personal wiki, journal, knowledge base, and "task-context" store for long-running projects. Notes use [[wiki-links]], YAML frontmatter (id, aliases, tags), timestamp-based note IDs (1722897441-MWFE), and daily notes (YYYY-MM-DD).
Todoist — the owner's primary task manager, accessed today via a bespoke Python mise-task in blumeops (mise-tasks/blumeops-tasks) that polls the Todoist v1 API (projects → tasks; fields: content, description, priority, due) using a token from 1Password.

Today these are loosely coupled by fragile cross-links (todoist://<task-id> in notes, [[note]] text in Todoist comments). Hephaestus replaces the fragile coupling with one unified data model and database where tasks and knowledge are first-class, linkable entities.

Guiding principles

Context-aware: A task like "Fix the roof leak" is coupled to the relevant knowledge — the home-repair log, the log of contractor calls, etc. Surfacing that context is a core feature, not an afterthought.
Workflow-optimized: Highly optimized, concise answers to recurring questions — above all "What is next?" — are the primary UX. Speed and concision beat feature breadth.
Owner-built, owner-hosted: A standalone codebase replacing a loose stack of tools, hosted out of blumeops.

2. Goals & Non-Goals

Goals (v1 / prototype)

🔒 Unified data model linking notes and tasks as first-class, cross-linkable entities.
🔒 SQLite backend.
🔒 Rust implementation.
🔒 Distributed dual-mode operation: useful fully offline, auto-syncs to a central self-hosted instance in blumeops (k3s container) when reachable. The central instance may be unavailable for long periods.
🔒 Surfaces over a shared core (via the hephd daemon): CLI (everyday driver), nvim plugin (primary editing UX), web UI (hub-hosted).
🔒 Optimized workflow queries, especially "What is next?", with context surfacing.
🔒 Auth from the ground up (OIDC/Authentik), with isolated multi-user accounts — sensitive data.

Non-Goals (v1) / Later

⏳ iOS app + Apple Watch app for quick voice-dispatched task capture — explicitly a follow-up goal, not v1.
⏳ Sharing/collaboration between users — accounts are isolated for v1 (multi-user isolation is in scope; multi-user collaboration is not).
❌ Obsidian feature parity (canvas, graph view, plugins) — clean break.
❌ Migration / import / Todoist sync in v1 — the prototype is a clean break: heph is the system of record from day one. No ZK import, no Todoist bridge. (Migration may return as a later, optional phase.)
❌ Encryption at rest / E2E in v1 — security model is access restriction (OIDC auth) only; plain SQLite. May revisit.
❌ Inferred/semantic context surfacing in v1 — explicit links + search only.

3. Data Model

🔒 DECIDED (shape). SQLite is the canonical store; a node's body is plain markdown text; an export mode materializes the whole store as a directory of .md files. Clean break from Obsidian-the-app, but markdown stays the portable payload. Tasks are thin; rich context lives in documents; everything is connected by typed links.

Entities

Node — the base entity. Every first-class thing is a node with a stable, sync-safe ID (see §3.1). A node has a kind discriminator. Kinds (initial set):
- doc — a rich markdown context document (the bulk of the knowledge base; replaces ZK articles, journals, logs). Body = markdown text. Has aliases, timestamps.
- task — thin: ID + state, plus a few scalar attributes (attention-state, do-date, late-on, recurrence) and links to a project/tags/context. No prose body of its own — context is supplied by linked doc nodes (or a doc is the work-log for the task). This is the key unification: ZK-articles-with-todos and Todoist-tasks-with-context both reduce to typed links between thin tasks and rich docs. The full task-attribute model is in §6.2 (it encodes the owner's actual lived prioritization system).
- project — a grouping node (maps to Todoist projects / ZK project dirs like payrix/).
- tag — a label node (so tags are linkable/queryable, not just strings).
- journal — a daily-note node keyed by date (YYYY-MM-DD).
Link — a typed, directional edge between two nodes. Types (initial set): wiki (a [[link]] found in a doc body), context-of (doc ↔ task), blocks, parent/child, tagged, in-project. Wiki-links in markdown bodies are parsed and materialized as wiki links so the graph and the prose stay consistent.

3.1 Identity

🔒 DECIDED — ULID for content nodes (doc/task/project), sortable and sync-safe, with a human-facing slug/alias layer so [[wiki-links]] can be written by name. The ZK's timestamp-IDs (1722897441-MWFE) are ULID-like already; a future migration can map them.
🔒 DECIDED — deterministic ids for key-unique kinds. Random ULIDs diverge when two offline replicas independently create the same logical singleton (today's journal, a tag by name) — duplicates that never merge. So journal ids derive deterministically from (owner, ISO-date) and tag ids from (owner, normalized-name); independent creation then yields the same id and the CRDT/op-log coalesce them (two partial daily notes even merge). project stays a random ULID (deliberately created, renameable, tiny stable set). Tag rename = retag (no in-place key change); the normalization function is a fixed, versioned constant. See v1-prototype-tech-spec §4.5.

3.2 Scalar task attributes vs. links

Tags and projects are nodes (linkable, queryable). Attention-state, do-date, late-on, recurrence are scalar columns on the task. See §6.2 for the semantics — these columns are not arbitrary; they encode a specific, battle-tested model. Due dates are not first-class nodes (a task has one project/context; "this Friday" doesn't need its own context).

3.3 Recurrence model (recurring tasks & checklists) — IN SCOPE for v1

A recurring task carries an RFC-5545 RRULE and acts as a recurring definition.

❌ The anti-pattern we must avoid (Todoist's mistake): modeling a recurring task's checklist sub-items as standalone, non-recurring tasks that carry their completion state forward when the parent recurs. This is the corner; designing around it now is why recurring checklists are in v1 rather than deferred.

heph model — 🔒 DECIDED (roll-forward in place):

The recurring definition's checklist template is just the - [ ] lines in its canonical context doc body (under Fork A, §6.3, the body is the template — no separate template entity).
A recurring task is a single node. On completion: (1) append the finished occurrence to the per-task log (§6.4), (2) reset the body's checkboxes to unchecked (a body-CRDT edit), (3) advance the do-date to the next RRULE instance strictly after now, skipping missed occurrences. So a fresh, all-outstanding checklist each occurrence; completion never carries forward; a missed daily routine is one gently-overdue item, not a pile.
History is narrative (the log), matching "narrative > list" (§6.1). RRULE expansion stays lazy — only "next instance after now" is ever computed, never the series (§6.6).

Why not occurrence-instances? The rejected alternative (a fresh node per occurrence) was attractive for queryable per-occurrence history, but under Fork A each occurrence would need its own body — hence its own node — turning every daily recurrence into a new node-pair: exactly the explosion §6.6 forbids. No v1 query needs occurrence rows (the ranking reads the single node's do-date), and adherence/streak stats can be reconstructed from the log later. See v1-prototype-tech-spec §4.4.

4. Architecture

🔒 DECIDED (shape).

Layers, top to bottom:

Surfaces (thin clients) — a three-surface model (revised 2026-06 after the §6.2.1 Todoist study; supersedes the earlier "nvim is the primary surface" framing). Tasks and knowledge pull in different interaction directions, so each surface plays to its strength rather than one trying to do everything:
- heph.nvim — the primary context / knowledge-base surface. The full "org-mode"-style experience (markdown buffers backed by doc nodes; wiki-links, journaling, the canonical-context doc, per-task log, checklists). The explicit successor to obsidian.nvim (telescope picker, follow [[wiki-links]] on <Enter>, dailies, multi-state checkboxes); must reach that parity (§6.5). It surfaces tasks for navigation/reading and context, not as the primary place to edit structured task fields.
- heph CLI — capture/scripting + the complete daemon API. Every structured task field is a flag (-a red --do tomorrow --late fri --recur weekly --project Maintenance), which is exactly what command-line flags are good at and dissolves the "how do I edit N structured fields" problem. The CLI implements the entire API (admin, sync, conflicts, export) so it is also the scripting/automation surface.
- heph-tui — the primary task agenda / triage surface (planned, v1-prototype-tech-spec §8.1). The §6.2.1 study shows the dominant task activity is interactive triage of a large set (daily orange reconfirm, blue keep/drop review, browse-by-project) — work that is awkward as either CLI flags or nvim buffers. A terminal UI owns that, and launches into nvim for a task's context (and nvim launches back). Not yet built.
- The web UI is the occasional hub-served surface. A later iOS/Watch client talks to the hub directly.
- 🔒 Single binary, three modes. One Rust binary runs as local / server / client (v1-prototype-tech-spec §3.1); the heph CLI shares the same command surface. Mode is two orthogonal axes (backend + inbound listener) plus an optional hub_url that makes any local instance a syncing spoke — the everyday device is local + hub_url, the hub is server, client is the online-only convenience.
Per-device daemon (hephd): owns the local SQLite handle, the CRDT/op-log state, and background sync. All local surfaces connect to it over a local socket. This is what makes multi-surface access concurrent and safe on one SQLite file, and gives one place to run background sync.
- 🔒 Lifecycle = an explicit OS service; surfaces are connect-only (decided 2026-06). Since the daemon is shared across surfaces (CLI, TUI, nvim), no single surface owns it — so none auto-spawns it (an earlier nvim "managed daemon" that spawned + killed-on-exit was removed: a surface-owned daemon can't be shared, and "when do we stop it?" has no good answer). Instead it runs as a launchd agent (macOS) / systemd user service (Linux), managed by heph daemon start/stop/restart/status (run-the-daemon). Surfaces only connect, and tell you to run heph daemon start if nothing is serving the socket.
Core crate (Rust lib): data model, query engine, markdown parsing + wiki-link extraction, and sync logic. Linked by both hephd and the hub server.
Hub: hephd running in "server" mode on blumeops k3s — central SQLite, the sync rendezvous point, and the web UI host. May be offline for long periods; devices keep working and reconcile when it returns.

Open points:

❓ nvim ↔ daemon transport: JSON-RPC over a unix socket (nvim has a native RPC channel; allows the daemon to push updates when sync brings in remote changes) vs. the plugin shelling out to the heph CLI. Leaning socket RPC.
❓ Web front-end style: server-rendered (Axum + HTMX/Askama) vs. WASM SPA (Leptos/Dioxus). Leaning SSR + HTMX since it's the lowest-traffic surface — favor simplicity.

5. Sync & Conflict Resolution

Requirements (gathered from the owner)

🔒 ~99% of interaction today is via the CLI on Gilbert (dev laptop). Future split: ~90% Gilbert, ~9% phone/watch, ~1% web.
🔒 Must work seamlessly across tailnet workstations (Gilbert, ringtail, others): dispatch a change on one, walk to another, see it arrive "in short order."
🔒 Genuine offline conflicts will occur — e.g. Gilbert edits offline, then ringtail edits, then Gilbert reconnects. Must resolve gracefully: auto-merge where possible; for the unresolvable remainder, a conflict queue + alert ("you have N merge conflicts, run heph conflicts").
🔒 Auth from the ground up — this is extremely sensitive data. Use Authentik (OIDC) as IdP, support multiple isolated user accounts (in practice just eblume).
🔒 A web UI is part of the hosted (hub) model.

Proposed model — for confirmation

A hub-and-spoke, op-log + CRDT design:

Topology: each device runs hephd with a full local SQLite replica; the blumeops hub is the rendezvous. Devices push/pull an append-only operation log to the hub when reachable. Hub-and-spoke (not P2P mesh) — simpler, matches "central self-hosted app," and the hub is normally up on the tailnet so propagation is prompt. When the hub is down, devices keep working and the op-log drains on reconnect.
Merge semantics (so "graceful" is precise):
- doc bodies (markdown): a text CRDT (e.g. Yrs/Automerge) → concurrent edits always merge with no hard conflict. This covers the common case.
- Scalar task fields (completion, due, priority): last-writer-wins via Hybrid Logical Clocks. Deterministic and cheap.
- Links / set membership (tags, projects): add/remove as a CRDT set (OR-Set) → no conflicts.
- Conflict queue: because CRDTs auto-merge, true blocking conflicts are rare. We surface (not block) the cases that are semantically ambiguous: a discarded LWW value on a meaningful field (e.g. completion flipped both ways, divergent due dates), or concurrent edits to overlapping text regions. These go to heph conflicts with an alert; sync never stalls.
Identity & ordering: ULIDs for nodes (§3.1); HLC timestamps stamp every op for causal ordering across offline devices.
Auth: the hub's sync + web endpoints require an OIDC bearer token from Authentik. Devices obtain tokens via the OAuth device-code flow; refresh token stored in the OS keychain / 1Password. Every node/op is owned by a user_id; the hub enforces per-user isolation. Offline devices operate on cached credentials.

Open points

❓ Hub-only vs. P2P fallback: if the hub is down but Gilbert and ringtail are both on the tailnet, do they sync directly, or wait for the hub? Hub-only is simpler; direct P2P is more available. (Leaning hub-only for v1.)
❓ Encryption at rest: given "extremely sensitive," should each device's SQLite (and the hub's) be encrypted at rest (e.g. SQLCipher), and/or should op-log payloads be end-to-end encrypted so the hub stores ciphertext it can't read? (E2E complicates server-side search/web UI.)
❓ Todoist's role during/after migration — two-way bridge during transition, import-once, or retire on cutover?
❓ How "short order" is short — is a few seconds of propagation (push-based) needed, or is periodic pull (e.g. every 30–60s) fine?

6. Workflow Features

Driven by "highly optimized specific workflows." heph is organized around a small set of workflows, each a first-class surface in heph.nvim:

Task / "What is next?" (§6.1–§6.3) — the Tactical / Strategic / Organizational task engine. The flagship.

Log (§6.4) — daily journal + optional append-only per-task work-logs ("narrative > list").

Wiki / KB (§6.5) — the whole context corpus as a browsable/queryable personal wiki; the direct obsidian.nvim replacement.

Calendar (§6.6) — entering via the calendar, tied to Strategic mode. Deferred, but scaffolded carefully.

"What is next?" — the flagship

❓ ACTIVE DEEP-DIVE. This is the defining workflow and has ~10 years of the owner's prior art and prior tool attempts behind it. Being designed from that prior art rather than from scratch — see §6.1. The blumeops-tasks ranking (overdue-first, then priority, p3 = backlog) is a known reference point, not the target.

Context surfacing — v1 scope

🔒 DECIDED for v1. Explicit links only; no inferred/semantic context yet.

Every task automatically gets one canonical doc (context) node, created and linked on task creation. That canonical doc is the jumping-off point and can wiki-link onward to other docs/cards.
nvim commands provide telescope/fzf/rg-style search over context docs, optionally with preloaded filters (e.g. scoped to a project).
Inferred context (shared tags, full-text/semantic similarity) is deferred — large surface area, not the v1 focus.

6.1 "What is next?" — prior art & synthesis

Distilled from ~10 years of the owner's thinking — predating the ZK by ~5 years — chiefly the owner's direct account (this design conversation), plus zk/mole/20240202113646.todo.md (the most developed written framework), daily logs (zk/home/2023-09-01.log.md, 2023-10-12.log.md), and zk/payrix/friction_log.md. Prior tool code may exist at github.com/eblume/{mole,hermes} (check detached-head branches — several restarts); the owner expects little reusable there beyond what's captured here.

History: Hermes → Mole → heph (and the lessons)

Hermes — the "time accountant" (≈ earliest). Goal: a timeline-based system fed "blueprints" of what a day should look like, with rules like "check email twice a day, unless a high-priority email comes in — but even then no more than once an hour," continuously re-solving the calendar in real time. Built a working prototype using a SAT solver (OR-tools). Killer feature deduced: a quick, meaningful answer to "what is next" — here, the next thing on the calendar. Why it failed: the calendar became a battle zone, swinging wildly every few minutes as the solver re-resolved the dependency graph → impossible to plan around. Lesson + salvage: there may be fruit here as a future, opt-in heph planning mode, but real-time auto-rescheduling of a visible calendar is anti-useful.
Mole — "whack-a-mole" rules engine. A rules engine that constantly re-evaluated rules and materialized Todoist tasks (canonical first test: "is there an active task 'whack the mole'? if not, make one"). "What is next" = "what Todoist says is available." Worked reasonably. Why it stalled: Todoist accumulated too many tasks, so choosing the actual next thing became the whole problem again. This pushed the owner to design the Todoist project/priority/filter discipline in §6.2. The deeper lesson: Mole's automation wasn't adding much value over the project/priority discipline itself — so the owner turned Mole off and just kept using Todoist. The manual discipline was the real engine.
Today. No automation beyond blumeops' read-only blumeops-tasks mise poller. Just Todoist + the §6.2 discipline — but as the direct descendant of all the above. Implication for heph: the value is in embodying and strengthening the discipline (and finally fixing the weak note↔task link), not in clever auto-scheduling or a rules engine. Earn any automation only after the discipline is faithfully supported.

"What's next?" is three questions, not one

The owner's framework splits the question by mode:

Tactical — blank-slate "what do I do right now?" Stay in the editor/terminal; capture interruptions and resume in milliseconds ("save state and walk away in milliseconds"). This is the mode a single ranked list serves.
Strategic — "I want to do X — what's the next step, or where was I?" Exploring a goal, generating the sub-tasks to reach it.
Organizational — "I want to do X — help me plan the steps." Project-level enumeration of what needs doing, not how.

🔒 REAFFIRMED. The owner still strongly identifies with this split (the earlier "maybe" was conflating it with the unrelated CLI-first process note on the same page). The clean operational test, in the owner's words: "looking at a project's list of todos = Organizational; deciding which to work on = Strategic; working a todo = Tactical." Escalation flows upward (project-infra work can become Strategic, then Tactical).

How heph embodies it — as named modes/views in the nvim plugin, not an inference engine that guesses your mode:

Tactical (execution view): you are "tactically engaging" a specific committed task. Loads its canonical context + sub-items into a goal stack; millisecond interrupt-capture pushes new context items onto the stack; "smallest amount of contextual guidance and next steps to stay productive." Includes Chore Mode (synthesize a goal stack from a query — e.g. the Chores project — and chug through one by one).
Strategic (planning view): editor present but the goal is to explore/decompose, not edit. "To do X I must also do Y and Z." Produces a fleshed-out task with ready sub-items so Tactical can grab and go. When Strategic completes, no work has been done — only new/changed tasks, docs, links.
Organizational (survey view): project-level enumeration — what needs doing, not how.
Shared primitive — the goal stack (see §6.3) spans Tactical and Strategic; transitions between modes are first-class (Strategic produces → Tactical consumes).

The real cost is resumption, not ranking

The recurring pain in the logs is reorienting after a context switch, not deciding raw priority. Design rules that follow:

Resumption hint invariant: when work pauses on a task/todo, it should leave at least one concrete next-step so future-you can re-enter quickly. heph should make capturing/“leaving a breadcrumb” frictionless and surface it first on return.
Narrative > list: daily logs proved more motivating and contextual than flat task lists ("a log is the way"). Journaling/log nodes are first-class, and "what is next" should be able to lean on recent log narrative, not just task metadata.

The "weak link" is heph's reason to exist

The #1 unsolved problem: the fragile, clunky coupling between Todoist and notes (manually typing nb o 45 into task descriptions). heph's thin-task ↔ auto-created canonical-context-doc model is the direct fix: the link is structural, not a hand-typed string.

Mapping prior taxonomy → heph entities

Prior (`nb`/Todoist)	Meaning	heph
nb-project	grouping; completing all members completes it	`project` node
nb-todo	planning complex multi-step work; has "a single unified context log"	`task` + its canonical context `doc`
nb-task	immediate, short-lived "might eventually be done"; checking off = "no longer might be done" (not necessarily completed)	thin `task` node
Todoist	"deciding what is next and knowing what needs to be done now"	heph's "what is next?" engine
Calendar	scheduling / future	due dates (+ later calendar integration)

🔒 Preserved. The nb-task semantic — checking off means "no longer outstanding," not "accomplished" — carries into heph. Context items are outstanding / not-outstanding (see §6.3). For committed tasks, heph distinguishes end-states: done (accomplished) vs. dropped/dismissed (let go, e.g. during a Blue review). Both are "not outstanding"; the distinction is kept for honesty and history.

Process steer (now historical)

The owner's old rule — "avoid ncurses and interactive UIs; write atomic code and call it manually with subcommands… until a strong pattern emerges" — was a focus device for hand-building tools. It no longer constrains heph: we go straight to the nvim org-mode plugin backed by the Rust server (the patterns are now well understood; see §6.2/§6.3). The CLI remains as a secondary/utility surface.

Distilled requirements for heph's "what is next?" — to confirm/refine

Mode-aware (Tactical / Strategic / Organizational), not a single global sort.
Sub-millisecond capture and fast state-save (Tactical interrupt handling).
Resumption-first: on returning to a task, surface the last breadcrumb/next-step before anything else.
Context-coupled: every task one keystroke away from its canonical context doc and onward links.
Log-aware: can factor in / surface recent daily-log narrative.
A sane default ranking for the Tactical blank-slate case — the concrete model is §6.2, not the blumeops-tasks placeholder.

6.2 The lived priority discipline (the working basis for heph)

🔒 This is the real model. The owner has used this Todoist discipline daily for years; Mole was eventually turned off because this discipline outvalued the automation. heph should embody and strengthen it. Captured verbatim-in-spirit from the owner's account.

Projects = contexts (a task has exactly one). Examples: Life (catch-all), Work, Chores, Maintenance, Outside, Allison (wife), Errands, Cooking, Health. A project contextualizes a task; it is not a plan.

Attention-state (thought of by color, not number):

State	Meaning
White (default)	Able to be done once its do-date arrives. (See do-date semantics below.)
Orange	Top of mind. Keep ≤ 6, reassessed every day. Goal: you know your top-of-minds without looking at the tool.
Red	Top of mind + there will be a consequence if not done ASAP. Red means the existence of a consequence, NOT its severity/importance. Flagging by importance is a trap → anxiety (everything due feels important).
Blue	On Deck (backlog). Doable, but deliberately cooling off. The pressure-relief valve that keeps the working set light.

Do-date, not due-date (key insight): "due date" means DO date — don't do it before this date. Optionally a separate "Late On" marker for when it actually becomes a problem. Due dates are a terrible planning tool — calendar-style deadlines make everything "yell for attention" constantly. heph must treat the do-date as earliest actionable, not a deadline alarm.

Working-set discipline (the healthy tensions — a feature, not a bug):

Keep White+Orange+Red ≤ ~30 total, or the system collapses into noise. Keep Orange ≤ 6.
Push freely to Blue to stay light; periodically review Blue: keep vs. let go (this is where "drop/dismiss" happens). Target On-Deck < 100 (owner is at ~149 and hates it).
These tensions map the owner's emotional state to the task system and should be surfaced honestly. heph must avoid two failure modes: (1) "fake happy" — telling you everything's fine when it isn't; (2) overwhelm — yelling that you're buried until you freeze and make no progress. Reflect true state; help maintain the tensions.

Filters = saved views (queries over the above):

Top of Mind — all Red + Orange.
Tasks / Work Tasks — Red/Orange/White that are actionable now (do-date arrived), scoped by project.
Chores — like Tasks but from Chores, slightly tuned down so the (near-universally recurring) chores don't become a nuisance.
On Deck — all Blue.
Schedule — recurring checklists from a few projects (e.g. Morning Routine → brush teeth, feed birds, coffee, medicine). The owner uses this particular filter sparingly, but the underlying recurring-checklist mechanism is IN SCOPE for v1 (see §3.3): the only way to be sure we avoid Todoist's mistake is to model it up front. Each recurrence gets a fresh checklist instance; completion never carries forward.

heph's default "what is next?" (Tactical blank-slate) therefore ≈ Top of Mind (Red first — by consequence, then Orange) + White items whose do-date has arrived, scoped to the current project/context, with Blue hidden. Concise, honest, light.

🔒 DECIDED ranking mechanics (v1-prototype-tech-spec §7): do_date is a boolean candidacy filter only (null ⇒ "now"), never an urgency input; late_on is the sole urgency signal (a global "now a problem" top tier); within a band, FIFO by created_at — age never becomes urgency. The order is expressed as a reorderable named-dimension list so it can be retuned without touching the engine.

6.2.1 Todoist study (2026-06): empirical confirmation + new requirements

📊 Snapshot of the owner's live Todoist (read via the blumeops blumeops-tasks API pattern), to ground heph against how the discipline of §6.2 actually manifests at scale. 387 active tasks, 34 projects.

Confirms the model:

Projects = contexts, used heavily and hierarchically. 34 projects organized into top-level life-areas (Life, Work, Coding, Camano, Culture) → sub-projects (Child, Blumeops, Daily Routine, Maintenance, Movies…). The hierarchy is new information (§6.2 listed projects flat).
Huge backlog, tiny hot set. Priority split p1=2, p2=8, p4(default)=229, p3=148 — i.e. the genuinely "hot" set is ~10, the rest is white/blue backlog. The dominant activity is therefore triage of a large set, not editing single tasks — a direct argument for an interactive agenda surface (§4, v1-prototype-tech-spec §8.1).
Do-date, not due-date — validated to the hilt. Zero Todoist deadlines are used; only 107/387 carry a due (do) date. late_on will be genuinely rare.
Recurring checklists are real ("Prep For Day", every day @ 07:00, 6 children) — exactly the §3.3 reset-each-occurrence mechanism.
Daily rituals exist as tasks: "Excess Tasks to On Deck" (= the blue keep/drop review), "Coordinate with Allison", "Check Personal Email" (= orange reconfirm). The §6.2 working-set rituals are alive in the data; the TUI should make them first-class filters.

The strongest validation — descriptions are already heph. Real task descriptions contain see [[review-services]], See [[run-1password-backup]], docs/how-to/plans/migrate-forgejo-from-brew, plus URLs and phone numbers. The owner is hand-rolling task↔knowledge-base wiki-links inside Todoist descriptions that cannot resolve. heph's task → canonical-context-doc fusion (§6.3) is precisely the thing being worked around. (89/387 tasks carry a description; the task description ≡ heph's canonical context doc body.)

New requirements surfaced:

Project hierarchy — a project needs an optional parent project. (Model it with the existing parent link; deeper hierarchy-aware scope is a later refinement.)
Natural-language recurrence — 95/107 dated tasks recur, expressed as every 3 days, every 6 months, every workday, every April 15, every other wed. heph stores RFC-5545 RRULE; capture should accept the common NL forms and compile them (the easy subset; time-of-day like "at 08:00" deferred — heph's do_date is date-grained for ranking).
Tags are noise — 7 labels, 5 task-uses across 387. Confidently defer a tag surface; it is not load-bearing.

The saved filters, verbatim (the basis for heph's filter views, v1-prototype-tech-spec §8.2):

Filter	Todoist query
Top of Mind	`(p1\|p2) & (no date \| today \| overdue)`
Schedule	`(today \| overdue) & !no time`
Tasks	`!p3 & (no date \| today \| overdue) & (p1 \| (p2 & !#Work) \| !(#Daily Routine \| #Work Routine \| #Chores \| #Camano Chores \| #Work \| ##Culture \| #Camano Info)) & !subtask`
Work Tasks	`#Work & !p3 & (no date \| today \| overdue) & !subtask`
Chores	`(today \| overdue \| no date) & (#Chores \| #Camano Chores)`
On Deck	`p3 & (no date \| overdue \| today)`

These define both the agenda slices heph must offer (v1-prototype-tech-spec §8.2) and, by what "Tasks" excludes (##Culture = Movies/Books/Theater; #Camano Info), which contexts are not tasks at all.

Reference contexts → wiki, not tasks (decided 2026-06). The ##Culture subtree (Movies/Books/Theater) and #Camano Info are reference lists (films to watch, books to read, contractor phone numbers), deliberately excluded from every task filter. In heph they belong as doc pages, not committed tasks — otherwise they flood "what is next?". On the initial Todoist import these 5 projects (~59 items) were reclassified into one wiki list-doc each and the task/project nodes tombstoned. (A useful general principle: a "task" that never appears in any of your filters is probably a note.)

Future direction (noted 2026-06, not scheduled):

Chores as a first-class feature. Chores want different do-date / recurrence semantics from regular tasks (the every-N-days "do it again sometime after" rhythm, tuned-down urgency). Rather than model them as a #Chores project you scope to, make "chore" a first-class task property (kind/flag) with its own scheduling rules — which retires the #Chores / #Camano Chores projects (and the Camano split) entirely. The interim Chores filter view (v1-prototype-tech-spec §8.2) is project-scoped until then.
Drop the Schedule filter. Schedule was Todoist's timed-routine view (!no time); heph's do_date is date-grained, so it is omitted (not approximated). It's entangled with the chores rework (timed routines), so reconsider both together if/when time-of-day lands on tasks.

6.3 Two kinds of task: commitments vs. context items

🔒 DECIDED (shape). A commitment axis orthogonal to the §6.2 attention-states.

Committed task — long-lived, strongly tied to context (e.g. "Complete PLAT-1234", "Build a nursery"). Participates in §6.2: has an attention-state, a project, a do-date; is a candidate for "what is next"; may carry sub-structure and a web of context.
Context item — ephemeral, created mid-flow ("six things before this is done", "oh crap, another" pushed on the stack). Does not enter the §6.2 scheduling system — it is part of its containing task's context, with only outstanding / not-outstanding state. Never appears in global "what is next."
Promotion is first-class: a context item can be promoted to a committed task when you decide to actually schedule it (the old "nb-todo → Todoist-task on activation").
Ephemeral ≠ deleted: deletes effectively never happen — tombstones + a dedicated cleanup mode only. "Ephemeral" is lifecycle/visibility, not storage. (This also keeps the CRDT sync simple.)
The goal stack is a session-scoped, synthesized construct: seeded from a committed task (Tactical) or from a query over contexts (Chore Mode — "grind on some chores"). It holds committed tasks and/or context items for the duration of a work session.

Editing surface for context items — deliberately deferred to the prototype

🔑 DECIDED — Fork A "index" backend (the affordance is still trialed both ways). The stored source of truth is the doc body markdown: context items are - [ ] lines in it, and a context item's checked state is the [ ]/[x]. They are not synced nodes — each replica derives a local index from its converged CRDT body, so they converge for free (no divergent ULIDs across offline edits — the original sync trap). Identity is pinned only at promotion, which mints a real committed task node and rewrites the source line into a link to it. This collapses the hardest sync problem into "the body CRDT already converges" and matches this section's "ephemeral, low-stakes identity" intent. What's still trialed both ways is purely the nvim affordance (Option A checkboxes-in-body vs. Option B command-driven capture) — the backend is shared, so ergonomics can be judged in the prototype. See v1-prototype-tech-spec §4.3, §5.

Option A — checkboxes in the body, derived on save. Edit the task's canonical context doc as plain markdown; - [ ] lines are materialized into context-item state on :w (BufWritePost), reusing the wiki-link extraction machinery. No real-time scanning needed (debounced live updates are optional polish). Remote CRDT edits are pushed from the daemon to reconcile an open buffer. Item identity is low-stakes (reword = tombstone-old + add-new) since items are ephemeral; identity is pinned only at promotion. Most "personal org-mode"-native; millisecond capture = "type a line."
Option B — command-driven capture, rendered. Capture via a leader chord + vim.ui.input (no raw checkbox typing); the new item is still written as a - [ ] line into the body-backed store (Fork A), then rendered inline via virtual text/extmarks or a transient floating window (avoids the disliked persistent cutaway). Heavier capture; more UI to build; promotion identical.
Lean: A for capture flow, but build the shared node backend first and trial both affordances in the plugin.

Prior-art note (multi-state checkboxes): the owner's obsidian.nvim config uses checkbox states " ", "x", ">", "~", "!" (todo / done / forwarded / ~ / important). heph's task/context-item state model should accommodate richer-than-binary states (at least the done/dropped/outstanding set in §6.1; possibly forwarded/deferred).

6.4 The Log workflow

🔒 In scope for v1 (daily journal certainly; per-task logs as the model allows). Logs embody the "narrative > list" insight and the resumption-hint invariant from §6.1.

Two related shapes:

Daily journal. Today driven by the zkd abbrev → Obsidian dailies: a telescope picker to create/open dated journal buffers. heph keeps this as a first-class feature: journal nodes keyed by date, a picker to jump to today / recent / create. The daily journal is the catch-all narrative stream.
Per-task log. Sometimes a task wants an append-only "here's what I was working on" side-commentary. It half-belongs to the task's context and half doesn't — so model it as a separate buffer/node from the task's canonical context doc, optionally present only for tasks that need longer-duration commentary. Properties:
- Append-only (entries accrete; you don't rewrite history).
- Dispatchable from an isolated context: you must be able to fire a log entry without leaving / losing your place in the context buffer you're navigating — i.e. append to the log from elsewhere (a quick-capture into the task's log), while the context buffer stays where it is. (This is the ergonomic crux; likely a quick-capture command targeting "the current task's log".)
- The per-task log is a strong candidate to be the resumption breadcrumb store (§6.1).

❓ Open: is the per-task log a distinct node kind (log, append-only), or just a doc flagged append-only and linked log-of → task? Leaning a dedicated append-only log kind to make "dispatch an entry" a cheap, well-defined op.

6.5 The Wiki / KB workflow

🔒 First-class in v1. This is the "view all my contexts as one big personal wiki" surface — the direct replacement for obsidian.nvim.

The owner estimates ~90% of the value comes from a few well-worn primitives, so heph.nvim should reach obsidian.nvim feature parity on day one:

Follow [[wiki-links]] on <Enter> to jump between context docs (the core navigation gesture).
Telescope-based search (rg), quick-switch between docs, tag browse, backlinks, outgoing links — mirroring the owner's <Leader>o* verbs (to be rebound under heph's own prefix).
Logical layout + tagging of context docs so the corpus is browsable/queryable as a whole.
New-doc-from-query and insert-link from the picker (obsidian.nvim's <C-x>/<C-l>).

heph.nvim replaces obsidian.nvim outright — the owner already rarely opens Obsidian itself. This workflow is the knowledge-base analogue of the Organizational task view: surveying and traversing the doc graph.

6.6 The Calendar workflow

⏳ Deferred (later version) — but scaffold now, don't paint ourselves into a corner. Ties closely to Strategic mode (planning against time).

The owner uses Calendar.app bound to a Gmail calendar (almost certainly CalDAV; Gmail also exposes ICS/WebCal feeds — to confirm when implemented).
Eventual goal: integration between calendar views and task views — enter the system via the calendar, see tasks/do-dates in temporal context.
⚠️ Hard-won gotcha: enabling Todoist's gcal integration flooded the calendar with a million events because of recurring rules. heph must never explode recurrence into individual calendar events. This echoes the Hermes "battle zone" failure (§6.1) — calendar integration must be read-mostly, careful, de-duplicated, and opt-in, not a firehose.
Scaffolding guidance (decisions to keep clean now): keep a crisp separation between tasks/do-dates and calendar events; model recurrence as RRULEs that are expanded lazily for display, never materialized as stored events; design sync so a future read-only CalDAV ingestion (calendar → heph) is easy, and treat any heph → calendar export as a later, explicitly bounded, non-exploding feature.

7. Hosting & Deployment

Reuse the established blumeops patterns (🔒 confirmed by repo conventions):

Containerized service deployed to k3s (ringtail cluster) via ArgoCD app-of-apps + Kustomize raw manifests (no Helm/Flux).
Secrets via 1Password + external-secrets operator (ClusterSecretStore onepassword-blumeops).
Image built via Dagger + Forgejo Actions, pushed to Zot registry (registry.ops.eblu.me).
❓ Rust build approach for the container: Nix (like kingfisher) vs. Dagger-native cargo multi-stage build.
❓ Persistence for the central instance's SQLite — NFS PVC (sifaka) vs. local PV. (SQLite over NFS has known locking caveats.)
❓ How do clients reach the central instance — Tailscale tailnet only, or public via Caddy/Fly proxy?

8. Technology Choices (to decide)

❓ nvim plugin stack: Lua + JSON-RPC over a socket to hephd (with daemon→plugin push) vs. CLI shell-out. Telescope as the picker dependency.
❓ Web framework (Axum + HTMX/Askama? Leptos? Dioxus?).
❓ SQLite access layer (sqlx, rusqlite, SeaORM, Diesel) and migrations.
❓ CRDT / sync library (Automerge-rs, Yrs, bespoke op-log).
❓ Markdown parser/renderer (pulldown-cmark, comrak) and wiki-link / checkbox extraction.

9. Open Questions Summary

Decided

✅ Source of truth: SQLite-primary, markdown as the body payload, with a directory export mode. Clean break from Obsidian-the-app.
✅ Note/task model: typed node graph — thin task nodes, rich doc nodes, connected by typed links.
✅ Front-end: hephd daemon with thin clients over a local socket; nvim plugin (heph.nvim) as the primary surface, CLI secondary; web UI on the hub (later).
✅ Sync (shape): hub-and-spoke, op-log + CRDT, HLC ordering, OIDC/Authentik auth, conflict queue for the ambiguous remainder, over a targetable backend (tech-spec §3.1).

Decided since (B/C/D)

✅ Context surfacing v1: explicit links only + auto-created canonical context doc per task + nvim fuzzy search. Inferred/semantic deferred.
✅ v1 supports local-only and distributed via a targetable backend (tech-spec §3.1): Store trait + three modes (local/server/client) + exclusive-lock handoff over a shared SQLite file. Offline-first sync with automatic merge + conflict queue + full OIDC/Authentik auth with per-user isolation are all IN v1 (tech-spec §12/§13). Local-only is a first-class config. Web UI and k3s deployment are later (hub binary is built deployable).
✅ Security v1: OIDC/Authentik auth + per-user isolation is the boundary; plain SQLite at rest (no encryption). May revisit.
✅ Migration v1: clean break, no ZK import / no Todoist bridge. heph is system of record day one.

Decided (workflows)

✅ heph organized around first-class workflows (§6): Task (what-is-next), Log, Wiki/KB, Calendar.
✅ Log (§6.4) in scope: daily journal (picker over dated journal nodes, à la zkd/Obsidian dailies) + optional append-only per-task work-logs (separate buffer from the canonical context doc; dispatchable without leaving the context buffer).
✅ Wiki/KB (§6.5) first-class: heph.nvim reaches obsidian.nvim feature parity (follow [[links]] on <Enter>, telescope search/quick-switch/tags/backlinks/links) and replaces it.
✅ Calendar (§6.6) deferred but scaffolded carefully: never explode recurrence into stored events (Todoist-gcal-flood / Hermes lesson); keep tasks vs. events separate; lazy RRULE expansion; design for a future read-only CalDAV ingest.

Decided (workflow model)

✅ "What is next?" rests on the lived priority discipline (§6.2): projects-as-contexts; attention-states White/Orange/Red/Blue (Red = consequence, not severity); do-date not due-date; working-set tensions surfaced honestly (avoid "fake happy" and overwhelm).
✅ Tactical / Strategic / Organizational reaffirmed as named modes/views in the nvim plugin, sharing a goal-stack primitive (§6.1).
✅ Commitment axis (§6.3): committed tasks (in §6.2 system) vs. ephemeral context items (outstanding/not-outstanding, scoped to a container); promotion is first-class; no hard deletes (tombstones + cleanup mode).
✅ Task end-states: done vs. dropped/dismissed (both "not outstanding").
✅ Prototype goes straight to the nvim plugin + Rust server; CLI is secondary.
✅ Rust stack RATIFIED (§11.2): rusqlite + tokio/JSON-RPC + pulldown-cmark + ulid + rrule + clap.
✅ Recurrence + recurring checklists IN SCOPE for v1 (§3.3): RRULE-driven; each occurrence gets a fresh checklist instance, completion never carries forward (avoids Todoist's corner).

Still open

Prototype-blocking (resolve at kickoff — see §11 / tech-spec):

CRDT library for body merge: yrs (leaning) vs automerge; bespoke op-log/HLC for scalar fields either way.
Hub network transport (tech-spec §6.1): axum HTTP/JSON (leaning) vs gRPC; sync propagation cadence (push vs periodic pull); device-id provisioning (how a spoke mints its origin id and registers with the hub).

Resolved in the second-pass review (2026-05-31), now folded into v1-prototype-tech-spec:

✅ Recurrence → roll-forward in place (§3.3) — occurrence-instances rejected (would explode into a node-per-occurrence under Fork A); advance to the next RRULE instance after now, skipping misses.
✅ Context items → Fork A "index" backend (§6.3) — body markdown is the source of truth; context items are a local derived index, not synced nodes; identity at promotion. The nvim affordance (A vs B) is still trialed in-prototype.
✅ Identity → deterministic ids for journal/tag (§3.1); random ULID for content nodes/project.
✅ Mode/sync orthogonality (§4): backend + inbound-listener axes plus optional hub_url; everyday device = local + hub_url.
✅ Auth/owner (tech-spec §13): nullable oidc_sub, local user for local-only, hub-authoritative identity, one-time pre-first-sync adoption rewrite.
✅ Ranking (tech-spec §7): do_date boolean candidacy filter only; late_on sole urgency signal (global tier); FIFO tiebreak; order as a reorderable named-dimension list.
✅ Modes (§6.1) are plugin-side compositions of mode-agnostic daemon primitives; added list (Organizational) and log.tail (resumption breadcrumb).

Not prototype-blocking (later phases):

Per-task log representation: dedicated append-only log node kind vs. doc flagged append-only (§6.4).
Calendar protocol confirmation (CalDAV vs ICS/WebCal) and the careful, non-exploding integration (§6.6).
Web front-end style (SSR + HTMX vs. WASM SPA).
k3s deployment specifics: Rust container build (Nix vs Dagger-native cargo); central SQLite persistence (local PV vs NFS); client reachability (tailnet-only vs public).
Encryption at rest / E2E (currently out; may revisit).

10. Roadmap (provisional)

Phase 0 — Design (this document + v1-prototype-tech-spec): done enough to build.
Phase 1 — v1 prototype (a single C1 effort, deliberately — a one-shot test of delivering a high-complexity prototype from the spec; built in TDD slices): heph-core (model, schema, extraction, recurrence, "what is next", Store trait, op-log/HLC/CRDT merge) → hephd local mode → server + client modes (+ lock handoff) → offline sync + conflict queue → OIDC/Authentik auth + per-user isolation → heph.nvim + heph CLI. Local-only works standalone; runnable client/server + offline sync on the tailnet. The build order doubles as the cross-session resume tracker (next un-green slice = where to resume). C2/Mikado is not used: it sequences prerequisites against existing code under test, and this is greenfield delivery from a complete spec; follow-up C1s or a C2 refactor come later as needed.

Phase 1 progress (branch feature/v1-prototype, PR #1; 102 tests green as of 2026-06-01 — see v1-prototype-tech-spec §14 for the per-area tracker):
- heph-core library — schema/Store/LocalStore, extraction, tasks/links/canonical-context + wiki-link materialization, "what is next?" ranking, recurrence roll-forward + per-task logs.
- hephd local mode — file lock + JSON-RPC over a unix socket (tokio blocking pool); heph CLI + export.
- Local query surface — list, health, journal, search (FTS5).
- Sync engine (no network yet) — HLC + device origin; op-log recording; apply_op merge (LWW + conflict queue, OR-set links, monotonic tombstones, idempotent); two-replica convergence proven; adopt_owner (basic §13).
- Resume here → yrs text-CRDT for bodies (upgrade body merge from LWW); then server/client modes + network push/pull (transport over the existing engine); then OIDC/Authentik auth; then heph.nvim.
Phase 2 — k3s deployment: Dagger→Zot image, ArgoCD app + Kustomize manifests, external-secrets; hub on blumeops.
Phase 3 — Web UI on the hub.
Phase 4 (later, optional) — calendar integration (careful CalDAV); migration from ZK / Todoist; iOS / Apple Watch voice capture; Hermes-style planning mode.

11. v1 Prototype — Scope, Stack, and Kickoff

The handoff target for the next context session. Items here are proposals to ratify; once ratified, a fresh session can build directly from this.

The canonical, implementation-ready scope/stack/schema/API now live in v1-prototype-tech-spec. This section keeps the intent and the kickoff process; defer to the spec for details (and update the spec when decisions change).

11.1 Scope — local-to-distributed via a targetable backend (ratified)

v1 supports the full spectrum — pure local-only through full distributed — chosen by configuration, not a rebuild. A Store abstraction backs three runtime modes (local / server / client) with an exclusive-lock handoff over a shared SQLite file (tech-spec §3.1).

In: the full data model (§3, §6) incl. recurrence + recurring checklists (§3.3); the "what is next?" engine; the targetable backend + all three modes with the same-file local↔server lock handoff; offline-first op-log + CRDT sync with automatic merge and a conflict queue for local-backed replicas (tech-spec §12); OIDC/Authentik auth with per-user isolation on the server endpoint (tech-spec §13); the heph.nvim + heph CLI surfaces.
Out (later, scaffolded): the web UI (hub serves sync only in v1); actual k3s deployment to blumeops (hub binary is built deployable; ArgoCD/Kustomize/Dagger is a fast-follow); calendar; iOS/Watch; inferred context; P2P-over-tailnet fallback; persistent goal stack (an in-plugin session stack suffices).

Why this v1: the backend abstraction, offline/merge behavior, and auth model are architecturally load-bearing — bolting them on later means reworking the core. The owner wants them designed in from the start, while keeping local-only a first-class, friction-free configuration. (Recurrence-with-checklists is in for the same reason — see §3.3.) Natural build order: local mode first (simplest), then server/client, then sync, then auth.

11.2 Stack — ✅ RATIFIED

Core: rusqlite (bundled) + migrations · tokio + JSON-RPC/unix-socket (local) · ulid · rrule · pulldown-cmark · clap · anyhow/thiserror · tracing. Distributed/auth additions (some to confirm at kickoff): text CRDT yrs (vs automerge) for bodies + bespoke op-log/HLC for fields · hub transport axum/reqwest · OIDC via openidconnect + keyring. Full detail and the SQLite schema: tech-spec §4.5, §10, §12, §13.

11.3 First-session kickoff checklist

mise run ai-docs; read v1-prototype-tech-spec (the build spec) and this doc's §3/§6 for rationale.
Classify as a single C1 (deliberate — a one-shot test of delivering this high-complexity prototype from the spec; C2/Mikado is for sequencing prerequisites against existing code, not greenfield delivery). Single long-lived feature branch + early draft PR, docs-first, push as you go (agent-change-process). The §11.1 build order is the resume tracker; the hairy slices (sync/CRDT) may spin off follow-up C1s rather than blocking the prototype.
Scaffold the cargo workspace + heph.nvim skeleton; fill the AGENTS.md Project Structure section (last template TODO).
Build outward in testable slices (TDD, tech-spec §2/§9): heph-core (schema, model, extraction, recurrence, "what is next", Store trait, op-log/HLC/CRDT merge) → hephd local mode (LocalStore + lock + local RPC) → server + client modes (network endpoint, RemoteStore, lock handoff) → sync + conflict queue → OIDC auth → heph.nvim + heph CLI.
Confirm the kickoff-open picks: CRDT lib (yrs vs automerge), hub transport (axum vs gRPC) + propagation cadence + device-id provisioning. (Recurrence = roll-forward and the context-item backend = Fork A are now decided; only the nvim affordance A vs B is trialed in-prototype.)

v1-prototype-tech-spec — clean implementation-facing technical specification distilled from this document
ai-assistance-guide — conventions for AI agents in this repo
agent-change-process — C0/C1/C2 change methodology

55 KiB Raw Permalink Blame History Unescape Escape