Tier 2 fuzzing: a nightly cargo-fuzz crate at crates/heph-core/fuzz/ with three targets (crdt_merge, crdt_write, extract), reaching crate-private CRDT internals through heph-core's new 'fuzzing' feature. Driven ad-hoc via 'mise run fuzz'; not in CI (needs nightly + wall clock). crdt_merge immediately surfaced robustness gaps in yrs 0.27 on malformed sync deltas (a 4-byte input OOMs; other inputs abort/UB) — uncatchable, limited blast radius (authenticated /sync/push), documented as a known limitation. extract and crdt_write ran clean over ~1M cases. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
5.6 KiB
| title | modified | tags | |
|---|---|---|---|
| Fuzz Testing | 2026-06-09 |
|
Fuzz Testing
heph's parsing layer is pure and clock-injected, which makes it a natural fit for randomized testing. Fuzzing runs at two tiers:
Tier 1 — property tests (proptest, stable Rust, runs in CI)
Property-based tests live alongside the unit tests in each module and run as
part of the normal cargo test suite — no extra tooling, and CI picks them up
via the standard build hook.
Covered invariants:
| Module | Invariants |
|---|---|
heph-core/src/extract.rs |
extraction never panics, is idempotent; links are non-empty/trimmed/deduped; context_item_lines aligns 1:1 with context_items |
heph-core/src/wikilink.rs |
expand/collapse are idempotent; collapse(expand(x)) == collapse(x) |
heph-core/src/crdt.rs |
a write materializes exactly; concurrent edits converge regardless of merge order; merge is idempotent; merging arbitrary garbage bytes never panics |
heph-core/src/frontmatter.rs |
strip is idempotent and always returns a suffix of its input |
heph-core/src/recurrence.rs |
checkbox reset properties (pre-existing); next_occurrence is strictly after after; arbitrary RRULE strings never panic |
heph-core/src/hlc.rs |
HLC ordering properties (pre-existing); parse never panics |
hephd/src/datespec.rs |
parse_date never panics (including huge +N offsets); offsets and ISO dates round-trip |
hephd/src/quickadd.rs |
parse never panics; title words always come from the input |
Run them with the rest of the suite:
cargo test
Tier 2 — coverage-guided fuzzing (cargo-fuzz, nightly, run ad-hoc)
libFuzzer targets live in crates/heph-core/fuzz/. These are for the surfaces
where coverage guidance beats random generation — chiefly the CRDT layer, which
decodes attacker-controllable bytes arriving via sync (yrs update payloads in
op-log entries).
Targets:
crdt_merge— feeds arbitrary(state, delta)byte pairs tomerge_body; asserts no panic and merge idempotence. This is the remote-input surface.crdt_write— arbitrary(prev, new)string pairs throughwrite_body; asserts the diff/CRDT round-trip materializesnewexactly (UTF-8 boundary stress).extract— arbitrary markdown throughextract+context_item_lines; asserts the 1:1 alignment invariant promotion depends on.
Requirements: rustup toolchain install nightly and cargo install cargo-fuzz.
The fuzz targets reach crate-private CRDT internals through the fuzzing
cargo feature of heph-core, which exposes thin public wrappers — the feature
is never enabled in normal builds.
Run all targets briefly (default 60s each), or one target for longer:
mise run fuzz # all targets, 60s each
mise run fuzz 300 # all targets, 5 min each
cargo +nightly fuzz run crdt_merge --fuzz-dir crates/heph-core/fuzz -- -max_total_time=3600
Crash artifacts land in crates/heph-core/fuzz/artifacts/<target>/; the corpus
accumulates in crates/heph-core/fuzz/corpus/<target>/ (both gitignored).
Reproduce a crash with
cargo +nightly fuzz run <target> --fuzz-dir crates/heph-core/fuzz <artifact-path>.
Tier 2 is deliberately not wired into CI: it needs nightly and meaningful wall
clock to earn its keep. Run it ad-hoc after touching crdt.rs, extract.rs,
or the sync payload path. If it ever moves to CI, a scheduled (not per-push)
workflow with a persistent corpus is the right shape.
Findings so far
The first runs paid for themselves. Tier 1 proptests found two reachable panics on user input, both fixed in the same change:
datespec::parse_offsetpanicked on a large relative offset (e.g.+999999999999d) because chrono's+overflows; now uses checked arithmetic and returns an out-of-range error.datespec::parse_month_daysliced a token on a non-char boundary for multibyte input (e.g. anevery <Month> <day>phrase containing); now takes the first three chars.
Tier 2 (crdt_merge) surfaced robustness gaps in yrs 0.27 on malformed
update bytes, reachable through the authenticated /sync/push path:
- a tiny delta
[255, 255, 255, 126]triggers a huge allocation → OOM; - some inputs trip a
debug_assert!in the yrs block decoder (unwinding panic — contained by thecatch_unwindinmerge_body); - at least one class hits genuine UB (an invalid
char) →SIGABRTunder debug UB-checks, silent UB in release.
These are not fully fixable in-tree: yrs exposes no pre-apply validator, and
the OOM/abort classes are uncatchable. The blast radius is limited (the sync
endpoint is authenticated), but a buggy or hostile authenticated peer can still
crash a daemon. The catch_unwind in merge_body is partial mitigation;
durable fixes need upstream yrs work or a bounded decoder. Until then this is
a known limitation, tracked here and reproduced by the crdt_merge target.
Why these targets
The high-value surfaces, ranked when this was set up:
crdt::merge_body— decodes untrusted bytes from sync peers; a panic here is a remote-input daemon crash.extract— custom scanning logic layered over pulldown-cmark; promotion rewrites body lines based on its output, so misalignment corrupts bodies.wikilinkrewriting — span arithmetic where off-by-ones hide.datespec/quickadd— user-typed input parsed inside the daemon process.
Crashes found in dependencies (yrs, rrule, pulldown-cmark) are still
real hephd crashes — handle by validating/catching before the call, and
report upstream.
Related
- v1-prototype-tech-spec — testing strategy
- design — sync/CRDT rationale