diff --git a/CHANGELOG.md b/CHANGELOG.md index 1900ad9..9799e3f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,28 +12,6 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). -## [v1.4.1] - 2026-06-08 - -### Bug Fixes - -- The `heph` CLI and `heph-tui` now survive a daemon restart. Previously the unix-socket client connected once and never reconnected, so an opt-in self-update or `heph daemon restart` left every subsequent call failing — `heph-tui` would sit on errors until relaunched. The client now reconnects on a dropped socket: a request that never went out is retried transparently, while a reply lost mid-request is surfaced (not silently retried) so a mutation is never double-applied. A long-running TUI self-heals on its next refresh tick. -- Quick-add popover (⌘'): hand keyboard focus back to the previously active app when it hides, and stop the (now invisible) overlay from intercepting clicks where it used to sit. - - -## [v1.4.0] - 2026-06-08 - -### Features - -- Spoke auth failures now tell you how to recover. When a refresh token is rejected or the hub returns 401, `hephd` records the real cause plus the exact `heph auth login --hub-url … --issuer … --client-id …` command (keyed to this spoke's hub) in its sync health. A new `heph auth status` prints that health and the re-login command, `heph sync --status`'s `last_error` carries it, and `heph-tui`'s status line points at it with a `⚠ auth · heph auth status` chip. -- `heph daemon start`/`restart` can now bake the daemon's full runtime config into the managed service — `--mode`, `--hub-url`, `--http-addr`, `--oidc-issuer`/`--oidc-audience`/`--oidc-client-id`, and `--self-update-interval-secs` (previously only the bare `--self-update` bool was wired). Regenerating preserves whatever is already baked into the on-disk plist/unit, so a bare `start`/`restart` no longer silently drops spoke/hub or self-update config. -- heph-tui's sync indicator now shows the last-sync age in seconds under a minute (`⟳ 26s`) instead of a flat `just now`, so the chip reads as a live heartbeat and a missed sync (the loop runs every 30s) shows up as the age climbing. - -### Bug Fixes - -- hephd no longer reports a rejected OAuth refresh as "identity provider unreachable". A reachable IdP that returns an HTTP error (e.g. `400 invalid_grant` once a refresh token expires/rotates) is now surfaced as a *rejection* — `identity provider rejected the request: HTTP 400 (invalid_grant): …` — with the OAuth error body, distinct from a genuine transport failure. This stops the wording from misdirecting incident response toward the network when the real fix is re-authentication. -- `heph daemon restart` on macOS no longer intermittently fails with `launchctl bootstrap failed: 5: Input/output error`. The old code bootstrapped immediately after `bootout`, racing launchd's asynchronous teardown; it now waits for the service to fully unload and retries the bootstrap. When the plist is unchanged (e.g. a plain binary upgrade) it uses `launchctl kickstart -k` to restart the loaded job atomically, sidestepping the bootout→bootstrap dance entirely. - - ## [v1.2.3] - 2026-06-06 ### Features diff --git a/Cargo.lock b/Cargo.lock index cc9b3a6..be8f974 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -2237,8 +2237,6 @@ dependencies = [ "heph-core", "hephd", "libc", - "objc2 0.6.4", - "objc2-app-kit 0.3.2", "serde_json", "winit", ] diff --git a/crates/heph-quickadd/Cargo.toml b/crates/heph-quickadd/Cargo.toml index 57bbb98..5b1889b 100644 --- a/crates/heph-quickadd/Cargo.toml +++ b/crates/heph-quickadd/Cargo.toml @@ -19,16 +19,7 @@ global-hotkey = "0.8" # macOS-only: winit for the accessory-mode activation policy (no Dock icon), # pinned to the same minor eframe carries so cargo unifies to one winit; libc -# for getppid() (orphan detection — self-exit when the supervising daemon dies); -# objc2 + objc2-app-kit to hand keyboard focus back to the previously active app -# when the popover hides (NSApplication.hide:/unhide:). Pinned to the 0.6/0.3 -# line global-hotkey already pulls in, so cargo unifies to one copy. +# for getppid() (orphan detection — self-exit when the supervising daemon dies). [target.'cfg(target_os = "macos")'.dependencies] winit = "0.30" libc = "0.2" -objc2 = "0.6" -objc2-app-kit = { version = "0.3", default-features = false, features = [ - "std", - "NSApplication", - "NSResponder", -] } diff --git a/crates/heph-quickadd/src/app.rs b/crates/heph-quickadd/src/app.rs index a334b22..b08bf03 100644 --- a/crates/heph-quickadd/src/app.rs +++ b/crates/heph-quickadd/src/app.rs @@ -226,9 +226,6 @@ impl QuickAdd { } fn show(&mut self, ctx: &egui::Context) { - // Undo the app-level hide from the previous `hide()` so we can take focus - // again (no-op the first time / off macOS). - app_take_focus(); self.visible = true; self.focus_pending = true; self.current_hint = random_hint(self.current_hint); @@ -259,13 +256,6 @@ impl QuickAdd { ctx.send_viewport_cmd(egui::ViewportCommand::InnerSize(egui::vec2(WIN_W, BASE_H))); self.win_h_applied = BASE_H; } - // Hand keyboard focus back to the app underneath us. winit's - // `Visible(false)` alone leaves *us* the active application, so focus - // never returns and the borderless always-on-top overlay can keep eating - // clicks where it used to sit. `NSApplication.hide:` orders our windows - // fully out and activates the next app in line — exactly the one the user - // was in (no-op off macOS). - app_yield_focus(); } /// Optimistic submit: hide now, create in the background. @@ -606,39 +596,6 @@ impl QuickAdd { } } -/// Hide the popover at the *application* level so macOS hands keyboard focus -/// back to the previously active app. `NSApplication.hide:` orders all our -/// windows out and activates the next app in line — the one the user was in — -/// which a plain winit `Visible(false)` does not do. No-op off macOS. -#[cfg(target_os = "macos")] -fn app_yield_focus() { - use objc2::MainThreadMarker; - use objc2_app_kit::NSApplication; - // eframe's `update` runs on the main thread, so this marker is always Some. - if let Some(mtm) = MainThreadMarker::new() { - NSApplication::sharedApplication(mtm).hide(None); - } -} - -#[cfg(not(target_os = "macos"))] -fn app_yield_focus() {} - -/// Undo [`app_yield_focus`]: clear the app-level hidden flag before re-showing, -/// so the window the viewport `Focus` command then makes key actually appears. -/// (`unhide:` also re-activates us; the per-window `Focus`/`Visible` viewport -/// commands do the rest.) No-op off macOS. -#[cfg(target_os = "macos")] -fn app_take_focus() { - use objc2::MainThreadMarker; - use objc2_app_kit::NSApplication; - if let Some(mtm) = MainThreadMarker::new() { - NSApplication::sharedApplication(mtm).unhide(None); - } -} - -#[cfg(not(target_os = "macos"))] -fn app_take_focus() {} - /// The current parent process id, for orphan detection. `None` off macOS (where /// hephd does not supervise a helper — there is no Aqua session to inherit). fn current_parent_pid() -> Option { diff --git a/crates/hephd/src/client.rs b/crates/hephd/src/client.rs index 8a2bd5d..c3c008b 100644 --- a/crates/hephd/src/client.rs +++ b/crates/hephd/src/client.rs @@ -2,145 +2,59 @@ //! //! Used by the `heph` CLI and by tests. Surfaces never touch SQLite directly //! (tech-spec §3) — they go through the daemon socket, which this wraps. -//! -//! The connection self-heals across daemon restarts (opt-in self-update, `heph -//! daemon restart`): a [`call`](Client::call) that finds the socket dropped -//! reconnects. It only auto-retries when the request provably never reached the -//! daemon (a write-side failure); a reply lost *after* sending is surfaced -//! rather than retried, so a mutation is never silently double-applied. use std::io::{BufRead, BufReader, Write}; use std::os::unix::net::UnixStream; -use std::path::{Path, PathBuf}; +use std::path::Path; -use anyhow::{anyhow, Context, Result}; +use anyhow::{bail, Context, Result}; use serde_json::{json, Value}; use crate::rpc::Response; /// A connected client. One request/response per [`call`](Client::call). pub struct Client { - socket_path: PathBuf, reader: BufReader, writer: UnixStream, next_id: u64, } -/// How a single request/response exchange failed — drives the retry decision. -enum ExchangeError { - /// The request could not be written (broken pipe, reset): it never reached - /// the daemon, so retrying on a fresh connection is safe. - Send(anyhow::Error), - /// The request was sent but no reply came back (the daemon closed mid-flight, - /// e.g. it restarted): it may or may not have applied — do not retry. - Recv(anyhow::Error), - /// A well-formed RPC-level error (or an unparseable reply): the connection is - /// fine; nothing to reconnect. - Rpc(anyhow::Error), -} - -impl ExchangeError { - fn into_inner(self) -> anyhow::Error { - match self { - ExchangeError::Send(e) | ExchangeError::Recv(e) | ExchangeError::Rpc(e) => e, - } - } -} - impl Client { /// Connect to a daemon listening at `socket_path`. pub fn connect(socket_path: &Path) -> Result { - let (reader, writer) = Self::open(socket_path)?; + let stream = UnixStream::connect(socket_path) + .with_context(|| format!("connecting to hephd at {}", socket_path.display()))?; + let reader = BufReader::new(stream.try_clone()?); Ok(Client { - socket_path: socket_path.to_path_buf(), reader, - writer, + writer: stream, next_id: 1, }) } - /// Open a fresh reader/writer pair on the socket. - fn open(socket_path: &Path) -> Result<(BufReader, UnixStream)> { - let stream = UnixStream::connect(socket_path) - .with_context(|| format!("connecting to hephd at {}", socket_path.display()))?; - let reader = BufReader::new(stream.try_clone()?); - Ok((reader, stream)) - } - - /// Re-establish the connection (after the daemon restarted and dropped it). - fn reconnect(&mut self) -> Result<()> { - let (reader, writer) = Self::open(&self.socket_path)?; - self.reader = reader; - self.writer = writer; - Ok(()) - } - /// Call `method` with `params`, returning the `result` value (or an error /// carrying the RPC error's code and message). - /// - /// If the daemon has restarted and dropped the socket, this reconnects: it - /// retries transparently when the request never went out, and otherwise - /// reconnects for the next call while surfacing an error for this one (so a - /// mutation whose reply was lost is not silently re-applied). pub fn call(&mut self, method: &str, params: Value) -> Result { let id = self.next_id; self.next_id += 1; + let mut line = serde_json::to_string(&json!({ "id": id, "method": method, "params": params, }))?; line.push('\n'); - - match self.exchange(&line) { - Ok(v) => Ok(v), - Err(ExchangeError::Rpc(e)) => Err(e), - Err(ExchangeError::Send(_)) => { - // The request never reached the daemon — reconnect and retry once. - self.reconnect() - .context("hephd connection lost and reconnect failed")?; - self.exchange(&line) - .map_err(ExchangeError::into_inner) - .with_context(|| format!("retrying `{method}` after reconnect")) - } - Err(ExchangeError::Recv(e)) => { - // Sent but no reply: the daemon likely restarted mid-request. Don't - // retry (a mutation may have applied); reconnect for next time and - // surface this one. - let _ = self.reconnect(); - Err(e).context( - "hephd closed the connection mid-request (it likely restarted); \ - reconnected — re-run the action if it didn't take effect", - ) - } - } - } - - /// One request/response over the current connection, classifying failures. - fn exchange(&mut self, line: &str) -> std::result::Result { - self.writer - .write_all(line.as_bytes()) - .map_err(|e| ExchangeError::Send(e.into()))?; - self.writer - .flush() - .map_err(|e| ExchangeError::Send(e.into()))?; + self.writer.write_all(line.as_bytes())?; + self.writer.flush()?; let mut response_line = String::new(); - let read = self - .reader - .read_line(&mut response_line) - .map_err(|e| ExchangeError::Recv(e.into()))?; + let read = self.reader.read_line(&mut response_line)?; if read == 0 { - return Err(ExchangeError::Recv(anyhow!("hephd closed the connection"))); + bail!("hephd closed the connection"); } - let response: Response = - serde_json::from_str(&response_line).map_err(|e| ExchangeError::Rpc(e.into()))?; + let response: Response = serde_json::from_str(&response_line)?; if let Some(err) = response.error { - return Err(ExchangeError::Rpc(anyhow!( - "rpc error {}: {}", - err.code, - err.message - ))); + bail!("rpc error {}: {}", err.code, err.message); } Ok(response.result.unwrap_or(Value::Null)) } diff --git a/crates/hephd/tests/client_reconnect.rs b/crates/hephd/tests/client_reconnect.rs deleted file mode 100644 index a4d0074..0000000 --- a/crates/hephd/tests/client_reconnect.rs +++ /dev/null @@ -1,96 +0,0 @@ -//! [`Client`] survives the daemon dropping the socket (opt-in self-update, `heph -//! daemon restart`). A mock daemon serves exactly one request per connection -//! then closes it, forcing the client to reconnect — without auto-reconnect, -//! every call after the first would fail forever. - -use std::io::{BufRead, BufReader, Write}; -use std::os::unix::net::UnixListener; -use std::path::PathBuf; -use std::sync::atomic::{AtomicUsize, Ordering}; -use std::sync::Arc; -use std::thread; -use std::time::Duration; - -use hephd::Client; -use serde_json::{json, Value}; - -/// A mock daemon that handles ONE request per connection then closes it, looping -/// to accept the next connection. `served` counts total requests answered. -fn spawn_one_shot_daemon(socket: PathBuf, served: Arc) { - thread::spawn(move || { - let listener = UnixListener::bind(&socket).unwrap(); - for conn in listener.incoming() { - let Ok(mut stream) = conn else { continue }; - let mut reader = BufReader::new(stream.try_clone().unwrap()); - let mut line = String::new(); - if reader.read_line(&mut line).unwrap_or(0) == 0 { - continue; // client opened then went away; wait for the next one - } - let req: Value = serde_json::from_str(&line).unwrap(); - let n = served.fetch_add(1, Ordering::SeqCst) + 1; - let mut out = serde_json::to_string(&json!({ - "id": req["id"], - "result": { "served": n }, - })) - .unwrap(); - out.push('\n'); - let _ = stream.write_all(out.as_bytes()); - let _ = stream.flush(); - // `stream` drops here → the connection closes after one request. - } - }); -} - -fn wait_for(socket: &std::path::Path) { - for _ in 0..400 { - if socket.exists() { - return; - } - thread::sleep(Duration::from_millis(5)); - } - panic!("mock daemon socket never appeared"); -} - -#[test] -fn client_reconnects_after_the_daemon_drops_the_socket() { - let dir = tempfile::tempdir().unwrap(); - let socket = dir.path().join("d.sock"); - let served = Arc::new(AtomicUsize::new(0)); - spawn_one_shot_daemon(socket.clone(), served.clone()); - wait_for(&socket); - - let mut c = Client::connect(&socket).unwrap(); - - // First call works on the initial connection. - let r1 = c.call("ping", json!({})).unwrap(); - assert_eq!(r1["served"], 1); - - // The daemon has now closed that connection. With reconnect, the client - // recovers within a call or two (depending on whether the dead socket fails - // on write or on read); without it, every further call would fail forever. - let mut recovered = None; - for _ in 0..2 { - if let Ok(v) = c.call("ping", json!({})) { - recovered = Some(v); - break; - } - } - let r = recovered.expect("client should reconnect after the socket was dropped"); - // The recovered call was served exactly once on the new connection — no - // double-serve from a spurious retry. - assert_eq!(r["served"], 2); - assert_eq!(served.load(Ordering::SeqCst), 2); - - // And it keeps working across subsequent drops. - let r3 = { - let mut got = None; - for _ in 0..2 { - if let Ok(v) = c.call("ping", json!({})) { - got = Some(v); - break; - } - } - got.expect("client should keep reconnecting") - }; - assert_eq!(r3["served"], 3); -} diff --git a/docs/changelog.d/+sync-age-seconds.feature.md b/docs/changelog.d/+sync-age-seconds.feature.md new file mode 100644 index 0000000..cf453c2 --- /dev/null +++ b/docs/changelog.d/+sync-age-seconds.feature.md @@ -0,0 +1 @@ +heph-tui's sync indicator now shows the last-sync age in seconds under a minute (`⟳ 26s`) instead of a flat `just now`, so the chip reads as a live heartbeat and a missed sync (the loop runs every 30s) shows up as the age climbing. diff --git a/docs/changelog.d/auth-error-clarity.bugfix.md b/docs/changelog.d/auth-error-clarity.bugfix.md new file mode 100644 index 0000000..83ba854 --- /dev/null +++ b/docs/changelog.d/auth-error-clarity.bugfix.md @@ -0,0 +1 @@ +hephd no longer reports a rejected OAuth refresh as "identity provider unreachable". A reachable IdP that returns an HTTP error (e.g. `400 invalid_grant` once a refresh token expires/rotates) is now surfaced as a *rejection* — `identity provider rejected the request: HTTP 400 (invalid_grant): …` — with the OAuth error body, distinct from a genuine transport failure. This stops the wording from misdirecting incident response toward the network when the real fix is re-authentication. diff --git a/docs/changelog.d/auth-error-clarity.feature.md b/docs/changelog.d/auth-error-clarity.feature.md new file mode 100644 index 0000000..ab67867 --- /dev/null +++ b/docs/changelog.d/auth-error-clarity.feature.md @@ -0,0 +1 @@ +Spoke auth failures now tell you how to recover. When a refresh token is rejected or the hub returns 401, `hephd` records the real cause plus the exact `heph auth login --hub-url … --issuer … --client-id …` command (keyed to this spoke's hub) in its sync health. A new `heph auth status` prints that health and the re-login command, `heph sync --status`'s `last_error` carries it, and `heph-tui`'s status line points at it with a `⚠ auth · heph auth status` chip. diff --git a/docs/changelog.d/daemon-restart-race.bugfix.md b/docs/changelog.d/daemon-restart-race.bugfix.md new file mode 100644 index 0000000..c13a257 --- /dev/null +++ b/docs/changelog.d/daemon-restart-race.bugfix.md @@ -0,0 +1 @@ +`heph daemon restart` on macOS no longer intermittently fails with `launchctl bootstrap failed: 5: Input/output error`. The old code bootstrapped immediately after `bootout`, racing launchd's asynchronous teardown; it now waits for the service to fully unload and retries the bootstrap. When the plist is unchanged (e.g. a plain binary upgrade) it uses `launchctl kickstart -k` to restart the loaded job atomically, sidestepping the bootout→bootstrap dance entirely. diff --git a/docs/changelog.d/daemon-self-update-interval.feature.md b/docs/changelog.d/daemon-self-update-interval.feature.md new file mode 100644 index 0000000..b5ec9b8 --- /dev/null +++ b/docs/changelog.d/daemon-self-update-interval.feature.md @@ -0,0 +1 @@ +`heph daemon start`/`restart` can now bake the daemon's full runtime config into the managed service — `--mode`, `--hub-url`, `--http-addr`, `--oidc-issuer`/`--oidc-audience`/`--oidc-client-id`, and `--self-update-interval-secs` (previously only the bare `--self-update` bool was wired). Regenerating preserves whatever is already baked into the on-disk plist/unit, so a bare `start`/`restart` no longer silently drops spoke/hub or self-update config. diff --git a/docs/how-to/run-the-daemon.md b/docs/how-to/run-the-daemon.md index 545b3be..cb9e56d 100644 --- a/docs/how-to/run-the-daemon.md +++ b/docs/how-to/run-the-daemon.md @@ -86,14 +86,6 @@ still the old binary until you restart it: heph daemon restart ``` -A restart (or an opt-in self-update) drops the daemon's unix socket out from -under any connected surface. The CLI and `heph-tui` **reconnect automatically**: -a read transparently retries on a fresh connection, and a long-running TUI -self-heals on its next tick — so a daemon restart no longer leaves the agenda -view stuck on errors. (A mutating action whose reply is lost mid-restart reports -"reconnected — re-run the action if it didn't take effect" rather than risk -applying twice.) - ## Self-update (opt-in) `hephd` can keep itself current: `heph daemon start --self-update` generates a