diff --git a/CHANGELOG.md b/CHANGELOG.md
index 9799e3f..1900ad9 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -12,6 +12,28 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
 
 <!-- towncrier release notes start -->
 
+## [v1.4.1] - 2026-06-08
+
+### Bug Fixes
+
+- The `heph` CLI and `heph-tui` now survive a daemon restart. Previously the unix-socket client connected once and never reconnected, so an opt-in self-update or `heph daemon restart` left every subsequent call failing — `heph-tui` would sit on errors until relaunched. The client now reconnects on a dropped socket: a request that never went out is retried transparently, while a reply lost mid-request is surfaced (not silently retried) so a mutation is never double-applied. A long-running TUI self-heals on its next refresh tick.
+- Quick-add popover (⌘'): hand keyboard focus back to the previously active app when it hides, and stop the (now invisible) overlay from intercepting clicks where it used to sit.
+
+
+## [v1.4.0] - 2026-06-08
+
+### Features
+
+- Spoke auth failures now tell you how to recover. When a refresh token is rejected or the hub returns 401, `hephd` records the real cause plus the exact `heph auth login --hub-url … --issuer … --client-id …` command (keyed to this spoke's hub) in its sync health. A new `heph auth status` prints that health and the re-login command, `heph sync --status`'s `last_error` carries it, and `heph-tui`'s status line points at it with a `⚠ auth · heph auth status` chip.
+- `heph daemon start`/`restart` can now bake the daemon's full runtime config into the managed service — `--mode`, `--hub-url`, `--http-addr`, `--oidc-issuer`/`--oidc-audience`/`--oidc-client-id`, and `--self-update-interval-secs` (previously only the bare `--self-update` bool was wired). Regenerating preserves whatever is already baked into the on-disk plist/unit, so a bare `start`/`restart` no longer silently drops spoke/hub or self-update config.
+- heph-tui's sync indicator now shows the last-sync age in seconds under a minute (`⟳ 26s`) instead of a flat `just now`, so the chip reads as a live heartbeat and a missed sync (the loop runs every 30s) shows up as the age climbing.
+
+### Bug Fixes
+
+- hephd no longer reports a rejected OAuth refresh as "identity provider unreachable". A reachable IdP that returns an HTTP error (e.g. `400 invalid_grant` once a refresh token expires/rotates) is now surfaced as a *rejection* — `identity provider rejected the request: HTTP 400 (invalid_grant): …` — with the OAuth error body, distinct from a genuine transport failure. This stops the wording from misdirecting incident response toward the network when the real fix is re-authentication.
+- `heph daemon restart` on macOS no longer intermittently fails with `launchctl bootstrap failed: 5: Input/output error`. The old code bootstrapped immediately after `bootout`, racing launchd's asynchronous teardown; it now waits for the service to fully unload and retries the bootstrap. When the plist is unchanged (e.g. a plain binary upgrade) it uses `launchctl kickstart -k` to restart the loaded job atomically, sidestepping the bootout→bootstrap dance entirely.
+
+
 ## [v1.2.3] - 2026-06-06
 
 ### Features
diff --git a/Cargo.lock b/Cargo.lock
index be8f974..cc9b3a6 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -2237,6 +2237,8 @@ dependencies = [
  "heph-core",
  "hephd",
  "libc",
+ "objc2 0.6.4",
+ "objc2-app-kit 0.3.2",
  "serde_json",
  "winit",
 ]
diff --git a/crates/heph-quickadd/Cargo.toml b/crates/heph-quickadd/Cargo.toml
index 5b1889b..57bbb98 100644
--- a/crates/heph-quickadd/Cargo.toml
+++ b/crates/heph-quickadd/Cargo.toml
@@ -19,7 +19,16 @@ global-hotkey = "0.8"
 
 # macOS-only: winit for the accessory-mode activation policy (no Dock icon),
 # pinned to the same minor eframe carries so cargo unifies to one winit; libc
-# for getppid() (orphan detection — self-exit when the supervising daemon dies).
+# for getppid() (orphan detection — self-exit when the supervising daemon dies);
+# objc2 + objc2-app-kit to hand keyboard focus back to the previously active app
+# when the popover hides (NSApplication.hide:/unhide:). Pinned to the 0.6/0.3
+# line global-hotkey already pulls in, so cargo unifies to one copy.
 [target.'cfg(target_os = "macos")'.dependencies]
 winit = "0.30"
 libc = "0.2"
+objc2 = "0.6"
+objc2-app-kit = { version = "0.3", default-features = false, features = [
+  "std",
+  "NSApplication",
+  "NSResponder",
+] }
diff --git a/crates/heph-quickadd/src/app.rs b/crates/heph-quickadd/src/app.rs
index b08bf03..a334b22 100644
--- a/crates/heph-quickadd/src/app.rs
+++ b/crates/heph-quickadd/src/app.rs
@@ -226,6 +226,9 @@ impl QuickAdd {
     }
 
     fn show(&mut self, ctx: &egui::Context) {
+        // Undo the app-level hide from the previous `hide()` so we can take focus
+        // again (no-op the first time / off macOS).
+        app_take_focus();
         self.visible = true;
         self.focus_pending = true;
         self.current_hint = random_hint(self.current_hint);
@@ -256,6 +259,13 @@ impl QuickAdd {
             ctx.send_viewport_cmd(egui::ViewportCommand::InnerSize(egui::vec2(WIN_W, BASE_H)));
             self.win_h_applied = BASE_H;
         }
+        // Hand keyboard focus back to the app underneath us. winit's
+        // `Visible(false)` alone leaves *us* the active application, so focus
+        // never returns and the borderless always-on-top overlay can keep eating
+        // clicks where it used to sit. `NSApplication.hide:` orders our windows
+        // fully out and activates the next app in line — exactly the one the user
+        // was in (no-op off macOS).
+        app_yield_focus();
     }
 
     /// Optimistic submit: hide now, create in the background.
@@ -596,6 +606,39 @@ impl QuickAdd {
     }
 }
 
+/// Hide the popover at the *application* level so macOS hands keyboard focus
+/// back to the previously active app. `NSApplication.hide:` orders all our
+/// windows out and activates the next app in line — the one the user was in —
+/// which a plain winit `Visible(false)` does not do. No-op off macOS.
+#[cfg(target_os = "macos")]
+fn app_yield_focus() {
+    use objc2::MainThreadMarker;
+    use objc2_app_kit::NSApplication;
+    // eframe's `update` runs on the main thread, so this marker is always Some.
+    if let Some(mtm) = MainThreadMarker::new() {
+        NSApplication::sharedApplication(mtm).hide(None);
+    }
+}
+
+#[cfg(not(target_os = "macos"))]
+fn app_yield_focus() {}
+
+/// Undo [`app_yield_focus`]: clear the app-level hidden flag before re-showing,
+/// so the window the viewport `Focus` command then makes key actually appears.
+/// (`unhide:` also re-activates us; the per-window `Focus`/`Visible` viewport
+/// commands do the rest.) No-op off macOS.
+#[cfg(target_os = "macos")]
+fn app_take_focus() {
+    use objc2::MainThreadMarker;
+    use objc2_app_kit::NSApplication;
+    if let Some(mtm) = MainThreadMarker::new() {
+        NSApplication::sharedApplication(mtm).unhide(None);
+    }
+}
+
+#[cfg(not(target_os = "macos"))]
+fn app_take_focus() {}
+
 /// The current parent process id, for orphan detection. `None` off macOS (where
 /// hephd does not supervise a helper — there is no Aqua session to inherit).
 fn current_parent_pid() -> Option<i32> {
diff --git a/crates/heph-tui/src/ui.rs b/crates/heph-tui/src/ui.rs
index 6e15453..f6d2f37 100644
--- a/crates/heph-tui/src/ui.rs
+++ b/crates/heph-tui/src/ui.rs
@@ -570,7 +570,9 @@ fn sync_indicator(sync: &SyncStatus, now: i64) -> Vec<Span<'static>> {
 
     let health = sync.health.clone().unwrap_or_default();
     let mut spans = vec![if health.auth_failure {
-        Span::styled("⚠ auth", red)
+        // Point at the recovery command — `heph auth status` prints the exact
+        // `heph auth login …` to run (the full command is too long for the bar).
+        Span::styled("⚠ auth · heph auth status", red)
     } else if let Some(ts) = health.last_success_ms {
         Span::styled(format!("⟳ {}", fmt_age(now, ts)), dim)
     } else if health.last_error.is_some() {
@@ -639,7 +641,7 @@ mod tests {
             },
             0,
         );
-        assert_eq!(render(&auth, NOW), "⚠ auth");
+        assert_eq!(render(&auth, NOW), "⚠ auth · heph auth status");
 
         // Errored with no prior success → offline.
         let offline = spoke(
diff --git a/crates/heph/src/main.rs b/crates/heph/src/main.rs
index c327f1d..28d3b5e 100644
--- a/crates/heph/src/main.rs
+++ b/crates/heph/src/main.rs
@@ -344,7 +344,7 @@ enum ConflictAction {
     },
 }
 
-#[derive(Subcommand, Debug)]
+#[derive(Subcommand, Debug, Clone)]
 enum AuthAction {
     /// Log in via the device-code flow; caches the bearer token for hub sync.
     Login {
@@ -367,6 +367,9 @@ enum AuthAction {
         #[arg(long)]
         hub_url: String,
     },
+    /// Show this spoke's auth health and, if re-auth is needed, the exact
+    /// `heph auth login` command to run. Queries the daemon.
+    Status,
 }
 
 /// Run the device-code flow (or clear a token) — no daemon needed.
@@ -396,10 +399,63 @@ fn run_auth(action: AuthAction) -> Result<()> {
             KeyringTokenStore::new(hub_url.as_str()).clear()?;
             println!("Logged out of {hub_url}.");
         }
+        AuthAction::Status => unreachable!("auth status is handled via the daemon"),
     }
     Ok(())
 }
 
+/// Render `heph auth status` from a `sync.status` RPC response: hub/issuer/client
+/// id, whether auth is healthy or needs re-login, and — when it does — the exact
+/// command to run (built daemon-side, keyed under the right hub URL).
+fn print_auth_status(status: &Value) {
+    let Some(hub) = status.get("hub_url").and_then(Value::as_str) else {
+        println!("This instance is standalone (no hub configured); auth does not apply.");
+        return;
+    };
+    let auth = status.get("auth");
+    let issuer = auth.and_then(|a| a.get("issuer")).and_then(Value::as_str);
+    let client_id = auth
+        .and_then(|a| a.get("client_id"))
+        .and_then(Value::as_str);
+    let health = status.get("health");
+    let auth_failure = health
+        .and_then(|h| h.get("auth_failure"))
+        .and_then(Value::as_bool)
+        .unwrap_or(false);
+    let last_error = health
+        .and_then(|h| h.get("last_error"))
+        .and_then(Value::as_str);
+    let last_success = health
+        .and_then(|h| h.get("last_success_ms"))
+        .and_then(Value::as_i64);
+
+    println!("hub        : {hub}");
+    if let Some(iss) = issuer {
+        println!("issuer     : {iss}");
+    }
+    if let Some(cid) = client_id {
+        println!("client id  : {cid}");
+    }
+    println!(
+        "auth       : {}",
+        if auth_failure {
+            "FAILED — re-authentication required"
+        } else if last_success.is_some() {
+            "ok"
+        } else {
+            "unknown (no successful sync yet)"
+        }
+    );
+    if let Some(err) = last_error {
+        println!("last error : {err}");
+    }
+    if auth_failure {
+        if let Some(cmd) = status.get("reauth_command").and_then(Value::as_str) {
+            println!("\nTo re-authenticate, run:\n  {cmd}");
+        }
+    }
+}
+
 fn main() -> Result<()> {
     let cli = Cli::parse();
 
@@ -407,9 +463,13 @@ fn main() -> Result<()> {
     if let Command::Daemon { action } = &cli.command {
         return service::run(action);
     }
-    // `auth` runs locally (device-code flow + keyring); it needs no daemon.
-    if let Command::Auth { action } = cli.command {
-        return run_auth(action);
+    // `auth login`/`logout` run locally (device-code flow + keyring); they need
+    // no daemon. `auth status` reads live sync health, so it falls through to the
+    // connected path below.
+    if let Command::Auth { action } = &cli.command {
+        if !matches!(action, AuthAction::Status) {
+            return run_auth(action.clone());
+        }
     }
 
     let socket = cli.socket.unwrap_or_else(default_socket_path);
@@ -790,7 +850,13 @@ fn main() -> Result<()> {
             let n = result.as_u64().unwrap_or(0);
             println!("Rewrote legacy [[Name]] links to [[id]] in {n} node(s).");
         }
-        Command::Auth { .. } => unreachable!("auth is handled before connecting"),
+        Command::Auth {
+            action: AuthAction::Status,
+        } => {
+            let result = client.call("sync.status", json!({}))?;
+            print_auth_status(&result);
+        }
+        Command::Auth { .. } => unreachable!("auth login/logout handled before connecting"),
         Command::Daemon { .. } => unreachable!("daemon is handled before connecting"),
     }
     Ok(())
diff --git a/crates/heph/src/service.rs b/crates/heph/src/service.rs
index 1c90924..0b8928b 100644
--- a/crates/heph/src/service.rs
+++ b/crates/heph/src/service.rs
@@ -13,6 +13,7 @@
 
 use std::path::{Path, PathBuf};
 use std::process::Command;
+use std::time::{Duration, Instant};
 
 use anyhow::{bail, Context, Result};
 use clap::{Args, Subcommand};
@@ -494,6 +495,51 @@ fn launchd_loaded(domain_target: &str) -> bool {
         .unwrap_or(false)
 }
 
+/// Block until `target` is no longer loaded, up to `timeout`. `launchctl bootout`
+/// is asynchronous in effect — it requests teardown and returns, but launchd may
+/// still be killing/reaping the job and removing its label from the domain.
+/// Bootstrapping while the label lingers fails with a generic `5: Input/output
+/// error`, so we wait for the label to actually disappear before re-bootstrapping.
+fn wait_until_unloaded(target: &str, timeout: Duration) {
+    let start = Instant::now();
+    while launchd_loaded(target) {
+        if start.elapsed() >= timeout {
+            break; // fall through; bootstrap's own retry covers the residual window
+        }
+        std::thread::sleep(Duration::from_millis(100));
+    }
+}
+
+/// Bootstrap the service, retrying briefly. Even once the old instance is gone,
+/// launchd can momentarily return EIO while the domain settles, so a couple of
+/// short retries make `start`/`restart` reliable instead of intermittently failing.
+fn launchd_bootstrap(domain: &str, plist: &str) -> Result<()> {
+    let mut last = String::new();
+    for attempt in 0..5 {
+        if attempt > 0 {
+            std::thread::sleep(Duration::from_millis(200));
+        }
+        let (ok, err) = run_cmd("launchctl", &["bootstrap", domain, plist])?;
+        if ok {
+            return Ok(());
+        }
+        last = err;
+    }
+    bail!("launchctl bootstrap failed: {}", last.trim());
+}
+
+/// Restart an already-loaded job in place (kills it, then launchd's KeepAlive —
+/// `-k` forces the kill). This restarts the *loaded* job definition, so it does
+/// not pick up an edited plist — callers use it only when the on-disk plist is
+/// unchanged, where it sidesteps the bootout→bootstrap race entirely.
+fn launchd_kickstart(target: &str) -> Result<()> {
+    let (ok, err) = run_cmd("launchctl", &["kickstart", "-k", target])?;
+    if !ok {
+        bail!("launchctl kickstart failed: {}", err.trim());
+    }
+    Ok(())
+}
+
 fn launchd(action: &DaemonAction, p: &Paths) -> Result<()> {
     let plist = launchd_plist_path()?;
     let uid = uid()?;
@@ -512,10 +558,7 @@ fn launchd(action: &DaemonAction, p: &Paths) -> Result<()> {
             if launchd_loaded(&target) {
                 println!("heph daemon already running ({LABEL}).");
             } else {
-                let (ok, err) = run_cmd("launchctl", &["bootstrap", &domain, &plist_str(&plist)?])?;
-                if !ok {
-                    bail!("launchctl bootstrap failed: {}", err.trim());
-                }
+                launchd_bootstrap(&domain, &plist_str(&plist)?)?;
                 println!("heph daemon started ({LABEL}).");
             }
         }
@@ -527,14 +570,24 @@ fn launchd(action: &DaemonAction, p: &Paths) -> Result<()> {
             let cfg = args
                 .to_config()
                 .fill_from(existing_config(&plist, &Manager::Launchd));
-            write_if_changed(
+            let changed = write_if_changed(
                 &plist,
                 &launchd_plist(&p.hephd, &p.db, &p.socket, &p.log, &cfg),
             )?;
-            let _ = run_cmd("launchctl", &["bootout", &target])?;
-            let (ok, err) = run_cmd("launchctl", &["bootstrap", &domain, &plist_str(&plist)?])?;
-            if !ok {
-                bail!("launchctl bootstrap failed: {}", err.trim());
+            if !launchd_loaded(&target) {
+                // Not currently loaded — nothing to tear down, just bring it up.
+                launchd_bootstrap(&domain, &plist_str(&plist)?)?;
+            } else if changed {
+                // The plist changed, so launchd must re-read it: a full reload is
+                // required. bootout is async, so wait for the label to clear
+                // before bootstrapping (and bootstrap retries the residual EIO).
+                let _ = run_cmd("launchctl", &["bootout", &target])?;
+                wait_until_unloaded(&target, Duration::from_secs(5));
+                launchd_bootstrap(&domain, &plist_str(&plist)?)?;
+            } else {
+                // Same definition (e.g. binary upgraded in place) — restart the
+                // loaded job atomically, sidestepping the bootout→bootstrap race.
+                launchd_kickstart(&target)?;
             }
             println!("heph daemon restarted ({LABEL}).");
         }
diff --git a/crates/hephd/src/auth.rs b/crates/hephd/src/auth.rs
index c601d90..6b80e95 100644
--- a/crates/hephd/src/auth.rs
+++ b/crates/hephd/src/auth.rs
@@ -38,9 +38,45 @@ pub enum AuthError {
     /// The token was present but failed validation.
     #[error("invalid token: {0}")]
     Invalid(String),
-    /// The identity provider could not be reached to fetch keys.
+    /// The identity provider could not be reached at all (DNS, TLS, connection
+    /// refused, timeout) — a transport failure, distinct from a rejection.
     #[error("identity provider unreachable: {0}")]
-    Provider(String),
+    Unreachable(String),
+    /// The identity provider *was* reached but returned an HTTP error response —
+    /// e.g. `400 invalid_grant` on a refresh, meaning the token was rejected
+    /// (expired/rotated/session-invalidated), not that the IdP was down. The
+    /// distinction matters: "unreachable" sends debugging toward the network;
+    /// this points at the token/authorization.
+    #[error("identity provider rejected the request: {0}")]
+    Rejected(String),
+    /// Some other failure in the auth path that is neither a transport failure
+    /// nor an HTTP rejection — a malformed/unparseable IdP response, or a local
+    /// credential-store (keyring) error. Kept distinct so neither is mislabeled
+    /// as "unreachable".
+    #[error("auth error: {0}")]
+    Other(String),
+}
+
+impl AuthError {
+    /// Build a [`AuthError::Rejected`] from an HTTP status and the OAuth error
+    /// body (RFC 6749 §5.2), e.g. `HTTP 400 (invalid_grant): Token is expired`.
+    pub fn rejected(status: u16, error: Option<&str>, description: Option<&str>) -> AuthError {
+        let mut msg = format!("HTTP {status}");
+        if let Some(e) = error.filter(|e| !e.is_empty()) {
+            msg.push_str(&format!(" ({e})"));
+        }
+        if let Some(d) = description.filter(|d| !d.is_empty()) {
+            msg.push_str(&format!(": {d}"));
+        }
+        AuthError::Rejected(msg)
+    }
+
+    /// Whether this is an authorization-level rejection (the IdP refused the
+    /// grant) rather than a transport failure — i.e. re-authentication is the
+    /// likely fix, not network troubleshooting.
+    pub fn is_rejection(&self) -> bool {
+        matches!(self, AuthError::Rejected(_))
+    }
 }
 
 /// Verifies a bearer token and returns its [`Claims`]. A trait so the hub can be
@@ -92,16 +128,13 @@ impl OidcVerifier {
             .http
             .get(url)
             .call()
-            .map_err(|e| AuthError::Provider(e.to_string()))?;
+            .map_err(|e| AuthError::Unreachable(e.to_string()))?;
         if !resp.status().is_success() {
-            return Err(AuthError::Provider(format!(
-                "{url} returned {}",
-                resp.status()
-            )));
+            return Err(AuthError::rejected(resp.status().as_u16(), None, None));
         }
         resp.body_mut()
             .read_json()
-            .map_err(|e| AuthError::Provider(e.to_string()))
+            .map_err(|e| AuthError::Unreachable(e.to_string()))
     }
 
     /// Resolve the JWKS URI from the provider's discovery document.
@@ -169,3 +202,38 @@ impl TokenVerifier for OidcVerifier {
         Some((&self.issuer, &self.audience))
     }
 }
+
+#[cfg(test)]
+mod tests {
+    use super::AuthError;
+
+    #[test]
+    fn rejected_formats_status_error_and_description() {
+        let e = AuthError::rejected(400, Some("invalid_grant"), Some("Token is not active"));
+        assert!(e.is_rejection());
+        assert_eq!(
+            e.to_string(),
+            "identity provider rejected the request: HTTP 400 (invalid_grant): Token is not active"
+        );
+    }
+
+    #[test]
+    fn rejected_omits_absent_or_empty_oauth_fields() {
+        // No OAuth body (e.g. a bare 503) → just the status.
+        assert_eq!(
+            AuthError::rejected(503, None, None).to_string(),
+            "identity provider rejected the request: HTTP 503"
+        );
+        // Empty strings are treated as absent, not rendered as "()" / ": ".
+        assert_eq!(
+            AuthError::rejected(400, Some(""), Some("")).to_string(),
+            "identity provider rejected the request: HTTP 400"
+        );
+    }
+
+    #[test]
+    fn unreachable_is_not_a_rejection() {
+        assert!(!AuthError::Unreachable("connection refused".into()).is_rejection());
+        assert!(!AuthError::Other("keyring locked".into()).is_rejection());
+    }
+}
diff --git a/crates/hephd/src/client.rs b/crates/hephd/src/client.rs
index c3c008b..8a2bd5d 100644
--- a/crates/hephd/src/client.rs
+++ b/crates/hephd/src/client.rs
@@ -2,59 +2,145 @@
 //!
 //! Used by the `heph` CLI and by tests. Surfaces never touch SQLite directly
 //! (tech-spec §3) — they go through the daemon socket, which this wraps.
+//!
+//! The connection self-heals across daemon restarts (opt-in self-update, `heph
+//! daemon restart`): a [`call`](Client::call) that finds the socket dropped
+//! reconnects. It only auto-retries when the request provably never reached the
+//! daemon (a write-side failure); a reply lost *after* sending is surfaced
+//! rather than retried, so a mutation is never silently double-applied.
 
 use std::io::{BufRead, BufReader, Write};
 use std::os::unix::net::UnixStream;
-use std::path::Path;
+use std::path::{Path, PathBuf};
 
-use anyhow::{bail, Context, Result};
+use anyhow::{anyhow, Context, Result};
 use serde_json::{json, Value};
 
 use crate::rpc::Response;
 
 /// A connected client. One request/response per [`call`](Client::call).
 pub struct Client {
+    socket_path: PathBuf,
     reader: BufReader<UnixStream>,
     writer: UnixStream,
     next_id: u64,
 }
 
+/// How a single request/response exchange failed — drives the retry decision.
+enum ExchangeError {
+    /// The request could not be written (broken pipe, reset): it never reached
+    /// the daemon, so retrying on a fresh connection is safe.
+    Send(anyhow::Error),
+    /// The request was sent but no reply came back (the daemon closed mid-flight,
+    /// e.g. it restarted): it may or may not have applied — do not retry.
+    Recv(anyhow::Error),
+    /// A well-formed RPC-level error (or an unparseable reply): the connection is
+    /// fine; nothing to reconnect.
+    Rpc(anyhow::Error),
+}
+
+impl ExchangeError {
+    fn into_inner(self) -> anyhow::Error {
+        match self {
+            ExchangeError::Send(e) | ExchangeError::Recv(e) | ExchangeError::Rpc(e) => e,
+        }
+    }
+}
+
 impl Client {
     /// Connect to a daemon listening at `socket_path`.
     pub fn connect(socket_path: &Path) -> Result<Client> {
-        let stream = UnixStream::connect(socket_path)
-            .with_context(|| format!("connecting to hephd at {}", socket_path.display()))?;
-        let reader = BufReader::new(stream.try_clone()?);
+        let (reader, writer) = Self::open(socket_path)?;
         Ok(Client {
+            socket_path: socket_path.to_path_buf(),
             reader,
-            writer: stream,
+            writer,
             next_id: 1,
         })
     }
 
+    /// Open a fresh reader/writer pair on the socket.
+    fn open(socket_path: &Path) -> Result<(BufReader<UnixStream>, UnixStream)> {
+        let stream = UnixStream::connect(socket_path)
+            .with_context(|| format!("connecting to hephd at {}", socket_path.display()))?;
+        let reader = BufReader::new(stream.try_clone()?);
+        Ok((reader, stream))
+    }
+
+    /// Re-establish the connection (after the daemon restarted and dropped it).
+    fn reconnect(&mut self) -> Result<()> {
+        let (reader, writer) = Self::open(&self.socket_path)?;
+        self.reader = reader;
+        self.writer = writer;
+        Ok(())
+    }
+
     /// Call `method` with `params`, returning the `result` value (or an error
     /// carrying the RPC error's code and message).
+    ///
+    /// If the daemon has restarted and dropped the socket, this reconnects: it
+    /// retries transparently when the request never went out, and otherwise
+    /// reconnects for the next call while surfacing an error for this one (so a
+    /// mutation whose reply was lost is not silently re-applied).
     pub fn call(&mut self, method: &str, params: Value) -> Result<Value> {
         let id = self.next_id;
         self.next_id += 1;
-
         let mut line = serde_json::to_string(&json!({
             "id": id,
             "method": method,
             "params": params,
         }))?;
         line.push('\n');
-        self.writer.write_all(line.as_bytes())?;
-        self.writer.flush()?;
+
+        match self.exchange(&line) {
+            Ok(v) => Ok(v),
+            Err(ExchangeError::Rpc(e)) => Err(e),
+            Err(ExchangeError::Send(_)) => {
+                // The request never reached the daemon — reconnect and retry once.
+                self.reconnect()
+                    .context("hephd connection lost and reconnect failed")?;
+                self.exchange(&line)
+                    .map_err(ExchangeError::into_inner)
+                    .with_context(|| format!("retrying `{method}` after reconnect"))
+            }
+            Err(ExchangeError::Recv(e)) => {
+                // Sent but no reply: the daemon likely restarted mid-request. Don't
+                // retry (a mutation may have applied); reconnect for next time and
+                // surface this one.
+                let _ = self.reconnect();
+                Err(e).context(
+                    "hephd closed the connection mid-request (it likely restarted); \
+                     reconnected — re-run the action if it didn't take effect",
+                )
+            }
+        }
+    }
+
+    /// One request/response over the current connection, classifying failures.
+    fn exchange(&mut self, line: &str) -> std::result::Result<Value, ExchangeError> {
+        self.writer
+            .write_all(line.as_bytes())
+            .map_err(|e| ExchangeError::Send(e.into()))?;
+        self.writer
+            .flush()
+            .map_err(|e| ExchangeError::Send(e.into()))?;
 
         let mut response_line = String::new();
-        let read = self.reader.read_line(&mut response_line)?;
+        let read = self
+            .reader
+            .read_line(&mut response_line)
+            .map_err(|e| ExchangeError::Recv(e.into()))?;
         if read == 0 {
-            bail!("hephd closed the connection");
+            return Err(ExchangeError::Recv(anyhow!("hephd closed the connection")));
         }
-        let response: Response = serde_json::from_str(&response_line)?;
+        let response: Response =
+            serde_json::from_str(&response_line).map_err(|e| ExchangeError::Rpc(e.into()))?;
         if let Some(err) = response.error {
-            bail!("rpc error {}: {}", err.code, err.message);
+            return Err(ExchangeError::Rpc(anyhow!(
+                "rpc error {}: {}",
+                err.code,
+                err.message
+            )));
         }
         Ok(response.result.unwrap_or(Value::Null))
     }
diff --git a/crates/hephd/src/oauth.rs b/crates/hephd/src/oauth.rs
index 53ee5f0..4af704f 100644
--- a/crates/hephd/src/oauth.rs
+++ b/crates/hephd/src/oauth.rs
@@ -109,7 +109,7 @@ impl KeyringTokenStore {
             }
         });
         keyring_core::Entry::new(&self.service, &self.account)
-            .map_err(|e| AuthError::Provider(e.to_string()))
+            .map_err(|e| AuthError::Other(e.to_string()))
     }
 }
 
@@ -119,16 +119,16 @@ impl TokenStore for KeyringTokenStore {
         serde_json::from_str(&secret).ok()
     }
     fn save(&self, token: &StoredToken) -> Result<(), AuthError> {
-        let json = serde_json::to_string(token).map_err(|e| AuthError::Provider(e.to_string()))?;
+        let json = serde_json::to_string(token).map_err(|e| AuthError::Other(e.to_string()))?;
         self.entry()?
             .set_password(&json)
-            .map_err(|e| AuthError::Provider(e.to_string()))
+            .map_err(|e| AuthError::Other(e.to_string()))
     }
     fn clear(&self) -> Result<(), AuthError> {
         match self.entry()?.delete_credential() {
             Ok(()) => Ok(()),
             Err(keyring_core::Error::NoEntry) => Ok(()),
-            Err(e) => Err(AuthError::Provider(e.to_string())),
+            Err(e) => Err(AuthError::Other(e.to_string())),
         }
     }
 }
@@ -187,6 +187,9 @@ impl TokenResponse {
 #[derive(Debug, Deserialize)]
 struct TokenErrorBody {
     error: String,
+    /// Human-readable detail the provider may include (RFC 6749 §5.2).
+    #[serde(default)]
+    error_description: Option<String>,
 }
 
 /// Drives the OAuth 2.0 device-code flow against one provider.
@@ -208,17 +211,14 @@ impl DeviceFlow {
         let mut resp = http
             .get(&url)
             .call()
-            .map_err(|e| AuthError::Provider(e.to_string()))?;
+            .map_err(|e| AuthError::Unreachable(e.to_string()))?;
         if !resp.status().is_success() {
-            return Err(AuthError::Provider(format!(
-                "discovery returned {}",
-                resp.status()
-            )));
+            return Err(AuthError::rejected(resp.status().as_u16(), None, None));
         }
         let doc: DiscoveryDoc = resp
             .body_mut()
             .read_json()
-            .map_err(|e| AuthError::Provider(e.to_string()))?;
+            .map_err(|e| AuthError::Other(e.to_string()))?;
         Ok(DeviceFlow {
             client_id: client_id.to_string(),
             http,
@@ -233,16 +233,13 @@ impl DeviceFlow {
             .http
             .post(&self.device_authorization_endpoint)
             .send_form([("client_id", self.client_id.as_str()), ("scope", scope)])
-            .map_err(|e| AuthError::Provider(e.to_string()))?;
+            .map_err(|e| AuthError::Unreachable(e.to_string()))?;
         if !resp.status().is_success() {
-            return Err(AuthError::Provider(format!(
-                "device authorization returned {}",
-                resp.status()
-            )));
+            return Err(AuthError::rejected(resp.status().as_u16(), None, None));
         }
         resp.body_mut()
             .read_json()
-            .map_err(|e| AuthError::Provider(e.to_string()))
+            .map_err(|e| AuthError::Other(e.to_string()))
     }
 
     /// Poll the token endpoint until the user authorizes, the code expires, or
@@ -267,13 +264,13 @@ impl DeviceFlow {
                     ("device_code", auth.device_code.as_str()),
                     ("client_id", self.client_id.as_str()),
                 ])
-                .map_err(|e| AuthError::Provider(e.to_string()))?;
+                .map_err(|e| AuthError::Unreachable(e.to_string()))?;
 
             if response.status().is_success() {
                 let token: TokenResponse = response
                     .body_mut()
                     .read_json()
-                    .map_err(|e| AuthError::Provider(e.to_string()))?;
+                    .map_err(|e| AuthError::Other(e.to_string()))?;
                 return Ok(token.into_stored());
             }
 
@@ -281,7 +278,7 @@ impl DeviceFlow {
             let body: TokenErrorBody = response
                 .body_mut()
                 .read_json()
-                .map_err(|e| AuthError::Provider(e.to_string()))?;
+                .map_err(|e| AuthError::Other(e.to_string()))?;
             match body.error.as_str() {
                 "authorization_pending" => {}
                 "slow_down" => interval += 5,
@@ -301,17 +298,24 @@ impl DeviceFlow {
                 ("refresh_token", refresh_token),
                 ("client_id", self.client_id.as_str()),
             ])
-            .map_err(|e| AuthError::Provider(e.to_string()))?;
+            .map_err(|e| AuthError::Unreachable(e.to_string()))?;
         if !response.status().is_success() {
-            return Err(AuthError::Provider(format!(
-                "token refresh returned {}",
-                response.status()
-            )));
+            // The IdP was reached and refused the grant (typically a `400
+            // invalid_grant` once the refresh token is expired/rotated). Report
+            // it as a *rejection* with the OAuth error body — not "unreachable",
+            // which would misdirect debugging toward the network.
+            let status = response.status().as_u16();
+            let body = response.body_mut().read_json::<TokenErrorBody>().ok();
+            return Err(AuthError::rejected(
+                status,
+                body.as_ref().map(|b| b.error.as_str()),
+                body.as_ref().and_then(|b| b.error_description.as_deref()),
+            ));
         }
         let mut token: StoredToken = response
             .body_mut()
             .read_json::<TokenResponse>()
-            .map_err(|e| AuthError::Provider(e.to_string()))?
+            .map_err(|e| AuthError::Other(e.to_string()))?
             .into_stored();
         // Providers may omit the refresh token on refresh — keep the old one.
         if token.refresh_token.is_none() {
diff --git a/crates/hephd/src/server.rs b/crates/hephd/src/server.rs
index 30c5d5a..89dee78 100644
--- a/crates/hephd/src/server.rs
+++ b/crates/hephd/src/server.rs
@@ -20,6 +20,7 @@ use tokio::net::{UnixListener, UnixStream};
 
 use heph_core::Store;
 
+use crate::auth::AuthError;
 use crate::oauth::{self, TokenStore};
 use crate::rpc::{self, Request, Response, RpcError, INTERNAL_ERROR, PARSE_ERROR};
 use crate::selfupdate::{self, SelfUpdateConfig};
@@ -80,10 +81,25 @@ fn is_auth_error(e: &anyhow::Error) -> bool {
         .is_some_and(|s| s == reqwest::StatusCode::UNAUTHORIZED)
 }
 
-/// Fold one exchange outcome into the shared [`SyncHealth`].
-fn record_sync_outcome(health: &Arc<Mutex<SyncHealth>>, result: &Result<sync::SyncReport>) {
+/// The exact `heph auth login …` command that re-authenticates this spoke, built
+/// from the hub URL + issuer + client id the daemon is configured with — so the
+/// surfaced error tells the user *what to run*, not just that auth failed.
+/// `None` for an unauthenticated / standalone instance. The hub-URL string must
+/// match what the credential store is keyed under, which is exactly `hub_url`.
+fn reauth_command(hub_url: Option<&str>, auth: Option<&SpokeAuth>) -> Option<String> {
+    let (hub, auth) = (hub_url?, auth?);
+    Some(format!(
+        "heph auth login --hub-url {hub} --issuer {} --client-id {}",
+        auth.issuer, auth.client_id
+    ))
+}
+
+/// Fold one exchange outcome into the shared [`SyncHealth`]. On an auth failure
+/// (a 401 from the hub) the recorded error carries the actionable re-login
+/// command, so `heph sync --status` / `heph auth status` / the TUI show the fix.
+fn record_sync_outcome(ctx: &Ctx, result: &Result<sync::SyncReport>) {
     let now = now_ms();
-    let mut h = health.lock().expect("sync_health mutex poisoned");
+    let mut h = ctx.sync_health.lock().expect("sync_health mutex poisoned");
     h.last_attempt_ms = Some(now);
     match result {
         Ok(_) => {
@@ -92,28 +108,67 @@ fn record_sync_outcome(health: &Arc<Mutex<SyncHealth>>, result: &Result<sync::Sy
             h.auth_failure = false;
         }
         Err(e) => {
-            h.auth_failure = is_auth_error(e);
-            h.last_error = Some(e.to_string());
+            let auth_failure = is_auth_error(e);
+            h.auth_failure = auth_failure;
+            h.last_error = Some(annotate_reauth(
+                e.to_string(),
+                auth_failure,
+                ctx.hub_url.as_deref(),
+                ctx.auth.as_ref(),
+            ));
         }
     }
 }
 
+/// Record a failure to obtain a bearer token (the refresh step, before any hub
+/// request). A *rejection* (the IdP refused the refresh) is an auth failure and
+/// gets the re-login hint; a transport failure stays a transient error. Surfacing
+/// this here means `last_error` reflects the real cause (e.g. `invalid_grant`)
+/// instead of only the downstream 401 on `/sync/pull`.
+fn record_bearer_failure(ctx: &Ctx, err: &AuthError) {
+    let now = now_ms();
+    let auth_failure = err.is_rejection();
+    let mut h = ctx.sync_health.lock().expect("sync_health mutex poisoned");
+    h.last_attempt_ms = Some(now);
+    h.auth_failure = auth_failure;
+    h.last_error = Some(annotate_reauth(
+        format!("could not obtain bearer token: {err}"),
+        auth_failure,
+        ctx.hub_url.as_deref(),
+        ctx.auth.as_ref(),
+    ));
+}
+
+/// Append the actionable re-login command to `msg` when this is an auth failure
+/// and the spoke has auth configured.
+fn annotate_reauth(
+    msg: String,
+    auth_failure: bool,
+    hub_url: Option<&str>,
+    auth: Option<&SpokeAuth>,
+) -> String {
+    match reauth_command(hub_url, auth) {
+        Some(cmd) if auth_failure => format!("{msg} — re-authenticate: {cmd}"),
+        _ => msg,
+    }
+}
+
 impl Ctx {
-    /// The current bearer token for hub sync (refreshing if expired), or `None`
-    /// if this spoke has no auth configured / no usable token.
-    async fn bearer(&self) -> Option<String> {
-        let auth = self.auth.clone()?;
-        let result = tokio::task::spawn_blocking(move || {
+    /// The current bearer token for hub sync (refreshing if expired). `Ok(None)`
+    /// means this spoke has no auth configured / no token stored (it syncs
+    /// unauthenticated); `Err` means token acquisition genuinely failed (the
+    /// caller records it and skips the attempt rather than 401ing the hub).
+    async fn bearer(&self) -> Result<Option<String>, AuthError> {
+        let Some(auth) = self.auth.clone() else {
+            return Ok(None);
+        };
+        match tokio::task::spawn_blocking(move || {
             oauth::current_bearer(auth.store.as_ref(), &auth.issuer, &auth.client_id)
         })
-        .await;
-        match result {
-            Ok(Ok(token)) => token,
-            Ok(Err(e)) => {
-                tracing::warn!("could not obtain bearer token: {e}");
-                None
-            }
-            Err(_) => None,
+        .await
+        {
+            Ok(res) => res,
+            Err(_join) => Ok(None), // the blocking task panicked; treat as no token
         }
     }
 }
@@ -223,10 +278,20 @@ impl Daemon {
             let mut tick = tokio::time::interval(interval);
             loop {
                 tick.tick().await;
-                let bearer = ctx.bearer().await;
+                let bearer = match ctx.bearer().await {
+                    Ok(b) => b,
+                    Err(e) => {
+                        // Couldn't get a token — record the real cause (e.g. a
+                        // rejected refresh) and skip; sending an unauthenticated
+                        // request would only 401 and mask it.
+                        record_bearer_failure(&ctx, &e);
+                        tracing::warn!("background sync: could not obtain bearer token: {e}");
+                        continue;
+                    }
+                };
                 let result =
                     sync::sync_once(ctx.store.clone(), &hub, &ctx.http, bearer.as_deref()).await;
-                record_sync_outcome(&ctx.sync_health, &result);
+                record_sync_outcome(&ctx, &result);
                 match result {
                     Ok(report) => tracing::debug!(?report, "background sync"),
                     Err(e) => tracing::warn!("background sync failed: {e}"),
@@ -321,9 +386,25 @@ async fn sync_now(ctx: &Ctx) -> Result<Value, RpcError> {
             message: "no hub_url configured; this instance is standalone".into(),
         });
     };
-    let bearer = ctx.bearer().await;
+    let bearer = match ctx.bearer().await {
+        Ok(b) => b,
+        Err(e) => {
+            // Token acquisition failed — record the real cause (with a re-login
+            // hint when it's a rejection) and surface it instead of a downstream 401.
+            record_bearer_failure(ctx, &e);
+            return Err(RpcError {
+                code: INTERNAL_ERROR,
+                message: annotate_reauth(
+                    format!("sync failed: could not obtain bearer token: {e}"),
+                    e.is_rejection(),
+                    ctx.hub_url.as_deref(),
+                    ctx.auth.as_ref(),
+                ),
+            });
+        }
+    };
     let result = sync::sync_once(ctx.store.clone(), &hub_url, &ctx.http, bearer.as_deref()).await;
-    record_sync_outcome(&ctx.sync_health, &result);
+    record_sync_outcome(ctx, &result);
     match result {
         Ok(report) => Ok(json!(report)),
         Err(e) => Err(RpcError {
@@ -374,10 +455,22 @@ async fn sync_status(ctx: &Ctx) -> Result<Value, RpcError> {
         .expect("sync_health mutex poisoned")
         .clone();
 
+    // Non-secret OIDC params (issuer/client-id) + the exact re-login command, so
+    // `heph auth status` can show the fix without reconstructing it client-side
+    // (and keyed under the right hub URL — see the per-URL token-keying gotcha).
+    let auth = ctx.auth.as_ref().map(|a| {
+        json!({
+            "issuer": a.issuer,
+            "client_id": a.client_id,
+        })
+    });
+
     Ok(json!({
         "hub_url": hub_url,
         "cursors": cursors,
         "conflicts": conflicts,
         "health": health,
+        "auth": auth,
+        "reauth_command": reauth_command(Some(&hub_url), ctx.auth.as_ref()),
     }))
 }
diff --git a/crates/hephd/src/sync.rs b/crates/hephd/src/sync.rs
index bfaa323..9beac05 100644
--- a/crates/hephd/src/sync.rs
+++ b/crates/hephd/src/sync.rs
@@ -261,8 +261,14 @@ async fn require_auth(
         .await
         .map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?
         .map_err(|e| match e {
-            AuthError::Provider(_) => StatusCode::SERVICE_UNAVAILABLE,
-            _ => StatusCode::UNAUTHORIZED,
+            // The token itself is missing/bad → tell the client it's unauthorized.
+            AuthError::Missing | AuthError::Invalid(_) => StatusCode::UNAUTHORIZED,
+            // We couldn't reach/process the IdP to fetch verification keys — a
+            // transient hub-side problem, not the client's token. Ask them to
+            // retry rather than claiming their token is invalid.
+            AuthError::Unreachable(_) | AuthError::Rejected(_) | AuthError::Other(_) => {
+                StatusCode::SERVICE_UNAVAILABLE
+            }
         })?;
 
     // Multi-tenancy seam: resolve the token's identity to the owner it may act
diff --git a/crates/hephd/tests/client_reconnect.rs b/crates/hephd/tests/client_reconnect.rs
new file mode 100644
index 0000000..a4d0074
--- /dev/null
+++ b/crates/hephd/tests/client_reconnect.rs
@@ -0,0 +1,96 @@
+//! [`Client`] survives the daemon dropping the socket (opt-in self-update, `heph
+//! daemon restart`). A mock daemon serves exactly one request per connection
+//! then closes it, forcing the client to reconnect — without auto-reconnect,
+//! every call after the first would fail forever.
+
+use std::io::{BufRead, BufReader, Write};
+use std::os::unix::net::UnixListener;
+use std::path::PathBuf;
+use std::sync::atomic::{AtomicUsize, Ordering};
+use std::sync::Arc;
+use std::thread;
+use std::time::Duration;
+
+use hephd::Client;
+use serde_json::{json, Value};
+
+/// A mock daemon that handles ONE request per connection then closes it, looping
+/// to accept the next connection. `served` counts total requests answered.
+fn spawn_one_shot_daemon(socket: PathBuf, served: Arc<AtomicUsize>) {
+    thread::spawn(move || {
+        let listener = UnixListener::bind(&socket).unwrap();
+        for conn in listener.incoming() {
+            let Ok(mut stream) = conn else { continue };
+            let mut reader = BufReader::new(stream.try_clone().unwrap());
+            let mut line = String::new();
+            if reader.read_line(&mut line).unwrap_or(0) == 0 {
+                continue; // client opened then went away; wait for the next one
+            }
+            let req: Value = serde_json::from_str(&line).unwrap();
+            let n = served.fetch_add(1, Ordering::SeqCst) + 1;
+            let mut out = serde_json::to_string(&json!({
+                "id": req["id"],
+                "result": { "served": n },
+            }))
+            .unwrap();
+            out.push('\n');
+            let _ = stream.write_all(out.as_bytes());
+            let _ = stream.flush();
+            // `stream` drops here → the connection closes after one request.
+        }
+    });
+}
+
+fn wait_for(socket: &std::path::Path) {
+    for _ in 0..400 {
+        if socket.exists() {
+            return;
+        }
+        thread::sleep(Duration::from_millis(5));
+    }
+    panic!("mock daemon socket never appeared");
+}
+
+#[test]
+fn client_reconnects_after_the_daemon_drops_the_socket() {
+    let dir = tempfile::tempdir().unwrap();
+    let socket = dir.path().join("d.sock");
+    let served = Arc::new(AtomicUsize::new(0));
+    spawn_one_shot_daemon(socket.clone(), served.clone());
+    wait_for(&socket);
+
+    let mut c = Client::connect(&socket).unwrap();
+
+    // First call works on the initial connection.
+    let r1 = c.call("ping", json!({})).unwrap();
+    assert_eq!(r1["served"], 1);
+
+    // The daemon has now closed that connection. With reconnect, the client
+    // recovers within a call or two (depending on whether the dead socket fails
+    // on write or on read); without it, every further call would fail forever.
+    let mut recovered = None;
+    for _ in 0..2 {
+        if let Ok(v) = c.call("ping", json!({})) {
+            recovered = Some(v);
+            break;
+        }
+    }
+    let r = recovered.expect("client should reconnect after the socket was dropped");
+    // The recovered call was served exactly once on the new connection — no
+    // double-serve from a spurious retry.
+    assert_eq!(r["served"], 2);
+    assert_eq!(served.load(Ordering::SeqCst), 2);
+
+    // And it keeps working across subsequent drops.
+    let r3 = {
+        let mut got = None;
+        for _ in 0..2 {
+            if let Ok(v) = c.call("ping", json!({})) {
+                got = Some(v);
+                break;
+            }
+        }
+        got.expect("client should keep reconnecting")
+    };
+    assert_eq!(r3["served"], 3);
+}
diff --git a/crates/hephd/tests/oauth.rs b/crates/hephd/tests/oauth.rs
index f61c872..0a1c709 100644
--- a/crates/hephd/tests/oauth.rs
+++ b/crates/hephd/tests/oauth.rs
@@ -90,11 +90,25 @@ async fn token(State(s): State<IdpState>, Form(form): Form<HashMap<String, Strin
             }))
             .into_response()
         }
-        Some("refresh_token") => Json(json!({
-            "access_token": "access-2",
-            "expires_in": 3600,
-        }))
-        .into_response(),
+        Some("refresh_token") => {
+            // A rotated/expired refresh token is refused with `400 invalid_grant`
+            // (RFC 6749 §5.2) — the case that used to be mislabeled "unreachable".
+            if form.get("refresh_token").map(String::as_str) == Some("refresh-expired") {
+                return (
+                    StatusCode::BAD_REQUEST,
+                    Json(json!({
+                        "error": "invalid_grant",
+                        "error_description": "Token is not active",
+                    })),
+                )
+                    .into_response();
+            }
+            Json(json!({
+                "access_token": "access-2",
+                "expires_in": 3600,
+            }))
+            .into_response()
+        }
         _ => (
             StatusCode::BAD_REQUEST,
             Json(json!({ "error": "unsupported_grant_type" })),
@@ -129,6 +143,48 @@ fn refresh_keeps_the_old_refresh_token_when_omitted() {
     assert_eq!(refreshed.refresh_token.as_deref(), Some("refresh-1"));
 }
 
+#[test]
+fn refresh_rejected_by_idp_is_a_rejection_not_unreachable() {
+    let issuer = start_idp();
+    let flow = DeviceFlow::discover(&issuer, "heph-cli").unwrap();
+    let err = flow.refresh("refresh-expired").unwrap_err();
+    // The whole point of the fix: a reachable IdP that returns 400 is a
+    // *rejection*, carrying the OAuth error body — not "unreachable".
+    assert!(err.is_rejection(), "expected a rejection, got: {err}");
+    let msg = err.to_string();
+    assert!(
+        msg.contains("rejected"),
+        "message should say rejected: {msg}"
+    );
+    assert!(
+        msg.contains("invalid_grant"),
+        "should include the OAuth error: {msg}"
+    );
+    assert!(
+        msg.contains("Token is not active"),
+        "should include error_description: {msg}"
+    );
+    assert!(
+        !msg.contains("unreachable"),
+        "must NOT claim the IdP was unreachable: {msg}"
+    );
+}
+
+#[test]
+fn discovery_against_a_dead_idp_is_unreachable_not_a_rejection() {
+    use hephd::AuthError;
+    // Port 1 refuses the connection → a genuine transport failure.
+    let err = match DeviceFlow::discover("http://127.0.0.1:1/application/o/heph/", "heph-cli") {
+        Ok(_) => panic!("discovery should fail against a dead IdP"),
+        Err(e) => e,
+    };
+    assert!(
+        matches!(err, AuthError::Unreachable(_)),
+        "a connection failure must be Unreachable, got: {err}"
+    );
+    assert!(!err.is_rejection());
+}
+
 #[test]
 fn memory_token_store_round_trips_and_reports_expiry() {
     let store = MemoryTokenStore::default();
diff --git a/docs/changelog.d/+sync-age-seconds.feature.md b/docs/changelog.d/+sync-age-seconds.feature.md
deleted file mode 100644
index cf453c2..0000000
--- a/docs/changelog.d/+sync-age-seconds.feature.md
+++ /dev/null
@@ -1 +0,0 @@
-heph-tui's sync indicator now shows the last-sync age in seconds under a minute (`⟳ 26s`) instead of a flat `just now`, so the chip reads as a live heartbeat and a missed sync (the loop runs every 30s) shows up as the age climbing.
diff --git a/docs/changelog.d/daemon-self-update-interval.feature.md b/docs/changelog.d/daemon-self-update-interval.feature.md
deleted file mode 100644
index b5ec9b8..0000000
--- a/docs/changelog.d/daemon-self-update-interval.feature.md
+++ /dev/null
@@ -1 +0,0 @@
-`heph daemon start`/`restart` can now bake the daemon's full runtime config into the managed service — `--mode`, `--hub-url`, `--http-addr`, `--oidc-issuer`/`--oidc-audience`/`--oidc-client-id`, and `--self-update-interval-secs` (previously only the bare `--self-update` bool was wired). Regenerating preserves whatever is already baked into the on-disk plist/unit, so a bare `start`/`restart` no longer silently drops spoke/hub or self-update config.
diff --git a/docs/how-to/run-the-daemon.md b/docs/how-to/run-the-daemon.md
index cb9e56d..545b3be 100644
--- a/docs/how-to/run-the-daemon.md
+++ b/docs/how-to/run-the-daemon.md
@@ -86,6 +86,14 @@ still the old binary until you restart it:
 heph daemon restart
 ```
 
+A restart (or an opt-in self-update) drops the daemon's unix socket out from
+under any connected surface. The CLI and `heph-tui` **reconnect automatically**:
+a read transparently retries on a fresh connection, and a long-running TUI
+self-heals on its next tick — so a daemon restart no longer leaves the agenda
+view stuck on errors. (A mutating action whose reply is lost mid-restart reports
+"reconnected — re-run the action if it didn't take effect" rather than risk
+applying twice.)
+
 ## Self-update (opt-in)
 
 `hephd` can keep itself current: `heph daemon start --self-update` generates a
diff --git a/docs/how-to/set-up-sync-hub.md b/docs/how-to/set-up-sync-hub.md
index a5b56ea..4d654a9 100644
--- a/docs/how-to/set-up-sync-hub.md
+++ b/docs/how-to/set-up-sync-hub.md
@@ -130,19 +130,41 @@ spoke is visible at a glance rather than buried in the daemon log.
 
 Make a change on `gilbert`, force a sync, and confirm it appears via the hub.
 
+### When sync stops authenticating
+
+A spoke's refresh token can expire or be rotated (e.g. the IdP session lapses).
+The spoke then can't refresh on its own and needs a re-login — but this is
+**visible, not silent**:
+
+- `heph-tui` shows a red `⚠ auth · heph auth status` chip in the status line.
+- `heph auth status` prints the auth health and the **exact** re-login command,
+  pre-filled with this spoke's hub URL / issuer / client id:
+
+  ```bash
+  heph auth status
+  ```
+
+- `heph sync --status`'s `last_error` names the real cause — a refresh
+  *rejection* (e.g. `HTTP 400 (invalid_grant)`), not a misleading "identity
+  provider unreachable" — and carries the same `heph auth login …` hint.
+
+Run the printed `heph auth login …` command to restore sync.
+
 ## Current gaps (finalized by the blumeops deployment)
 
-The flag-level flow above works today; two enablers make it a clean, managed
+The flag-level flow above works today; one enabler makes it a clean, managed
 deployment rather than a hand-run process — tracked in the `Hephaestus` project:
 
-- **`heph daemon` only generates a `--mode local` service** (no `--hub-url` /
-  `--oidc-*`). So for now the hub and the spoke config are expressed as `hephd`
-  flags (run directly, or via the blumeops-managed systemd unit), not via
-  `heph daemon start`.
 - **Path A seeding is manual** (copy the store + reset the device origin). A
   small enabler — seed a hub from a snapshot with a fresh origin, or
   `hephd --owner-id` — would make this one step.
 
+> `heph daemon start`/`restart` can now bake the spoke/hub config (`--hub-url`,
+> `--mode server`, `--http-addr`, `--oidc-*`) into the generated service (see
+> [[run-the-daemon]]). The canonical hub on `indri` is still provisioned via the
+> blumeops-managed systemd unit by deployment choice, not because `heph daemon`
+> can't express it.
+
 ## Related
 
 - [[run-the-daemon]] — manage the local daemon as an OS service