Document Dex OIDC and add services-check integration (#223)

## Summary
- Create Dex reference card (`docs/reference/services/dex.md`) with quick reference, architecture, identity source, storage, OIDC clients, secrets, and endpoints
- Write federated login explanation article (`docs/explanation/federated-login.md`) covering the Dex + Forgejo two-layer auth model, login flow, and break-glass access
- Add Dex to `services-check` (HTTP health endpoint + k3s pod check)
- Update Grafana docs with new Authentication section documenting SSO via Dex
- Update Forgejo docs with OAuth2 Provider section documenting its role as upstream identity source
- Add Dex to ringtail workloads table and reference service index
- Move `adopt-oidc-provider` plan to `completed/` with final design reflecting actual implementation

## Test plan
- [ ] `mise run services-check` passes (includes new Dex checks)
- [ ] `docs-check-links` passes (all wiki-links resolve)
- [ ] `docs-check-index` passes (new docs are indexed)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/223
This commit is contained in:
Erich Blume 2026-02-19 20:44:23 -08:00
commit d21798b1f3
13 changed files with 306 additions and 209 deletions

View file

@ -1,208 +0,0 @@
---
title: "Plan: Adopt OIDC Identity Provider"
modified: 2026-02-11
tags:
- how-to
- plans
- security
- oidc
---
# Plan: Adopt OIDC Identity Provider
> **Status:** Planning (design sketch — not yet ready to execute)
## Background
BlumeOps services currently handle authentication independently — ArgoCD has its own admin password, Grafana has its own login, Forgejo has local accounts, and zot has no auth at all. There is no single sign-on, no centralized user management, and no way to issue scoped API keys or service tokens from a shared identity.
Adding an OpenID Connect (OIDC) identity provider gives BlumeOps a central authentication layer. Services delegate login to the IdP, and the IdP issues tokens that carry identity and group claims. This unlocks:
- **SSO across services** — one login for Grafana, ArgoCD, Forgejo, zot, and future services
- **API keys derived from identity** — zot's API key feature requires OIDC; CI service accounts get scoped, expirable tokens tied to a real identity
- **Group-based authorization** — services can make access decisions based on IdP group claims rather than per-service user lists
- **Audit trail** — authentication events flow through one system
### Goals
- Deploy a lightweight OIDC provider on the BlumeOps infrastructure
- Configure at least one service (zot) as a relying party to validate the setup
- Establish patterns for adding future OIDC clients (Grafana, ArgoCD, Forgejo)
- Keep complexity appropriate for a single-user homelab
## Provider Comparison
| Provider | Language | Resources | UI | OIDC Maturity | Zot Integration | Notes |
|----------|----------|-----------|-----|---------------|-----------------|-------|
| **Dex** | Go | ~20-50MB RAM | None (config-driven) | Mature, purpose-built | Explicitly documented in zot examples | CNCF Sandbox; `staticPasswords` connector for single-user |
| **Authentik** | Python | ~200-300MB RAM, needs PostgreSQL + Redis | Full web UI, visual flow builder | Mature | [Proven community guide](https://integrations.goauthentik.io/infrastructure/zot/) | Best for small teams; heavier than needed for one user |
| **Authelia** | Go | ~30MB RAM | None (YAML config) | Maturing (OIDC provider still on roadmap) | [Unresolved integration issues](https://github.com/authelia/authelia/discussions/7615) | Primarily a forward-auth proxy; OIDC is secondary |
| **Keycloak** | Java | ~500MB+ RAM | Enterprise admin console | Battle-tested | Works via generic OIDC | Massive overkill for homelab |
### Recommendation: Investigate Dex First
Dex is the strongest candidate for BlumeOps:
- **Lightest footprint** — single Go binary, no database dependencies (in-memory or SQLite storage)
- **Designed for exactly this** — Dex is an OIDC provider that federates identity; it's not a full IAM suite bolted onto other things
- **Zot uses Dex in its own examples** — lowest integration risk
- **`staticPasswords` connector** — define the single `eblume` user directly in YAML config, no external user store needed
- **Future flexibility** — if SSO via GitHub or Google is ever wanted, add a connector without changing the architecture
- **CNCF project** — actively maintained, well-documented
The main trade-off is no web UI for user management — but for a single-user setup, that's a non-issue. Config changes go through the normal PR workflow.
If Dex proves insufficient during execution (e.g., missing features for a specific service integration), Authentik is the fallback — heavier but more capable.
## Architecture
```
Caddy (TLS termination)
|
+--------------+--------------+
| | |
Browser SSO CLI / CI k8s services
| | |
v v v
Dex (OIDC IdP) API Keys OIDC tokens
issuer: (generated (validated by
dex.ops.eblu.me after OIDC each service)
| login)
v
staticPasswords
connector (eblume)
```
### Deployment Options
Dex can run as:
1. **k8s pod** (via ArgoCD) — follows the pattern of other BlumeOps services, gets automatic restarts, lives alongside its consumers
2. **Native on indri** (via Ansible/LaunchAgent) — follows the zot/Forgejo pattern, simpler networking
The k8s option is preferred since most OIDC consumers (Grafana, ArgoCD) are already in k8s. Evaluate during execution.
### Endpoints
| Endpoint | URL | Purpose |
|----------|-----|---------|
| Issuer | `https://dex.ops.eblu.me` | OIDC discovery (`/.well-known/openid-configuration`) |
| Auth | `https://dex.ops.eblu.me/auth` | Browser login redirect |
| Token | `https://dex.ops.eblu.me/token` | Token exchange |
| Callback | Per-client (e.g., `https://registry.ops.eblu.me/zot/auth/callback/oidc`) | OAuth2 redirect URI |
## Dex Configuration Sketch
```yaml
issuer: https://dex.ops.eblu.me
storage:
type: sqlite3
config:
file: /var/dex/dex.db
web:
http: 0.0.0.0:5556
connectors:
- type: local
id: local
name: Local
staticPasswords:
- email: eblume@eblume.net
hash: "<bcrypt hash>" # generated at deploy time
username: eblume
userID: "<uuid>"
staticClients:
- id: zot-registry
name: Zot Registry
secret: "<from 1Password>"
redirectURIs:
- https://registry.ops.eblu.me/zot/auth/callback/oidc
# Future clients:
# - id: grafana
# ...
# - id: argocd
# ...
# - id: forgejo
# ...
```
Secrets (static password hash, client secrets) are stored in 1Password and injected at deploy time — never committed to the repo.
## Planned OIDC Clients
Initial rollout targets zot only. Future services to integrate:
| Service | OIDC Support | Priority | Notes |
|---------|-------------|----------|-------|
| **Zot** | Native (`openid.providers.oidc`) | First (validates IdP) | See [[harden-zot-registry]] |
| **Grafana** | Native (`auth.generic_oauth`) | High | Currently uses default admin password |
| **ArgoCD** | Native (`oidc.config` in `argocd-cm`) | High | Currently uses local admin password |
| **Forgejo** | Native (OAuth2 provider in admin settings) | Medium | Currently uses local accounts |
## Execution Steps
1. **Choose deployment method** (k8s vs native) and set up the service
- If k8s: create `argocd/manifests/dex/` with Deployment, Service, ConfigMap
- If native: create `ansible/roles/dex/` following the zot pattern
- Add Caddy reverse proxy entry for `dex.ops.eblu.me`
2. **Configure Dex**
- Generate static password hash and client secrets
- Store all secrets in 1Password
- Deploy initial config with `staticPasswords` connector and zot as the first client
3. **Verify OIDC discovery**
- `curl https://dex.ops.eblu.me/.well-known/openid-configuration` returns valid JSON
- Issuer URL matches config
4. **Integrate first client (zot)**
- This is covered by [[harden-zot-registry]] — configure zot's `openid.providers.oidc` to point at Dex
- Test browser login → API key generation → CLI push flow
5. **Documentation**
- Create `docs/reference/services/dex.md` reference card
- Update service indexes
- Add changelog fragment
## Verification Checklist
- [ ] Dex is running and healthy
- [ ] OIDC discovery endpoint returns valid configuration
- [ ] Browser login flow works (redirect → Dex login → redirect back)
- [ ] At least one client (zot) successfully authenticates via Dex
- [ ] Caddy proxies `dex.ops.eblu.me` correctly
- [ ] `mise run services-check` passes (if health check is added)
## Open Questions
- **Service dependency and recovery:** If Dex runs in k8s and k8s goes down, services that depend on Dex for authentication may become inaccessible — potentially including tools needed to bring k8s back up. This circular dependency **must be resolved** before execution. Options include: running Dex natively on indri (outside k8s), ensuring all critical recovery paths have break-glass credentials that bypass OIDC, or designing the system so that OIDC is additive (services fall back to local auth when the IdP is unreachable). This needs its own design pass during implementation planning.
- **Dex vs Authentik:** Dex is the starting recommendation, but evaluate during execution. If multiple services need dynamic user management or a web UI for client registration, Authentik may be worth the extra weight.
- **Storage backend:** SQLite is simplest for single-node. If Dex runs in k8s, it needs a PersistentVolume or could use the k8s CRD storage backend instead.
- **Tailscale ACL interaction:** Should the Dex endpoint be tailnet-only, or accessible from the public internet (for potential external SSO)? Start with tailnet-only.
- **Token lifetime and refresh:** Dex defaults are reasonable, but may need tuning for long-running CI jobs.
## Future Considerations
- **Additional connectors** — add GitHub or Google as upstream identity sources for SSO convenience
- **Group claims** — define groups in Dex config (e.g., `admin`, `ci`) and use them for authorization across services
- **Mutual TLS** — Dex supports mTLS for service-to-service token exchange, which could harden the CI credential path
## Reference Pattern Files
| File | Purpose |
|------|---------|
| `argocd/manifests/grafana-config/` | Example k8s service with ConfigMap-based config |
| `ansible/roles/zot/` | Example native service deployment pattern |
| `pulumi/tailscale/` | Example of secrets injection from 1Password |
## Related
- [[harden-zot-registry]] — first OIDC client (execute after this plan)
- [[zot]] — container registry reference
- [[cluster]] — k8s cluster (potential Dex host)
- [[indri]] — native service host (alternative Dex host)

View file

@ -0,0 +1,105 @@
---
title: "Plan: Adopt OIDC Identity Provider"
modified: 2026-02-19
tags:
- how-to
- plans
- security
- oidc
---
# Plan: Adopt OIDC Identity Provider
> **Status:** Completed (2026-02-19) — Phase 1 (Dex + Grafana)
> **PR:** #222
## Background
BlumeOps services currently handle authentication independently — ArgoCD has its own admin password, Grafana has its own login, Forgejo has local accounts, and zot has no auth at all. There is no single sign-on, no centralized user management, and no way to issue scoped API keys or service tokens from a shared identity.
Adding an OpenID Connect (OIDC) identity provider gives BlumeOps a central authentication layer. Services delegate login to the IdP, and the IdP issues tokens that carry identity claims.
## Final Design
### Provider: Dex
Dex was chosen for its lightweight footprint (single Go binary, ~50MB RAM), config-driven operation (no web UI needed), and native Gitea/Forgejo connector support.
### Architecture
```
User Browser
|
v
Grafana (indri/minikube) --OIDC--> Dex (ringtail/k3s) --OAuth2--> Forgejo (indri/native)
^ |
| |
+---------------------- redirect back with token -------------------+
```
Key design decisions:
- **Dex runs on ringtail's k3s cluster** — isolates the IdP from indri's minikube. If minikube goes down, Dex stays up. Recovery path: SSH → indri → ArgoCD local admin → fix.
- **Forgejo is the upstream identity source** — not static passwords. Users authenticate with their Forgejo account. Adding a user to SSO = creating a Forgejo account.
- **SQLite3 storage with emptyDir** — avoids a Kubernetes CRD storage bug (Go URL parsing issue with in-cluster API address). Pod restart invalidates sessions (users re-login), acceptable for a homelab.
- **NixOS-built container**`containers/dex/default.nix` using `pkgs.dex-oidc`, consistent with the ntfy pattern.
- **Full config templated via ExternalSecret** — the entire `config.yaml` lives in the ExternalSecret template with secrets injected from 1Password. Nothing sensitive in git.
- **Cross-cluster communication** — Grafana reaches Dex via `https://dex.ops.eblu.me` (Caddy → Tailscale → ringtail), not k8s-internal DNS.
### Resolved Open Questions
- **Service dependency and recovery:** Dex on ringtail is independent of minikube. All services keep local admin logins as break-glass. If Dex goes down, users log in locally.
- **Dex vs Authentik:** Dex confirmed as the right choice. Config-driven, minimal resource usage, native Forgejo connector.
- **Storage backend:** SQLite3 (not Kubernetes CRDs). The CRD backend crashes due to a Go URL parsing bug with k3s's in-cluster API address. SQLite3 with emptyDir is simpler and avoids the issue.
- **User management scaling:** Forgejo connector solves this. Users are managed in Forgejo, not in Dex config files. Future option to add Google/GitHub connectors alongside Forgejo.
- **Tailscale ACL interaction:** Dex is tailnet-only via Caddy. Public access is a future consideration tied to exposing Forgejo publicly.
## Execution (as completed)
1. Created `containers/dex/default.nix` and built `dex:v1.0.0-nix`
2. Created 1Password item "Dex (blumeops)" with Forgejo OAuth2 credentials and Grafana client secret
3. Created OAuth2 application in Forgejo (Site Administration → Applications, confidential client, redirect URI `https://dex.ops.eblu.me/callback`)
4. Created ArgoCD app (`argocd/apps/dex.yaml`) targeting ringtail
5. Created k8s manifests: ExternalSecret, Deployment, Service, Ingress (5 files in `argocd/manifests/dex/`)
6. Added `dex.ops.eblu.me` to Caddy reverse proxy config
7. Created `grafana-dex-oauth` ExternalSecret for Grafana's OIDC client secret
8. Added `auth.generic_oauth` to Grafana's `values.yaml` with Dex endpoints
9. Fixed Grafana `root_url` from `grafana.tail8d86e.ts.net` to `grafana.ops.eblu.me` (OAuth state cookie mismatch)
10. Deployed and verified end-to-end SSO flow
## Verification (completed)
- [x] Container image exists: `dex:v1.0.0-nix` in registry
- [x] OIDC discovery endpoint returns valid configuration
- [x] Health check passes (`/healthz`)
- [x] Grafana login page shows "Sign in with Dex" button
- [x] OIDC flow: click Dex → Forgejo login → redirect back → logged in as Admin
- [x] Break-glass: local admin login still works
- [x] `mise run services-check` passes
- [x] ArgoCD shows dex app healthy and synced
## Key Files
| File | Purpose |
|------|---------|
| `containers/dex/default.nix` | NixOS container build |
| `argocd/apps/dex.yaml` | ArgoCD app (ringtail target) |
| `argocd/manifests/dex/` | K8s manifests (ExternalSecret, Deployment, Service, Ingress) |
| `argocd/manifests/grafana-config/external-secret-dex-oauth.yaml` | Grafana OIDC client secret |
| `argocd/manifests/grafana/values.yaml` | Grafana OIDC config (`auth.generic_oauth`) |
| `ansible/roles/caddy/defaults/main.yml` | Caddy reverse proxy entry |
## Future Phases
- **Phase 2:** ArgoCD OIDC (keep local admin, RBAC: `g, blume.erich@gmail.com, role:admin`)
- **Phase 3:** Forgejo OAuth2 provider integration (keep local accounts)
- **Phase 4:** Miniflux, Immich, other services
- **Phase 5:** Zot OIDC + hardening (per [[harden-zot-registry]])
## Related
- [[dex]] - Service reference card
- [[federated-login]] - How authentication works across BlumeOps
- [[harden-zot-registry]] - Future OIDC client
- [[forgejo]] - Upstream OAuth2 provider
- [[grafana]] - First OIDC client

View file

@ -15,3 +15,4 @@ Plans that have been fully implemented and verified. Kept for historical referen
| [[adopt-dagger-ci]] | 2026-02-11 | Adopt Dagger as CI/CD build engine (Phases 13) |
| [[segment-home-network]] | 2026-02-14 | Manual three-network segmentation for UniFi Express 7 |
| [[operationalize-reolink-camera]] | 2026-02-15 | Deploy Frigate NVR stack with Mosquitto, Ntfy, and frigate-notify |
| [[adopt-oidc-provider]] | 2026-02-19 | Deploy Dex OIDC identity provider with Forgejo backend and Grafana SSO |

View file

@ -17,7 +17,7 @@ Plans differ from regular how-to guides in that they describe work that has been
| [[migrate-forgejo-from-brew]] | Planned | Transition Forgejo from Homebrew to source-built binary with LaunchAgent |
| [[add-unifi-pulumi-stack]] | Abandoned | Add Pulumi IaC for UniFi Express 7 (provider bugs — see doc) |
| [[upstream-fork-strategy]] | Planned | Stacked-branch forking strategy for tracking upstream projects |
| [[adopt-oidc-provider]] | Planning | Deploy OIDC identity provider for SSO across services |
| [[adopt-oidc-provider]] | Completed | Deploy OIDC identity provider for SSO across services |
| [[harden-zot-registry]] | Planned | Add authentication and tag immutability to zot registry |
| [[forgejo-actions-dashboard]] | Planned | Grafana dashboard and custom Prometheus exporter for Forgejo Actions CI metrics |
| [[upgrade-grafana-helm-chart]] | Planned | Upgrade Grafana Helm chart from 8.8.2 to 11.x (3 phases) |