blumeops

Author	SHA1	Message	Date
Erich Blume	2ac353b7bf	Fix authentik container: create /tmp for unprivileged user All checks were successful Build Container (Nix) / detect (push) Successful in 1s Details Build Container / detect (push) Successful in 2s Details Build Container / build (authentik) (push) Successful in 1s Details Build Container (Nix) / build (authentik) (push) Successful in 54s Details buildLayeredImage doesn't create /tmp by default. The container runs as user 65534 (nobody) which can't mkdir /tmp at runtime. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-01 15:48:05 -08:00
Erich Blume	efa9806bfa	C2: Build authentik from source (Mikado chain) (#274 ) All checks were successful Build Container / detect (push) Successful in 3s Details Build Container (Nix) / detect (push) Successful in 1s Details Build Container / build (authentik) (push) Successful in 2s Details Build Container (Nix) / build (authentik) (push) Successful in 22s Details ## Mikado Chain: build-authentik-from-source Replace `pkgs.authentik` from nixpkgs with a custom Nix derivation built from source. This removes the dependency on the nixpkgs packaging timeline and gives full version control. Target version: 2025.12.4 (nixpkgs reference, upgrading from deployed 2025.10.1). ### Dependency Graph ``` build-authentik-from-source (goal) ├── authentik-go-server-derivation │ ├── authentik-api-client-generation ← IN PROGRESS │ └── authentik-python-backend-derivation ├── authentik-web-ui-derivation │ └── authentik-api-client-generation ← IN PROGRESS └── authentik-python-backend-derivation ``` ### Ready Leaves - `authentik-api-client-generation` — Go + TypeScript client generation from OpenAPI schema - `authentik-python-backend-derivation` — Django backend with 60+ deps, 4 in-tree packages ### Architecture Ported from [nixpkgs `pkgs/by-name/au/authentik/package.nix`](https://github.com/NixOS/nixpkgs/tree/master/pkgs/by-name/au/authentik): - `source.nix` — shared version/source fetch - `client-go.nix` — Go API client generation - `client-ts.nix` — TypeScript API client generation - `api-go-vendor-hook.nix` — Go vendor directory injection hook - (more components to follow as leaves are closed) ### Related Cards - [[build-authentik-from-source]] — Goal card - [[authentik-api-client-generation]] - [[authentik-python-backend-derivation]] - [[authentik-web-ui-derivation]] - [[authentik-go-server-derivation]] Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/274	2026-03-01 13:45:00 -08:00
Erich Blume	33b7f0f353	Switch prometheus, teslamate, miniflux to forge mirrors All checks were successful Build Container / detect (push) Successful in 2s Details Build Container (Nix) / detect (push) Successful in 2s Details Build Container (Nix) / build (miniflux) (push) Successful in 3s Details Build Container (Nix) / build (prometheus) (push) Successful in 3s Details Build Container (Nix) / build (teslamate) (push) Successful in 2s Details Build Container / build (miniflux) (push) Successful in 1m14s Details Build Container / build (teslamate) (push) Successful in 13m42s Details Build Container / build (prometheus) (push) Successful in 15m20s Details Created miniflux mirror at mirrors/miniflux. All three containers now clone from forge.ops.eblu.me/mirrors/ instead of GitHub directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-24 21:01:08 -08:00
Erich Blume	cd578144f7	Migrate upstream mirrors to mirrors/ Forgejo org (#265 ) All checks were successful Build Container (Nix) / detect (push) Successful in 2s Details Build Container (Nix) / build (homepage) (push) Successful in 3s Details Build Container (Nix) / build (navidrome) (push) Successful in 3s Details Build Container (Nix) / build (ntfy) (push) Successful in 8s Details Build Container / detect (push) Successful in 42s Details Build Container / build (navidrome) (push) Successful in 9m37s Details Build Container / build (homepage) (push) Successful in 9m56s Details Build Container / build (ntfy) (push) Successful in 2m35s Details ## Summary - Created `mirrors` Forgejo organization for upstream mirror repos - Transferred 22 mirror repos from `eblume/` to `mirrors/` (mirror sync config preserved) - Deleted unused repos: hajimari, hister - Updated all container build URLs (homepage, navidrome, ntfy Dockerfiles + nix) - Updated documentation references (migrate-forgejo-from-brew, upstream-fork-strategy, fix-ntfy-nix-version) - `dotfiles` intentionally kept under `eblume/` per user request - `devpi` transferred to `mirrors/` Repos remaining under `eblume/`: blumeops, cv, mcquack, dotfiles ## Cleanup TODO - [ ] Delete temp Forgejo API token "claude-migration-temp" (Settings > Applications) ## Test Plan - [x] Verified mirror config (mirror=true, original_url) survived transfer on test repo (tesla_auth) - [x] All pre-commit hooks pass (including container-version-check, docs-check-links) - [ ] Verify a mirror repo sync runs successfully after transfer (check mirrors/authentik or similar) - [ ] Rebuild containers from branch to verify Dockerfile URLs resolve Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/265	2026-02-24 20:43:14 -08:00
Erich Blume	2ba5d8a8aa	Port Prometheus to local container build (#262 ) All checks were successful Build Container (Nix) / detect (push) Successful in 2s Details Build Container / detect (push) Successful in 2s Details Build Container (Nix) / build (prometheus) (push) Successful in 2s Details Build Container / build (prometheus) (push) Successful in 7s Details ## Summary - Add three-stage Dockerfile for Prometheus v3.9.1 (Node UI → Go binaries → Alpine runtime) - Produces `prometheus` and `promtool` binaries with embedded web UI assets - Follows navidrome/ntfy pattern for supply chain control via Zot registry ## Deployment and Testing - [ ] `dagger call build --src=. --container-name=prometheus` succeeds - [ ] Container reports correct version via `prometheus --version` - [ ] `promtool --version` works - [ ] Update statefulset image reference after successful build - [ ] Deploy from branch: `argocd app set prometheus --revision <branch> && argocd app sync prometheus` - [ ] Health probes pass (`/-/healthy`, `/-/ready`) - [ ] Web UI loads, scrape targets work, remote write functions Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/262	2026-02-24 09:15:57 -08:00
Erich Blume	d05d2fbaff	C2: Upgrade Grafana to 12.x with Nix container and Kustomize (#260 ) All checks were successful Build Container (Nix) / detect (push) Successful in 2s Details Build Container / detect (push) Successful in 1s Details Build Container (Nix) / build (grafana) (push) Successful in 2s Details Build Container / build (grafana) (push) Successful in 7s Details ## Summary Mikado chain to upgrade Grafana from 11.4.0 (Helm chart) to 12.x with: - Home-built Nix container image (`forge.ops.eblu.me/eblume/grafana`) - Kustomize manifests replacing the Helm chart - Single-source ArgoCD app ## Chain Goal: `upgrade-grafana` Leaves: `build-grafana-container`, `kustomize-grafana-deployment` Track with: `mise run docs-mikado upgrade-grafana` ## Test plan - [ ] Container builds successfully via Nix - [ ] Container pushed to registry - [ ] Kustomize manifests produce equivalent resources to current Helm - [ ] Pod runs, UI loads, OIDC works, datasources healthy - [ ] `mise run services-check` passes Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/260	2026-02-23 18:07:18 -08:00
Erich Blume	4c5e0f0d16	Rename containers/forgejo-runner to runner-job-image All checks were successful Build Container (Nix) / detect (push) Successful in 2s Details Build Container / detect (push) Successful in 2s Details Build Container (Nix) / build (runner-job-image) (push) Successful in 2s Details Build Container / build (runner-job-image) (push) Successful in 1m42s Details The forgejo-runner container is the CI job execution environment (Dagger, ArgoCD CLI, etc.), not the runner daemon itself. Rename to runner-job-image to fix the version-check false positive (Dagger 0.19.11 vs daemon 12.7.0) and clarify the distinction. RUNNER_LABELS still references the old image name — will update after building the image under the new name. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 17:44:51 -08:00
Erich Blume	0e2c10176d	Harden zot registry, pt 1 (#231 ) ## Summary - Enable OIDC + API key authentication on zot with anonymous pull preserved - Enforce tag immutability for version tags - Adopt commit-SHA-based container image tagging Details in the [[harden-zot-registry]] Mikado chain (`mise run docs-mikado harden-zot-registry`). ## Test plan - [ ] Anonymous pull still works - [ ] Unauthenticated push fails (401) - [ ] CI container builds pass with new auth and tagging - [ ] `mise run services-check` passes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/231	2026-02-20 22:50:01 -08:00
Erich Blume	71cb256527	Deploy Authentik identity provider (C2 Mikado) (#227 ) ## Summary C2 Mikado chain for deploying Authentik as the SSO identity provider, replacing Dex. This PR will evolve over multiple sessions. Each iteration adds documentation (prerequisite cards) and eventually code as leaf nodes are resolved. ## Current Mikado State - Goal: `deploy-authentik` (active) - Leaf prerequisites: - `build-authentik-container` — Build Nix container image - `provision-authentik-database` — Create PostgreSQL database on CNPG cluster - `create-authentik-secrets` — Create 1Password item with credentials ## Process refinements - Updated agent-change-process with lessons from first attempt: reset code before committing cards, open PRs early ## Test plan - [ ] `mise run docs-mikado` shows correct dependency chain - [ ] Leaf nodes can be worked independently - [ ] Container builds on ringtail - [ ] Authentik starts and reaches healthy state - [ ] Forgejo OAuth2 connector works Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/227	2026-02-20 12:55:59 -08:00
Erich Blume	0cdc143227	Deploy Dex OIDC identity provider with Grafana SSO (#222 ) ## Summary - Deploys Dex OIDC identity provider on ringtail k3s cluster as central authentication service - Integrates Grafana as first SSO client via `auth.generic_oauth` - Uses Kubernetes CRD storage backend (no PVC needed) - All secrets (bcrypt hash, client secrets) injected via ExternalSecrets from 1Password item "Dex (blumeops)" - NixOS-built container image via `containers/dex/default.nix` ## Pre-requisites (manual, before deployment) 1. Create 1Password item "Dex (blumeops)" in `blumeops` vault with fields: - `password`: strong generated password for Dex login - `static-password-hash`: bcrypt hash of above (`htpasswd -BnC 10 eblume`, copy hash after `eblume:`) - `grafana-client-secret`: random 32-char hex (`openssl rand -hex 16`) 2. Build container: `mise run container-tag-and-release dex v1.0.0` ## Deployment sequence 1. Build container: `mise run container-tag-and-release dex v1.0.0` 2. Deploy Caddy: `mise run provision-indri -- --tags caddy` 3. Sync ArgoCD: `argocd app sync apps` → `argocd app sync dex` 4. Verify Dex: `curl https://dex.ops.eblu.me/.well-known/openid-configuration` 5. Sync Grafana: `argocd app sync grafana-config` → `argocd app sync grafana` 6. Test SSO: Visit `https://grafana.ops.eblu.me/login`, click "Sign in with Dex" ## Verification - [ ] Container image exists: `mise run container-list` shows `dex:v1.0.0-nix` - [ ] `curl https://dex.ops.eblu.me/.well-known/openid-configuration` returns valid OIDC discovery - [ ] `curl https://dex.ops.eblu.me/healthz` returns healthy - [ ] Grafana login shows "Sign in with Dex" button alongside local login - [ ] OIDC flow: click Dex → enter credentials → redirect back → logged in as Admin - [ ] Break-glass: local admin login still works - [ ] `mise run services-check` passes ## Files changed \| File \| Action \| Purpose \| \|------\|--------\|---------\| \| `containers/dex/default.nix` \| Create \| NixOS container build \| \| `argocd/apps/dex.yaml` \| Create \| ArgoCD app targeting ringtail \| \| `argocd/manifests/dex/*` (8 files) \| Create \| K8s manifests (RBAC, ExternalSecret, Deployment, Service, Ingress) \| \| `argocd/manifests/grafana-config/external-secret-dex-oauth.yaml` \| Create \| Grafana OIDC client secret \| \| `argocd/manifests/grafana-config/kustomization.yaml` \| Modify \| Add new ExternalSecret resource \| \| `argocd/manifests/grafana/values.yaml` \| Modify \| Add `auth.generic_oauth` config + envFromSecrets \| \| `ansible/roles/caddy/defaults/main.yml` \| Modify \| Add `dex.ops.eblu.me` reverse proxy entry \| \| `docs/changelog.d/feature-dex-oidc.feature.md` \| Create \| Changelog fragment \| Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/222	2026-02-19 20:24:24 -08:00
Erich Blume	b876e39981	Replace Homepage Helm chart with kustomize manifests and custom Dockerfile (#221 ) ## Summary - Replace third-party Helm chart (jameswynn/homepage v2.1.0, pinned at app v1.2.0) with plain kustomize manifests and a custom Dockerfile building from forge mirror at v1.10.1 - Adds Dockerfile (`containers/homepage/`) with multi-stage build (node:22-slim builder, node:22-alpine runtime) - Creates kustomize manifests: Deployment, Service, ConfigMap (6 config files), ServiceAccount, ClusterRole, ClusterRoleBinding - Keeps existing ingress-tailscale.yaml and all 6 ExternalSecret resources unchanged - Updates ArgoCD app definition from multi-source Helm to single directory source ## Prerequisite - Homepage source mirrored at forge.ops.eblu.me/eblume/homepage.git ✅ - Container must be built and pushed before syncing: `mise run container-release homepage v1.10.1` ## Deployment and Testing - [ ] Build and push container image: `mise run container-release homepage v1.10.1` - [ ] Branch-test via ArgoCD: `argocd app set homepage --revision feature/homepage-kustomize && argocd app sync homepage` - [ ] Verify dashboard loads at go.ops.eblu.me / go.tail8d86e.ts.net - [ ] Verify k8s autodiscovery works (services appear on dashboard) - [ ] Verify widgets load (weather, Forgejo, Jellyfin, etc.) - [ ] After merge: `argocd app set homepage --revision main && argocd app sync homepage` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/221	2026-02-19 18:29:19 -08:00
Erich Blume	16a4a9a616	Port Mosquitto and ntfy to ringtail k3s, retire Apple Silicon Detector (#216 ) ## Summary - Delete `ansible/roles/frigate_detector/` and remove from indri playbook — the Apple Silicon Detector is retired - Move Mosquitto (MQTT) ArgoCD app from indri minikube to ringtail k3s - Move ntfy ArgoCD app from indri minikube to ringtail k3s - Update Frigate docs to reflect detector removal and planned RTX 4080 migration - Manifests are reused as-is (same `argocd/manifests/mosquitto/` and `argocd/manifests/ntfy/`), just pointed at ringtail ## Deployment After merge: 1. Sync indri ArgoCD `apps` app with prune to remove old mosquitto/ntfy apps: ``` argocd app sync apps --prune ``` 2. Sync new ringtail apps: ``` argocd app sync mosquitto-ringtail argocd app sync ntfy-ringtail ``` 3. Manually clean up the detector LaunchAgent on indri: ``` ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.frigate-detector.plist' ssh indri 'rm ~/Library/LaunchAgents/mcquack.eblume.frigate-detector.plist' ``` ## Notes - Frigate on indri will lose MQTT/ntfy connectivity — this is expected (user confirmed no downtime concerns) - ntfy Tailscale Ingress hostname `ntfy` will transfer from indri ProxyGroup to ringtail ProxyGroup - Caddy on indri proxies `ntfy.ops.eblu.me` → `ntfy.tail8d86e.ts.net`, so no Caddy changes needed - Frigate + frigate-notify will be ported to ringtail in a follow-up PR 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/216	2026-02-19 11:22:44 -08:00
Erich Blume	695089499e	Nix container build for nettest (#214 ) ## Summary - Add `containers/nettest/default.nix` using `dockerTools.buildLayeredImage` with curl, jq, dnsutils, cacert, and bash — equivalent to the existing Dockerfile - Update `container-tag-and-release` to require `--nix` or `--dockerfile` flag when both build types exist for a container - Update `container-list` to show `[dockerfile+nix]` label when both exist ## Deployment and Testing - [ ] SSH to ringtail, run `nix build -f containers/nettest/default.nix -o result` to verify the nix expression builds - [ ] Tag `nettest-nix-v1.0.0`, confirm `build-container-nix` workflow runs on `nix-container-builder` runner and pushes to registry - [ ] Smoke test on ringtail k3s: `kubectl run nettest --image=registry.ops.eblu.me/blumeops/nettest:v1.0.0 --restart=Never && kubectl logs nettest` - [ ] Verify `mise run container-list` shows `[dockerfile+nix]` for nettest - [ ] Verify `mise run container-tag-and-release nettest v1.1.0` prompts for build type Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/214	2026-02-19 08:42:58 -08:00
Erich Blume	5fbe70d1ba	Port ntfy to locally built container image (#202 ) All checks were successful Build Container / build (push) Successful in 6m28s Details ## Summary - Add `containers/ntfy/Dockerfile` — three-stage build (Node web UI, Go+CGO server, Alpine runtime) pinned to commit SHA `a03a37fe` (v2.17.0), sourced from forge mirror - Update ntfy deployment image from `binwiederhier/ntfy:v2.17.0` to `registry.ops.eblu.me/blumeops/ntfy:v1.0.0` - Note fish shell in CLAUDE.md ## Deployment After merge, release the container image: ```fish mise run container-tag-and-release ntfy v1.0.0 ``` Then sync: ```fish argocd app sync ntfy ``` ## Test plan - [x] `docker build` succeeds - [x] `dagger call build --src=. --container-name=ntfy` succeeds (exit 0, container ID printed) - [x] `ntfy --help` works in built container - [ ] Tag and release `ntfy-v1.0.0` after merge - [ ] Verify ntfy pod starts with new image - [ ] Verify health endpoint responds at `ntfy.ops.eblu.me/v1/health` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/202	2026-02-17 10:18:20 -08:00
Erich Blume	74294094e3	Fix navidrome custom container image v1.0.2 (#194 ) ## Summary - Switch navidrome deployment from upstream `deluan/navidrome:0.60.3` back to custom image `registry.ops.eblu.me/blumeops/navidrome:v1.0.2` - The v1.0.1 image was tagged before the `USER 65534` removal commit, so it still ran as a non-root user that couldn't write to the SQLite data directory - v1.0.2 is built from current main which includes both the `zlib-dev` build fix and the non-root user removal ## Deployment and Testing - [ ] Wait for CI to build `navidrome:v1.0.2` image - [ ] Sync via ArgoCD and verify pod starts without CrashLoopBackOff - [ ] Verify navidrome UI accessible at https://navidrome.ops.eblu.me Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/194	2026-02-16 08:24:33 -08:00
Erich Blume	ad3ffbbf87	Remove non-root user from navidrome container The SQLite data directory needs write access, matching upstream behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 08:20:58 -08:00
Erich Blume	982ce3dff3	Add zlib-dev to navidrome build stage for taglib linking All checks were successful Build Container / build (push) Successful in 2m23s Details Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-15 08:17:20 -08:00
Erich Blume	996441876d	Document container build pattern and port navidrome (#192 ) Some checks failed Build Container / build (push) Failing after 4m28s Details ## Summary - Add how-to guide (`docs/how-to/build-container-image.md`) covering the full container build workflow: directory layout, Dagger local builds, mise release task, and common patterns with links to existing containers - Port navidrome from upstream `deluan/navidrome:0.60.3` to a custom three-stage build (`containers/navidrome/Dockerfile`) using Node + Go + Alpine - Update navidrome deployment to use `registry.ops.eblu.me/blumeops/navidrome:v1.0.0` ## Deployment and Testing - [x] `dagger call build --src=. --container-name=navidrome` builds successfully - [ ] After merge: `mise run container-tag-and-release navidrome v1.0.0` - [ ] After image published: `argocd app sync navidrome` and verify pod starts Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/192	2026-02-15 08:05:11 -08:00
Erich Blume	b3747f6c95	Tier 1 version bumps (#186 ) All checks were successful Build Container / build (push) Successful in 8s Details ## Summary Audit and upgrade of all deployed images, helm charts, and custom container Dockerfiles to latest stable versions. This PR covers Tier 1 (low-risk minor/patch bumps only). ### Upstream images \| Image \| Old \| New \| \|-------\|-----\|-----\| \| kube-state-metrics \| v2.13.0 \| v2.18.0 \| \| prometheus \| v3.2.1 \| v3.9.1 \| \| loki \| 3.3.2 \| 3.6.5 \| \| alloy \| v1.5.1 \| v1.13.1 \| \| tailscale (proxy + operator) \| v1.92.5 \| v1.94.1 \| \| navidrome \| :latest \| v0.60.3 (pinned) \| ### Helm charts \| Chart \| Old \| New \| \|-------\|-----\|-----\| \| CloudNativePG \| v0.27.0 \| v0.27.1 \| \| 1Password Connect \| 2.2.1 \| 2.3.0 \| ### Custom containers (Dockerfiles updated, images not yet tagged) \| Container \| Changes \| New tag \| \|-----------\|---------\|---------\| \| miniflux \| 2.2.16→2.2.17 (security), alpine 3.22 \| v1.1.0 \| \| kubectl \| v1.34.1→v1.34.4, alpine 3.22 \| v1.1.0 \| \| kiwix-serve \| alpine 3.22 \| v1.1.0 \| \| nettest \| alpine 3.22 \| v0.14.0 \| \| transmission \| alpine 3.22, pkg 4.0.6-r4 \| v1.1.0 \| All custom containers verified with local `dagger call build`. ### Deferred to Tier 2 (separate PRs) - Forgejo runner 6→12 (major version scheme change) - Docker DinD 27→29 - Grafana chart 8→11 (repo migration) - External Secrets 1→2 (breaking changes) - Python 3.12→3.13, Elixir 1.18→1.19, Node 22→24 - Transmission 4.0.6→4.1.0 (not in Alpine yet) ## Deployment After merge: 1. Tag custom containers: `mise run container-tag-and-release <name> <version>` for each 2. Wait for CI builds to complete 3. `argocd app sync apps` then sync individual apps, or let ArgoCD auto-detect Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/186	2026-02-13 17:16:37 -08:00
Erich Blume	e364bdd238	Upgrade Node.js from 20 to 22 LTS (#182 ) Some checks failed Build Container / build (push) Failing after 11m14s Details ## Summary - Upgrade Dagger docs build image from `node:20-slim` to `node:22-slim` - Upgrade forgejo-runner container from Node 20 to Node 22 - Fixes Quartz 4.5.2 `EBADENGINE` warning (requires Node >= 22) - Node 20 EOL is 2026-04-30 Both builds verified locally via Dagger. ## Deployment 1. Merge this PR 2. Tag and release forgejo-runner v3.2.0: `mise run container-tag-and-release forgejo-runner v3.2.0` 3. Update RUNNER_LABELS version in `argocd/manifests/forgejo-runner/deployment.yaml` from `v3.1.0` to `v3.2.0` 4. `argocd app sync forgejo-runner` The Dagger docs build change takes effect immediately on merge (no container release needed). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/182	2026-02-13 11:07:41 -08:00
Erich Blume	2fad8db639	Add yq to forgejo-runner and replace sed YAML edits (#180 ) All checks were successful Build Container / build (push) Successful in 1m31s Details ## Summary - Install yq in the forgejo-runner container image for structured YAML editing - Replace fragile `sed` regex patterns with `yq` in `build-blumeops.yaml` and `cv-deploy.yaml` workflows ## Deployment 1. Merge this PR 2. Tag and release forgejo-runner v3.1.0: `mise run container-tag-and-release forgejo-runner v3.1.0` 3. Update runner label in `argocd/manifests/forgejo-runner/external-secret.yaml` from `v3.0.2` to `v3.1.0` 4. Sync the forgejo-runner app: `argocd app sync forgejo-runner` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/180	2026-02-13 10:20:27 -08:00
Erich Blume	01e19023ee	Add CV/resume web app at cv.ops.eblu.me (#169 ) ## Summary - nginx container (`containers/cv/`) downloads and serves a content tarball at startup (same pattern as quartz) - ArgoCD app + k8s manifests (deployment, service, Tailscale ingress) - Caddy route for `cv.ops.eblu.me` - Deploy workflow: resolves "latest" or specific version from Forgejo packages, updates deployment, syncs ArgoCD - Content is built and released from the separate [cv repo](https://forge.ops.eblu.me/eblume/cv) ## Deployment steps (after merge) 1. `mise run container-tag-and-release cv v1.0.0` 2. Run "Release CV" workflow in cv repo (SPECIFIC_VERSION `v0.1.0`) 3. Run "Deploy CV" workflow in blumeops (default: latest) 4. `mise run provision-indri -- --tags caddy` 5. Verify at `https://cv.ops.eblu.me/` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/169	2026-02-12 11:09:41 -08:00
Erich Blume	24d2a55f9e	Restore Docker CLI to runner image for Dagger engine (#164 ) Some checks failed Build Container / build (push) Failing after 3s Details ## Summary - Dagger shells out to the `docker` binary to provision its BuildKit engine container - Phase 3 removed `docker-ce-cli`, breaking all `dagger call` invocations in CI - This restores `docker-ce-cli` (without buildx/skopeo — those aren't needed) ## Test plan - [ ] Build locally, release as v3.0.2, update manifest, sync - [ ] Trigger docs build workflow and verify Dagger engine starts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/164	2026-02-11 17:49:33 -08:00
Erich Blume	23fb036b92	Restore Node.js to runner image for JavaScript Actions (#163 ) Some checks failed Build Container / build (push) Failing after 2s Details ## Summary - Restores Node.js 20 LTS to the Forgejo runner job image - `actions/checkout@v4` and other JavaScript Actions require `node` in the job container - The Phase 3 simplification (PR #162) accidentally removed it, breaking all CI runs ## Changes - `containers/forgejo-runner/Dockerfile`: Add `gnupg` (for nodesource GPG key) and Node.js 20 via nodesource - Changelog fragment ## Test plan - [ ] Merge, release as `forgejo-runner-v3.0.1` - [ ] Update runner manifest to v3.0.1, sync, restart pod - [ ] Trigger a workflow_dispatch and verify `actions/checkout` succeeds 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/163	2026-02-11 17:35:33 -08:00
Erich Blume	95364dcb48	Simplify runner image (Dagger Phase 3) (#162 ) All checks were successful Build Container / build (push) Successful in 1m13s Details ## Summary With Phases 1 and 2 complete, the runner image no longer needs most of its bundled tools. This PR strips it down and adds what was missing. Removed (now inside Dagger containers): - Node.js 24.x - Docker CLI + buildx plugin - skopeo - gnupg, lsb-release, xz-utils Added: - `tzdata` — fixes the TZ env var (#159, #160, #161) so `TZ=America/Los_Angeles` actually works - `flyctl` — was being installed from scratch every release Workflow changes: - Remove "Ensure Dagger CLI" bootstrap steps from both workflows (Dagger is in the image) - Remove "Install flyctl" step from build-blumeops (flyctl is in the image) - Remove job-level `TZ` from build-blumeops (moved to runner configmap `runner.envs`) - Set `TZ: America/Los_Angeles` in runner configmap so all job containers inherit it ## Deployment After merge: 1. Build and release the new runner image: `mise run container-release forgejo-runner v2.0.0` 2. Sync the runner: `argocd app sync forgejo-runner` 3. Verify: `kubectl -n forgejo-runner exec deploy/forgejo-runner -c runner -- date` (but the real test is running a docs release and checking the changelog date) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/162	2026-02-11 17:24:20 -08:00
Erich Blume	1bc2b421a8	Adopt Dagger CI for container builds (Phase 1) (#156 ) All checks were successful Build Container / build (push) Successful in 13s Details ## Summary - Add Dagger Python module (`.dagger/`) with `build` and `publish` functions for container images - Replace Docker buildx + skopeo composite action with `dagger call publish` in `build-container.yaml` - BuildKit's native push is compatible with Zot — skopeo workaround eliminated - Add Dagger CLI (v0.19.11) to forgejo-runner Dockerfile, bump runner to v2.6.0 - Bootstrap step in workflow curl-installs dagger if not in runner (for first build on v2.5.1 runner) - Delete old `.forgejo/actions/build-push-image/` composite action - Add GPLv3 LICENSE ## Verified locally - `dagger call build --src=. --container-name=nettest` — builds ✓ - `dagger call publish --src=. --container-name=nettest --version=dagger-test` — pushed to Zot ✓ - `dagger call build --src=. --container-name=forgejo-runner` — new runner image builds ✓ - Dagger CLI accessible inside built runner image ✓ ## Deployment sequence (after merge) 1. `mise run container-tag-and-release forgejo-runner v2.6.0` — old runner bootstraps dagger via curl, builds new runner 2. `argocd app sync forgejo-runner` — runner restarts with v2.6.0 (dagger baked in) 3. `mise run container-tag-and-release nettest v0.13.0` — end-to-end test of new pipeline 4. `mise run container-list` — verify tags ## Not included (future phases) - Phase 2: docs build + Forgejo packages migration - Phase 3: runner simplification (remove skopeo, Node.js, etc.) - Phase 4: future workflows Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/156	2026-02-11 15:38:31 -08:00
Erich Blume	2fc5aa82b1	Add docker-buildx-plugin to forgejo-runner (#147 ) Some checks failed Build Container / build (push) Failing after 3s Details ## Summary - Install `docker-buildx-plugin` alongside `docker-ce-cli` in the forgejo-runner image - Fixes `docker buildx build` failing with "unknown flag: --tag" from #146 ## Test plan - [ ] Merge and release `forgejo-runner-v2.5.1` - [ ] Update runner configmap/labels if needed to use new image - [ ] Re-tag `nettest-v0.11.1` (or `v0.12.0`) to verify build-container workflow succeeds Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/147	2026-02-10 21:14:29 -08:00
Erich Blume	cb36f1784f	Switch CI builds to docker buildx (#146 ) Some checks failed Build Container / build (push) Failing after 4s Details ## Summary - Replace deprecated `docker build` with `docker buildx build` in the build-push-image composite action - Remove redundant build/run comments from nettest Dockerfile ## Test plan - [ ] Merge and tag `nettest-v1.1.0` (or similar) to trigger the build-container workflow - [ ] Verify the build succeeds without the deprecation warning Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/146	2026-02-10 21:03:41 -08:00
Erich Blume	09317fecc1	Fix argocd CLI download in forgejo-runner image All checks were successful Build Container / build (push) Successful in 1m40s Details - Add -L flag to follow redirects - Add -f flag to fail on HTTP errors - Use dpkg --print-architecture as fallback for TARGETARCH - Verify binary works after download Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 17:11:36 -08:00
Erich Blume	1f73eb675d	Auto-deploy docs from build workflow (#93 ) ## Summary - Add `uv` and `argocd` CLI to forgejo-runner container image - Add `workflow-bot` ArgoCD account with sync permissions (declarative via kustomize patches) - Add `ARGOCD_AUTH_TOKEN` to forgejo-runner external secret for workflow auth - Update build workflow to auto-deploy docs after release: - Update configmap with new release URL - Commit changelog and configmap changes - Sync docs app via ArgoCD ## Deployment and Testing Manual steps required before this can work: 1. [ ] Build and push new forgejo-runner image (v2.4.0) 2. [ ] Sync argocd app to create workflow-bot account 3. [ ] Generate token: `argocd account generate-token --account workflow-bot` 4. [ ] Store token in 1Password under "Forgejo Secrets" with field `argocd_token` 5. [ ] Sync forgejo-runner app to pick up new external secret 6. [ ] Update forgejo-runner deployment to use new image version 7. [ ] Test by running workflow manually 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/93	2026-02-03 16:58:03 -08:00
Erich Blume	1c86134a62	Phase 1b: Deploy docs hosting with Quartz (#85 ) ## Summary - Add ArgoCD Application and manifests for `quartz` service - Add `docs.ops.eblu.me` to Caddy reverse proxy configuration - ConfigMap points to blumeops v1.0.0 release tarball - Tailscale ingress with homepage annotations for auto-discovery ## Deployment and Testing Pre-deployment (container build): - [ ] Build and tag quartz container: `mise run container-tag-and-release quartz v1.0.0` K8s deployment: - [ ] Sync apps: `argocd app sync apps` - [ ] Point quartz at feature branch: `argocd app set quartz --revision feature/docs-phase-1b-hosting` - [ ] Sync quartz: `argocd app sync quartz` - [ ] Verify pod is running: `kubectl --context=minikube-indri get pods -n quartz` - [ ] Verify Tailscale ingress: `kubectl --context=minikube-indri get ingress -n quartz` Caddy deployment: - [ ] Dry run: `mise run provision-indri -- --tags caddy --check --diff` - [ ] Apply: `mise run provision-indri -- --tags caddy` Verification: - [ ] Test https://docs.tail8d86e.ts.net - [ ] Test https://docs.ops.eblu.me - [ ] Verify homepage dashboard shows docs link Post-merge: - [ ] Reset to main: `argocd app set quartz --revision main && argocd app sync quartz` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/85	2026-02-03 10:52:20 -08:00
Erich Blume	d870694e10	Upgrade forgejo-runner to Node.js 24.x LTS All checks were successful Build Container / build (push) Successful in 54s Details Quartz now requires Node.js >= 22. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-03 09:19:58 -08:00
Erich Blume	b8104d75ad	Move zk cards to docs/zk/ for documentation restructuring (#84 ) ## Summary - Move all existing zettelkasten cards from `docs/` to `docs/zk/` as a temporary holding area - Update `zk-docs` mise task to look in the new location - Add `docs/README.md` explaining the Diataxis-based restructuring plan and target audiences ## Context This is phase 1 of a multi-phase documentation restructuring effort. The goal is to reorganize docs to follow the Diataxis framework while serving multiple audiences: 1. Erich (owner) - knowledge graph/zk 2. Claude/AI agents - memory and context enrichment 3. New external readers - high-level overview 4. Potential operators/contributors - onboarding 5. Replicators - people wanting to duplicate the approach ## Testing - [x] Verified `mise run zk-docs` still works with the new path - [x] Updated obsidian.nvim config (in ~/.config/nvim) to point to new path ## Note The obsidian.nvim config change is outside this repo but was made as part of this work. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/84	2026-02-03 09:13:50 -08:00
Erich Blume	dc974858b0	Add skopeo to forgejo-runner image All checks were successful Build Container / build (push) Successful in 1m8s Details Pre-install skopeo for pushing images to zot registry. Docker 27's manifest format has compatibility issues with zot, so we use skopeo for the push step. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-30 11:11:19 -08:00
Erich Blume	c8b655f177	Build local containers for k8s services (#61 ) ## Summary - Move devpi Dockerfile from argocd/manifests to containers/devpi/ - Add containers for: transmission, teslamate, miniflux, kiwix-serve, kubectl - Update all k8s deployments to use local images (registry.ops.eblu.me/blumeops/*) - All containers use v1.0.0 tag for initial release ## Containers Added \| Container \| Source \| Notes \| \|-----------\|--------\|-------\| \| devpi \| python:3.12-slim \| Existing, moved to containers/ \| \| kubectl \| alpine + download \| For zim-watcher CronJob \| \| miniflux \| Go build from source \| v2.2.16 \| \| kiwix-serve \| Download pre-built binary \| v3.8.1 \| \| transmission \| alpine + apk install \| Simpler than linuxserver image \| \| teslamate \| Elixir build from source \| v2.2.0 \| ## Deployment and Testing - [ ] Build and tag devpi-v1.0.0 - [ ] Build and tag kubectl-v1.0.0 - [ ] Build and tag miniflux-v1.0.0 - [ ] Build and tag kiwix-serve-v1.0.0 - [ ] Build and tag transmission-v1.0.0 - [ ] Build and tag teslamate-v1.0.0 - [ ] Sync ArgoCD apps and verify services 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/61	2026-01-25 21:35:57 -08:00
Erich Blume	ea42362b6f	Migrate Forgejo runner to Kubernetes with DinD (#60 ) ## Summary - Deploy Forgejo runner to k8s with Docker-in-Docker sidecar - Add job execution image with Node.js and Docker CLI - Retire host-mode runner on indri - All CI jobs now run containerized in k8s ## Components Added - `containers/forgejo-runner/Dockerfile` - Job execution image - `argocd/apps/forgejo-runner.yaml` - ArgoCD Application - `argocd/manifests/forgejo-runner/` - Kubernetes manifests ## Components Removed - `ansible/roles/forgejo_runner/` - No longer needed ## Changes to Existing Files - `.forgejo/workflows/build-container.yaml` - Use `k8s` runner with `DOCKER_HOST` env - `.github/actionlint.yaml` - Only `k8s` label now valid ## Deployment 1. Apply secret: `op inject -i argocd/manifests/forgejo-runner/secret.yaml.tpl \| kubectl --context=minikube-indri apply -f -` 2. Sync ArgoCD: `argocd app sync forgejo-runner` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/60	2026-01-25 19:56:17 -08:00
Erich Blume	d6e6b48f6a	Migrate registry to Caddy (registry.ops.eblu.me) (#58 ) ## Summary - Update all references from `registry.tail8d86e.ts.net` to `registry.ops.eblu.me` - Remove `tailscale_serve` ansible role (no longer needed - all services migrated to Caddy) - Update minikube containerd config for new registry URL - Update devpi manifest, CI actions, and mise tasks ## Deployment and Testing - [ ] Run `mise run provision-indri -- --check --diff` (dry run) - [ ] Run `mise run provision-indri -- --tags minikube` to update containerd config - [ ] Sync devpi ArgoCD app: `argocd app sync devpi` - [ ] Manually remove old Tailscale serve entry: `ssh indri 'tailscale serve --service=svc:registry off'` - [ ] Test registry access: `curl https://registry.ops.eblu.me/v2/_catalog` - [ ] Run `mise run indri-services-check` to verify all services healthy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/58	2026-01-25 12:06:15 -08:00
Erich Blume	1184b4de1d	Add Caddy layer4 for Forgejo SSH (#56 ) ## Summary - Add layer4 TCP proxy configuration to Caddyfile template for SSH services - Configure Forgejo SSH on port 2222 → localhost:2200 - Switch HTTPS from port 8443 (testing) to 443 (production) - Requires Caddy rebuilt with `github.com/mholt/caddy-l4` plugin ## What This Enables Git+SSH access via `forge.ops.eblu.me:2222` is now accessible from: - Tailnet clients (gilbert) - Docker containers on indri - Kubernetes pods in minikube This solves the DNS resolution issues where containers couldn't reach Tailscale MagicDNS names. ## Testing Done - [x] Caddy rebuilt with layer4 plugin - [x] Validated Caddyfile syntax - [x] Cleared `svc:forge` from tailscale serve - [x] Verified HTTPS works: `curl https://forge.ops.eblu.me` - [x] Verified SSH works: `ssh -p 2222 forgejo@forge.ops.eblu.me` - [x] Verified git clone works via new endpoint - [x] Verified minikube pods can reach both HTTPS and SSH endpoints ## Deployment Caddy is already running with the new config on indri. This PR captures the ansible changes. ## Next Steps - Update zk docs with new git remote format - Migrate registry and other services to Caddy - Retire tailscale_services ansible role 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/56	2026-01-25 11:37:23 -08:00
Erich Blume	31697b4d63	Add nettest container for CI/CD network debugging (#52 ) Some checks failed Build Container / build (push) Failing after 18s Details ## Summary - Add `containers/nettest/` with Alpine-based Dockerfile and connectivity test script - Add `.forgejo/workflows/build-nettest.yaml` workflow triggered by `nettest-v*` tags - Test script checks DNS resolution and HTTPS connectivity to forge and registry ## Deployment and Testing - [ ] Merge PR to main - [ ] Run `mise run container-release nettest v0.1.0` to trigger first build - [ ] Verify workflow runs successfully and container can reach tailnet services - [ ] Manually test from minikube: `kubectl run nettest --rm -it --image=registry.tail8d86e.ts.net/blumeops/nettest:v0.1.0` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/52	2026-01-24 16:54:35 -08:00

39 commits