blumeops

Author	SHA1	Message	Date
Erich Blume	fdd3f6483a	Update forgejo-runner image to v3.2.0 All checks were successful Build Container / build (push) Successful in 7m31s Details Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 11:08:57 -08:00
Forgejo Actions	02b1397f1a	Update docs release to v1.8.2 - Built changelog from towncrier fragments [skip ci]	2026-02-13 10:36:04 -08:00
Erich Blume	0098ac37e0	Move non-secret runner env vars to deployment spec (#181 ) ## Summary - Move FORGEJO_URL, RUNNER_NAME, and RUNNER_LABELS from ExternalSecret template to deployment env vars - ExternalSecret now only contains the actual secret (RUNNER_TOKEN) - Image version changes in RUNNER_LABELS now trigger automatic pod rollouts ## Deployment 1. Merge this PR 2. `argocd app sync forgejo-runner` — the deployment spec change will auto-roll the pod No manual restart needed — that's the whole point :) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/181	2026-02-13 10:29:23 -08:00
Erich Blume	52bbf88aa6	Update forgejo-runner image to v3.1.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 10:21:43 -08:00
Erich Blume	4942dee182	Update homepage layout for new Content/Misc groups Replace old Apps/Observability/Infrastructure layout entries with Content and Misc to match the recategorized ingress annotations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 09:16:40 -08:00
Erich Blume	ca6a845604	Move ArgoCD to Misc homepage group and rename ingress file ArgoCD's tailscale ingress was missed in the recategorization (filed as service-tailscale.yaml instead of ingress-tailscale.yaml). Fix the group annotation and rename the file to match the convention used by all other services. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 09:13:32 -08:00
Erich Blume	48ce5b4120	Recategorize homepage into Content and Misc groups (#179 ) ## Summary - Replace the three homepage groups (Apps, Observability, Infrastructure) with two cleaner groups - Content: Immich, Kiwix, Miniflux, DJ, Grafana - Misc: CV, TeslaMate, Transmission, Docs, Prometheus, PyPI ## Deployment and Testing - [ ] Sync affected ingresses via ArgoCD (all 11 services) - [ ] Verify homepage shows the two new groups correctly Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/179	2026-02-13 09:09:22 -08:00
Forgejo Actions	e21277ae83	Update docs release to v1.8.0 - Built changelog from towncrier fragments [skip ci]	2026-02-12 19:20:27 -08:00
Erich Blume	9c789a1868	Fix cache hit rate on APM and Fly.io dashboards (#177 ) All checks were successful Deploy Fly.io Proxy / deploy (push) Successful in 1m19s Details ## Summary - Remove `match_all = true` from `flyio_nginx_cache_requests_total` in Alloy so the metric only counts requests that go through the proxy cache (excludes health checks with empty `cache_status`) - Change dashboard queries from `rate(...[5m])` to `increase(...[$__range])` — aggregates over the full dashboard time window instead of a 5-minute sliding window, giving meaningful ratios for low-traffic static sites - Add null/NaN value mapping to show "No traffic" in neutral color instead of blank/red ## Root cause Health check requests from Fly.io hit the default nginx server block (no `proxy_cache`), producing entries with empty `upstream_cache_status`. With `match_all = true`, these were counted in the cache metric, diluting the Fly.io dashboard ratio. For APM dashboards, `rate()[5m]` on low-traffic sites with 24h cache validity almost always returns either all-HITs (100%) or no data (blank → red background). ## Deployment - Fly.io proxy redeploy needed for Alloy config change - ArgoCD sync for dashboard ConfigMap changes ## Test plan - [ ] Redeploy Fly.io proxy - [ ] Sync grafana-config in ArgoCD - [ ] Verify CV APM cache hit ratio shows a real percentage (not 100%) - [ ] Verify Docs APM shows "No traffic" in neutral color when idle, real ratio when visited - [ ] Verify Fly.io proxy dashboard cache ratio excludes health checks Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/177	2026-02-12 18:40:48 -08:00
Erich Blume	9717863f65	Update CV release to v1.0.3, add X-Clacks-Overhead header (#176 ) All checks were successful Deploy Fly.io Proxy / deploy (push) Successful in 1m5s Details ## Summary - Update CV release URL from v1.0.2 to v1.0.3 - Add `X-Clacks-Overhead: GNU Terry Pratchett` header to both `docs.eblu.me` and `cv.eblu.me` server blocks in the Fly.io proxy nginx config ## Deployment and Testing - [ ] Sync CV app: `argocd app sync cv` - [ ] Verify CV is serving v1.0.3 content - [ ] Deploy fly proxy (workflow or `mise run fly-deploy`) - [ ] Verify header: `curl -sI https://docs.eblu.me \| grep -i clacks` - [ ] Verify header: `curl -sI https://cv.eblu.me \| grep -i clacks` Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/176	2026-02-12 17:08:22 -08:00
Erich Blume	ed5c9c9b48	Update CV release to v1.0.2 (#175 ) ## Summary - Update `CV_RELEASE_URL` in cv deployment from v1.0.1 to v1.0.2 ## Deployment and Testing - [ ] `argocd app sync cv` after merge - [ ] Verify cv.eblu.me serves updated content Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/175	2026-02-12 16:18:55 -08:00
Forgejo Actions	70d8881959	Update docs release to v1.7.1 - Built changelog from towncrier fragments [skip ci]	2026-02-12 14:13:12 -08:00
Erich Blume	7dc03c0af1	Add CV to services-check, update homepage link (#174 ) ## Summary - Add CV to services-check (tailnet endpoint + public cv.eblu.me) - Update CV homepage annotation to point to cv.eblu.me instead of cv.ops.eblu.me ## Deployment and Testing - [ ] `argocd app sync cv` (homepage link change) - [ ] `mise run services-check` passes Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/174	2026-02-12 14:10:03 -08:00
Erich Blume	df372fccb6	Expose CV publicly at cv.eblu.me (#173 ) All checks were successful Deploy Fly.io Proxy / deploy (push) Successful in 1m57s Details ## Summary - Add nginx server block for `cv.eblu.me` (static site, same pattern as docs) - Add DNS CNAME record in Pulumi (`cv.eblu.me` → `blumeops-proxy.fly.dev`) - Add `cv.eblu.me` cert to `fly-setup` mise task - Tag CV Tailscale ingress with `tag:flyio-target` for ACL access - Remove `/_error` test endpoint from docs proxy ## Deployment and Testing - [ ] `argocd app set cv --revision cv/public-cv-eblu-me && argocd app sync cv` - [ ] `fly certs add cv.eblu.me -a blumeops-proxy` - [ ] `mise run fly-deploy` - [ ] Verify proxy: `curl -I -H "Host: cv.eblu.me" https://blumeops-proxy.fly.dev/` - [ ] `mise run dns-preview` then `mise run dns-up` - [ ] Verify live: `curl -I https://cv.eblu.me` - [ ] Merge, then `argocd app set cv --revision main && argocd app sync cv` Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/173	2026-02-12 14:05:00 -08:00
Erich Blume	a68542a602	Update CV release to v1.0.1 (#172 ) ## Summary - Update `CV_RELEASE_URL` from v0.1.0 to v1.0.1 in the CV deployment manifest ## Deployment and Testing - [ ] `argocd app sync cv` after merge - [ ] Verify cv.ops.eblu.me serves updated resume Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/172	2026-02-12 13:38:05 -08:00
Forgejo Actions	200be39492	Update docs release to v1.7.0 - Built changelog from towncrier fragments [skip ci]	2026-02-12 11:46:38 -08:00
Erich Blume	01e19023ee	Add CV/resume web app at cv.ops.eblu.me (#169 ) ## Summary - nginx container (`containers/cv/`) downloads and serves a content tarball at startup (same pattern as quartz) - ArgoCD app + k8s manifests (deployment, service, Tailscale ingress) - Caddy route for `cv.ops.eblu.me` - Deploy workflow: resolves "latest" or specific version from Forgejo packages, updates deployment, syncs ArgoCD - Content is built and released from the separate [cv repo](https://forge.ops.eblu.me/eblume/cv) ## Deployment steps (after merge) 1. `mise run container-tag-and-release cv v1.0.0` 2. Run "Release CV" workflow in cv repo (SPECIFIC_VERSION `v0.1.0`) 3. Run "Deploy CV" workflow in blumeops (default: latest) 4. `mise run provision-indri -- --tags caddy` 5. Verify at `https://cv.ops.eblu.me/` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/169	2026-02-12 11:09:41 -08:00
Forgejo Actions	a800bdc8b9	Update docs release to v1.6.9 - Built changelog from towncrier fragments [skip ci]	2026-02-11 21:28:40 -08:00
Forgejo Actions	b36b30ef7a	Update docs release to v1.6.8 - Built changelog from towncrier fragments [skip ci]	2026-02-11 21:12:50 -08:00
Forgejo Actions	0528a6f712	Update docs release to v1.6.7 - Built changelog from towncrier fragments [skip ci]	2026-02-11 18:07:11 -08:00
Forgejo Actions	9e0487b523	Update docs release to v1.6.6 - Built changelog from towncrier fragments [skip ci]	2026-02-11 17:57:59 -08:00
Erich Blume	20a25557d6	Bump runner image to v3.0.2 (restore Docker CLI) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 17:53:04 -08:00
Erich Blume	0006e6bf17	Bump runner image to v3.0.1 (restore Node.js) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 17:38:45 -08:00
Erich Blume	3d84483513	Update runner job image to forgejo-runner:v3.0.0 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-11 17:27:50 -08:00
Erich Blume	95364dcb48	Simplify runner image (Dagger Phase 3) (#162 ) All checks were successful Build Container / build (push) Successful in 1m13s Details ## Summary With Phases 1 and 2 complete, the runner image no longer needs most of its bundled tools. This PR strips it down and adds what was missing. Removed (now inside Dagger containers): - Node.js 24.x - Docker CLI + buildx plugin - skopeo - gnupg, lsb-release, xz-utils Added: - `tzdata` — fixes the TZ env var (#159, #160, #161) so `TZ=America/Los_Angeles` actually works - `flyctl` — was being installed from scratch every release Workflow changes: - Remove "Ensure Dagger CLI" bootstrap steps from both workflows (Dagger is in the image) - Remove "Install flyctl" step from build-blumeops (flyctl is in the image) - Remove job-level `TZ` from build-blumeops (moved to runner configmap `runner.envs`) - Set `TZ: America/Los_Angeles` in runner configmap so all job containers inherit it ## Deployment After merge: 1. Build and release the new runner image: `mise run container-release forgejo-runner v2.0.0` 2. Sync the runner: `argocd app sync forgejo-runner` 3. Verify: `kubectl -n forgejo-runner exec deploy/forgejo-runner -c runner -- date` (but the real test is running a docs release and checking the changelog date) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/162	2026-02-11 17:24:20 -08:00
Forgejo Actions	996afbcf6f	Update docs release to v1.6.5 [skip ci]	2026-02-11 17:10:29 -08:00
Forgejo Actions	6ce03df819	Update docs release to v1.6.4 - Built changelog from towncrier fragments [skip ci]	2026-02-12 01:01:23 +00:00
Erich Blume	2a04ab26b7	Mount host zoneinfo into runner for TZ support (#160 ) ## Summary The `TZ=America/Los_Angeles` env var from #159 has no effect because the `forgejo/runner` image doesn't ship tzdata. Mount the node's `/usr/share/zoneinfo` into the container so the timezone database is available. ## Deployment After merge, sync forgejo-runner and verify: ``` argocd app sync forgejo-runner kubectl -n forgejo-runner exec deploy/forgejo-runner -c runner -- date # Should show PST/PDT, not UTC ``` Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/160	2026-02-11 16:57:11 -08:00
Erich Blume	42ebc2b122	Fix Forgejo runner timezone (UTC -> America/Los_Angeles) (#159 ) ## Summary - Set `TZ=America/Los_Angeles` on the Forgejo runner container The runner pod defaults to UTC. When releases are cut in the evening PST, towncrier stamps changelog entries with tomorrow's date (e.g., v1.6.2 shows 2026-02-12 despite being released on the evening of Feb 11 PST). ## Deployment After merge, sync the forgejo-runner ArgoCD app: ``` argocd app sync forgejo-runner ``` The runner pod will restart with the new timezone. Note: the v1.6.2 changelog entry will remain dated 2026-02-12; future entries will use PST dates, so dates may appear non-sequential once. Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/159	2026-02-11 16:53:41 -08:00
Forgejo Actions	e5d1e795e0	Update docs release to v1.6.3 [skip ci]	2026-02-12 00:46:35 +00:00
Forgejo Actions	a75089d8ef	Update docs release to v1.6.2 - Built changelog from towncrier fragments [skip ci]	2026-02-12 00:35:02 +00:00
Erich Blume	1bc2b421a8	Adopt Dagger CI for container builds (Phase 1) (#156 ) All checks were successful Build Container / build (push) Successful in 13s Details ## Summary - Add Dagger Python module (`.dagger/`) with `build` and `publish` functions for container images - Replace Docker buildx + skopeo composite action with `dagger call publish` in `build-container.yaml` - BuildKit's native push is compatible with Zot — skopeo workaround eliminated - Add Dagger CLI (v0.19.11) to forgejo-runner Dockerfile, bump runner to v2.6.0 - Bootstrap step in workflow curl-installs dagger if not in runner (for first build on v2.5.1 runner) - Delete old `.forgejo/actions/build-push-image/` composite action - Add GPLv3 LICENSE ## Verified locally - `dagger call build --src=. --container-name=nettest` — builds ✓ - `dagger call publish --src=. --container-name=nettest --version=dagger-test` — pushed to Zot ✓ - `dagger call build --src=. --container-name=forgejo-runner` — new runner image builds ✓ - Dagger CLI accessible inside built runner image ✓ ## Deployment sequence (after merge) 1. `mise run container-tag-and-release forgejo-runner v2.6.0` — old runner bootstraps dagger via curl, builds new runner 2. `argocd app sync forgejo-runner` — runner restarts with v2.6.0 (dagger baked in) 3. `mise run container-tag-and-release nettest v0.13.0` — end-to-end test of new pipeline 4. `mise run container-list` — verify tags ## Not included (future phases) - Phase 2: docs build + Forgejo packages migration - Phase 3: runner simplification (remove skopeo, Node.js, etc.) - Phase 4: future workflows Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/156	2026-02-11 15:38:31 -08:00
Forgejo Actions	362ae22ab7	Update docs release to v1.6.1 - Built changelog from towncrier fragments [skip ci]	2026-02-11 21:37:34 +00:00
Forgejo Actions	eca01a9546	Update docs release to v1.6.0 - Built changelog from towncrier fragments [skip ci]	2026-02-11 21:33:57 +00:00
Forgejo Actions	ab6661f5dd	Update docs release to v1.5.4 - Built changelog from towncrier fragments [skip ci]	2026-02-11 20:17:12 +00:00
Forgejo Actions	a106f92c38	Update docs release to v1.5.3 - Built changelog from towncrier fragments [skip ci]	2026-02-11 15:53:49 +00:00
Erich Blume	f0ac04fb8a	Bootstrap buildx: revert to docker build, bump runner to v2.5.1 (#148 ) All checks were successful Build Container / build (push) Successful in 1m56s Details ## Summary - Temporarily revert composite action to `docker build` so we can build the runner image (chicken-and-egg: current runner v2.5.0 doesn't have buildx) - Bump runner label to `v2.5.1` so after sync the new runner image (with buildx) gets used ## Deployment plan 1. Merge this PR 2. Tag `forgejo-runner-v2.5.1` — builds with legacy `docker build` (one last time) 3. Sync forgejo-runner in ArgoCD to pick up the v2.5.1 label 4. Follow-up PR: switch action back to `docker buildx build` 5. Tag `nettest-v0.12.0` to verify buildx works end-to-end Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/148	2026-02-10 21:17:14 -08:00
Erich Blume	85e36cd807	Operations and observability for sifaka NAS (#135 ) ## Summary - Add `smartctl_exporter` Docker container to sifaka for SMART disk health monitoring - Formalize existing `node_exporter` container under Ansible management - Route both exporters through Caddy L4 TCP proxy (`nas.ops.eblu.me:9100`, `nas.ops.eblu.me:9633`), replacing the hardcoded LAN IP in Prometheus - Create "Sifaka Disk Health" Grafana dashboard (health status, temperature, wear indicators, lifetime) - Introduce `ansible/playbooks/sifaka.yml` and `mise run provision-sifaka` — first Ansible playbook for the NAS - Shared exporter port variables in `group_vars/all.yml` to avoid duplication between Caddy and sifaka roles ## Prerequisites before deploy - [ ] Enable SSH on sifaka (DSM Control Panel > Terminal & SNMP) - [ ] Verify `ssh eblume@sifaka 'docker ps'` works - [ ] Run `mise run provision-sifaka` to deploy containers - [ ] Run `mise run provision-indri -- --tags caddy` to add L4 routes - [ ] `argocd app sync prometheus` + `argocd app sync grafana-config` ## Test plan - [ ] Verify smartctl_exporter metrics: `curl http://nas.ops.eblu.me:9633/metrics` - [ ] Verify Prometheus targets page shows both sifaka jobs as UP - [ ] Verify Grafana "Sifaka Disk Health" dashboard loads with data 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/135	2026-02-09 17:44:05 -08:00
Erich Blume	3415cad38c	Log real client IPs via Fly-Client-IP header (#130 ) All checks were successful Deploy Fly.io Proxy / deploy (push) Successful in 59s Details ## Summary - Add `client_ip` field to the Fly.io nginx JSON log format, sourced from `Fly-Client-IP` header - Extract `client_ip` in the Alloy pipeline so it's available as a parsed field in Loki - Keeps `remote_addr` (the internal proxy IP) for debugging Fixes: Grafana access logs for docs.eblu.me showing 172.16.11.178 for every request instead of real visitor IPs. ## Deployment and Testing - [ ] Deploy updated fly.io proxy: `fly deploy` from `fly/` directory - [ ] Verify in Grafana that new log lines include `client_ip` with real IPs - [ ] Confirm `remote_addr` still shows the proxy IP (preserved for debugging) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/130	2026-02-09 11:02:06 -08:00
Forgejo Actions	92a1081302	Update docs release to v1.5.2 - Built changelog from towncrier fragments [skip ci]	2026-02-09 15:30:21 +00:00
Erich Blume	a0b076172f	Fix Immich/Homepage Ingress host matching, add missing service checks (#127 ) ## Summary - Fix Immich Ingress `host: photos` causing 404 with ProxyGroup (same FQDN mismatch as Prometheus/Loki) - Migrate Homepage from old per-service Tailscale proxy to shared ProxyGroup (was the last holdout) - Add Immich and Navidrome to `services-check` HTTP endpoints ## Deployment Notes - Already tested on branch: Immich and Homepage both return 200 via Caddy - Homepage's old Helm-managed Ingress was deleted manually; ArgoCD may recreate it on sync — prune with `argocd app sync homepage --prune` after merge - Old per-service `ts-homepage-*` pod in tailscale namespace can be cleaned up after confirming ProxyGroup works ## Test Plan - [x] `curl https://photos.ops.eblu.me/` returns 200 - [x] `curl https://go.ops.eblu.me/` returns 200 - [ ] `mise run services-check` fully passes after merge Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/127	2026-02-08 22:12:50 -08:00
Erich Blume	e6cf7e47e0	Restrict flyio-proxy ACLs to dedicated tag:flyio-target endpoints (#126 ) All checks were successful Deploy Fly.io Proxy / deploy (push) Successful in 1m8s Details ## Summary - Introduce `tag:flyio-target` so services must explicitly opt in to be reachable by the fly.io proxy - Replace broad `tag:k8s` and `tag:homelab` grants with the new tag in the ACL rule and test - Add `tailscale.com/tags: "tag:k8s,tag:flyio-target"` annotation to docs, loki, and prometheus Ingresses - Switch Alloy push endpoints from `.ops.eblu.me` (Caddy) to `.tail8d86e.ts.net` (Tailscale Ingress) - Update docs: flyio-proxy, caddy, tailscale, forgejo (future public access + security checklist), expose-service-publicly ## Manual step (not in PR) Update the k8s operator OAuth client in the Tailscale admin console to include `tag:flyio-target` in its scope. Without this, the operator cannot assign the new tag to Ingress proxy nodes. ## Deployment order 1. Pulumi ACLs — `mise run tailnet-preview && mise run tailnet-up` 2. OAuth client — Manual update in Tailscale admin console 3. K8s Ingresses — `argocd app sync apps && argocd app sync docs loki prometheus` 4. Fly.io proxy — `mise run fly-deploy` 5. Verify — `mise run services-check`, check Grafana dashboards ## Test plan - [ ] `mise run tailnet-preview` shows clean diff - [ ] `argocd app diff docs`, `argocd app diff loki`, `argocd app diff prometheus` show only annotation additions - [ ] After deploy: Grafana dashboards show continued log/metric flow - [ ] `curl -sf https://docs.eblu.me` returns 200 - [ ] `mise run services-check` passes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/126	2026-02-08 21:54:18 -08:00
Forgejo Actions	c8d0af6644	Update docs release to v1.5.1 - Built changelog from towncrier fragments [skip ci]	2026-02-08 18:06:46 +00:00
Erich Blume	cc54b4f565	Add Fly.io proxy observability via embedded Alloy (#123 ) All checks were successful Deploy Fly.io Proxy / deploy (push) Successful in 1m16s Details ## Summary - Embed Grafana Alloy in the Fly.io proxy container to collect nginx JSON access logs (→ Loki) and derive request rate, latency histogram, cache status, and bandwidth metrics (→ Prometheus) - Add nginx `stub_status` endpoint for connection-level metrics (active/reading/writing/waiting) - Create two Grafana dashboards: Docs APM (per-service view filtered by `host="docs.eblu.me"`) and Fly.io Proxy Health (aggregate proxy health across all upstream services) ## Changed Files \| File \| Change \| \|------\|--------\| \| `fly/nginx.conf` \| Add JSON `log_format` + `access_log`, add `stub_status` endpoint \| \| `fly/Dockerfile` \| COPY Alloy binary from `grafana/alloy:v1.5.1`, COPY `alloy.river` config \| \| `fly/alloy.river` \| New — Alloy config: log tailing, metric extraction, remote_write \| \| `fly/start.sh` \| Start Alloy after Tailscale, before nginx \| \| `argocd/manifests/grafana-config/dashboards/configmap-docs-apm.yaml` \| New — Docs APM dashboard \| \| `argocd/manifests/grafana-config/dashboards/configmap-flyio.yaml` \| New — Fly.io Proxy Health dashboard \| \| `argocd/manifests/grafana-config/kustomization.yaml` \| Register new dashboard configmaps \| \| `docs/reference/services/flyio-proxy.md` \| Document observability setup \| ## Deployment and Testing - [ ] `mise run fly-deploy` — rebuild container with Alloy - [ ] `curl https://docs.eblu.me/` — generate traffic - [ ] `fly logs -a blumeops-proxy` — verify Alloy startup - [ ] Query Prometheus: `flyio_nginx_http_requests_total{instance="flyio-proxy"}` - [ ] Query Loki: `{instance="flyio-proxy", job="flyio-nginx"}` - [ ] `argocd app sync grafana-config` — deploy dashboards - [ ] Verify dashboards show data in Grafana - [ ] `mise run services-check` — no regressions Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/123	2026-02-08 10:05:38 -08:00
Forgejo Actions	c46d55060d	Update docs release to v1.5.0 - Built changelog from towncrier fragments [skip ci]	2026-02-08 10:37:30 +00:00
Erich Blume	64a78422b1	Add Fly.io public reverse proxy for docs.eblu.me (#120 ) Some checks failed Deploy Fly.io Proxy / deploy (push) Failing after 9s Details ## Summary - Adds a Fly.io reverse proxy (`blumeops-proxy`) that tunnels public traffic to homelab services over Tailscale - First service exposed: `docs.eblu.me` — the Quartz static docs site - Includes Pulumi IaC for Tailscale auth key/ACLs and Gandi DNS CNAME - Adds mise tasks (`fly-deploy`, `fly-setup`, `fly-shutoff`) and Forgejo CI workflow ## Key details - Fly.io Firecracker VMs support TUN devices natively — no userspace networking needed - Tailscale auth key is `preauthorized=True` to avoid device approval hangs on container restarts - nginx caches aggressively for the static site; health check is on the default_server block - ACLs restrict `tag:flyio-proxy` to `tag:k8s` on port 443 only - DNS CNAME deployed and verified: `docs.eblu.me` → `blumeops-proxy.fly.dev` ## Test plan - [x] `curl -sf https://blumeops-proxy.fly.dev/healthz` returns `ok` - [x] `curl -I -H "Host: docs.eblu.me" https://blumeops-proxy.fly.dev/` returns 200 with `X-Cache-Status` - [x] `curl -I https://docs.eblu.me/` returns 200 with valid Let's Encrypt cert - [x] `dig forge.ops.eblu.me` still resolves to 100.98.163.89 (private services unaffected) - [x] Set `FLY_DEPLOY_TOKEN` Forgejo Actions secret for CI auto-deploy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/120	2026-02-08 02:36:19 -08:00
Forgejo Actions	11c76d4768	Update docs release to v1.4.2 - Built changelog from towncrier fragments [skip ci]	2026-02-08 05:45:40 +00:00
Forgejo Actions	ab7efd8c1c	Update docs release to v1.4.1 - Built changelog from towncrier fragments [skip ci]	2026-02-08 05:27:23 +00:00
Forgejo Actions	3f5017f732	Update docs release to v1.4.0 - Built changelog from towncrier fragments [skip ci]	2026-02-08 05:03:34 +00:00
Erich Blume	3b4ff91469	Fix homepage Admin bookmark icons (#110 ) ## Summary - Fix broken Pulumi icon: changed `pulumi` to `si-pulumi` (Simple Icons prefix required) - Fix broken ArgoCD icon: changed `argocd` to `argo-cd` (Dashboard Icons uses hyphenated name) ## Deployment and Testing - [ ] Sync homepage app in ArgoCD - [ ] Verify icons appear on go.ops.eblu.me Admin bookmarks section Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/110	2026-02-05 06:29:39 -08:00

... 3 4 5 6 7

339 commits