Commit graph

23 commits

Author SHA1 Message Date
5ec2411e20 Update navidrome, miniflux, forgejo-runner image tags to Alpine 3.23 builds [main]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:37:30 -07:00
a18ec9d958 Update miniflux to main image tag, disable OTEL metrics in Dagger module
Point miniflux kustomization at the main-built v2.2.19-138e23d image
(replacing the branch tag). Disable the OTLP metrics exporter at module
import time to prevent ~11s retry delays in CI — the env var must be set
inside the module, not the runner shell, because the SDK runs inside the
Dagger engine container.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 08:59:32 -07:00
138e23d525 Miniflux 2.2.19 + container.py migration + ty typechecker (#331)
All checks were successful
Build Container / detect (push) Successful in 3s
Build Container / build-dagger (miniflux) (push) Successful in 1m3s
## Summary

- Upgrade miniflux from 2.2.17 to 2.2.19 (security hardening, performance improvements)
- Migrate miniflux from Dockerfile to native Dagger container.py build
- Refactor `alpine_runtime()` helper to support existing users (nobody/65534)
- Add `ty` (Astral) Python typechecker to prek hooks

## Test plan

- [ ] `dagger call build --src=. --container-name=miniflux` succeeds
- [ ] `dagger call container-version --container-name=miniflux` returns 2.2.19
- [ ] `mise run container-version-check` passes
- [ ] `ty check` passes cleanly
- [ ] `prek run --all-files` passes
- [ ] CI builds container successfully
- [ ] Miniflux healthcheck passes after deploy from branch

Reviewed-on: #331
2026-04-12 08:54:32 -07:00
07e9c810ca Add RuntimeDefault seccomp profiles to all managed workloads
Addresses 32 CIS Kubernetes Benchmark failures from Prowler scan
(core_seccomp_profile_docker_default). Applied pod-level seccomp
RuntimeDefault to 18 deployments/statefulsets and 2 cronjobs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 16:19:40 -07:00
3d2a97aaf9 Update kustomization tags to OCI-labeled builds (613f05d)
Point all services at the 613f05d images which carry the new
consistent OCI labels. Skipped kiwix/transmission (old v4.0.6-r4
version, no matching build) and docs/quartz (no 613f05d build).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 06:34:12 -07:00
64afd40a29 Fix Grafana widget fields (lowercase) and hide Miniflux read count
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 06:28:41 -07:00
6e8d11c6bb Add :kustomized sentinel tag to manifest images, review devpi
Bare image references in manifests were ambiguous — unclear whether the
tag was intentionally omitted or managed by kustomize. Add :kustomized
sentinel to all 37 image refs overridden by kustomize images transformer.
Add sync notes for tailscale-operator proxyclass (CRD fields not processed
by kustomize). Mark devpi reviewed (6.19.1 is current).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 08:15:06 -08:00
e0f9ebebdf Update homepage, navidrome, ntfy, miniflux image tags after mirror migration
Prometheus and teslamate builds still in progress — will update in a
follow-up commit once their 33b7f0f tags land.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 21:06:08 -08:00
9b44a8ec51 Add kustomize images: and configMapGenerator: across services (#264)
## Summary

- Move hardcoded image tags to kustomization.yaml `images:` transformer across **22 services** — image names in manifests become version-agnostic templates, with tags centralized in one place per service
- Replace hand-written ConfigMap manifests with `configMapGenerator:` in **12 services** — config data extracted to standalone files, generated ConfigMaps include content hashes that trigger automatic pod rollouts on changes
- Create new `kustomization.yaml` for **forgejo-runner** and **nvidia-device-plugin** (switches ArgoCD from directory mode to kustomize mode, rendered output identical)

### Services modified

**Images only (8):** cv, devpi, docs, kube-state-metrics, miniflux, navidrome, teslamate, torrent

**Images + configMapGenerator (10):** alloy-k8s, forgejo-runner, frigate, grafana, homepage, kiwix, loki, mosquitto, ntfy, prometheus

**Images only, no configMapGenerator (4):** authentik (skip blueprints — special YAML tags), tailscale-operator-base (Deployment only, CRD image fields left as-is)

**Skipped entirely (6):** argocd (remote upstream), databases (no image fields), external-secrets, grafana-config (cross-kustomization dashboards), immich (Helm-managed), 1password-connect/cloudnative-pg (no kustomization.yaml)

### What changes at deploy time

- **images:** — no functional diff, `kustomize build` produces identical output with tags
- **configMapGenerator:** — ConfigMap names gain hash suffixes (e.g., `prometheus-config` → `prometheus-config-6f42fhctcb`) and all Deployment/StatefulSet/DaemonSet references are updated automatically. Pods will restart once per service on first sync due to the name change

## Test plan

- [x] `kubectl kustomize` builds all 30 service directories successfully
- [x] Image tags verified in rendered output for all modified services
- [x] ConfigMap hash suffixes verified in rendered output
- [x] ConfigMap references in Deployments/StatefulSets confirmed to use hashed names
- [x] All pre-commit hooks pass (yamllint, shellcheck, prettier, etc.)
- [ ] `argocd app diff` each service to confirm only expected ConfigMap name changes
- [ ] Deploy from branch starting with a low-risk service (e.g., mosquitto)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/264
2026-02-24 14:25:19 -08:00
e1c2892878 Fix container tags deleted during old-tag cleanup
Five container manifests were removed when deleting old-style tags
(shared digests). Rebuild on a72a0d8 and update references.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-21 16:26:29 -08:00
a72a0d8e8e Update all container images to new upstream-version tagging scheme (#238)
## Summary
- Updates all 15 container image references across 14 ArgoCD manifest files
- Migrates from old internal `vX.Y.Z` tags to new `v<upstream-version>-<sha>` format
- Covers: authentik, cv, devpi, forgejo-runner, homepage, kiwix-serve, kubectl, miniflux, navidrome, ntfy, quartz, teslamate, transmission

## Deployment and Testing
- [ ] Sync all ArgoCD apps on branch revision
- [ ] Verify all services come up healthy
- [ ] Merge and re-sync on main
- [ ] Clean up old-style tags from zot registry

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/238
2026-02-21 15:58:11 -08:00
b3747f6c95 Tier 1 version bumps (#186)
All checks were successful
Build Container / build (push) Successful in 8s
## Summary

Audit and upgrade of all deployed images, helm charts, and custom container Dockerfiles to latest stable versions. This PR covers Tier 1 (low-risk minor/patch bumps only).

### Upstream images
| Image | Old | New |
|-------|-----|-----|
| kube-state-metrics | v2.13.0 | v2.18.0 |
| prometheus | v3.2.1 | v3.9.1 |
| loki | 3.3.2 | 3.6.5 |
| alloy | v1.5.1 | v1.13.1 |
| tailscale (proxy + operator) | v1.92.5 | v1.94.1 |
| navidrome | :latest | v0.60.3 (pinned) |

### Helm charts
| Chart | Old | New |
|-------|-----|-----|
| CloudNativePG | v0.27.0 | v0.27.1 |
| 1Password Connect | 2.2.1 | 2.3.0 |

### Custom containers (Dockerfiles updated, images not yet tagged)
| Container | Changes | New tag |
|-----------|---------|---------|
| miniflux | 2.2.16→2.2.17 (security), alpine 3.22 | v1.1.0 |
| kubectl | v1.34.1→v1.34.4, alpine 3.22 | v1.1.0 |
| kiwix-serve | alpine 3.22 | v1.1.0 |
| nettest | alpine 3.22 | v0.14.0 |
| transmission | alpine 3.22, pkg 4.0.6-r4 | v1.1.0 |

All custom containers verified with local `dagger call build`.

### Deferred to Tier 2 (separate PRs)
- Forgejo runner 6→12 (major version scheme change)
- Docker DinD 27→29
- Grafana chart 8→11 (repo migration)
- External Secrets 1→2 (breaking changes)
- Python 3.12→3.13, Elixir 1.18→1.19, Node 22→24
- Transmission 4.0.6→4.1.0 (not in Alpine yet)

## Deployment

After merge:
1. Tag custom containers: `mise run container-tag-and-release <name> <version>` for each
2. Wait for CI builds to complete
3. `argocd app sync apps` then sync individual apps, or let ArgoCD auto-detect

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/186
2026-02-13 17:16:37 -08:00
48ce5b4120 Recategorize homepage into Content and Misc groups (#179)
## Summary
- Replace the three homepage groups (Apps, Observability, Infrastructure) with two cleaner groups
- **Content**: Immich, Kiwix, Miniflux, DJ, Grafana
- **Misc**: CV, TeslaMate, Transmission, Docs, Prometheus, PyPI

## Deployment and Testing
- [ ] Sync affected ingresses via ArgoCD (all 11 services)
- [ ] Verify homepage shows the two new groups correctly

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/179
2026-02-13 09:09:22 -08:00
e6cf7e47e0 Restrict flyio-proxy ACLs to dedicated tag:flyio-target endpoints (#126)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m8s
## Summary
- Introduce `tag:flyio-target` so services must explicitly opt in to be reachable by the fly.io proxy
- Replace broad `tag:k8s` and `tag:homelab` grants with the new tag in the ACL rule and test
- Add `tailscale.com/tags: "tag:k8s,tag:flyio-target"` annotation to docs, loki, and prometheus Ingresses
- Switch Alloy push endpoints from `*.ops.eblu.me` (Caddy) to `*.tail8d86e.ts.net` (Tailscale Ingress)
- Update docs: flyio-proxy, caddy, tailscale, forgejo (future public access + security checklist), expose-service-publicly

## Manual step (not in PR)
Update the k8s operator OAuth client in the Tailscale admin console to include `tag:flyio-target` in its scope. Without this, the operator cannot assign the new tag to Ingress proxy nodes.

## Deployment order
1. **Pulumi ACLs** — `mise run tailnet-preview && mise run tailnet-up`
2. **OAuth client** — Manual update in Tailscale admin console
3. **K8s Ingresses** — `argocd app sync apps && argocd app sync docs loki prometheus`
4. **Fly.io proxy** — `mise run fly-deploy`
5. **Verify** — `mise run services-check`, check Grafana dashboards

## Test plan
- [ ] `mise run tailnet-preview` shows clean diff
- [ ] `argocd app diff docs`, `argocd app diff loki`, `argocd app diff prometheus` show only annotation additions
- [ ] After deploy: Grafana dashboards show continued log/metric flow
- [ ] `curl -sf https://docs.eblu.me` returns 200
- [ ] `mise run services-check` passes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/126
2026-02-08 21:54:18 -08:00
4d97ac4c26 Expand homepage widgets and info panels (#81)
## Summary
- Add greeting and datetime info widgets to homepage header
- Add Miniflux widget showing unread/read counts (via existing API key in 1Password)
- Add Grafana widget showing dashboards/datasources/alerts (via existing credentials in 1Password)
- Add ArgoCD to bookmarks section
- Add TODO comments for widgets needing additional setup (Forgejo, Caddy, UniFi, Glances, Navidrome, Transmission, Immich)

## Deployment and Testing
- [ ] Sync homepage app to deploy new ExternalSecrets
- [ ] Verify greeting and datetime appear in header
- [ ] Verify Miniflux widget shows unread/read counts
- [ ] Verify Grafana widget shows dashboard stats
- [ ] Check that services without credentials still display (just without widgets)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/81
2026-02-02 16:11:20 -08:00
38538ad5f0 Replace hajimari with gethomepage (#75)
## Summary
- Remove hajimari (unmaintained since Oct 2022, broken helm deps)
- Add gethomepage (28k stars, actively maintained, monthly releases)
- Migrate custom apps, bookmarks, and search config
- Enable k8s RBAC for service autodiscovery
- Configure Tailscale ingress at go.tail8d86e.ts.net

## Why the switch
Hajimari hasn't released since October 2022. The helm chart has a broken
dependency (bjw-s/common URL is 404), and unreleased code on main has bugs.
gethomepage has similar k8s autodiscovery via ingress annotations and is
very actively maintained.

## Deployment and Testing
- [ ] Delete hajimari app from ArgoCD
- [ ] Delete hajimari namespace
- [ ] Sync apps to pick up new homepage app
- [ ] Sync homepage app
- [ ] Verify go.ops.eblu.me loads

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/75
2026-01-30 13:21:12 -08:00
316a4c4e42 Shorten Hajimari info descriptions and hide URLs
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 16:34:46 -08:00
d1164c8aac Add Hajimari service dashboard (#73)
## Summary
- Add Hajimari as a service dashboard/start page at `go.ops.eblu.me`
- Auto-discovers k8s services from ingress annotations
- Custom apps for non-k8s services: Forgejo, Registry, Sifaka NAS
- Add `nas.ops.eblu.me` Caddy proxy to Synology dashboard

## Services Configured

**Auto-discovered (k8s ingresses with hajimari.io annotations):**
- Grafana, ArgoCD, Prometheus, Loki (Observability)
- Miniflux, Kiwix, Transmission, TeslaMate, Immich (Apps)
- PyPI/devpi (Infrastructure)

**Custom apps (non-k8s):**
- Forgejo (forge.ops.eblu.me)
- Registry (registry.ops.eblu.me)
- Sifaka NAS (nas.ops.eblu.me)

**Bookmarks:**
- Tailscale Admin, 1Password, Pulumi

## Deployment and Testing
- [ ] Sync `apps` application to pick up new Hajimari Application
- [ ] Sync `hajimari` application
- [ ] Run `mise run provision-indri -- --tags caddy` for go/nas proxy entries
- [ ] Re-sync all k8s apps with hajimari annotations (or wait for natural drift)
- [ ] Verify https://go.ops.eblu.me shows dashboard with all services

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/73
2026-01-29 15:51:42 -08:00
dd6cf20d51 Remove obsolete secret templates
- Delete 13 .yaml.tpl files replaced by ExternalSecrets
- Update immich/README.md with direct CNPG secret copy instructions
- Update miniflux/README.md with context flag and ESO note

Only 1password-connect/secret-credentials.yaml.tpl remains (bootstrap).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 20:26:37 -08:00
c8b655f177 Build local containers for k8s services (#61)
## Summary
- Move devpi Dockerfile from argocd/manifests to containers/devpi/
- Add containers for: transmission, teslamate, miniflux, kiwix-serve, kubectl
- Update all k8s deployments to use local images (registry.ops.eblu.me/blumeops/*)
- All containers use v1.0.0 tag for initial release

## Containers Added
| Container | Source | Notes |
|-----------|--------|-------|
| devpi | python:3.12-slim | Existing, moved to containers/ |
| kubectl | alpine + download | For zim-watcher CronJob |
| miniflux | Go build from source | v2.2.16 |
| kiwix-serve | Download pre-built binary | v3.8.1 |
| transmission | alpine + apk install | Simpler than linuxserver image |
| teslamate | Elixir build from source | v2.2.0 |

## Deployment and Testing
- [ ] Build and tag devpi-v1.0.0
- [ ] Build and tag kubectl-v1.0.0
- [ ] Build and tag miniflux-v1.0.0
- [ ] Build and tag kiwix-serve-v1.0.0
- [ ] Build and tag transmission-v1.0.0
- [ ] Build and tag teslamate-v1.0.0
- [ ] Sync ArgoCD apps and verify services

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/61
2026-01-25 21:35:57 -08:00
af39067e1f Pin ArgoCD to v3.2.6 (#44)
## Summary
- Pin ArgoCD kustomization to v3.2.6 tag instead of `stable` branch
- This gives intentional control over ArgoCD version upgrades

## Deployment and Testing
- [ ] Sync the `apps` application: `argocd app sync apps`
- [ ] Point argocd at feature branch: `argocd app set argocd --revision feature/pin-argocd-v3.2.6`
- [ ] Sync argocd: `argocd app sync argocd`
- [ ] Verify ArgoCD is running v3.2.6
- [ ] After merge, reset to main: `argocd app set argocd --revision main && argocd app sync argocd`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/44
2026-01-22 16:38:27 -08:00
21848a7919 P5.1: Migrate minikube from podman to QEMU2 driver (#38)
## Summary
- Migrate minikube from podman driver to qemu2 driver for proper NFS/SMB volume mount support
- Update ansible minikube role with qemu installation and containerd runtime
- Remove podman role dependency from indri.yml
- Add synology user creation steps and post-migration zot reconfiguration notes

## Why
Phase 6 (Kiwix/Transmission migration) was blocked because the podman driver lacks kernel capabilities for filesystem mounts. QEMU2 creates an actual VM with full mount support.

## Deployment and Testing
- [ ] Create k8s-storage user on Synology DSM
- [ ] Store credentials in 1Password (synology-k8s-storage)
- [ ] Export current k8s state
- [ ] Stop and delete podman-based minikube cluster
- [ ] Run ansible to create QEMU2 cluster
- [ ] Test NFS volume mount with test pod
- [ ] Redeploy ArgoCD and all apps
- [ ] Verify all services healthy
- [ ] Reconfigure zot registry mirrors for containerd (post-migration)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/38
2026-01-21 16:03:37 -08:00
735b643429 P4: Miniflux migration + PostgreSQL consolidation (#33)
## Summary
- Deploy miniflux in k8s via ArgoCD
- Expose via Tailscale Ingress at feed.tail8d86e.ts.net
- Retire brew PostgreSQL (no longer needed)
- Rename k8s-pg to pg (canonical hostname)
- Remove ansible miniflux and postgresql roles
- Update borgmatic to backup pg.tail8d86e.ts.net
- Update all zk documentation

## Deployment and Testing
- [x] Miniflux pod running in k8s
- [x] User login works at https://feed.tail8d86e.ts.net
- [x] Feeds and entries visible
- [x] brew miniflux and postgresql stopped
- [x] Tailscale services migrated (feed, pg)
- [x] zk documentation updated
- [x] Run ansible to apply role removals
- [ ] Verify borgmatic backup with new pg hostname

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/33
2026-01-20 09:04:47 -08:00