Commit graph

47 commits

Author SHA1 Message Date
fb32cc07c4 chore: repoint runner-job-image tag at CI-built v0.20.6-50f8c2a
Swaps the k8s runner label from the local bootstrap tag (v0.20.6-9b6be09)
to the equivalent image rebuilt by CI from main. Functionally identical;
closes the bootstrap loop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 08:38:33 -07:00
50f8c2a33f Roll k8s runner to runner-job-image v0.20.6-9b6be09
Points the k8s Forgejo runner label at the locally-bootstrapped
runner-job-image built from the Alpine container.py on this branch.
Once merged, CI will rebuild the same image from the same SHA.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 08:28:18 -07:00
21177ff47f chore: update forgejo-runner image tag 2026-04-20 09:11:37 -07:00
1425bf1f5c Upgrade forgejo-runner to v12.8, adopt server.connections, and clean up docs (#338)
## Summary
- consolidate forgejo-runner how-to docs into current cards
- upgrade the k8s forgejo-runner deployment to the latest v12.8.x runner image
- switch the k8s runner from first-boot register flow to declarative server.connections config
- keep the runner image on the native Dagger build path and update the surrounding manifests/secrets

## Notes
- PR opened early for C1 review
- implementation and deployment verification will follow in subsequent commits

Reviewed-on: #338
2026-04-20 09:03:54 -07:00
5ec2411e20 Update navidrome, miniflux, forgejo-runner image tags to Alpine 3.23 builds [main]
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:37:30 -07:00
9d85c97b9b Update forgejo-runner kustomization tag to main-branch image
C0 follow-up: switch from branch-built tag to main-built v12.7.3-0e93cc0.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:10:36 -07:00
0e93cc08b4 Build forgejo-runner container locally (#334)
All checks were successful
Build Container / detect (push) Successful in 2s
Build Container / build-dagger (forgejo-runner) (push) Successful in 1m21s
## Summary
- Add native Dagger `container.py` for forgejo-runner (Go + Alpine runtime, static binary with CGO for SQLite)
- Update kustomization to point to local registry image (tag is placeholder until CI builds)
- Uses existing `clone_from_forge("forgejo-runner", ...)` mirror

## Test plan
- [x] `dagger call build --src=. --container-name=forgejo-runner` passes locally
- [ ] CI container build from branch succeeds
- [ ] Update kustomization tag to built image, deploy from branch via ArgoCD `--revision`
- [ ] Verify runner registers and picks up jobs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: #334
2026-04-14 11:06:36 -07:00
1e391f96bb Upgrade forgejo-runner 12.7.0 → 12.7.3, add service card
Patch upgrade picks up idempotent FetchTask API, offline registration
fix, cloudflare/circl security dep update, and custom gRPC user-agent.
No config defaults changed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 16:31:06 -07:00
924325ebd5 Fix DinD seccomp profile broken by RuntimeDefault rollout
The pod-level RuntimeDefault seccomp profile (07e9c81) overrides the
DinD sidecar's privileged flag in newer Kubernetes versions, blocking
Docker daemon syscalls. Set Unconfined explicitly on the DinD container
while keeping RuntimeDefault on the runner container.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 17:09:57 -07:00
07e9c810ca Add RuntimeDefault seccomp profiles to all managed workloads
Addresses 32 CIS Kubernetes Benchmark failures from Prowler scan
(core_seccomp_profile_docker_default). Applied pod-level seccomp
RuntimeDefault to 18 deployments/statefulsets and 2 cronjobs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 16:19:40 -07:00
b793299d6d Upgrade Dagger engine from v0.20.0 to v0.20.1
Phase 2 of Dagger upgrade: bump engine version, update runner
deployment to v0.20.1-24f7512, and fix docs reference card version.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 20:41:02 -08:00
6e8d11c6bb Add :kustomized sentinel tag to manifest images, review devpi
Bare image references in manifests were ambiguous — unclear whether the
tag was intentionally omitted or managed by kustomize. Add :kustomized
sentinel to all 37 image refs overridden by kustomize images transformer.
Add sync notes for tailscale-operator proxyclass (CRD fields not processed
by kustomize). Mark devpi reviewed (6.19.1 is current).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 08:15:06 -08:00
46cc3fbc2e Update forgejo-runner job image to v0.20.0-448689b
Built locally to break the chicken-and-egg: the old runner couldn't
build its own replacement because it needed Dagger 0.20.0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 11:05:21 -08:00
82884436df Route runner polling through internal forge.ops.eblu.me
The k8s and ringtail runners were hitting forge.eblu.me (fly.io proxy)
for every FetchTask poll (~every 2s), round-tripping through the public
internet unnecessarily. Use forge.ops.eblu.me (Caddy on indri, tailnet)
for infrastructure workloads.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 10:33:40 -08:00
a87c997ee1 Expose Forgejo publicly at forge.eblu.me (#278)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m28s
## Summary

Expose Forgejo publicly at `forge.eblu.me` via the Fly.io reverse proxy — the first dynamic, authenticated public-facing service.

- **Forgejo hardening:** Domain changed to forge.eblu.me, SSH stays on forge.ops.eblu.me, reverse proxy trust headers configured, local registration locked to external-only (Authentik SSO)
- **Tailscale Ingress:** ExternalName Service + Ingress in tailscale-operator creates forge.tail8d86e.ts.net endpoint
- **Fly.io proxy:** nginx server block with rate-limited auth endpoints (3r/s), fail2ban with custom nginx-deny action, security headers, /swagger blocked, WebSocket support, 512m body limit
- **Authentik:** OAuth callback updated to forge.eblu.me
- **DNS/TLS:** CNAME record in Pulumi, cert in fly-setup
- **Rename:** ~29 files updated from forge.ops.eblu.me to forge.eblu.me (HTTPS refs only; SSH, container builds, and Caddy table kept as-is)

## Deployment Order

1. `mise run provision-indri -- --tags forgejo` (config changes)
2. Verify forge.ops.eblu.me still works
3. `argocd app set tailscale-operator --revision feature/forge-public && argocd app sync tailscale-operator`
4. Verify `curl https://forge.tail8d86e.ts.net`
5. `cd fly && fly deploy`
6. Verify pre-DNS: `curl -H "Host: forge.eblu.me" https://blumeops-proxy.fly.dev/`
7. `fly certs add forge.eblu.me -a blumeops-proxy`
8. `argocd app set authentik --revision feature/forge-public && argocd app sync authentik`
9. `mise run dns-preview && mise run dns-up`
10. Full verification (see below)
11. Rehearse `mise run fly-shutoff`
12. After merge: reset ArgoCD revisions to main, re-sync

## Verification Checklist

- [ ] forge.eblu.me loads, shows public repos
- [ ] forge.ops.eblu.me still works from tailnet
- [ ] SSH clone via forge.ops.eblu.me:2222 works
- [ ] HTTPS clone via forge.eblu.me works
- [ ] UI shows forge.eblu.me for HTTPS clone, forge.ops.eblu.me for SSH
- [ ] /swagger returns 403
- [ ] Rapid login attempts trigger 429 rate limit
- [ ] fail2ban bans after 5 failed logins in 10 minutes
- [ ] ArgoCD can still sync (SSH unaffected)
- [ ] `mise run fly-shutoff` stops all public traffic
- [ ] `mise run services-check` passes

Reviewed-on: #278
2026-03-03 08:40:41 -08:00
9b44a8ec51 Add kustomize images: and configMapGenerator: across services (#264)
## Summary

- Move hardcoded image tags to kustomization.yaml `images:` transformer across **22 services** — image names in manifests become version-agnostic templates, with tags centralized in one place per service
- Replace hand-written ConfigMap manifests with `configMapGenerator:` in **12 services** — config data extracted to standalone files, generated ConfigMaps include content hashes that trigger automatic pod rollouts on changes
- Create new `kustomization.yaml` for **forgejo-runner** and **nvidia-device-plugin** (switches ArgoCD from directory mode to kustomize mode, rendered output identical)

### Services modified

**Images only (8):** cv, devpi, docs, kube-state-metrics, miniflux, navidrome, teslamate, torrent

**Images + configMapGenerator (10):** alloy-k8s, forgejo-runner, frigate, grafana, homepage, kiwix, loki, mosquitto, ntfy, prometheus

**Images only, no configMapGenerator (4):** authentik (skip blueprints — special YAML tags), tailscale-operator-base (Deployment only, CRD image fields left as-is)

**Skipped entirely (6):** argocd (remote upstream), databases (no image fields), external-secrets, grafana-config (cross-kustomization dashboards), immich (Helm-managed), 1password-connect/cloudnative-pg (no kustomization.yaml)

### What changes at deploy time

- **images:** — no functional diff, `kustomize build` produces identical output with tags
- **configMapGenerator:** — ConfigMap names gain hash suffixes (e.g., `prometheus-config` → `prometheus-config-6f42fhctcb`) and all Deployment/StatefulSet/DaemonSet references are updated automatically. Pods will restart once per service on first sync due to the name change

## Test plan

- [x] `kubectl kustomize` builds all 30 service directories successfully
- [x] Image tags verified in rendered output for all modified services
- [x] ConfigMap hash suffixes verified in rendered output
- [x] ConfigMap references in Deployments/StatefulSets confirmed to use hashed names
- [x] All pre-commit hooks pass (yamllint, shellcheck, prettier, etc.)
- [ ] `argocd app diff` each service to confirm only expected ConfigMap name changes
- [ ] Deploy from branch starting with a low-risk service (e.g., mosquitto)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/264
2026-02-24 14:25:19 -08:00
9b419abf24 Update RUNNER_LABELS to use runner-job-image:v0.19.11-4c5e0f0
Now that the image is built under the new name, point the forgejo
runner at it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 17:47:14 -08:00
e655f4556e Upgrade k8s forgejo-runner from v6.3.1 to v12.7.0 (#251)
## Summary

Completes the `upgrade-k8s-runner` mikado chain. Both prerequisites (workflow validation in Dagger, config review against v12 defaults) were resolved in #250.

- Bump runner image `code.forgejo.org/forgejo/runner:6.3.1` → `12.7.0`
- Update `service-versions.yaml` to track new version
- Mark goal card complete (remove `status: active`)

## Deployment and Testing

After merge:
1. `argocd app sync forgejo-runner`
2. Verify runner registers in Forgejo admin → runners
3. Trigger a test workflow (e.g. `branch-cleanup.yaml` manual dispatch)

Rollback: revert image tag to `6.3.1`, push, sync.
Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/251
2026-02-22 17:43:39 -08:00
0f6a1898f0 Prepare forgejo-runner v12 upgrade (leaf nodes) (#250)
## Summary
- Review runner config against v12.7.0 defaults — added `shutdown_timeout: 3h`, no breaking changes found
- Add `validate_workflows` Dagger function using `forgejo-runner validate --directory .` inside upstream container
- All 6 workflows pass v12.7.0 schema validation
- Wire `mise run validate-workflows` task and pre-commit hook on `.forgejo/workflows/` changes
- Mark both leaf Mikado cards (`review-runner-config-v12`, `validate-workflows-against-v12`) complete

## Mikado State
After merge, `upgrade-k8s-runner` goal card has no unmet dependencies — ready to execute the actual image bump in a follow-up PR.

## Test Plan
- [x] `dagger call validate-workflows --src=.` passes (all 6 workflows OK)
- [x] Pre-commit hooks pass
- [ ] Reviewer: confirm `shutdown_timeout: 3h` addition to ConfigMap looks reasonable

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/250
2026-02-22 17:38:32 -08:00
a72a0d8e8e Update all container images to new upstream-version tagging scheme (#238)
## Summary
- Updates all 15 container image references across 14 ArgoCD manifest files
- Migrates from old internal `vX.Y.Z` tags to new `v<upstream-version>-<sha>` format
- Covers: authentik, cv, devpi, forgejo-runner, homepage, kiwix-serve, kubectl, miniflux, navidrome, ntfy, quartz, teslamate, transmission

## Deployment and Testing
- [ ] Sync all ArgoCD apps on branch revision
- [ ] Verify all services come up healthy
- [ ] Merge and re-sync on main
- [ ] Clean up old-style tags from zot registry

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/238
2026-02-21 15:58:11 -08:00
d5c00192d5 Configure DinD to use Zot as pull-through registry mirror (#183)
## Summary
- Add `daemon.json` with `registry-mirrors` to the forgejo-runner ConfigMap, pointing DinD at `http://host.minikube.internal:5050`
- Mount `daemon.json` into the DinD sidecar at `/etc/docker/daemon.json` via `subPath`
- Docker Hub pulls during Dagger CI builds will now route through Zot's pull-through cache, reducing bandwidth and avoiding rate limits

## Deployment and Testing
- [ ] `argocd app sync forgejo-runner`
- [ ] Exec into DinD container: `docker info` should show the registry mirror
- [ ] Trigger a workflow build and check Zot logs for cache hits

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/183
2026-02-13 12:36:03 -08:00
ba9b251759 Update forgejo-runner image to v3.2.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 12:16:52 -08:00
d0c18043b7 Revert forgejo-runner image to v3.1.0
v3.2.0 build failed (GitHub download timeout), rolling back to
working image while it rebuilds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 12:07:51 -08:00
fdd3f6483a Update forgejo-runner image to v3.2.0
All checks were successful
Build Container / build (push) Successful in 7m31s
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 11:08:57 -08:00
0098ac37e0 Move non-secret runner env vars to deployment spec (#181)
## Summary
- Move FORGEJO_URL, RUNNER_NAME, and RUNNER_LABELS from ExternalSecret template to deployment env vars
- ExternalSecret now only contains the actual secret (RUNNER_TOKEN)
- Image version changes in RUNNER_LABELS now trigger automatic pod rollouts

## Deployment
1. Merge this PR
2. `argocd app sync forgejo-runner` — the deployment spec change will auto-roll the pod

No manual restart needed — that's the whole point :)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/181
2026-02-13 10:29:23 -08:00
52bbf88aa6 Update forgejo-runner image to v3.1.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-13 10:21:43 -08:00
20a25557d6 Bump runner image to v3.0.2 (restore Docker CLI)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 17:53:04 -08:00
0006e6bf17 Bump runner image to v3.0.1 (restore Node.js)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 17:38:45 -08:00
3d84483513 Update runner job image to forgejo-runner:v3.0.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 17:27:50 -08:00
95364dcb48 Simplify runner image (Dagger Phase 3) (#162)
All checks were successful
Build Container / build (push) Successful in 1m13s
## Summary

With Phases 1 and 2 complete, the runner image no longer needs most of its bundled tools. This PR strips it down and adds what was missing.

**Removed** (now inside Dagger containers):
- Node.js 24.x
- Docker CLI + buildx plugin
- skopeo
- gnupg, lsb-release, xz-utils

**Added:**
- `tzdata` — fixes the TZ env var (#159, #160, #161) so `TZ=America/Los_Angeles` actually works
- `flyctl` — was being installed from scratch every release

**Workflow changes:**
- Remove "Ensure Dagger CLI" bootstrap steps from both workflows (Dagger is in the image)
- Remove "Install flyctl" step from build-blumeops (flyctl is in the image)
- Remove job-level `TZ` from build-blumeops (moved to runner configmap `runner.envs`)
- Set `TZ: America/Los_Angeles` in runner configmap so all job containers inherit it

## Deployment

After merge:
1. Build and release the new runner image: `mise run container-release forgejo-runner v2.0.0`
2. Sync the runner: `argocd app sync forgejo-runner`
3. Verify: `kubectl -n forgejo-runner exec deploy/forgejo-runner -c runner -- date` (but the real test is running a docs release and checking the changelog date)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/162
2026-02-11 17:24:20 -08:00
2a04ab26b7 Mount host zoneinfo into runner for TZ support (#160)
## Summary

The `TZ=America/Los_Angeles` env var from #159 has no effect because the `forgejo/runner` image doesn't ship tzdata. Mount the node's `/usr/share/zoneinfo` into the container so the timezone database is available.

## Deployment

After merge, sync forgejo-runner and verify:
```
argocd app sync forgejo-runner
kubectl -n forgejo-runner exec deploy/forgejo-runner -c runner -- date
# Should show PST/PDT, not UTC
```

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/160
2026-02-11 16:57:11 -08:00
42ebc2b122 Fix Forgejo runner timezone (UTC -> America/Los_Angeles) (#159)
## Summary

- Set `TZ=America/Los_Angeles` on the Forgejo runner container

The runner pod defaults to UTC. When releases are cut in the evening PST, towncrier stamps changelog entries with tomorrow's date (e.g., v1.6.2 shows 2026-02-12 despite being released on the evening of Feb 11 PST).

## Deployment

After merge, sync the forgejo-runner ArgoCD app:
```
argocd app sync forgejo-runner
```

The runner pod will restart with the new timezone. Note: the v1.6.2 changelog entry will remain dated 2026-02-12; future entries will use PST dates, so dates may appear non-sequential once.

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/159
2026-02-11 16:53:41 -08:00
1bc2b421a8 Adopt Dagger CI for container builds (Phase 1) (#156)
All checks were successful
Build Container / build (push) Successful in 13s
## Summary

- Add Dagger Python module (`.dagger/`) with `build` and `publish` functions for container images
- Replace Docker buildx + skopeo composite action with `dagger call publish` in `build-container.yaml`
- BuildKit's native push is compatible with Zot — **skopeo workaround eliminated**
- Add Dagger CLI (v0.19.11) to forgejo-runner Dockerfile, bump runner to v2.6.0
- Bootstrap step in workflow curl-installs dagger if not in runner (for first build on v2.5.1 runner)
- Delete old `.forgejo/actions/build-push-image/` composite action
- Add GPLv3 LICENSE

## Verified locally

- `dagger call build --src=. --container-name=nettest` — builds ✓
- `dagger call publish --src=. --container-name=nettest --version=dagger-test` — pushed to Zot ✓
- `dagger call build --src=. --container-name=forgejo-runner` — new runner image builds ✓
- Dagger CLI accessible inside built runner image ✓

## Deployment sequence (after merge)

1. `mise run container-tag-and-release forgejo-runner v2.6.0` — old runner bootstraps dagger via curl, builds new runner
2. `argocd app sync forgejo-runner` — runner restarts with v2.6.0 (dagger baked in)
3. `mise run container-tag-and-release nettest v0.13.0` — end-to-end test of new pipeline
4. `mise run container-list` — verify tags

## Not included (future phases)

- Phase 2: docs build + Forgejo packages migration
- Phase 3: runner simplification (remove skopeo, Node.js, etc.)
- Phase 4: future workflows

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/156
2026-02-11 15:38:31 -08:00
f0ac04fb8a Bootstrap buildx: revert to docker build, bump runner to v2.5.1 (#148)
All checks were successful
Build Container / build (push) Successful in 1m56s
## Summary
- Temporarily revert composite action to `docker build` so we can build the runner image (chicken-and-egg: current runner v2.5.0 doesn't have buildx)
- Bump runner label to `v2.5.1` so after sync the new runner image (with buildx) gets used

## Deployment plan
1. Merge this PR
2. Tag `forgejo-runner-v2.5.1` — builds with legacy `docker build` (one last time)
3. Sync forgejo-runner in ArgoCD to pick up the v2.5.1 label
4. Follow-up PR: switch action back to `docker buildx build`
5. Tag `nettest-v0.12.0` to verify buildx works end-to-end

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/148
2026-02-10 21:17:14 -08:00
aaf5090509 Remove ARGOCD_AUTH_TOKEN from external secret
Workflow secrets come from Forgejo's secret store, not runner env.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 17:17:53 -08:00
3a26d7e49a Update forgejo-runner image to v2.5.0
Fixes argocd CLI download.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 17:13:37 -08:00
f08595a3c0 Update forgejo-runner image to v2.4.0
Includes uv and argocd CLI for auto-deploy workflow.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 17:05:09 -08:00
1f73eb675d Auto-deploy docs from build workflow (#93)
## Summary
- Add `uv` and `argocd` CLI to forgejo-runner container image
- Add `workflow-bot` ArgoCD account with sync permissions (declarative via kustomize patches)
- Add `ARGOCD_AUTH_TOKEN` to forgejo-runner external secret for workflow auth
- Update build workflow to auto-deploy docs after release:
  - Update configmap with new release URL
  - Commit changelog and configmap changes
  - Sync docs app via ArgoCD

## Deployment and Testing
Manual steps required before this can work:
1. [ ] Build and push new forgejo-runner image (v2.4.0)
2. [ ] Sync argocd app to create workflow-bot account
3. [ ] Generate token: `argocd account generate-token --account workflow-bot`
4. [ ] Store token in 1Password under "Forgejo Secrets" with field `argocd_token`
5. [ ] Sync forgejo-runner app to pick up new external secret
6. [ ] Update forgejo-runner deployment to use new image version
7. [ ] Test by running workflow manually

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/93
2026-02-03 16:58:03 -08:00
9719fc05f7 Update forgejo-runner to v2.3.0 (Node.js 24)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 09:29:50 -08:00
4c852751db Update forgejo-runner to v2.2.0 (adds skopeo)
All checks were successful
Build Container / build (push) Successful in 13s
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 11:13:54 -08:00
9114aac8f6 Switch all ExternalSecrets to creationPolicy: Owner
ESO now has full ownership of these secrets.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 20:27:16 -08:00
dd6cf20d51 Remove obsolete secret templates
- Delete 13 .yaml.tpl files replaced by ExternalSecrets
- Update immich/README.md with direct CNPG secret copy instructions
- Update miniflux/README.md with context flag and ESO note

Only 1password-connect/secret-credentials.yaml.tpl remains (bootstrap).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 20:26:37 -08:00
351528474c Add ExternalSecrets for remaining k8s secrets
Migrate 10 secret templates to ESO ExternalSecrets with 1Password Connect:
- databases: eblume, borgmatic, teslamate passwords
- tailscale-operator: OAuth client credentials
- grafana-config: admin password, teslamate datasource
- teslamate: db password, encryption key
- forgejo-runner: runner registration token
- argocd: forge SSH credentials

All use creationPolicy: Merge for safe migration from existing secrets.

Skipped:
- miniflux/secret-db: Uses CNPG secret, not 1Password directly
- immich/secret-db: Requires 1Password item creation first
- 1password-connect: Bootstrap secret, must stay as template

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 19:50:38 -08:00
ea42362b6f Migrate Forgejo runner to Kubernetes with DinD (#60)
## Summary
- Deploy Forgejo runner to k8s with Docker-in-Docker sidecar
- Add job execution image with Node.js and Docker CLI
- Retire host-mode runner on indri
- All CI jobs now run containerized in k8s

## Components Added
- `containers/forgejo-runner/Dockerfile` - Job execution image
- `argocd/apps/forgejo-runner.yaml` - ArgoCD Application
- `argocd/manifests/forgejo-runner/` - Kubernetes manifests

## Components Removed
- `ansible/roles/forgejo_runner/` - No longer needed

## Changes to Existing Files
- `.forgejo/workflows/build-container.yaml` - Use `k8s` runner with `DOCKER_HOST` env
- `.github/actionlint.yaml` - Only `k8s` label now valid

## Deployment
1. Apply secret: `op inject -i argocd/manifests/forgejo-runner/secret.yaml.tpl | kubectl --context=minikube-indri apply -f -`
2. Sync ArgoCD: `argocd app sync forgejo-runner`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/60
2026-01-25 19:56:17 -08:00
8ca8798121 Switch to Buildah for container builds (#51)
All checks were successful
Test CI / test (push) Successful in 4s
## Summary
- Replace Docker with Buildah for container image builds
- No Docker socket required - buildah is daemonless
- Cleaner security model (no privileged containers or socket mounting)
- Remove Docker-related security context from deployment

## Changes
- Update Dockerfile to install buildah/podman instead of docker-cli
- Configure buildah storage with overlay driver and fuse-overlayfs
- Update composite action to use `buildah bud` and `buildah push`
- Add `imagePullPolicy: Always` to ensure fresh image pulls
- Update test workflow to verify buildah/podman

## Testing
- [ ] Runner pod starts successfully
- [ ] Buildah is available in runner
- [ ] Test workflow verifies buildah/podman versions
- [ ] Container build workflow builds and pushes to zot

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/51
2026-01-24 13:30:26 -08:00
5fcd122494 Reorganize CI/CD bootstrap phases and add custom runner Dockerfile (#50)
All checks were successful
Test CI / test (push) Successful in 2s
## Summary
- Reorder CI/CD bootstrap phases to address chicken-and-egg problem
- P2 is now "Custom Runner Image" (stock runner lacks Node.js)
- Add P3 for "Mirror Forgejo & Build from Source"
- Rename P3 -> P4 (Self-Deploy), P4 -> P5 (Container Builds)
- Add Dockerfile for custom runner with Node.js, npm, docker, build tools
- Update overview with new phase structure, host mode notes, and cross-compilation challenge

## Key Changes

### Phase Reordering
| Old | New | Name |
|-----|-----|------|
| P1 | P1 | Enable Actions (complete) |
| P2 | P2 | **Custom Runner Image** (new focus) |
| - | P3 | **Mirror Forgejo & Build** (new) |
| P3 | P4 | Self-Deploy |
| P4 | P5 | Container Builds |

### Custom Runner Dockerfile
The stock `forgejo/runner:3.5.1` image lacks Node.js, so `actions/checkout@v4` doesn't work. The new Dockerfile adds:
- Node.js + npm (for GitHub Actions)
- Docker CLI (for container builds)
- Build tools (gcc, make, curl, jq)

### Bootstrap Strategy
1. Build custom runner image manually on gilbert (podman build)
2. Push to zot registry
3. Update deployment to use custom image
4. Then enable auto-build workflow for runner

## Deployment and Testing
- [x] Review plan changes
- [x] Build custom runner image manually and verify
- [x] Update runner deployment
- [x] Test `actions/checkout@v4` works

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/50
2026-01-23 18:50:27 -08:00
7893c41020 Enable Forgejo Actions (Phase 1) (#48)
All checks were successful
Test CI / test (push) Successful in 0s
## Summary
- Refactor Forgejo app.ini to be managed by ansible with secrets from 1Password
- Enable Forgejo Actions in config (`[actions] ENABLED = true`)
- Add `repo.actions` to DEFAULT_REPO_UNITS
- Clean up unused MySQL database fields (we use SQLite)

## Phase 1 Progress
This PR covers the first part of Phase 1 (ci-cd-bootstrap plan):
- [x] Refactor app.ini to ansible template
- [x] Store secrets in 1Password
- [x] Enable Actions in config
- [ ] Deploy config changes (pending review)
- [ ] Create runner registration token
- [ ] Deploy runner to k8s
- [ ] Test with simple workflow

## Deployment and Testing
- [ ] Run `mise run provision-indri -- --tags forgejo` to deploy
- [ ] Verify Forgejo restarts correctly
- [ ] Verify Actions tab appears in repo settings

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/48
2026-01-23 17:00:12 -08:00