Upgrade forgejo-runner to v12.8, adopt server.connections, and clean up docs (#338)

## Summary - consolidate forgejo-runner how-to docs into current cards - upgrade the k8s forgejo-runner deployment to the latest v12.8.x runner image - switch the k8s runner from first-boot register flow to declarative server.connections config - keep the runner image on the native Dagger build path and update the surrounding manifests/secrets ## Notes - PR opened early for C1 review - implementation and deployment verification will follow in subsequent commits Reviewed-on: #338
2026-04-20 09:03:54 -07:00 · 2026-04-20 09:03:54 -07:00 · 1425bf1f5c
commit 1425bf1f5c
parent 353e2785c3
13 changed files with 142 additions and 140 deletions
--- a/argocd/manifests/forgejo-runner/config.yaml
+++ b/argocd/manifests/forgejo-runner/config.yaml
@ -1,9 +1,8 @@
-# Reviewed against v12.7.3 defaults (2026-03-30)
+# Reviewed against v12.8.2 defaults (2026-04-20)
 log:
  level: info

 runner:
-  file: /data/.runner
  capacity: 2
  timeout: 3h
  shutdown_timeout: 3h
@ -13,7 +12,15 @@ runner:
    TZ: America/Los_Angeles

 container:
-  # Job execution image is set via RUNNER_LABELS in deployment.yaml
  network: "host"
  # Connect to DinD sidecar via TCP (not socket)
  docker_host: tcp://127.0.0.1:2375
+
+server:
+  connections:
+    forgejo:
+      url: https://forge.ops.eblu.me/
+      uuid: ${FORGEJO_RUNNER_UUID}
+      token: ${FORGEJO_RUNNER_TOKEN}
+      labels:
+        - k8s:docker://registry.ops.eblu.me/blumeops/runner-job-image:v0.20.1-24f7512
--- a/argocd/manifests/forgejo-runner/deployment.yaml
+++ b/argocd/manifests/forgejo-runner/deployment.yaml
@ -25,14 +25,6 @@ spec:
          env:
            - name: TZ
              value: America/Los_Angeles
-            - name: DOCKER_HOST
-              value: tcp://localhost:2375
-            - name: FORGEJO_URL
-              value: "https://forge.ops.eblu.me"
-            - name: RUNNER_NAME
-              value: "k8s-runner"
-            - name: RUNNER_LABELS
-              value: "k8s:docker://registry.ops.eblu.me/blumeops/runner-job-image:v0.20.1-24f7512"
          command:
            - /bin/sh
            - -c
@ -44,19 +36,11 @@ spec:
              done
              echo "Docker daemon ready"

-              # Register if not already registered
-              if [ ! -f /data/.runner ]; then
-                echo "Registering runner..."
-                forgejo-runner register \
-                  --instance "$FORGEJO_URL" \
-                  --token "$RUNNER_TOKEN" \
-                  --name "$RUNNER_NAME" \
-                  --labels "$RUNNER_LABELS" \
-                  --no-interactive
-              fi
+              # Render config with credentials from ExternalSecret.
+              envsubst < /config/config.yaml > /tmp/config.yaml

              # Start daemon
-              exec forgejo-runner daemon --config /config/config.yaml
+              exec forgejo-runner daemon --config /tmp/config.yaml
          envFrom:
            - secretRef:
                name: forgejo-runner-env
--- a/argocd/manifests/forgejo-runner/external-secret.yaml
+++ b/argocd/manifests/forgejo-runner/external-secret.yaml
@ -1,11 +1,7 @@
-# ExternalSecret for Forgejo Runner token
+# ExternalSecret for Forgejo Runner credentials
 #
 # 1Password item: "Forgejo Secrets" in blumeops vault
-# Field: runner_reg (runner registration token)
-#
-# Non-secret env vars (FORGEJO_URL, RUNNER_NAME, RUNNER_LABELS) live in the
-# deployment spec so that changes (e.g. image version bumps) trigger a rollout
-# automatically.
+# Fields: runner_k8s_uuid, runner_k8s_token
 #
 apiVersion: external-secrets.io/v1
 kind: ExternalSecret
@ -21,7 +17,11 @@ spec:
    name: forgejo-runner-env
    creationPolicy: Owner
  data:
-  - secretKey: RUNNER_TOKEN
+  - secretKey: FORGEJO_RUNNER_UUID
    remoteRef:
      key: Forgejo Secrets
-      property: runner_reg
+      property: runner_k8s_uuid
+  - secretKey: FORGEJO_RUNNER_TOKEN
+    remoteRef:
+      key: Forgejo Secrets
+      property: runner_k8s_token
--- a/argocd/manifests/forgejo-runner/kustomization.yaml
+++ b/argocd/manifests/forgejo-runner/kustomization.yaml
@ -11,7 +11,7 @@ resources:
 images:
  - name: code.forgejo.org/forgejo/runner
    newName: registry.ops.eblu.me/blumeops/forgejo-runner
-    newTag: v12.7.3-352b95c
+    newTag: v12.8.2-bf16b8a
  - name: docker
    newTag: 27-dind

--- a/containers/forgejo-runner/container.py
+++ b/containers/forgejo-runner/container.py
@ -13,7 +13,7 @@ from blumeops.containers import (
    oci_labels,
 )

-VERSION = "12.7.3"
+VERSION = "12.8.2"


 async def build(src: dagger.Directory) -> dagger.Container:
@ -34,7 +34,7 @@ async def build(src: dagger.Directory) -> dagger.Container:

    # Stage 2: Runtime
    runtime = alpine_runtime(
-        extra_apk=["git", "bash", "ca-certificates"],
+        extra_apk=["git", "bash", "ca-certificates", "gettext-envsubst"],
        uid=1000,
        gid=1000,
        username="runner",
--- a/docs/changelog.d/forgejo-runner-v12-8-server-connections.infra.md
+++ b/docs/changelog.d/forgejo-runner-v12-8-server-connections.infra.md
@ -0,0 +1 @@
+Upgraded the k8s Forgejo runner to the v12.8 line, switched it from first-boot registration to declarative `server.connections` credentials from 1Password, and consolidated the supporting runner how-to documentation.
--- a/docs/how-to/forgejo-runner/configure-k8s-runner.md
+++ b/docs/how-to/forgejo-runner/configure-k8s-runner.md
@ -0,0 +1,100 @@
+---
+title: Configure K8s Forgejo Runner
+modified: 2026-04-20
+last-reviewed: 2026-04-20
+tags:
+  - how-to
+  - forgejo-runner
+  - ci
+---
+
+# Configure K8s Forgejo Runner
+
+Configure the Kubernetes Forgejo runner on [[indri]] using declarative `server.connections` config instead of first-boot `register`.
+
+## Why This Flow
+
+The older bootstrap pattern used `forgejo-runner register` on container start and persisted `/data/.runner` in an `emptyDir`. That works, but it depends on deprecated CLI flows and mutates runner identity at runtime.
+
+The preferred pattern is:
+
+- Create runner credentials once on the Forgejo host
+- Store the runner UUID and token in 1Password
+- Inject them into Kubernetes via [[external-secrets]]
+- Render `server.connections` in `argocd/manifests/forgejo-runner/config.yaml`
+
+This keeps runner identity under secret management and makes pod restarts idempotent.
+
+## Create Runner Credentials
+
+On [[indri]], use Forgejo's local CLI instead of the web UI:
+
+```bash
+ssh indri 'cd ~/code/3rd/forgejo && ./forgejo forgejo-cli actions register \
+  --name k8s-runner \
+  --scope instance \
+  --secret "$(openssl rand -hex 32)"'
+```
+
+This returns a runner UUID. The generated secret becomes the runner token. Store both in 1Password under the "Forgejo Secrets" item as:
+
+- `runner_k8s_uuid`
+- `runner_k8s_token`
+
+## Kubernetes Secret Wiring
+
+Expose those fields with `argocd/manifests/forgejo-runner/external-secret.yaml` and make them available to the runner container as environment variables.
+
+The deployment should not carry registration-only env vars like `FORGEJO_URL`, `RUNNER_NAME`, or `RUNNER_TOKEN`.
+
+## Runner Config
+
+Keep the runner configuration in `argocd/manifests/forgejo-runner/config.yaml`. The key change is adopting `server.connections`:
+
+```yaml
+server:
+  connections:
+    forgejo:
+      url: https://forge.ops.eblu.me
+      uuid: ${FORGEJO_RUNNER_UUID}
+      token: ${FORGEJO_RUNNER_TOKEN}
+      labels:
+        - k8s:docker://registry.ops.eblu.me/blumeops/runner-job-image:<tag>
+```
+
+Other settings that still matter for this deployment:
+
+- `runner.capacity: 2`
+- `runner.timeout: 3h`
+- `runner.shutdown_timeout: 3h`
+- `container.network: host`
+- `container.docker_host: tcp://127.0.0.1:2375`
+
+We do not currently use cache configuration, extra volume mounts, or multiple Forgejo connections.
+
+## Deployment Shape
+
+The pod still runs two containers:
+
+1. `runner` — Forgejo runner daemon
+2. `dind` — Docker-in-Docker sidecar
+
+The startup script only needs to wait for DinD and then launch the daemon. It should no longer call `forgejo-runner register` or depend on `/data/.runner`.
+
+## Upgrade Procedure
+
+When bumping the runner version:
+
+1. Update `VERSION` in `containers/forgejo-runner/container.py`
+2. Review release notes for runner breaking changes
+3. Confirm `config.yaml` is still compatible with the current runner defaults
+4. Build and release the updated `forgejo-runner` image
+5. Update `argocd/manifests/forgejo-runner/kustomization.yaml` to the new image tag
+6. Validate workflows with [[validate-forgejo-workflows]]
+7. Sync the `forgejo-runner` ArgoCD app and trigger a test workflow
+
+## Related
+
+- [[validate-forgejo-workflows]] — Validate workflow schema against the deployed runner line
+- [[forgejo-runner]] — Service reference
+- [[build-container-image]] — Build and release the runner image
--- a/docs/how-to/forgejo-runner/review-runner-config-v12.md
+++ b/docs/how-to/forgejo-runner/review-runner-config-v12.md
@ -1,39 +0,0 @@
---
-title: Review Runner Config for v12
-modified: 2026-02-27
-last-reviewed: 2026-02-27
-tags:
-  - how-to
-  - forgejo-runner
-  - ci
---
-
-# Review Runner Config for v12
-
-Compare the current runner ConfigMap against the v12.7.0 default config to identify new, changed, or deprecated keys.
-
-## Findings
-
-Compared `forgejo-runner generate-config` output from v6.3.1 and v12.7.0. Our config is minimal and remains valid for v12.
-
-### New sections in v12 (not adopted)
-
- **`server.connections`** — multi-server polling. Not needed (single Forgejo instance).
- **`cache.secret_url`** — load cache secret from file URL. Not needed.
- **`runner.report_retry`** — retry config for log uploads. Defaults are fine.
-
-### Changed semantics
-
- **`container.docker_host`** — v12 supports `unix://` and `ssh://` URLs. Our explicit `tcp://127.0.0.1:2375` still correct for DinD sidecar.
- **`cache`** section restructured with proxy/server split and better docs. We don't configure cache, so defaults apply.
-
-### Config update applied
-
-Added `shutdown_timeout: 3h` to allow graceful job completion on pod termination (v12 default, was missing from our v6 config). Added review date comment.
-
-`container.valid_volumes` and `container.options` left empty — our jobs use host networking and don't mount volumes. Can harden later if needed.
-
-## Related
-
- [[upgrade-k8s-runner]] — Parent goal
- [[validate-workflows-against-v12]] — Sibling prerequisite
--- a/docs/how-to/forgejo-runner/upgrade-k8s-runner.md
+++ b/docs/how-to/forgejo-runner/upgrade-k8s-runner.md
@ -1,52 +0,0 @@
---
-title: Upgrade K8s Forgejo Runner to v12
-modified: 2026-02-27
-last-reviewed: 2026-02-27
-tags:
-  - how-to
-  - forgejo-runner
-  - ci
---
-
-# Upgrade K8s Forgejo Runner to v12
-
-Upgrade the k8s forgejo-runner daemon from v6.3.1 to v12.7.0 (or latest v12.x at time of execution).
-
-## Background
-
-The k8s runner on indri (minikube) uses the upstream `code.forgejo.org/forgejo/runner` image, currently pinned to v6.3.1. The latest is v12.7.0. The runner is still in alpha and uses major version bumps for each breaking change, so v6→v12 crosses six major versions. The ringtail runner is already at ~v12.6.4 via nixpkgs and needs no work.
-
-Blast radius is low — if the upgrade breaks CI, revert the image tag in `argocd/manifests/forgejo-runner/deployment.yaml` and sync.
-
-## Breaking Changes Crossed
-
-| Version | Change | Impact |
-|---------|--------|--------|
-| v7.0 | CLI `--gitea-instance` → `--forgejo-instance`; `FORGEJO_*` env vars | Low — our registration doesn't use the old flag |
-| v8.0 | Workflow schema validation; default image → `node:22-bookworm` | Workflows must pass validation |
-| v9.0 | Stricter schema + actions validation; `forgejo-runner validate` added | Same — but now we have a tool |
-| v10.0 | Cache isolation; skip v10.0.0 (regression) | Low |
-| v11.0 | License MIT → GPLv3 | Non-technical |
-| v12.0 | Git binary required; git worktrees for remote actions | Low — OCI image includes git |
-
-## Execution Steps
-
-Once prerequisites are met:
-
-1. Update `argocd/manifests/forgejo-runner/deployment.yaml`:
-   - Change runner image from `code.forgejo.org/forgejo/runner:6.3.1` to `code.forgejo.org/forgejo/runner:12.7.0`
-2. Update `argocd/manifests/forgejo-runner/config.yaml` with any config changes from [[review-runner-config-v12]]
-3. Push, sync ArgoCD: `argocd app sync forgejo-runner`
-4. Verify runner registers and connects: check Forgejo admin → runners
-5. Trigger a test workflow (manual dispatch of `build-container.yaml` or `branch-cleanup.yaml`)
-6. Update `service-versions.yaml` to note the daemon version
-
-## Rollback
-
-Revert the image tag to `6.3.1` in `deployment.yaml`, push, and sync.
-
-## Related
-
- [[forgejo]] — Forgejo service reference
- [[validate-workflows-against-v12]] — Pre-upgrade workflow validation
- [[review-runner-config-v12]] — Config format review
--- a/docs/how-to/forgejo-runner/validate-workflows-against-v12.md
+++ b/docs/how-to/forgejo-runner/validate-workflows-against-v12.md
@ -1,20 +1,20 @@
 ---
-title: Validate Workflows Against v12
+title: Validate Forgejo Workflows
 modified: 2026-04-11
-last-reviewed: 2026-02-27
+last-reviewed: 2026-04-20
 tags:
  - how-to
  - forgejo-runner
  - ci
 ---

-# Validate Workflows Against v12
+# Validate Forgejo Workflows

-Run `forgejo-runner validate` (available from v9.0+) against all workflow files to catch schema issues before upgrading the k8s runner daemon.
+Run `forgejo-runner validate` against all workflow files to catch schema issues before upgrading the k8s runner daemon.

 ## Result

-All 6 workflows pass v12.7.0 schema validation with no changes needed:
+All current workflows pass the validation step with no changes needed:

 - `branch-cleanup.yaml` — OK
 - `build-blumeops.yaml` — OK
@ -27,7 +27,7 @@ All 6 workflows pass v12.7.0 schema validation with no changes needed:

 1. `validate_workflows` function added to `src/blumeops/main.py` (formerly `.dagger/src/blumeops_ci/main.py`)
   - Uses `forgejo-runner validate --directory .` inside the upstream runner container
-   - `runner_version` parameter (default `12.7.0`) pins to deployed version
+   - `runner_version` parameter pins validation to the deployed runner line
 2. `mise run validate-workflows` task wired to `dagger call validate-workflows`
 3. Pre-commit hook triggers on `.forgejo/workflows/` changes

@ -41,5 +41,4 @@ dagger call validate-workflows --src=.

 ## Related

- [[upgrade-k8s-runner]] — Parent goal
- [[review-runner-config-v12]] — Sibling prerequisite
+- [[configure-k8s-runner]] — Runner configuration and upgrade flow
--- a/docs/reference/services/forgejo-runner.md
+++ b/docs/reference/services/forgejo-runner.md
@ -1,7 +1,7 @@
 ---
 title: Forgejo Runner
-modified: 2026-03-30
-last-reviewed: 2026-03-30
+modified: 2026-04-20
+last-reviewed: 2026-04-20
 tags:
  - service
  - ci-cd
@ -22,21 +22,21 @@ Forgejo Actions runner daemon for CI/CD job execution. Runs as a Kubernetes pod
 | **Capacity** | 2 concurrent jobs |
 | **Timeout** | 3h |
 | **Forgejo Instance** | https://forge.ops.eblu.me |
-| **Image** | `code.forgejo.org/forgejo/runner` (see `argocd/manifests/forgejo-runner/kustomization.yaml` for current tag) |
+| **Image** | `registry.ops.eblu.me/blumeops/forgejo-runner` (see `argocd/manifests/forgejo-runner/kustomization.yaml` for current tag) |
 | **DinD Sidecar** | `docker:27-dind` |

 ## Architecture

 The pod runs two containers:

-1. **runner** - The Forgejo runner daemon. Registers with the forge on first start, then polls for jobs. Talks to DinD via `tcp://localhost:2375`.
+1. **runner** - The Forgejo runner daemon. Loads a rendered `server.connections` config at startup, then polls for jobs. Talks to DinD via `tcp://localhost:2375`.
 2. **dind** - Docker-in-Docker sidecar (privileged). Provides the Docker daemon for job container execution. Uses a registry mirror at `host.minikube.internal:5050` ([[zot]]).

-Runner state (`/data/.runner`) is stored in an `emptyDir` volume, so re-registration happens on pod restart. The registration token comes from 1Password via [[external-secrets]].
+The runner daemon image is built from `containers/forgejo-runner/container.py`, not pulled directly from upstream. Credentials come from 1Password via [[external-secrets]], and the startup script renders the final config before launching the daemon. The `/data` volume remains for the runner home directory and job scratch space, not for `.runner` registration state.

 ## Job Execution Image

-The actual container image used to run workflow steps is set via `RUNNER_LABELS` in the deployment, not in the runner config. This image is tracked separately as `runner-job-image` in `service-versions.yaml`. See [[build-container-image]] for how it's built.
+The actual container image used to run workflow steps is declared in `server.connections.labels` in the runner config. This image is tracked separately as `runner-job-image` in `service-versions.yaml`. See [[build-container-image]] for how it's built.

 ## Network

@ -46,7 +46,8 @@ Jobs run with `network: "host"` to share the DinD network namespace. This gives

 | Secret | Source | Purpose |
 |--------|--------|---------|
-| `RUNNER_TOKEN` | 1Password ("Forgejo Secrets" → `runner_reg`) | Runner registration with forge |
+| `FORGEJO_RUNNER_UUID` | 1Password ("Forgejo Secrets" → `runner_k8s_uuid`) | Static runner identity for `server.connections` |
+| `FORGEJO_RUNNER_TOKEN` | 1Password ("Forgejo Secrets" → `runner_k8s_token`) | Static runner credential for `server.connections` |

 ## Related

--- a/docs/reference/services/forgejo.md
+++ b/docs/reference/services/forgejo.md
@ -85,6 +85,7 @@ Both container workflows trigger on the same tag pattern (`*-v[0-9]*`). Each che
 Server configuration secrets managed via 1Password → Ansible:
 - `lfs-jwt-secret`, `internal-token`, `oauth2-jwt-secret` - Forgejo server tokens
 - `runner_reg` - Runner registration token (also in k8s via [[external-secrets]])
+- `runner_k8s_uuid`, `runner_k8s_token` - Static credentials for the k8s runner `server.connections` flow

 ## Forgejo Actions Secrets

--- a/service-versions.yaml
+++ b/service-versions.yaml
@ -236,7 +236,7 @@ services:
  - name: forgejo-runner
    type: argocd
    last-reviewed: 2026-03-30
-    current-version: "12.7.3"
+    current-version: "12.8.2"
    upstream-source: https://code.forgejo.org/forgejo/runner/releases
    notes: >-
      Runner daemon version (code.forgejo.org/forgejo/runner). Job execution
				`@ -0,0 +1 @@`
				Upgraded the k8s Forgejo runner to the v12.8 line, switched it from first-boot registration to declarative `server.connections` credentials from 1Password, and consolidated the supporting runner how-to documentation.