Commit graph

8 commits

Author SHA1 Message Date
695089499e Nix container build for nettest (#214)
## Summary
- Add `containers/nettest/default.nix` using `dockerTools.buildLayeredImage` with curl, jq, dnsutils, cacert, and bash — equivalent to the existing Dockerfile
- Update `container-tag-and-release` to require `--nix` or `--dockerfile` flag when both build types exist for a container
- Update `container-list` to show `[dockerfile+nix]` label when both exist

## Deployment and Testing
- [ ] SSH to ringtail, run `nix build -f containers/nettest/default.nix -o result` to verify the nix expression builds
- [ ] Tag `nettest-nix-v1.0.0`, confirm `build-container-nix` workflow runs on `nix-container-builder` runner and pushes to registry
- [ ] Smoke test on ringtail k3s: `kubectl run nettest --image=registry.ops.eblu.me/blumeops/nettest:v1.0.0 --restart=Never && kubectl logs nettest`
- [ ] Verify `mise run container-list` shows `[dockerfile+nix]` for nettest
- [ ] Verify `mise run container-tag-and-release nettest v1.1.0` prompts for build type

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/214
2026-02-19 08:42:58 -08:00
918df9e642 Add k3s, 1Password Connect, and systemd nix-container-builder to ringtail (#209)
## Summary

  Extends ringtail from a desktop/gaming NixOS box into an infrastructure node with a k3s cluster, secrets management, and a Forgejo Actions
  runner for building containers with Nix.

  ### K3s cluster
  - Single-node k3s with Traefik/ServiceLB/metrics-server disabled (minimal footprint)
  - TLS SAN set to `ringtail.tail8d86e.ts.net` so ArgoCD on indri can manage it via Tailscale
  - Containerd registry mirrors pull through Zot on indri (`k3s-registries.yaml`)
  - Tailscale interface added to `trustedInterfaces` for cross-node ArgoCD access
  - `kubectl` added to system packages

  ### 1Password Connect + External Secrets Operator
  - Four new ArgoCD apps targeting `k3s-ringtail`: `1password-connect-ringtail`, `external-secrets-crds-ringtail`, `external-secrets-ringtail`,
  `external-secrets-config-ringtail`
  - Reuses the same Helm charts/values as indri, just pointed at ringtail's k3s API server
  - Bootstrap secrets (`op-credentials`, `onepassword-token`) provisioned by Ansible pre_tasks via `op read`, then applied to the `1password`
  namespace in post_tasks

  ### Systemd Forgejo Actions runner
  - Native `services.gitea-actions-runner` with `forgejo-runner` package — no DinD, no k8s pod, runs directly on the NixOS host
  - Label `nix-container-builder:host` — jobs execute on the host with `nix`, `skopeo`, `nodejs`, etc. in PATH
  - Registration token fetched from 1Password (`Forgejo Secrets/runner_reg`) by Ansible and written to `/etc/forgejo-runner/token.env`
  - Runner's dynamic user (`gitea-runner`) added to `nix.settings.trusted-users` for nix daemon access

  ### Nix container build workflow
  - New `.forgejo/workflows/build-container-nix.yaml` triggers on `*-nix-v[0-9]*` tags (e.g. `nettest-nix-v1.0.0`)
  - Builds with `nix build -f containers/<name>/default.nix`, pushes to Zot via `skopeo copy`
  - Existing Dockerfile workflow guarded with `if: !contains(github.ref_name, '-nix-v')` to avoid double-triggering

  ### Mise task updates
  - `container-tag-and-release` auto-detects `default.nix` vs `Dockerfile` and uses the appropriate tag format (`-nix-v` vs `-v`)
  - `container-list` shows build type indicator (`[nix]` / `[dockerfile]`)

  ## Post-merge

  1. `mise run provision-ringtail` — deploys k3s token, runner token, NixOS rebuild
  2. Register k3s cluster in ArgoCD (first time only):
     ```fish
     ssh ringtail 'sudo cat /etc/rancher/k3s/k3s.yaml' | \
       sed 's|127.0.0.1|ringtail.tail8d86e.ts.net|' > /tmp/k3s-ringtail.yaml
     set -x KUBECONFIG /tmp/k3s-ringtail.yaml
     argocd cluster add default --name k3s-ringtail
  3. Sync ArgoCD apps in order: 1password-connect-ringtail -> external-secrets-crds-ringtail -> external-secrets-ringtail ->
  external-secrets-config-ringtail
  4. Verify runner: ssh ringtail 'systemctl status gitea-runner-nix-container-builder'
  5. Check Forgejo admin panel for ringtail-nix-builder runner online
  6. Test: create containers/<name>/default.nix, tag with <name>-nix-v0.1.0

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/209
2026-02-18 21:15:30 -08:00
95364dcb48 Simplify runner image (Dagger Phase 3) (#162)
All checks were successful
Build Container / build (push) Successful in 1m13s
## Summary

With Phases 1 and 2 complete, the runner image no longer needs most of its bundled tools. This PR strips it down and adds what was missing.

**Removed** (now inside Dagger containers):
- Node.js 24.x
- Docker CLI + buildx plugin
- skopeo
- gnupg, lsb-release, xz-utils

**Added:**
- `tzdata` — fixes the TZ env var (#159, #160, #161) so `TZ=America/Los_Angeles` actually works
- `flyctl` — was being installed from scratch every release

**Workflow changes:**
- Remove "Ensure Dagger CLI" bootstrap steps from both workflows (Dagger is in the image)
- Remove "Install flyctl" step from build-blumeops (flyctl is in the image)
- Remove job-level `TZ` from build-blumeops (moved to runner configmap `runner.envs`)
- Set `TZ: America/Los_Angeles` in runner configmap so all job containers inherit it

## Deployment

After merge:
1. Build and release the new runner image: `mise run container-release forgejo-runner v2.0.0`
2. Sync the runner: `argocd app sync forgejo-runner`
3. Verify: `kubectl -n forgejo-runner exec deploy/forgejo-runner -c runner -- date` (but the real test is running a docs release and checking the changelog date)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/162
2026-02-11 17:24:20 -08:00
1bc2b421a8 Adopt Dagger CI for container builds (Phase 1) (#156)
All checks were successful
Build Container / build (push) Successful in 13s
## Summary

- Add Dagger Python module (`.dagger/`) with `build` and `publish` functions for container images
- Replace Docker buildx + skopeo composite action with `dagger call publish` in `build-container.yaml`
- BuildKit's native push is compatible with Zot — **skopeo workaround eliminated**
- Add Dagger CLI (v0.19.11) to forgejo-runner Dockerfile, bump runner to v2.6.0
- Bootstrap step in workflow curl-installs dagger if not in runner (for first build on v2.5.1 runner)
- Delete old `.forgejo/actions/build-push-image/` composite action
- Add GPLv3 LICENSE

## Verified locally

- `dagger call build --src=. --container-name=nettest` — builds ✓
- `dagger call publish --src=. --container-name=nettest --version=dagger-test` — pushed to Zot ✓
- `dagger call build --src=. --container-name=forgejo-runner` — new runner image builds ✓
- Dagger CLI accessible inside built runner image ✓

## Deployment sequence (after merge)

1. `mise run container-tag-and-release forgejo-runner v2.6.0` — old runner bootstraps dagger via curl, builds new runner
2. `argocd app sync forgejo-runner` — runner restarts with v2.6.0 (dagger baked in)
3. `mise run container-tag-and-release nettest v0.13.0` — end-to-end test of new pipeline
4. `mise run container-list` — verify tags

## Not included (future phases)

- Phase 2: docs build + Forgejo packages migration
- Phase 3: runner simplification (remove skopeo, Node.js, etc.)
- Phase 4: future workflows

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/156
2026-02-11 15:38:31 -08:00
fd29244854 Simplify CI: remove Tailscale sidecar, use skopeo for push (#74)
## Summary
- Remove Tailscale sidecar from build-push-image action - registry.ops.eblu.me is directly reachable from k8s pods via Caddy
- Use skopeo for pushing images instead of docker push - Docker 27's manifest format has compatibility issues with zot registry
- Remove tailscale_authkey secret requirement from workflows

## Deployment and Testing
- [x] Tested with nettest-v0.10.0 tag - build succeeded and image pushed to registry

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/74
2026-01-30 10:18:20 -08:00
ea42362b6f Migrate Forgejo runner to Kubernetes with DinD (#60)
## Summary
- Deploy Forgejo runner to k8s with Docker-in-Docker sidecar
- Add job execution image with Node.js and Docker CLI
- Retire host-mode runner on indri
- All CI jobs now run containerized in k8s

## Components Added
- `containers/forgejo-runner/Dockerfile` - Job execution image
- `argocd/apps/forgejo-runner.yaml` - ArgoCD Application
- `argocd/manifests/forgejo-runner/` - Kubernetes manifests

## Components Removed
- `ansible/roles/forgejo_runner/` - No longer needed

## Changes to Existing Files
- `.forgejo/workflows/build-container.yaml` - Use `k8s` runner with `DOCKER_HOST` env
- `.github/actionlint.yaml` - Only `k8s` label now valid

## Deployment
1. Apply secret: `op inject -i argocd/manifests/forgejo-runner/secret.yaml.tpl | kubectl --context=minikube-indri apply -f -`
2. Sync ArgoCD: `argocd app sync forgejo-runner`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/60
2026-01-25 19:56:17 -08:00
424647cd93 Use Tailscale sidecar for container registry push
Some checks failed
Build Container / build (push) Failing after 1m9s
Docker Desktop's VM can't resolve tailnet hostnames. Work around this by:
1. Starting a Tailscale container that joins the tailnet
2. Building the image with docker build
3. Saving to tarball with docker save
4. Pushing via skopeo inside the Tailscale container

Uses TS_CI_GATEWAY_AUTHKEY repository secret for authentication.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 19:29:01 -08:00
31697b4d63 Add nettest container for CI/CD network debugging (#52)
Some checks failed
Build Container / build (push) Failing after 18s
## Summary
- Add `containers/nettest/` with Alpine-based Dockerfile and connectivity test script
- Add `.forgejo/workflows/build-nettest.yaml` workflow triggered by `nettest-v*` tags
- Test script checks DNS resolution and HTTPS connectivity to forge and registry

## Deployment and Testing
- [ ] Merge PR to main
- [ ] Run `mise run container-release nettest v0.1.0` to trigger first build
- [ ] Verify workflow runs successfully and container can reach tailnet services
- [ ] Manually test from minikube: `kubectl run nettest --rm -it --image=registry.tail8d86e.ts.net/blumeops/nettest:v0.1.0`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/52
2026-01-24 16:54:35 -08:00