Switch to Buildah for container builds #51

Merged
eblume merged 30 commits from feature/p5-container-builds into main 2026-01-24 13:30:26 -08:00
Owner

Summary

  • Replace Docker with Buildah for container image builds
  • No Docker socket required - buildah is daemonless
  • Cleaner security model (no privileged containers or socket mounting)
  • Remove Docker-related security context from deployment

Changes

  • Update Dockerfile to install buildah/podman instead of docker-cli
  • Configure buildah storage with overlay driver and fuse-overlayfs
  • Update composite action to use buildah bud and buildah push
  • Add imagePullPolicy: Always to ensure fresh image pulls
  • Update test workflow to verify buildah/podman

Testing

  • Runner pod starts successfully
  • Buildah is available in runner
  • Test workflow verifies buildah/podman versions
  • Container build workflow builds and pushes to zot

🤖 Generated with Claude Code

## Summary - Replace Docker with Buildah for container image builds - No Docker socket required - buildah is daemonless - Cleaner security model (no privileged containers or socket mounting) - Remove Docker-related security context from deployment ## Changes - Update Dockerfile to install buildah/podman instead of docker-cli - Configure buildah storage with overlay driver and fuse-overlayfs - Update composite action to use `buildah bud` and `buildah push` - Add `imagePullPolicy: Always` to ensure fresh image pulls - Update test workflow to verify buildah/podman ## Testing - [ ] Runner pod starts successfully - [ ] Buildah is available in runner - [ ] Test workflow verifies buildah/podman versions - [ ] Container build workflow builds and pushes to zot 🤖 Generated with [Claude Code](https://claude.com/claude-code)
- Create composite action: .forgejo/actions/build-push-image
- Add build-runner.yaml workflow (triggers on Dockerfile changes)
- Add build-devpi.yaml workflow (triggers on Dockerfile/start.sh changes)
- Mount Docker socket in runner deployment for container builds
- Run runner as root to access Docker socket

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace docker-cli with buildah/podman in runner image
- Configure buildah for overlay storage with fuse-overlayfs
- Add registry config for insecure local registry
- Remove Docker socket mount and root security context from deployment
- Update composite action to use buildah bud/push instead of docker

Buildah is daemonless - no Docker socket required, cleaner security model.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update test workflow to verify buildah/podman instead of docker
Some checks failed
Test CI / test (pull_request) Failing after 12s
6d8e6ea4c0
Fix SIGPIPE in test workflow by adding || true to piped commands
All checks were successful
Test CI / test (pull_request) Successful in 3s
a3a61146a3
Add comment to test buildah workflow
All checks were successful
Test CI / test (pull_request) Successful in 3s
b2967817d6
Add tag-based container release workflow
All checks were successful
Test CI / test (pull_request) Successful in 3s
3702e7eec2
- Workflows trigger on git tags (e.g. runner-v1.0.0, devpi-v1.0.0)
- Composite action takes explicit version, tags image with version + SHA
- Add mise-tasks/container-list to enumerate containers and recent tags
- Add mise-tasks/container-release to create release tags
- Update CLAUDE.md with container release commands
- TODO: investigate zot tag immutability

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove test comment from Dockerfile
Some checks failed
Test CI / test (pull_request) Successful in 2s
Build forgejo-runner / build (push) Failing after 2s
0c1a3bf0cf
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Build forgejo-runner from source with proper user setup
Some checks failed
Test CI / test (pull_request) Successful in 3s
Build forgejo-runner / build (push) Failing after 2s
4e0767b4d9
- Multi-stage build from mirrored forgejo-runner source
- Create proper runner user with passwd entry (fixes buildah)
- Use named user instead of numeric UID

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use versioned runner image v1.0.1
Some checks failed
Test CI / test (pull_request) Successful in 3s
Build forgejo-runner / build (push) Failing after 1m14s
a979ddaf0c
- Remove imagePullPolicy: Always (rely on immutable tags)
- Use explicit version tag instead of :latest

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add subuid/subgid for rootless buildah
Some checks failed
Test CI / test (pull_request) Successful in 3s
Build forgejo-runner / build (push) Failing after 20s
8d2e180d5d
Buildah needs UID/GID remapping to extract images with files
owned by different users (root, shadow, etc). Configure
subordinate UID/GID ranges for the runner user.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add forgejo_runner Ansible role for indri
All checks were successful
Test CI / test (pull_request) Successful in 2s
676c1782d1
Run forgejo-runner directly on indri using Docker container mode
instead of trying to build containers inside k8s pods. This avoids
nested containerization complexity.

Features:
- Build from source using mise + Go
- Docker container mode for job isolation
- Can build containers via Docker socket
- Labels: docker-builder (distinct from k8s runner)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fix 1Password field name for runner token
All checks were successful
Test CI / test (pull_request) Successful in 3s
7a637d2ebf
Use runner_reg field (matching existing k8s secret template)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Switch container builds to indri docker-builder runner
Some checks failed
Test CI / test (pull_request) Successful in 3s
Build forgejo-runner / build (push) Failing after 0s
2c284ed0cf
- Use Docker instead of buildah in composite action
- Build workflows now run on docker-builder label
- Add actionlint config for custom runner labels
- Avoids nested containerization complexity in k8s

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add README explaining .github vs .forgejo directories
All checks were successful
Test CI / test (pull_request) Successful in 2s
6b4e0961ed
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add ubuntu-latest labels to indri runner
Some checks failed
Test CI / test (pull_request) Failing after 1s
f4178fce7d
Now handles all workflows (test and build)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace k8s runner with ci-base image for local builds
Some checks failed
Test CI / test (pull_request) Failing after 1s
bcdee225e5
- Remove forgejo-runner k8s manifests and ArgoCD app (runner now on indri)
- Remove build-runner workflow (no longer needed)
- Add ci-base image with Ubuntu 22.04 + common CI tools
- Add build-ci-base workflow to build the image
- Update test workflow to check docker instead of buildah
- Document bootstrap vs production mode for runner labels
- Configure host.docker.internal:5050 for zot access from job containers

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add comment to test workflow to trigger CI run
Some checks failed
Test CI / test (pull_request) Failing after 0s
35136e361e
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update test workflow comment to trigger CI
Some checks failed
Test CI / test (pull_request) Failing after 1m15s
50b925791d
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use host networking for job containers
Some checks failed
Test CI / test (pull_request) Failing after 36s
15e3ec98ea
Containers need to reach localhost:3001 (Forgejo) for git operations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use --add-host to map localhost to Docker host in job containers
Some checks failed
Test CI / test (pull_request) Failing after 40s
476b80e985
This allows containers to reach Forgejo at localhost:3001 for git operations.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ci-gateway tag owner (admin and blumeops can assign)
- Grant ci-gateway access to forge:443 for git operations
- Grant ci-gateway access to registry:443 for container push/pull
- Add ACL test for ci-gateway access

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Containerize forgejo-runner with Tailscale gateway for tailnet access
Some checks failed
Test CI / test (pull_request) Failing after 48s
fdf5153130
Architecture:
- tailscale_ci_gateway role: Runs Tailscale container on tailnet-jobs network
- forgejo_runner role: Runs runner daemon in container on same network
- Job containers also use tailnet-jobs network

This allows the runner and jobs to reach forge.tail8d86e.ts.net via
the Tailscale gateway, avoiding hairpinning issues with localhost.

Changes:
- Add tailscale_ci_gateway role with launchd management
- Refactor forgejo_runner to use containerized daemon
- Runner registers with Tailscale URL instead of localhost
- Job containers run on tailnet-jobs network
- Update playbook role ordering (gateway before runner)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
launchd agents don't have /usr/local/bin in PATH by default.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fix forgejo-runner networking for tailnet access
Some checks failed
Test CI / test (pull_request) Failing after 32s
c79dc94325
- Add --accept-routes to tailscale-ci-gateway for service routing
- Run forgejo-runner as root for docker socket access
- Mount actual docker socket path (not symlink)
- Use gateway network namespace for tailnet connectivity
- Registration uses gateway network for Forgejo access

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Switch forgejo-runner to host execution mode
All checks were successful
Test CI / test (pull_request) Successful in 4s
cfe5c0c0dd
Docker-based runner had networking issues reaching Forgejo from job
containers. Host execution mode runs the runner daemon directly on indri,
with jobs executing on the host. Actions that need Docker use host
networking to access localhost:3001.

- Runner binary compiled locally at ~/code/3rd/forgejo-runner
- Labels use :host suffix instead of :docker://image
- PATH set in launchd plist for mise-managed tools (node, etc.)
- Container network set to "host" for actions needing Docker

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove tailscale_ci_gateway role and ACLs
All checks were successful
Test CI / test (pull_request) Successful in 4s
ad968eea46
The Docker-based runner with Tailscale sidecar approach was abandoned
in favor of host execution mode. Clean up the unused infrastructure:

- Remove tailscale_ci_gateway role and its reference in indri.yml
- Remove tag:ci-gateway ACL grants and tagOwners from pulumi policy
- Plist already removed from indri

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove placeholder workflows and ci-base manifest
All checks were successful
Test CI / test (pull_request) Successful in 4s
34211fa874
Keep only test.yaml workflow for now. Container build workflows
and ci-base Dockerfile will be added in a future PR.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
eblume merged commit 8ca8798121 into main 2026-01-24 13:30:26 -08:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
eblume/blumeops!51
No description provided.