All checks were successful
Test CI / test (push) Successful in 2s
## Summary - Reorder CI/CD bootstrap phases to address chicken-and-egg problem - P2 is now "Custom Runner Image" (stock runner lacks Node.js) - Add P3 for "Mirror Forgejo & Build from Source" - Rename P3 -> P4 (Self-Deploy), P4 -> P5 (Container Builds) - Add Dockerfile for custom runner with Node.js, npm, docker, build tools - Update overview with new phase structure, host mode notes, and cross-compilation challenge ## Key Changes ### Phase Reordering | Old | New | Name | |-----|-----|------| | P1 | P1 | Enable Actions (complete) | | P2 | P2 | **Custom Runner Image** (new focus) | | - | P3 | **Mirror Forgejo & Build** (new) | | P3 | P4 | Self-Deploy | | P4 | P5 | Container Builds | ### Custom Runner Dockerfile The stock `forgejo/runner:3.5.1` image lacks Node.js, so `actions/checkout@v4` doesn't work. The new Dockerfile adds: - Node.js + npm (for GitHub Actions) - Docker CLI (for container builds) - Build tools (gcc, make, curl, jq) ### Bootstrap Strategy 1. Build custom runner image manually on gilbert (podman build) 2. Push to zot registry 3. Update deployment to use custom image 4. Then enable auto-build workflow for runner ## Deployment and Testing - [x] Review plan changes - [x] Build custom runner image manually and verify - [x] Update runner deployment - [x] Test `actions/checkout@v4` works 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/50
505 lines
11 KiB
Markdown
505 lines
11 KiB
Markdown
# Phase 5: Container Image Builds
|
|
|
|
**Goal**: Set up CI workflows to build custom container images and push to zot registry
|
|
|
|
**Status**: Planning
|
|
|
|
**Prerequisites**: [Phase 4](P4_self_deploy.md) complete (Forgejo self-deploying, Actions working)
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
With Forgejo Actions operational (including custom runner from P2), we can now build container images for:
|
|
- Custom devpi with pre-installed plugins
|
|
- Any other custom images needed for k8s services
|
|
- Release artifacts for Python packages
|
|
|
|
**Note**: The custom runner image build is covered in [Phase 2](P2_mirror_and_build.md). This phase focuses on application container builds.
|
|
|
|
---
|
|
|
|
## Use Case 1: devpi Custom Image
|
|
|
|
### Current State
|
|
|
|
devpi runs from `registry.tail8d86e.ts.net/blumeops/devpi:latest`, built manually:
|
|
- Base image: python
|
|
- Adds: devpi-server, devpi-web
|
|
- Startup script for auto-initialization
|
|
|
|
### Goal
|
|
|
|
Automate builds triggered by:
|
|
- Push to devpi repo on forge
|
|
- Manual workflow dispatch
|
|
- Optionally: upstream devpi release (via schedule check)
|
|
|
|
---
|
|
|
|
## Step 1: Create Workflow for devpi
|
|
|
|
### 1.1 Ensure devpi Repo Has Dockerfile
|
|
|
|
The Dockerfile already exists at `argocd/manifests/devpi/Dockerfile`. We'll create a workflow in the blumeops repo that builds it.
|
|
|
|
### 1.2 Create Build Workflow
|
|
|
|
Create `.forgejo/workflows/build-devpi.yml` in blumeops repo:
|
|
|
|
```yaml
|
|
name: Build devpi Image
|
|
|
|
on:
|
|
push:
|
|
paths:
|
|
- 'argocd/manifests/devpi/Dockerfile'
|
|
- 'argocd/manifests/devpi/start.sh'
|
|
- '.forgejo/workflows/build-devpi.yml'
|
|
workflow_dispatch:
|
|
inputs:
|
|
tag:
|
|
description: 'Image tag (default: latest)'
|
|
required: false
|
|
default: 'latest'
|
|
|
|
env:
|
|
REGISTRY: registry.tail8d86e.ts.net
|
|
IMAGE_NAME: blumeops/devpi
|
|
|
|
jobs:
|
|
build:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Checkout
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Set up Docker Buildx
|
|
uses: docker/setup-buildx-action@v3
|
|
|
|
- name: Determine tag
|
|
id: tag
|
|
run: |
|
|
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
|
|
TAG="${{ github.event.inputs.tag }}"
|
|
else
|
|
TAG="latest"
|
|
fi
|
|
echo "tag=$TAG" >> "$GITHUB_OUTPUT"
|
|
|
|
- name: Build image
|
|
uses: docker/build-push-action@v5
|
|
with:
|
|
context: argocd/manifests/devpi
|
|
file: argocd/manifests/devpi/Dockerfile
|
|
platforms: linux/arm64
|
|
load: true
|
|
tags: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.tag.outputs.tag }}
|
|
|
|
- name: Push to registry
|
|
run: |
|
|
# Zot has no auth, just push
|
|
docker push ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ steps.tag.outputs.tag }}
|
|
|
|
- name: Verify push
|
|
run: |
|
|
# Check image exists in registry
|
|
curl -sf "https://${{ env.REGISTRY }}/v2/${{ env.IMAGE_NAME }}/tags/list" | jq .
|
|
```
|
|
|
|
### 1.3 Runner Needs Registry Access
|
|
|
|
The runner needs to reach `registry.tail8d86e.ts.net`. This should work via Tailscale egress (same as Forgejo access).
|
|
|
|
If not, add egress for registry in `argocd/manifests/tailscale-operator/`:
|
|
```yaml
|
|
apiVersion: tailscale.com/v1alpha1
|
|
kind: Connector
|
|
metadata:
|
|
name: egress-registry
|
|
namespace: tailscale-operator
|
|
spec:
|
|
hostname: egress-registry
|
|
subnetRouter:
|
|
advertiseRoutes:
|
|
- registry.tail8d86e.ts.net/32
|
|
```
|
|
|
|
---
|
|
|
|
## Step 2: Test Build Workflow
|
|
|
|
### 2.1 Push and Trigger
|
|
|
|
```bash
|
|
# Make a small change to trigger
|
|
echo "# Build $(date)" >> argocd/manifests/devpi/Dockerfile
|
|
git add argocd/manifests/devpi/Dockerfile
|
|
git commit -m "Trigger devpi image rebuild"
|
|
git push
|
|
```
|
|
|
|
### 2.2 Monitor Build
|
|
|
|
1. Go to https://forge.tail8d86e.ts.net/eblume/blumeops/actions
|
|
2. Watch "Build devpi Image" workflow
|
|
3. Verify success
|
|
|
|
### 2.3 Verify Image in Registry
|
|
|
|
```bash
|
|
curl -s https://registry.tail8d86e.ts.net/v2/blumeops/devpi/tags/list | jq .
|
|
```
|
|
|
|
### 2.4 Restart devpi to Use New Image
|
|
|
|
```bash
|
|
kubectl --context=minikube-indri -n devpi rollout restart statefulset/devpi
|
|
```
|
|
|
|
---
|
|
|
|
## Step 3: Reusable Container Build Workflow
|
|
|
|
### 3.1 Create Reusable Workflow
|
|
|
|
Create `.forgejo/workflows/build-container.yml`:
|
|
|
|
```yaml
|
|
name: Build Container Image
|
|
|
|
on:
|
|
workflow_call:
|
|
inputs:
|
|
context:
|
|
description: 'Build context path'
|
|
required: true
|
|
type: string
|
|
dockerfile:
|
|
description: 'Dockerfile path (relative to context)'
|
|
required: false
|
|
type: string
|
|
default: 'Dockerfile'
|
|
image_name:
|
|
description: 'Image name (without registry)'
|
|
required: true
|
|
type: string
|
|
tag:
|
|
description: 'Image tag'
|
|
required: false
|
|
type: string
|
|
default: 'latest'
|
|
platforms:
|
|
description: 'Target platforms'
|
|
required: false
|
|
type: string
|
|
default: 'linux/arm64'
|
|
|
|
env:
|
|
REGISTRY: registry.tail8d86e.ts.net
|
|
|
|
jobs:
|
|
build:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Checkout
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Set up Docker Buildx
|
|
uses: docker/setup-buildx-action@v3
|
|
|
|
- name: Build and push
|
|
uses: docker/build-push-action@v5
|
|
with:
|
|
context: ${{ inputs.context }}
|
|
file: ${{ inputs.context }}/${{ inputs.dockerfile }}
|
|
platforms: ${{ inputs.platforms }}
|
|
push: true
|
|
tags: ${{ env.REGISTRY }}/${{ inputs.image_name }}:${{ inputs.tag }}
|
|
|
|
- name: Verify push
|
|
run: |
|
|
curl -sf "https://${{ env.REGISTRY }}/v2/${{ inputs.image_name }}/tags/list" | jq .
|
|
```
|
|
|
|
### 3.2 Use in devpi Workflow
|
|
|
|
Simplify `.forgejo/workflows/build-devpi.yml`:
|
|
|
|
```yaml
|
|
name: Build devpi Image
|
|
|
|
on:
|
|
push:
|
|
paths:
|
|
- 'argocd/manifests/devpi/**'
|
|
workflow_dispatch:
|
|
|
|
jobs:
|
|
build:
|
|
uses: ./.forgejo/workflows/build-container.yml
|
|
with:
|
|
context: argocd/manifests/devpi
|
|
image_name: blumeops/devpi
|
|
```
|
|
|
|
---
|
|
|
|
## Step 4: Python Package Builds (Optional)
|
|
|
|
### 4.1 Use Case
|
|
|
|
Build Python packages from forge repos and publish to devpi.
|
|
|
|
Example: `mcquack` package (LaunchAgent management library)
|
|
|
|
### 4.2 Create Python Build Workflow
|
|
|
|
Create `.forgejo/workflows/build-python.yml`:
|
|
|
|
```yaml
|
|
name: Build Python Package
|
|
|
|
on:
|
|
workflow_call:
|
|
inputs:
|
|
package_path:
|
|
description: 'Path to package (contains pyproject.toml)'
|
|
required: false
|
|
type: string
|
|
default: '.'
|
|
python_version:
|
|
description: 'Python version'
|
|
required: false
|
|
type: string
|
|
default: '3.12'
|
|
publish:
|
|
description: 'Publish to devpi'
|
|
required: false
|
|
type: boolean
|
|
default: false
|
|
secrets:
|
|
DEVPI_PASSWORD:
|
|
required: false
|
|
|
|
env:
|
|
DEVPI_URL: https://pypi.tail8d86e.ts.net
|
|
|
|
jobs:
|
|
build:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Checkout
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Setup Python
|
|
uses: actions/setup-python@v5
|
|
with:
|
|
python-version: ${{ inputs.python_version }}
|
|
|
|
- name: Install uv
|
|
run: pip install uv
|
|
|
|
- name: Build package
|
|
run: |
|
|
cd ${{ inputs.package_path }}
|
|
uv build
|
|
|
|
- name: Upload artifact
|
|
uses: actions/upload-artifact@v4
|
|
with:
|
|
name: dist
|
|
path: ${{ inputs.package_path }}/dist/
|
|
|
|
- name: Publish to devpi
|
|
if: inputs.publish
|
|
run: |
|
|
cd ${{ inputs.package_path }}
|
|
uv publish \
|
|
--publish-url ${{ env.DEVPI_URL }}/eblume/dev/ \
|
|
--username eblume \
|
|
--password "${{ secrets.DEVPI_PASSWORD }}"
|
|
```
|
|
|
|
---
|
|
|
|
## Step 5: Scheduled Builds (Cron)
|
|
|
|
### 5.1 Weekly Rebuild
|
|
|
|
Keep images fresh with weekly rebuilds:
|
|
|
|
```yaml
|
|
name: Weekly Image Rebuilds
|
|
|
|
on:
|
|
schedule:
|
|
# Every Sunday at 3 AM UTC
|
|
- cron: '0 3 * * 0'
|
|
workflow_dispatch:
|
|
|
|
jobs:
|
|
devpi:
|
|
uses: ./.forgejo/workflows/build-container.yml
|
|
with:
|
|
context: argocd/manifests/devpi
|
|
image_name: blumeops/devpi
|
|
```
|
|
|
|
---
|
|
|
|
## Future Improvements
|
|
|
|
### Multi-Arch Builds
|
|
|
|
For images that need both ARM64 and AMD64:
|
|
|
|
```yaml
|
|
platforms: linux/arm64,linux/amd64
|
|
```
|
|
|
|
Requires QEMU emulation setup in runner (already supported by buildx).
|
|
|
|
### Build Caching
|
|
|
|
Use GitHub/Forgejo cache actions:
|
|
|
|
```yaml
|
|
- name: Cache Docker layers
|
|
uses: actions/cache@v4
|
|
with:
|
|
path: /tmp/.buildx-cache
|
|
key: ${{ runner.os }}-buildx-${{ hashFiles('**/Dockerfile') }}
|
|
```
|
|
|
|
### Security Scanning
|
|
|
|
Add Trivy or similar:
|
|
|
|
```yaml
|
|
- name: Run Trivy vulnerability scanner
|
|
uses: aquasecurity/trivy-action@master
|
|
with:
|
|
image-ref: '${{ env.REGISTRY }}/${{ inputs.image_name }}:${{ inputs.tag }}'
|
|
```
|
|
|
|
---
|
|
|
|
## Step 6: Runner Observability (Logging & Metrics)
|
|
|
|
### 6.1 Problem
|
|
|
|
The forgejo-runner pod generates logs and metrics that should be collected for:
|
|
- Debugging failed workflow runs
|
|
- Monitoring runner health and capacity
|
|
- Alerting on runner failures
|
|
|
|
### 6.2 Log Collection via Alloy
|
|
|
|
The forgejo-runner namespace needs to be included in Alloy's k8s log collection. Alloy is already configured to scrape logs from k8s pods - verify the runner namespace is included.
|
|
|
|
Check current Alloy config:
|
|
```bash
|
|
ssh indri 'cat ~/.config/alloy/config.alloy | grep -A20 discovery.kubernetes'
|
|
```
|
|
|
|
If using namespace filtering, ensure `forgejo-runner` is included.
|
|
|
|
### 6.3 Metrics Collection
|
|
|
|
The forgejo-runner exposes Prometheus metrics. Add a ServiceMonitor or configure Alloy to scrape:
|
|
|
|
**Option A: ServiceMonitor (if using Prometheus Operator)**
|
|
|
|
Create `argocd/manifests/forgejo-runner/servicemonitor.yaml`:
|
|
```yaml
|
|
apiVersion: monitoring.coreos.com/v1
|
|
kind: ServiceMonitor
|
|
metadata:
|
|
name: forgejo-runner
|
|
namespace: forgejo-runner
|
|
spec:
|
|
selector:
|
|
matchLabels:
|
|
app: forgejo-runner
|
|
endpoints:
|
|
- port: metrics
|
|
interval: 30s
|
|
```
|
|
|
|
**Option B: Alloy scrape config**
|
|
|
|
Add to Alloy's k8s scrape config to discover the runner pod's metrics endpoint.
|
|
|
|
### 6.4 Create Runner Service for Metrics
|
|
|
|
Add `argocd/manifests/forgejo-runner/service.yaml`:
|
|
```yaml
|
|
apiVersion: v1
|
|
kind: Service
|
|
metadata:
|
|
name: forgejo-runner-metrics
|
|
namespace: forgejo-runner
|
|
labels:
|
|
app: forgejo-runner
|
|
spec:
|
|
selector:
|
|
app: forgejo-runner
|
|
ports:
|
|
- name: metrics
|
|
port: 8080
|
|
targetPort: 8080
|
|
```
|
|
|
|
Update kustomization.yaml to include the service.
|
|
|
|
### 6.5 Grafana Dashboard
|
|
|
|
Consider creating a dashboard for:
|
|
- Runner status (online/offline)
|
|
- Job queue depth
|
|
- Job execution time
|
|
- Success/failure rates
|
|
|
|
### 6.6 Verification
|
|
|
|
```bash
|
|
# Check runner logs are appearing in Loki
|
|
# Go to Grafana → Explore → Loki
|
|
# Query: {namespace="forgejo-runner"}
|
|
|
|
# Check metrics are being scraped
|
|
# Go to Grafana → Explore → Prometheus
|
|
# Query: forgejo_runner_*
|
|
```
|
|
|
|
---
|
|
|
|
## Verification Checklist
|
|
|
|
- [ ] devpi build workflow created
|
|
- [ ] devpi image builds successfully
|
|
- [ ] Image pushed to zot registry
|
|
- [ ] devpi pod uses new image
|
|
- [ ] Reusable container workflow created
|
|
- [ ] (Optional) Python build workflow created
|
|
- [ ] (Optional) Scheduled builds configured
|
|
- [ ] Runner logs visible in Loki
|
|
- [ ] Runner metrics scraped by Prometheus/Alloy
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
With this phase complete, we have:
|
|
1. **Forgejo Actions** running with k8s runner
|
|
2. **Forgejo self-deploys** from CI on tagged releases
|
|
3. **Container images** built automatically on push
|
|
4. Infrastructure for Python package builds
|
|
5. **Runner observability** with logs in Loki and metrics in Prometheus
|
|
|
|
The CI/CD bootstrap is complete. Future work:
|
|
- Add more container builds as needed
|
|
- Add Python package publishing for internal tools
|
|
- Consider adding a macOS runner on indri for native builds
|
|
- Create Grafana dashboards for CI/CD monitoring
|