BlumeOps CI/CD currently runs on Forgejo Actions (GitHub Actions-compatible). While functional, the system has pain points that are inherent to the GHA ecosystem:
- **Hard to debug** — logs are buried in a web UI, no way to SSH into a running job, no interactive debugging
- **No local iteration** — the only way to test a workflow change is to push and wait for CI
- **Supply chain risk** — community actions are opaque third-party code running in your infrastructure
- **Runner complexity** — the k8s-runner image must bundle every tool any workflow might need (Docker CLI, buildx, skopeo, Node.js, etc.)
- **YAML as programming language** — complex workflows become unreadable
### Why Dagger?
[Dagger](https://dagger.io/) is an open-source (Apache-2.0) build engine built on BuildKit. It addresses every pain point above:
| Pain point | Dagger solution |
|------------|-----------------|
| Can't debug builds | `--interactive` drops you into a shell at the failure point; `.terminal()` adds breakpoints |
| Can't run locally | `dagger call` runs identically on your laptop and in CI — same code path |
| Supply chain risk | Build logic is your own Python code, not third-party actions |
| Runner bloat | Runner only needs Docker + `dagger` CLI; all tools live inside Dagger containers |
| YAML complexity | Pipelines are real Python (classes, decorators, async/await) — not templated YAML |
### What Dagger is NOT
Dagger is a **build engine**, not a CI scheduler. It does not handle triggers, scheduling, or webhooks. We keep Forgejo Actions as a thin trigger layer — its YAML becomes trivially simple (install dagger, run `dagger call`). All actual build logic moves to Python.
### Alternatives Considered
| System | Verdict | Reason |
|--------|---------|--------|
| **BuildKite** | Rejected | No fully self-hosted option (cloud control plane required); no native Forgejo integration; adds external dependency for a homelab |
| **Concourse CI** | Rejected | Fully self-hosted and great debugging (`fly intercept`), but verbose YAML with no built-in templating; small community; 2-4GB RAM overhead for the scheduler; doesn't solve local iteration as cleanly |
| **Earthly** | Not viable | Project discontinued April 2025, all cloud services shut down July 2025 |
Dagger was chosen because it delivers the best local iteration story, supports Python natively, and requires zero infrastructure beyond what we already have (Docker on the runner).
## Architecture
```
┌─────────────────────┐ ┌──────────────────┐
│ Forgejo Actions │ │ Your terminal │
│ (trigger layer) │ │ (local dev) │
│ │ │ │
│ on: push tags │ │ mise run ... │
│ → dagger call ... │ │ → dagger call .. │
└──────────┬───────────┘ └────────┬──────────┘
│ │
▼ ▼
┌──────────────────────────────────────┐
│ Dagger Engine (BuildKit) │
│ │
│ blumeops-ci Python module │
│ ├── build(container_name) │
│ ├── publish(container_name, version)│
│ ├── build_docs(version) │
│ ├── release_docs(version, tokens) │
│ └── validate() │
└──────────────┬───────────────────────┘
│
┌────────┼────────┐
▼ ▼ ▼
┌─────┐ ┌──────┐ ┌───────┐
│ Zot │ │Forgejo│ │ArgoCD │
│ │ │Pkgs │ │ │
└─────┘ └──────┘ └───────┘
```
**Key principle:** The same `dagger call` command runs on your Mac during development and in the Forgejo runner during CI. The Forgejo Actions YAML is a thin shim that parses the trigger event and calls Dagger.
## Dagger Module Structure
```
dagger/
├── dagger.json # Module metadata, SDK selection
├── pyproject.toml # Python deps (httpx, etc.)
├── uv.lock # Locked dependencies
└── src/blumeops_ci/
└── __init__.py # All build functions
```
## Secrets Handling
Dagger has a first-class `Secret` type — values are never logged, cached, or visible in traces.
**From CLI:**
```bash
dagger call release-docs \
--src=. --version=v1.6.0 \
--forgejo-token=env:FORGEJO_TOKEN \
--argocd-token=env:ARGOCD_TOKEN
```
The `env:VARIABLE` syntax reads from environment variables. In Forgejo Actions, secrets are injected as env vars. Locally, a mise task calls `op read` to populate them.
**In Python code:**
```python
@function
async def release_docs(
self,
src: dagger.Directory,
version: str,
forgejo_token: dagger.Secret,
argocd_token: dagger.Secret,
) -> str:
# Token is mounted securely, never exposed in logs
token = await forgejo_token.plaintext()
```
**Rule of thumb:** Simple API calls (Forgejo package upload) use Python `httpx` directly in the module runtime. CLI tools without good Python libraries (ArgoCD) run in container steps with secrets mounted as env vars via `.with_secret_variable()`.
## Phase 1: Container Builds
Migrate `build-container.yaml` to use Dagger for the build/push logic.
The composite action (`.forgejo/actions/build-push-image/`), skopeo workaround, and docker save/load dance are all eliminated — that logic lives in the Dagger module.
### Zot Manifest Compatibility
The current workflow uses skopeo because Docker 27's manifest format has issues with zot. Dagger's `.publish()` uses BuildKit's push mechanism, which is different. This **must be tested** during implementation. If BuildKit's push also has zot compatibility issues, the Dagger function can shell out to skopeo inside a container step as a fallback.
### Release Flow (Unchanged)
```bash
mise run container-tag-and-release <container> <version>
This decouples the docs artifact from git releases while keeping the versioned URL pattern. Forgejo releases can still be created for changelog/announcement purposes without carrying the tarball.
This is particularly valuable for debugging Quartz build issues and for iterating on a personal quartz fork.
### Forgejo Actions Integration
The workflow remains manually triggered (workflow_dispatch) to preserve centralized version sequencing. Dagger handles the build/upload/deploy; the workflow handles version resolution and git commit:
- Docker (for the Dagger engine — already available via DinD sidecar)
- The `dagger` CLI binary
- Git (for checkout)
- Basic shell utilities
All other tools (Node.js, skopeo, argocd, Python, npm) live inside the Dagger containers defined by the module. Adding a new tool to a build never requires rebuilding the runner image.
### Implementation
Update `containers/forgejo-runner/Dockerfile` to remove tool-specific dependencies. Install the `dagger` CLI instead. The DinD sidecar in the Forgejo runner pod (`argocd/manifests/forgejo-runner/`) stays unchanged — Dagger's engine runs inside Docker, which the sidecar provides.
## Phase 4: Future Workflows
These are natural extensions once the Dagger module is established:
### Forked Project Builds
Once the [[upstream-fork-strategy]] is in place, forked projects (e.g., a personal quartz fork) can use the same Dagger patterns for building. The docs build function could accept a quartz source directory parameter instead of cloning upstream, enabling builds against the fork.
### Python Package Builds
If private Python packages are built for [[devpi]], Dagger is a natural fit:
A `validate` function that runs linting, doc link checks, and other pre-merge checks:
```bash
dagger call validate --src=.
# → runs docs-check-links, docs-check-index, docs-check-filenames, etc.
```
Same checks run locally and in CI. Could be triggered by Forgejo Actions on PR creation.
## Caveats and Risks
### Dagger Is Pre-1.0
Current version is v0.19.x. API breakage between versions is possible. Mitigations:
- Pin the Dagger CLI version in the runner image and local install
- Test upgrades on a branch before adopting
- The module is small enough to update quickly if APIs change
### Privileged Container Requirement
The Dagger engine requires privileged container access. The current Forgejo runner already uses DinD (privileged), so this should work. Must be verified during implementation.
### BuildKit Cache Persistence
BuildKit caches aggressively, making repeated builds fast. Since the Forgejo runner pod is persistent (not ephemeral), the cache persists between CI runs. Locally, the Dagger engine maintains its own cache. No special cache configuration should be needed.