blumeops/docs/explanation/why-gitops.md
Erich Blume b197bd5f58 Adopt Dagger CI for docs build (Phase 2) (#157)
## Summary

Migrates the docs build pipeline to Dagger (Phase 2 of the Dagger CI adoption plan).

- **Backfill `date-modified` frontmatter** on all 80 docs — Dagger's `--src=.` excludes `.git`, so Quartz can't use git history for page dates. Frontmatter dates work with or without git.
- **New `docs-check-frontmatter` mise task + pre-commit hook** — validates all docs have `title`, `tags`, and `date-modified`
- **New Dagger functions** — `build_changelog` (towncrier in Python container) and `build_docs` (chains changelog → Quartz build in Node container, returns tarball)
- **Simplified CI workflow** — the ~44-line inline Quartz build (clone, npm ci, build, tar, cleanup) is replaced by `dagger call build-docs`. Changelog step remains local on the runner since towncrier needs to modify the host working tree for the git commit.

### Design decisions

- **Towncrier runs twice in CI**: once inside Dagger (for the docs tarball) and once on the runner (for the git commit). This is intentional — Dagger's directory export is additive and can't delete the consumed changelog fragments from the host.
- **Artifact hosting stays on Forgejo Releases** (not migrated to Forgejo Packages as the plan doc originally suggested). That migration can happen independently.
- **`date-modified` frontmatter** preserved even though `build_changelog` installs git — the git there is only for towncrier's `git add` call, not for history. The local iteration story (`dagger call build-docs --src=. --version=dev` with uncommitted changes) depends on frontmatter dates.

### Local iteration

```bash
dagger call build-docs --src=. --version=dev export --path=./docs-dev.tar.gz
tar tf docs-dev.tar.gz | head -20
```

## Deployment and Testing

- [x] `dagger call build-docs --src=. --version=dev` produces valid 1.1MB tarball (149 HTML pages)
- [x] Pre-commit hooks pass (including new `docs-check-frontmatter`)
- [ ] Full `workflow_dispatch` run after merge

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/157
2026-02-11 16:33:16 -08:00

71 lines
2.7 KiB
Markdown

---
title: Why GitOps
date-modified: 2026-02-07
tags:
- explanation
- philosophy
---
# Why GitOps?
> **Note:** This article was drafted by AI and reviewed by Erich. I plan to rewrite all explanatory content in my own words - these serve as placeholders to establish the documentation structure.
BlumeOps uses GitOps principles for managing personal infrastructure. This might seem like overkill for a homelab, but there are good reasons.
## The Problem with Manual Infrastructure
Traditional server management involves SSHing into machines and running commands. This works, but creates problems:
- **Drift**: The actual state diverges from what you think it is
- **Amnesia**: You forget what you changed and why
- **Fragility**: One bad command can break things with no easy rollback
- **Bus factor**: Only you know how it works (even AI assistants struggle without context)
## Git as the Source of Truth
GitOps inverts the model: instead of pushing changes to servers, you commit desired state to Git, and automation pulls it into reality.
**Benefits:**
- Every change is tracked with commit history
- Pull requests enable review before deployment
- Rollback is just `git revert`
- The repo *is* the documentation
## Why This Matters for a Homelab
A personal homelab isn't a production environment, but it shares the same challenges:
1. **Memory is unreliable** - Six months from now, you won't remember why you configured Caddy that way
2. **Experimentation is constant** - You try things, break things, want to undo things
3. **AI assistance needs context** - Claude can help much more effectively when it can read your infrastructure as code
## The BlumeOps Approach
BlumeOps uses layered GitOps:
| Layer | Tool | What it manages |
|-------|------|-----------------|
| **Tailnet** | [[tailscale|Pulumi]] | ACLs, tags, DNS |
| **Host config** | [[roles|Ansible]] | Services on [[indri]] |
| **Kubernetes** | [[argocd|ArgoCD]] | Containerized workloads |
Each layer has its own reconciliation loop:
- Pulumi applies on `mise run tailnet-up`
- Ansible applies on `mise run provision-indri`
- ArgoCD watches Git and syncs manually or automatically
## Trade-offs
GitOps isn't free:
- **Learning curve** - You need to understand Ansible, ArgoCD, Pulumi
- **Indirection** - Can't just `apt install` something; need to add it to config
- **Complexity** - More moving parts than a simple server
But for BlumeOps, the trade-off is worth it. The infrastructure is complex enough that managing it imperatively would be error-prone, and the GitOps approach enables effective AI-assisted operations.
## Related
- [[architecture]] - How the pieces fit together
- [[argocd]] - Kubernetes GitOps
- [[roles|Ansible roles]] - Host configuration