GitOps repository for personal infrastructure management

Nix 32.5%
Jinja 21.5%
Python 17.9%
Shell 11.8%
Go 8.1%
Other 8.2%

Find a file

Erich Blume 2e46f99820 Some checks failed Deploy Fly.io Proxy / deploy (push) Failing after 7m0s Details Upgrade Tailscale operator v1.94.2 → v1.96.3 (#304 ) ## Summary - Bump Tailscale operator, proxy containers, and init containers from v1.94.2 to v1.96.3 across both clusters (indri + ringtail via shared base kustomization) - Replace hand-rolled `until tailscale status` polling loop in `fly/start.sh` with `tailscale wait --timeout 60s` (new in v1.96.2) - Stamp kube-state-metrics review date (already current at v2.18.0) ## Notable upstream changes (v1.94.2 → v1.96.3) - Go upgraded from 1.25 to 1.26 - `tailscale wait` command — blocks until daemon is running + interface has IP - AuthKey policy now applies only when users are not logged in (behavioral change) - Peer Relay improvements (metrics, EC2 IMDS, UDP socket scaling) - UPnP stability fixes ## Deploy plan 1. Merge PR 2. Sync tailscale-operator on indri: `argocd app sync tailscale-operator` 3. Sync tailscale-operator on ringtail: `argocd app sync tailscale-operator-ringtail --server ringtail...` 4. Verify proxy pods roll with new image: `kubectl --context=minikube-indri -n tailscale get pods` 5. Verify ingress connectivity (spot-check a few `*.tail8d86e.ts.net` services) 6. Rebuild + deploy Fly proxy container (separate step, picks up `tailscale wait` change) ## Test plan - [ ] ArgoCD diff looks clean for both apps before sync - [ ] Proxy pods on indri come up healthy with v1.96.3 images - [ ] Proxy pods on ringtail come up healthy with v1.96.3 images - [ ] Tailscale ingress services remain reachable (e.g., grafana, prometheus) - [ ] Fly proxy rebuild deploys successfully with `tailscale wait` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: #304		2026-03-22 19:31:22 -07:00
.claude	Add Claude Code subagents for infrastructure workflows	2026-03-18 11:57:36 -07:00
.dagger	Upgrade Dagger from v0.19.11 to v0.20.0 (#285 )	2026-03-05 09:32:13 -08:00
.forgejo/workflows	Expose Forgejo publicly at forge.eblu.me (#278 )	2026-03-03 08:40:41 -08:00
.github
ansible	Fix borgmatic backup: use correct kubectl context on indri	2026-03-18 06:07:44 -07:00
argocd	Upgrade Tailscale operator v1.94.2 → v1.96.3 (#304 )	2026-03-22 19:31:22 -07:00
containers	Update loki to 3.6.7 (#302 )	2026-03-20 16:02:28 -07:00
docs	Upgrade Tailscale operator v1.94.2 → v1.96.3 (#304 )	2026-03-22 19:31:22 -07:00
fly	Upgrade Tailscale operator v1.94.2 → v1.96.3 (#304 )	2026-03-22 19:31:22 -07:00
mise-tasks	C2: Deploy infrastructure alerting pipeline (#303 )	2026-03-22 14:52:56 -07:00
nixos/ringtail	Update ringtail flake inputs	2026-03-22 17:57:54 -07:00
pulumi	Expose Forgejo publicly at forge.eblu.me (#278 )	2026-03-03 08:40:41 -08:00
.ansible-lint
.gitignore	agent memory ignore	2026-03-21 19:03:21 -07:00
.yamllint.yaml	Allow implicit octals in yamllint and normalize k8s mode values	2026-03-03 13:10:44 -08:00
Brewfile
CHANGELOG.md	Update docs release to v1.14.3	2026-03-22 18:20:41 -07:00
CLAUDE.md	Exclude docs from ai-sources, mention ai-sources in CLAUDE.md	2026-03-15 18:40:35 -07:00
dagger.json	Upgrade Dagger engine from v0.20.0 to v0.20.1	2026-03-06 20:41:02 -08:00
LICENSE
mise.toml	Upgrade Dagger engine from v0.20.0 to v0.20.1	2026-03-06 20:41:02 -08:00
prek.toml	Fix spider trap: disable SPA mode, remove index files, relax wiki-links (#290 )	2026-03-09 11:59:43 -07:00
README.md	Remove suggestion to run prek manually from README	2026-03-05 08:15:25 -08:00
service-versions.yaml	Upgrade Tailscale operator v1.94.2 → v1.96.3 (#304 )	2026-03-22 19:31:22 -07:00
towncrier.toml
update-loki-3.6.7.infra.md	Update loki to 3.6.7 (#302 )	2026-03-20 16:02:28 -07:00

README.md

blumeops

aka "Blue Mops"

Tools and configuration for Erich Blume's personal infrastructure, orchestrated across a Tailscale tailnet.

This is a homelab, but it's also a testing ground for AI-assisted infrastructure development. Much of this codebase was co-authored with Claude Code, and the repo places heavy emphasis on documentation, process, and change classification to make that collaboration work well. I don't know entirely how I feel about LLMs in our current era (there are real concerns about how training data is sourced and energy subsidy) but it felt important to learn how to work with these tools.

The full documentation is published at docs.eblu.me and lives in the docs/ directory, structured around the Diataxis framework and designed to be compatible with Obsidian/Obsidian.nvim.

What runs here

Services are a mix of Kubernetes pods (managed by ArgoCD), macOS LaunchAgent services (managed by Ansible), and NixOS systemd services (managed by Nix flakes), all connected via Tailscale:

Indri (Mac Mini M1) - primary server. Most services run in Minikube via ArgoCD; Forgejo, Caddy, and others run natively as LaunchAgent services via Ansible.
Ringtail (NixOS desktop, RTX 4080) - GPU workloads (Frigate NVR, Authentik SSO) on k3s, plus NixOS systemd services.
Sifaka (Synology NAS) - backup target and bulk storage.

Notable services include Grafana/Prometheus/Loki observability, Immich photos, Jellyfin media, Forgejo git forge, a Zot container registry, and more. Public access is routed through a Fly.io proxy; everything else is tailnet-only.

Project structure

ansible/            Ansible playbooks and roles (indri, sifaka)
argocd/apps/        ArgoCD Application definitions
argocd/manifests/   Kubernetes manifests per service
containers/         Custom container builds (Dockerfile + Nix)
docs/               Diataxis documentation (published at docs.eblu.me)
fly/                Fly.io public proxy configuration
mise-tasks/         Operational scripts run via mise
nixos/              NixOS configuration for ringtail
pulumi/             Pulumi IaC (Tailscale ACLs, Gandi DNS)
.dagger/            Dagger CI pipelines
.forgejo/           Forgejo Actions CI/CD workflows

Getting started

You'll need Homebrew and mise:

brew bundle                    # install CLI tools (argocd, tea, flyctl, etc.)
mise install                   # install managed toolchains (ansible, pulumi, dagger, etc.)
prek install                    # set up git hooks

Git hooks (via prek) enforce secret scanning (TruffleHog), linting, formatting, and custom checks like doc link validation and the Mikado branch invariant. They run automatically on git commit.

Operational tasks are driven through mise. Run mise tasks to see what's available. Key examples:

mise run provision-indri       # deploy to indri via Ansible
mise run services-check        # verify service health
mise run container-list        # list tracked container images

AI-assisted development

This repo is designed to be worked on by both humans and AI agents. The CLAUDE.md file provides instructions for Claude Code, and the docs/tutorials/ai-assistance-guide.md explains the full workflow.

Changes are classified before starting work:

C0 - quick fixes, committed directly to main
C1 - feature branch + PR, documentation written before code
C2 - multi-phase work using the Mikado method for dependency tracking

See the agent change process for details.

License

GPLv3