blumeops/docs/reference/storage/backups.md

97 lines
3.2 KiB
Markdown
Raw Normal View History

---
title: Backups
Add borgmatic backups for authentik and immich databases (#314) ## Summary - Add `authentik` database (blumeops-pg cluster) to borgmatic pg_dump backups - Add `immich` database (immich-pg cluster) to borgmatic pg_dump backups - For immich-pg: new borgmatic managed role with `pg_read_all_data`, ExternalSecret, Tailscale LoadBalancer service, and Caddy L4 TCP proxy on port 5433 - Update backup docs to reflect all four CNPG databases + mealie SQLite ## Deploy plan Deploy order matters — k8s resources must exist before ansible can route to them: 1. **ArgoCD (databases app):** sync to pick up immich-pg borgmatic role, ExternalSecret, and Tailscale service ``` argocd app set blumeops-pg --revision feature/borgmatic-all-pg-backups argocd app sync blumeops-pg ``` 2. **Wait** for `immich-pg-tailscale` service to get a Tailscale IP and `immich-pg.tail8d86e.ts.net` to resolve 3. **Ansible (caddy):** deploy Caddy L4 route for port 5433 ``` mise run provision-indri -- --tags caddy ``` 4. **Ansible (borgmatic):** deploy updated config and .pgpass ``` mise run provision-indri -- --tags borgmatic ``` 5. **Verify:** trigger a manual borgmatic run and check all four pg_dump streams succeed ``` borgmatic --verbosity 1 2>&1 | grep -E '(Dumping|ERROR)' ``` ## Test plan - [x] `kubectl kustomize` builds cleanly - [x] `ansible --check --diff` for borgmatic and caddy show expected changes - [ ] ArgoCD sync succeeds for databases app - [ ] `immich-pg.tail8d86e.ts.net` resolves - [ ] `pg.ops.eblu.me:5433` accepts connections - [ ] `borgmatic --verbosity 1` dumps all four databases without errors Reviewed-on: https://forge.eblu.me/eblume/blumeops/pulls/314
2026-03-27 16:59:58 -07:00
modified: 2026-03-27
tags:
- storage
- backup
---
# Backup Policy
Daily automated backups from [[indri]] to [[sifaka|Sifaka]] NAS.
## Schedule
| Time | Frequency | System |
|------|-----------|--------|
| 2:00 AM | Daily | [[borgmatic]] |
## What Gets Backed Up
### Directories
| Path | Description | Priority |
|------|-------------|----------|
| `~/code/personal/zk` | Zettelkasten notes | Critical |
| `/opt/homebrew/var/forgejo` | Git repositories | Critical |
| `~/.config/borgmatic` | Backup config | High |
| `~/Documents` | Personal documents (includes [[1password]] encrypted export) | High |
### Databases
Add borgmatic backups for authentik and immich databases (#314) ## Summary - Add `authentik` database (blumeops-pg cluster) to borgmatic pg_dump backups - Add `immich` database (immich-pg cluster) to borgmatic pg_dump backups - For immich-pg: new borgmatic managed role with `pg_read_all_data`, ExternalSecret, Tailscale LoadBalancer service, and Caddy L4 TCP proxy on port 5433 - Update backup docs to reflect all four CNPG databases + mealie SQLite ## Deploy plan Deploy order matters — k8s resources must exist before ansible can route to them: 1. **ArgoCD (databases app):** sync to pick up immich-pg borgmatic role, ExternalSecret, and Tailscale service ``` argocd app set blumeops-pg --revision feature/borgmatic-all-pg-backups argocd app sync blumeops-pg ``` 2. **Wait** for `immich-pg-tailscale` service to get a Tailscale IP and `immich-pg.tail8d86e.ts.net` to resolve 3. **Ansible (caddy):** deploy Caddy L4 route for port 5433 ``` mise run provision-indri -- --tags caddy ``` 4. **Ansible (borgmatic):** deploy updated config and .pgpass ``` mise run provision-indri -- --tags borgmatic ``` 5. **Verify:** trigger a manual borgmatic run and check all four pg_dump streams succeed ``` borgmatic --verbosity 1 2>&1 | grep -E '(Dumping|ERROR)' ``` ## Test plan - [x] `kubectl kustomize` builds cleanly - [x] `ansible --check --diff` for borgmatic and caddy show expected changes - [ ] ArgoCD sync succeeds for databases app - [ ] `immich-pg.tail8d86e.ts.net` resolves - [ ] `pg.ops.eblu.me:5433` accepts connections - [ ] `borgmatic --verbosity 1` dumps all four databases without errors Reviewed-on: https://forge.eblu.me/eblume/blumeops/pulls/314
2026-03-27 16:59:58 -07:00
| Database | Cluster | Host | Method |
|----------|---------|------|--------|
| miniflux | blumeops-pg | [[postgresql|pg.ops.eblu.me:5432]] | pg_dump stream |
| teslamate | blumeops-pg | [[postgresql|pg.ops.eblu.me:5432]] | pg_dump stream |
| authentik | blumeops-pg | [[postgresql|pg.ops.eblu.me:5432]] | pg_dump stream |
| immich | immich-pg | [[postgresql|pg.ops.eblu.me:5433]] | pg_dump stream |
| mealie | — (SQLite) | k8s pod | kubectl exec sqlite3 .backup |
Add offsite backup for immich photo library to BorgBase (#315) ## Summary - Adds a second borgmatic config (`photos.yaml`) that backs up `/Volumes/photos` (sifaka SMB mount, ~128 GB) to a dedicated BorgBase repo (`immich-photos`), running daily at 4 AM - Separate launchd agent (`mcquack.eblume.borgmatic-photos`) so photo backups run independently from the main backup - Refactors `borgmatic_metrics` script to support multiple repos with a `repo` Prometheus label - Updates Grafana "Borg Backups" dashboard with a `repo` template variable so you can filter/compare repos - Docs updated: `backups.md`, `borgmatic.md` ## Prerequisites (manual) - [x] Create `immich-photos` repo on BorgBase with same SSH key - [ ] Upgrade BorgBase plan to Small ($24/yr) if currently on free tier (128 GB exceeds 10 GB limit) - [ ] After deploy: `borg init` the new repo (borgmatic does this automatically on first run) ## Test plan - [ ] Dry run: `mise run provision-indri -- --check --diff --tags borgmatic,borgmatic_metrics` - [ ] Deploy borgmatic role and verify both configs deployed - [ ] Run `borgmatic --config ~/.config/borgmatic/photos.yaml create --verbosity 1` manually for first backup (will take hours) - [ ] Verify metrics script collects from both repos: `~/.local/bin/borgmatic-metrics && cat /opt/homebrew/var/node_exporter/textfile/borgmatic.prom` - [ ] Sync grafana-config in ArgoCD and verify dashboard repo selector works 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.eblu.me/eblume/blumeops/pulls/315
2026-03-27 19:43:05 -07:00
## Immich Photo Library (Offsite Only)
The [[immich]] photo library lives on [[sifaka]] at `/volume1/photos` (SMB-mounted on [[indri]] as `/Volumes/photos`). Since sifaka is already the local backup target, photos are backed up to BorgBase offsite only — not back to sifaka.
| Property | Value |
|----------|-------|
| **Config** | `~/.config/borgmatic/photos.yaml` |
| **Schedule** | Daily at 4:00 AM (offset from main backup) |
| **Source** | `/Volumes/photos` (sifaka SMB mount) |
| **Target** | BorgBase `borgbase-immich-photos` repo |
| **Size** | ~128 GB |
Uses the same encryption passphrase and SSH key as the main borgmatic config.
## Sifaka-Native Data
Add offsite backup for immich photo library to BorgBase (#315) ## Summary - Adds a second borgmatic config (`photos.yaml`) that backs up `/Volumes/photos` (sifaka SMB mount, ~128 GB) to a dedicated BorgBase repo (`immich-photos`), running daily at 4 AM - Separate launchd agent (`mcquack.eblume.borgmatic-photos`) so photo backups run independently from the main backup - Refactors `borgmatic_metrics` script to support multiple repos with a `repo` Prometheus label - Updates Grafana "Borg Backups" dashboard with a `repo` template variable so you can filter/compare repos - Docs updated: `backups.md`, `borgmatic.md` ## Prerequisites (manual) - [x] Create `immich-photos` repo on BorgBase with same SSH key - [ ] Upgrade BorgBase plan to Small ($24/yr) if currently on free tier (128 GB exceeds 10 GB limit) - [ ] After deploy: `borg init` the new repo (borgmatic does this automatically on first run) ## Test plan - [ ] Dry run: `mise run provision-indri -- --check --diff --tags borgmatic,borgmatic_metrics` - [ ] Deploy borgmatic role and verify both configs deployed - [ ] Run `borgmatic --config ~/.config/borgmatic/photos.yaml create --verbosity 1` manually for first backup (will take hours) - [ ] Verify metrics script collects from both repos: `~/.local/bin/borgmatic-metrics && cat /opt/homebrew/var/node_exporter/textfile/borgmatic.prom` - [ ] Sync grafana-config in ArgoCD and verify dashboard repo selector works 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.eblu.me/eblume/blumeops/pulls/315
2026-03-27 19:43:05 -07:00
Other data lives directly on [[sifaka]] (music via [[navidrome]], video via [[jellyfin]]). See [[sifaka]] for data protection details.
## What Is NOT Backed Up
| Data | Reason |
|------|--------|
| ZIM archives (`~/transmission/`) | Re-downloadable via torrent |
| Prometheus metrics | Ephemeral, in k8s PVC |
| Loki logs | Ephemeral, in k8s PVC |
Migrate devpi from minikube to indri (launchd) (#341) ## Summary Devpi was crash-looping under memory pressure on the minikube StatefulSet, breaking the Python toolchain across the repo (`mise run docs-mikado`, `prek`, every `uv pip install`). It moves to indri as a native LaunchAgent. ## What changed - **New ansible role** `ansible/roles/devpi/`: installs `devpi-server` + `devpi-web` into a uv-managed venv, initializes the server-dir on first run via 1Password root password, runs as a LaunchAgent (`mcquack.eblume.devpi`) bound to `127.0.0.1:3141`. Bootstraps from upstream PyPI (so devpi can install itself on a fresh box). - **Caddy**: `pypi.ops.eblu.me` now proxies to `http://localhost:3141`. - **Playbook**: `indri.yml` gains pre_tasks for the root password and the new role. - **service-versions.yaml**: devpi flipped from `type: argocd` to `type: ansible`. - **ArgoCD**: removed `apps/devpi.yaml` and `manifests/devpi/`. The in-cluster Application, namespace, and PVC have been deleted. - **Docs**: new how-to `docs/how-to/operations/devpi-on-indri.md`; `restart-indri.md` lists devpi in the LaunchAgent stop list. ## Already deployed (live on indri) - Service running: `launchctl list mcquack.eblume.devpi` → PID 53888 - `curl https://pypi.ops.eblu.me/+api` returns 200 ✅ - `mise run docs-mikado` works again ✅ - 1.0G of cached PyPI data was migrated from the PVC to `~erichblume/devpi/server-dir/` - Minikube namespace and PVC fully reclaimed ## Test plan - [ ] `mise run services-check` (after merge) - [ ] CI workflows that use devpi succeed - [ ] No regressions in tools that depend on `pypi.ops.eblu.me` (prek, uv-script tasks, dagger pipelines) ## Context This is the C1 prelude to a planned C2 chain (`mikado/retire-minikube-indri`) to retire minikube on indri entirely. Doing devpi as a standalone C1 was the right call because (a) it was urgent — it was breaking the toolchain — and (b) it shakes out the migration recipe before we commit to a multi-leaf chain. Reviewed-on: https://forge.eblu.me/eblume/blumeops/pulls/341
2026-04-29 13:38:36 -07:00
| devpi cache (`~/devpi/server-dir/` on indri) | Re-fetchable from PyPI on first request |
## Retention Policy
| Period | Retention |
|--------|-----------|
| Daily | 7 backups |
| Monthly | 12 backups |
| Yearly | 1000 backups |
## Backup Targets
Add offsite backup for immich photo library to BorgBase (#315) ## Summary - Adds a second borgmatic config (`photos.yaml`) that backs up `/Volumes/photos` (sifaka SMB mount, ~128 GB) to a dedicated BorgBase repo (`immich-photos`), running daily at 4 AM - Separate launchd agent (`mcquack.eblume.borgmatic-photos`) so photo backups run independently from the main backup - Refactors `borgmatic_metrics` script to support multiple repos with a `repo` Prometheus label - Updates Grafana "Borg Backups" dashboard with a `repo` template variable so you can filter/compare repos - Docs updated: `backups.md`, `borgmatic.md` ## Prerequisites (manual) - [x] Create `immich-photos` repo on BorgBase with same SSH key - [ ] Upgrade BorgBase plan to Small ($24/yr) if currently on free tier (128 GB exceeds 10 GB limit) - [ ] After deploy: `borg init` the new repo (borgmatic does this automatically on first run) ## Test plan - [ ] Dry run: `mise run provision-indri -- --check --diff --tags borgmatic,borgmatic_metrics` - [ ] Deploy borgmatic role and verify both configs deployed - [ ] Run `borgmatic --config ~/.config/borgmatic/photos.yaml create --verbosity 1` manually for first backup (will take hours) - [ ] Verify metrics script collects from both repos: `~/.local/bin/borgmatic-metrics && cat /opt/homebrew/var/node_exporter/textfile/borgmatic.prom` - [ ] Sync grafana-config in ArgoCD and verify dashboard repo selector works 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.eblu.me/eblume/blumeops/pulls/315
2026-03-27 19:43:05 -07:00
| Repository | Location | Label | Backs up |
|------------|----------|-------|----------|
| `/Volumes/backups/borg/` | [[sifaka]] (local NAS) | `sifaka-borg-backups` | indri data |
| `ssh://u3ugi1x1@...repo.borgbase.com/./repo` | BorgBase (offsite) | `borgbase-offsite` | indri data |
| `ssh://xcrtl5tg@...repo.borgbase.com/./repo` | BorgBase (offsite) | `borgbase-immich-photos` | immich photos |
## Monitoring
Metrics exposed to [[prometheus]]:
- `borgmatic_up` - Repository accessible
- `borgmatic_last_archive_timestamp` - Last backup time
- `borgmatic_repo_deduplicated_size_bytes` - Disk usage
Dashboard: "Borgmatic Backups" in [[grafana]]
## Related
- [[borgmatic]] - Backup system details
- [[sifaka|Sifaka]] - Backup storage
- [[postgresql]] - Database backups
- [[restore-1password-backup]] - Recover 1Password from backup