Commit graph

23 commits

Author SHA1 Message Date
3af346f1cd Move ollama LAN NodePort to port 80
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 09:37:50 -08:00
a87c997ee1 Expose Forgejo publicly at forge.eblu.me (#278)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m28s
## Summary

Expose Forgejo publicly at `forge.eblu.me` via the Fly.io reverse proxy — the first dynamic, authenticated public-facing service.

- **Forgejo hardening:** Domain changed to forge.eblu.me, SSH stays on forge.ops.eblu.me, reverse proxy trust headers configured, local registration locked to external-only (Authentik SSO)
- **Tailscale Ingress:** ExternalName Service + Ingress in tailscale-operator creates forge.tail8d86e.ts.net endpoint
- **Fly.io proxy:** nginx server block with rate-limited auth endpoints (3r/s), fail2ban with custom nginx-deny action, security headers, /swagger blocked, WebSocket support, 512m body limit
- **Authentik:** OAuth callback updated to forge.eblu.me
- **DNS/TLS:** CNAME record in Pulumi, cert in fly-setup
- **Rename:** ~29 files updated from forge.ops.eblu.me to forge.eblu.me (HTTPS refs only; SSH, container builds, and Caddy table kept as-is)

## Deployment Order

1. `mise run provision-indri -- --tags forgejo` (config changes)
2. Verify forge.ops.eblu.me still works
3. `argocd app set tailscale-operator --revision feature/forge-public && argocd app sync tailscale-operator`
4. Verify `curl https://forge.tail8d86e.ts.net`
5. `cd fly && fly deploy`
6. Verify pre-DNS: `curl -H "Host: forge.eblu.me" https://blumeops-proxy.fly.dev/`
7. `fly certs add forge.eblu.me -a blumeops-proxy`
8. `argocd app set authentik --revision feature/forge-public && argocd app sync authentik`
9. `mise run dns-preview && mise run dns-up`
10. Full verification (see below)
11. Rehearse `mise run fly-shutoff`
12. After merge: reset ArgoCD revisions to main, re-sync

## Verification Checklist

- [ ] forge.eblu.me loads, shows public repos
- [ ] forge.ops.eblu.me still works from tailnet
- [ ] SSH clone via forge.ops.eblu.me:2222 works
- [ ] HTTPS clone via forge.eblu.me works
- [ ] UI shows forge.eblu.me for HTTPS clone, forge.ops.eblu.me for SSH
- [ ] /swagger returns 403
- [ ] Rapid login attempts trigger 429 rate limit
- [ ] fail2ban bans after 5 failed logins in 10 minutes
- [ ] ArgoCD can still sync (SSH unaffected)
- [ ] `mise run fly-shutoff` stops all public traffic
- [ ] `mise run services-check` passes

Reviewed-on: #278
2026-03-03 08:40:41 -08:00
2312e5fbf8 Update ringtail flake inputs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 15:22:29 -08:00
a5429d5a34 Update ringtail flake inputs, add flake-update pipeline (#240)
## Summary
- Update all ringtail NixOS flake inputs (nixpkgs, disko, home-manager) to latest
- Add `flake_update` Dagger function (`nix flake update`) alongside existing `flake_lock` (`nix flake lock`)
- Add how-to guide for managing the ringtail lockfile
- Update dagger and ringtail reference cards

## Deployment and Testing
- [x] `mise run provision-ringtail` — deployed successfully, `changed=2` (repo + rebuild)
- [x] `mise run services-check` — all services healthy
- [x] Doc link and index checks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/240
2026-02-22 08:17:52 -08:00
b8bc0bffd8 Update ringtail flake.lock (remove dagger input)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:21:36 -08:00
02f2be5523 Use nix eval instead of dagger for nix runner version extraction
Dagger CLI needs a container runtime (Docker/containerd) to start its
engine, which the bare nix runner doesn't have. Use nix eval directly
instead — it's already available and more appropriate for a nix host.

Reverts the dagger flake input since it's not usable here.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:21:16 -08:00
de037ae51f Update ringtail flake.lock with dagger/nix input
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:15:19 -08:00
a2d2808270 Add dagger CLI to nix runner via github:dagger/nix flake
Dagger was removed from nixpkgs due to trademark concerns. Use the
official dagger/nix flake as a flake input instead, passing the package
through via specialArgs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 23:14:46 -08:00
e0d5f28147 Add dagger to nix-container-builder runner (#234)
## Summary
- Add `dagger` to `hostPackages` for the ringtail nix-container-builder runner
- Needed for `dagger call nix-version` fallback in the nix build workflow (authentik)
- `hostPackages` is scoped to the runner's systemd unit PATH, not system-wide
- Marks `install-dagger-on-nix-runner` Mikado card complete

## Deployment and Testing
- [ ] Merge, then `mise run provision-ringtail`
- [ ] `mise run container-build-and-release authentik` to verify nix build succeeds

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/234
2026-02-20 23:09:01 -08:00
d5d32fe91f Port Frigate NVR to ringtail k3s with GPU acceleration (#217)
## Summary

- Enable NVIDIA container toolkit on ringtail NixOS and configure k3s containerd with nvidia runtime
- Add NVIDIA device plugin ArgoCD app (RuntimeClass + DaemonSet) to expose `nvidia.com/gpu` resources
- Re-target Frigate from indri minikube (arm64, ZMQ detector) to ringtail k3s (x86_64, TensorRT/ONNX)
- Switch Frigate image to `-tensorrt` variant with GPU resource limits and increased shared memory

## Manual Prerequisites

1. **NFS access**: Verify ringtail can mount `sifaka:/volume1/frigate`
   ```fish
   ssh ringtail 'sudo mount -t nfs sifaka:/volume1/frigate /mnt/storage1 && ls /mnt/storage1 && sudo umount /mnt/storage1'
   ```
2. **YOLO model**: Verify `/volume1/frigate/models/yolov9m.onnx` exists on sifaka

## Deployment Steps

1. Provision ringtail: `mise run provision-ringtail`
2. Sync ArgoCD apps: `argocd app sync apps --prune`
3. Deploy NVIDIA device plugin: `argocd app sync nvidia-device-plugin`
4. Verify GPU: `kubectl --context=k3s-ringtail get nodes -o json | jq '.items[].status.capacity'`
5. Deploy Frigate: `argocd app sync frigate`

## Verification

- [ ] `nvidia.com/gpu: 1` visible in node capacity
- [ ] Frigate pod running with GPU allocated
- [ ] Frigate UI loads at `https://nvr.ops.eblu.me`
- [ ] Detector shows ONNX/TensorRT on System page
- [ ] Camera feed with bounding boxes in live view
- [ ] TensorRT engine build completes (watch logs on first start)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/217
2026-02-19 14:27:04 -08:00
695089499e Nix container build for nettest (#214)
## Summary
- Add `containers/nettest/default.nix` using `dockerTools.buildLayeredImage` with curl, jq, dnsutils, cacert, and bash — equivalent to the existing Dockerfile
- Update `container-tag-and-release` to require `--nix` or `--dockerfile` flag when both build types exist for a container
- Update `container-list` to show `[dockerfile+nix]` label when both exist

## Deployment and Testing
- [ ] SSH to ringtail, run `nix build -f containers/nettest/default.nix -o result` to verify the nix expression builds
- [ ] Tag `nettest-nix-v1.0.0`, confirm `build-container-nix` workflow runs on `nix-container-builder` runner and pushes to registry
- [ ] Smoke test on ringtail k3s: `kubectl run nettest --image=registry.ops.eblu.me/blumeops/nettest:v1.0.0 --restart=Never && kubectl logs nettest`
- [ ] Verify `mise run container-list` shows `[dockerfile+nix]` for nettest
- [ ] Verify `mise run container-tag-and-release nettest v1.1.0` prompts for build type

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/214
2026-02-19 08:42:58 -08:00
8f89239c78 Inhibit idle lock for fullscreen windows on ringtail (#212)
## Summary
- Adds `inhibit_idle fullscreen` window commands to sway config on ringtail
- Covers both Wayland-native (`app_id`) and XWayland (`class`) windows
- Prevents swayidle from locking the screen during gamepad-only gaming sessions where controller input isn't detected by the Wayland idle tracker

## Notes
This is a blanket fullscreen inhibit. A more targeted approach (daemon monitoring `/dev/input` gamepad events) may be desired later to allow idle lock during long-running fullscreen apps like Factorio.

## Deployment and Testing
- [ ] `mise run provision-ringtail` to deploy
- [ ] Run a fullscreen app and verify swayidle doesn't lock after 15 minutes
- [ ] Verify lock still activates when no fullscreen window is present

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/212
2026-02-19 07:20:05 -08:00
9829a6f971 Add screen lock and idle management to ringtail (#211)
## Summary
- Configure **swayidle** to lock screen (swaylock) after 15 minutes of inactivity
- Turn off display (DPMS) after 60 minutes, auto-restore on activity
- **swaylock** themed with Catppuccin Macchiato to match existing Sway config
- Add `Mod4+l` keybinding for manual screen lock
- Add PAM service for swaylock authentication
- Disable system suspend/hibernate entirely (workstation should never sleep)

## What changes
All changes in `nixos/ringtail/configuration.nix`:
- `security.pam.services.swaylock` — required for swaylock to authenticate on NixOS
- `systemd.sleep.extraConfig` — blocks all sleep/hibernate modes
- `programs.swaylock` (home-manager) — lock screen appearance config
- `services.swayidle` (home-manager) — idle timeout daemon with lock + DPMS events
- New keybinding `Mod4+l` for manual lock

## Deployment and Testing
- [ ] `mise run provision-ringtail`
- [ ] Verify swayidle is running: `systemctl --user status swayidle`
- [ ] Test manual lock with `Super+l`
- [ ] Verify display DPMS off after idle (can lower timeout temporarily to test)
- [ ] Confirm machine does not suspend: `systemctl status sleep.target`

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/211
2026-02-19 06:46:37 -08:00
918df9e642 Add k3s, 1Password Connect, and systemd nix-container-builder to ringtail (#209)
## Summary

  Extends ringtail from a desktop/gaming NixOS box into an infrastructure node with a k3s cluster, secrets management, and a Forgejo Actions
  runner for building containers with Nix.

  ### K3s cluster
  - Single-node k3s with Traefik/ServiceLB/metrics-server disabled (minimal footprint)
  - TLS SAN set to `ringtail.tail8d86e.ts.net` so ArgoCD on indri can manage it via Tailscale
  - Containerd registry mirrors pull through Zot on indri (`k3s-registries.yaml`)
  - Tailscale interface added to `trustedInterfaces` for cross-node ArgoCD access
  - `kubectl` added to system packages

  ### 1Password Connect + External Secrets Operator
  - Four new ArgoCD apps targeting `k3s-ringtail`: `1password-connect-ringtail`, `external-secrets-crds-ringtail`, `external-secrets-ringtail`,
  `external-secrets-config-ringtail`
  - Reuses the same Helm charts/values as indri, just pointed at ringtail's k3s API server
  - Bootstrap secrets (`op-credentials`, `onepassword-token`) provisioned by Ansible pre_tasks via `op read`, then applied to the `1password`
  namespace in post_tasks

  ### Systemd Forgejo Actions runner
  - Native `services.gitea-actions-runner` with `forgejo-runner` package — no DinD, no k8s pod, runs directly on the NixOS host
  - Label `nix-container-builder:host` — jobs execute on the host with `nix`, `skopeo`, `nodejs`, etc. in PATH
  - Registration token fetched from 1Password (`Forgejo Secrets/runner_reg`) by Ansible and written to `/etc/forgejo-runner/token.env`
  - Runner's dynamic user (`gitea-runner`) added to `nix.settings.trusted-users` for nix daemon access

  ### Nix container build workflow
  - New `.forgejo/workflows/build-container-nix.yaml` triggers on `*-nix-v[0-9]*` tags (e.g. `nettest-nix-v1.0.0`)
  - Builds with `nix build -f containers/<name>/default.nix`, pushes to Zot via `skopeo copy`
  - Existing Dockerfile workflow guarded with `if: !contains(github.ref_name, '-nix-v')` to avoid double-triggering

  ### Mise task updates
  - `container-tag-and-release` auto-detects `default.nix` vs `Dockerfile` and uses the appropriate tag format (`-nix-v` vs `-v`)
  - `container-list` shows build type indicator (`[nix]` / `[dockerfile]`)

  ## Post-merge

  1. `mise run provision-ringtail` — deploys k3s token, runner token, NixOS rebuild
  2. Register k3s cluster in ArgoCD (first time only):
     ```fish
     ssh ringtail 'sudo cat /etc/rancher/k3s/k3s.yaml' | \
       sed 's|127.0.0.1|ringtail.tail8d86e.ts.net|' > /tmp/k3s-ringtail.yaml
     set -x KUBECONFIG /tmp/k3s-ringtail.yaml
     argocd cluster add default --name k3s-ringtail
  3. Sync ArgoCD apps in order: 1password-connect-ringtail -> external-secrets-crds-ringtail -> external-secrets-ringtail ->
  external-secrets-config-ringtail
  4. Verify runner: ssh ringtail 'systemctl status gitea-runner-nix-container-builder'
  5. Check Forgejo admin panel for ringtail-nix-builder runner online
  6. Test: create containers/<name>/default.nix, tag with <name>-nix-v0.1.0

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/209
2026-02-18 21:15:30 -08:00
535f897054 Polish ringtail NixOS config and add documentation (#208)
## Summary
- Fix Super+Return keybinding to launch wezterm in sway
- Set fish as default login shell
- Remove `initialPassword` (real password already set)
- Add 1Password CLI + GUI, chezmoi, and dev tool packages (neovim, eza, fd, fzf, zoxide, starship, atuin, bat, ripgrep)
- Add ringtail reference card, update host inventory and reference index
- Changelog fragment

## Post-merge deployment
- `mise run provision-ringtail` to rebuild NixOS
- On ringtail: launch 1Password GUI, enable CLI integration (Settings > Developer > CLI integration)
- Chezmoi needs `.chezmoiignore` updates in the dotfiles repo (separate task)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/208
2026-02-18 17:53:47 -08:00
7bf46f4e28 Add flake.lock for ringtail NixOS config
Prevents 'Git tree is dirty' warnings during nixos-rebuild.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:31:21 -08:00
5a087c10df Fix deprecated greetd.tuigreet package reference
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:30:01 -08:00
4b7491c58f Add python3 to ringtail for Ansible compatibility
NixOS doesn't include Python by default. Ansible needs it on the
managed host for module execution.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:29:09 -08:00
b08ed98881 Enable passwordless sudo for wheel group on ringtail
Required for Ansible unattended provisioning via become: true.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:25:32 -08:00
8ee6c1271a Add --accept-routes and --ssh to tailscale config
Makes tailscale settings declarative so they persist across rebuilds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:24:17 -08:00
aaf7e73c27 Fix sway on NVIDIA proprietary drivers
Sway/wlroots refuses to start on proprietary NVIDIA by default.
Add --unsupported-gpu flag and disable hardware cursors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 09:08:26 -08:00
104e49d337 Allow unfree packages for NVIDIA drivers and Steam
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 08:56:27 -08:00
b9d813cde1 Add NixOS configuration for ringtail workstation (#207)
## Summary
- NixOS flake for ringtail (gaming/compute workstation, RTX 4080) in `nixos/ringtail/`
- Declarative disk partitioning via disko (GPT, 512M EFI + ext4 root on NVMe)
- NVIDIA proprietary drivers, sway/Wayland desktop, greetd, PipeWire, Steam
- Tailscale integration for tailnet connectivity
- Ansible playbook + `mise run provision-ringtail` for ongoing management
- Pulumi auth key (`tag:homelab`, `tag:blumeops`) for tailnet bootstrap

## Deployment Order
1. **Merge PR**
2. `pulumi up` in tailscale stack → creates auth key
3. Retrieve auth key: `pulumi stack output ringtail_authkey --show-secrets`
4. On ringtail NixOS installer:
   - `nix run github:nix-community/disko -- --mode disko /tmp/disk-config.nix` (or from cloned repo)
   - `nixos-install --flake github:eblume/blumeops?dir=nixos/ringtail#ringtail`
5. Reboot, `tailscale up --auth-key=<key>`
6. Verify: `tailscale status`, SSH from gilbert

## Test plan
- [ ] Review NixOS configuration for completeness
- [ ] Verify disko partition layout matches ringtail hardware
- [ ] Run `pulumi preview` for tailscale stack
- [ ] Install NixOS on ringtail
- [ ] Confirm tailscale connectivity
- [ ] Confirm sway desktop works
- [ ] Test `mise run provision-ringtail` for ongoing management

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/207
2026-02-18 08:24:25 -08:00