blumeops/docs/zk/1768506761-GHUW.md
Erich Blume b8104d75ad Move zk cards to docs/zk/ for documentation restructuring (#84)
## Summary
- Move all existing zettelkasten cards from `docs/` to `docs/zk/` as a temporary holding area
- Update `zk-docs` mise task to look in the new location
- Add `docs/README.md` explaining the Diataxis-based restructuring plan and target audiences

## Context
This is phase 1 of a multi-phase documentation restructuring effort. The goal is to reorganize docs to follow the Diataxis framework while serving multiple audiences:
1. Erich (owner) - knowledge graph/zk
2. Claude/AI agents - memory and context enrichment
3. New external readers - high-level overview
4. Potential operators/contributors - onboarding
5. Replicators - people wanting to duplicate the approach

## Testing
- [x] Verified `mise run zk-docs` still works with the new path
- [x] Updated obsidian.nvim config (in ~/.config/nvim) to point to new path

## Note
The obsidian.nvim config change is outside this repo but was made as part of this work.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/84
2026-02-03 09:13:50 -08:00

5.3 KiB

id aliases tags
1768506761-GHUW
alloy
grafana-alloy
blumeops

Grafana Alloy Management Log

Grafana Alloy is a unified observability collector with two deployments:

  1. Indri (host) - System metrics and service logs from macOS host
  2. Kubernetes (DaemonSet) - Automatic pod log collection and service health probes

Service Details

  • Binary: ~/.local/bin/alloy (built from source with CGO_ENABLED=1)
  • Config: ~/.config/grafana-alloy/config.alloy
  • Data: ~/.local/share/grafana-alloy/
  • Logs: ~/Library/Logs/mcquack.alloy.{out,err}.log
  • Managed via: mcquack LaunchAgent (mcquack.eblume.alloy)

Why built from source? The Homebrew bottle is built with CGO_ENABLED=0, which uses Go's pure DNS resolver. This resolver reads /etc/resolv.conf directly and ignores macOS /etc/resolver/* files, breaking Tailscale MagicDNS hostname resolution. Building with CGO_ENABLED=1 uses the macOS native resolver.

What Alloy Collects

Metrics

  • System metrics via prometheus.exporter.unix (same metrics as node_exporter)
  • Textfile collector reads from /opt/homebrew/var/node_exporter/textfile/
    • minikube.prom - Minikube cluster status
    • borgmatic.prom - Backup status metrics
    • zot.prom - Container registry metrics
    • jellyfin.prom - Jellyfin media server metrics
  • Zot registry metrics scraped from http://localhost:5050/metrics
  • Metrics pushed to Prometheus (k8s) via remote_write at https://prometheus.tail8d86e.ts.net/api/v1/write

Logs

Collects logs from all services on Indri:

Brew services:

  • forgejo
  • tailscale

mcquack LaunchAgents:

  • alloy (stdout/stderr)
  • borgmatic (stdout/stderr)
  • zot (stdout/stderr)
  • jellyfin (stdout/stderr)

Logs pushed to Loki (k8s) at https://loki.tail8d86e.ts.net/loki/api/v1/push.

Useful Commands

# Check service status
ssh indri 'launchctl list | grep alloy'

# View alloy logs
ssh indri 'tail -f ~/Library/Logs/mcquack.alloy.err.log'

# Restart service
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.alloy.plist && launchctl load ~/Library/LaunchAgents/mcquack.eblume.alloy.plist'

Building from Source

Alloy must be built with CGO to use macOS native DNS resolver (required for Tailscale MagicDNS):

# On gilbert (dev workstation):
git clone ssh://forgejo@forge.tail8d86e.ts.net/eblume/alloy.git ~/code/3rd/alloy
cd ~/code/3rd/alloy && mise use go@1.25 node yarn
mise x -- make alloy
scp ~/code/3rd/alloy/build/alloy indri:~/.local/bin/alloy

Then run ansible to deploy the config and LaunchAgent.

Ansible Management (Indri)

Alloy on Indri is managed via ansible in 1767747119-YCPO.

mise run provision-indri -- --tags alloy

Kubernetes Alloy (alloy-k8s)

A separate Alloy DaemonSet runs in k8s for:

  • Automatic pod log collection - discovers and collects logs from all pods
  • Service health probes - HTTP blackbox probes for k8s services

Service Details (k8s)

  • Namespace: alloy
  • Image: grafana/alloy:v1.8.2
  • ArgoCD app: alloy-k8s
  • Manifests: argocd/manifests/alloy-k8s/

What k8s Alloy Collects

Pod logs (automatic discovery):

  • All pods in all namespaces via loki.source.kubernetes
  • Labels: namespace, pod, container, node

Service health probes:

  • miniflux, kiwix, transmission, devpi, argocd
  • Metrics: probe_success, probe_duration_seconds
  • Labels: job="integrations/blackbox/<service>"

Useful Commands (k8s Alloy)

# View alloy-k8s logs
kubectl --context=minikube-indri -n alloy logs -f daemonset/alloy

# Check running config
kubectl --context=minikube-indri -n alloy get configmap alloy-config -o yaml

# Sync from ArgoCD
argocd app sync alloy-k8s

Log

Wed Jan 22 2026 (later)

  • Added Alloy k8s DaemonSet for automatic pod log collection
  • Logs from all k8s pods now forwarded to Loki with automatic discovery
  • Added service health probes for miniflux, kiwix, transmission, devpi, argocd
  • New "Services Health" Grafana dashboard shows probe metrics
  • Deleted stale textfile metrics (devpi.prom, transmission.prom) from indri
  • Deleted stale data directories (/opt/homebrew/var/loki, /opt/homebrew/var/prometheus)

Wed Jan 22 2026

  • Rebuilt from source with CGO_ENABLED=1 - required for Tailscale MagicDNS resolution
  • Migrated from Homebrew to mcquack LaunchAgent management
  • Updated remote_write to push to k8s Prometheus at prometheus.tail8d86e.ts.net
  • Updated log push to k8s Loki at loki.tail8d86e.ts.net
  • Removed prometheus/loki log collection (now running in k8s)
  • Binary now at ~/.local/bin/alloy, config at ~/.config/grafana-alloy/
  • Added build instructions to ansible role defaults

Mon Jan 20 2026

  • Removed devpi log collection (devpi migrated to k8s)
  • Removed devpi.prom textfile collection (metrics role retired)
  • Removed grafana log collection (grafana migrated to k8s in P2)

Wed Jan 15 2026

  • Initial setup replacing node_exporter
  • Configured metrics push via remote_write to Prometheus
  • Configured log collection for all services, forwarding to Loki

Thu Jan 30 2026

  • Removed Plex log and metrics collection (replaced by Jellyfin)
  • Added Jellyfin log collection via mcquack LaunchAgent logs
  • Added jellyfin.prom textfile metrics

Wed Jan 15 2026 (later)

  • Added Plex Media Server log collection (removed 2026-01-30)
  • Added plex.prom metrics from plex_metrics role (removed 2026-01-30)