- Remove aliases from all zk cards to prevent them from capturing wiki-links - Convert all wiki-links from [[filename|Title]] to [[Title]] format - Replace doc-filenames task with doc-titles for duplicate detection - Update pre-commit hook to use doc-titles Wiki-links now resolve to reference docs by their frontmatter title, which is more readable and maintainable than filename-based links. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
5.3 KiB
| id | tags | |
|---|---|---|
| 1768506761-GHUW |
|
Grafana Alloy Management Log
Grafana Alloy is a unified observability collector with two deployments:
- Indri (host) - System metrics and service logs from macOS host
- Kubernetes (DaemonSet) - Automatic pod log collection and service health probes
Service Details
- Binary:
~/.local/bin/alloy(built from source with CGO_ENABLED=1) - Config:
~/.config/grafana-alloy/config.alloy - Data:
~/.local/share/grafana-alloy/ - Logs:
~/Library/Logs/mcquack.alloy.{out,err}.log - Managed via: mcquack LaunchAgent (
mcquack.eblume.alloy)
Why built from source? The Homebrew bottle is built with CGO_ENABLED=0, which uses Go's pure DNS resolver. This resolver reads /etc/resolv.conf directly and ignores macOS /etc/resolver/* files, breaking Tailscale MagicDNS hostname resolution. Building with CGO_ENABLED=1 uses the macOS native resolver.
What Alloy Collects
Metrics
- System metrics via
prometheus.exporter.unix(same metrics as node_exporter) - Textfile collector reads from
/opt/homebrew/var/node_exporter/textfile/minikube.prom- Minikube cluster statusborgmatic.prom- Backup status metricszot.prom- Container registry metricsjellyfin.prom- Jellyfin media server metrics
- Zot registry metrics scraped from
http://localhost:5050/metrics - Metrics pushed to Prometheus (k8s) via remote_write at
https://prometheus.tail8d86e.ts.net/api/v1/write
Logs
Collects logs from all services on Indri:
Brew services:
- forgejo
- tailscale
mcquack LaunchAgents:
- alloy (stdout/stderr)
- borgmatic (stdout/stderr)
- zot (stdout/stderr)
- jellyfin (stdout/stderr)
Logs pushed to Loki (k8s) at https://loki.tail8d86e.ts.net/loki/api/v1/push.
Useful Commands
# Check service status
ssh indri 'launchctl list | grep alloy'
# View alloy logs
ssh indri 'tail -f ~/Library/Logs/mcquack.alloy.err.log'
# Restart service
ssh indri 'launchctl unload ~/Library/LaunchAgents/mcquack.eblume.alloy.plist && launchctl load ~/Library/LaunchAgents/mcquack.eblume.alloy.plist'
Building from Source
Alloy must be built with CGO to use macOS native DNS resolver (required for Tailscale MagicDNS):
# On gilbert (dev workstation):
git clone ssh://forgejo@forge.tail8d86e.ts.net/eblume/alloy.git ~/code/3rd/alloy
cd ~/code/3rd/alloy && mise use go@1.25 node yarn
mise x -- make alloy
scp ~/code/3rd/alloy/build/alloy indri:~/.local/bin/alloy
Then run ansible to deploy the config and LaunchAgent.
Ansible Management (Indri)
Alloy on Indri is managed via ansible in 1767747119-YCPO.
mise run provision-indri -- --tags alloy
Kubernetes Alloy (alloy-k8s)
A separate Alloy DaemonSet runs in k8s for:
- Automatic pod log collection - discovers and collects logs from all pods
- Service health probes - HTTP blackbox probes for k8s services
Service Details (k8s)
- Namespace:
alloy - Image:
grafana/alloy:v1.8.2 - ArgoCD app:
alloy-k8s - Manifests:
argocd/manifests/alloy-k8s/
What k8s Alloy Collects
Pod logs (automatic discovery):
- All pods in all namespaces via
loki.source.kubernetes - Labels: namespace, pod, container, node
Service health probes:
- miniflux, kiwix, transmission, devpi, argocd
- Metrics:
probe_success,probe_duration_seconds - Labels:
job="integrations/blackbox/<service>"
Useful Commands (k8s Alloy)
# View alloy-k8s logs
kubectl --context=minikube-indri -n alloy logs -f daemonset/alloy
# Check running config
kubectl --context=minikube-indri -n alloy get configmap alloy-config -o yaml
# Sync from ArgoCD
argocd app sync alloy-k8s
Log
Wed Jan 22 2026 (later)
- Added Alloy k8s DaemonSet for automatic pod log collection
- Logs from all k8s pods now forwarded to Loki with automatic discovery
- Added service health probes for miniflux, kiwix, transmission, devpi, argocd
- New "Services Health" Grafana dashboard shows probe metrics
- Deleted stale textfile metrics (
devpi.prom,transmission.prom) from indri - Deleted stale data directories (
/opt/homebrew/var/loki,/opt/homebrew/var/prometheus)
Wed Jan 22 2026
- Rebuilt from source with CGO_ENABLED=1 - required for Tailscale MagicDNS resolution
- Migrated from Homebrew to mcquack LaunchAgent management
- Updated remote_write to push to k8s Prometheus at
prometheus.tail8d86e.ts.net - Updated log push to k8s Loki at
loki.tail8d86e.ts.net - Removed prometheus/loki log collection (now running in k8s)
- Binary now at
~/.local/bin/alloy, config at~/.config/grafana-alloy/ - Added build instructions to ansible role defaults
Mon Jan 20 2026
- Removed devpi log collection (devpi migrated to k8s)
- Removed devpi.prom textfile collection (metrics role retired)
- Removed grafana log collection (grafana migrated to k8s in P2)
Wed Jan 15 2026
- Initial setup replacing node_exporter
- Configured metrics push via remote_write to Prometheus
- Configured log collection for all services, forwarding to Loki
Thu Jan 30 2026
- Removed Plex log and metrics collection (replaced by Jellyfin)
- Added Jellyfin log collection via mcquack LaunchAgent logs
- Added jellyfin.prom textfile metrics
Wed Jan 15 2026 (later)
- Added Plex Media Server log collection (removed 2026-01-30)
- Added plex.prom metrics from plex_metrics role (removed 2026-01-30)