K8s Migration Phase 0: Foundation Infrastructure (#26)

## Summary
- Step 0.1: Update Pulumi ACLs with tag:registry
- Step 0.3: Create Zot registry ansible role with mcquack LaunchAgent
- Step 0.4: Add Zot to Tailscale Serve configuration
- Step 0.5: Create Zot metrics role for Prometheus scraping
- Step 0.6: Add Zot log collection to Alloy
- Step 0.7: Update indri-services-check with zot checks
- Step 0.8: Add podman role for container runtime
- Step 0.9: Add minikube role for Kubernetes cluster
- Step 0.10: Configure remote kubectl access with 1Password credentials

## Remaining Steps
- [ ] Step 0.11: Add minikube to indri-services-check
- [ ] Step 0.12: Create zettelkasten documentation
- [ ] Step 0.13: Verify main playbook (already done - roles added)

## Deployment and Testing
- [x] Zot registry deployed and accessible at https://registry.tail8d86e.ts.net
- [x] Podman machine running on indri
- [x] Minikube cluster running on indri
- [x] kubectl access from gilbert working with 1Password credentials
- [ ] indri-services-check passes all checks

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/26
This commit is contained in:
Erich Blume 2026-01-18 12:06:28 -08:00
commit 19a82373d5
32 changed files with 1811 additions and 10 deletions

View file

@ -1,3 +1,4 @@
# CLI tools for blumeops management # CLI tools for blumeops management
brew "bat" # Syntax-highlighted file concatenation brew "bat" # Syntax-highlighted file concatenation
brew "tea" # Gitea/Forgejo CLI for forge.tail8d86e.ts.net brew "tea" # Gitea/Forgejo CLI for forge.tail8d86e.ts.net
brew "podman" # Container CLI (uses VM on macOS, for building/pushing images)

View file

@ -99,6 +99,16 @@
tags: devpi tags: devpi
- role: devpi_metrics - role: devpi_metrics
tags: devpi_metrics tags: devpi_metrics
- role: zot
tags: zot
- role: zot_metrics
tags: zot_metrics
- role: podman
tags: podman
- role: minikube
tags: minikube
- role: minikube_metrics
tags: minikube_metrics
- role: plex_metrics - role: plex_metrics
tags: plex_metrics tags: plex_metrics
- role: postgresql - role: postgresql

View file

@ -66,6 +66,12 @@ alloy_mcquack_logs:
- path: /Users/erichblume/Library/Logs/mcquack.borgmatic.err.log - path: /Users/erichblume/Library/Logs/mcquack.borgmatic.err.log
service: borgmatic service: borgmatic
stream: stderr stream: stderr
- path: /Users/erichblume/Library/Logs/mcquack.zot.out.log
service: zot
stream: stdout
- path: /Users/erichblume/Library/Logs/mcquack.zot.err.log
service: zot
stream: stderr
alloy_plex_logs: alloy_plex_logs:
- path: /Users/erichblume/Library/Logs/Plex Media Server/Plex Media Server.log - path: /Users/erichblume/Library/Logs/Plex Media Server/Plex Media Server.log
@ -75,6 +81,10 @@ alloy_plex_logs:
# Enable log collection (requires Loki to be running) # Enable log collection (requires Loki to be running)
alloy_collect_logs: true alloy_collect_logs: true
# Zot registry metrics collection
alloy_collect_zot: true
alloy_zot_metrics_url: "http://localhost:5050/metrics"
# PostgreSQL metrics collection # PostgreSQL metrics collection
alloy_collect_postgres: true alloy_collect_postgres: true
alloy_postgres_host: localhost alloy_postgres_host: localhost

View file

@ -54,6 +54,18 @@ prometheus.scrape "postgresql" {
} }
{% endif %} {% endif %}
{% if alloy_collect_zot | default(false) %}
// ============== ZOT REGISTRY METRICS ==============
// Scrape Zot's native metrics endpoint
prometheus.scrape "zot" {
targets = [{"__address__" = "localhost:5050"}]
metrics_path = "/metrics"
forward_to = [prometheus.relabel.instance.receiver]
scrape_interval = "{{ alloy_scrape_interval }}"
}
{% endif %}
{% if alloy_collect_logs %} {% if alloy_collect_logs %}
// ============== LOG COLLECTION ============== // ============== LOG COLLECTION ==============

View file

@ -0,0 +1,449 @@
{
"annotations": {
"list": []
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [],
"panels": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"mappings": [
{
"options": {
"0": { "color": "red", "index": 0, "text": "DOWN" }
},
"type": "value"
},
{
"options": {
"1": { "color": "green", "index": 1, "text": "UP" }
},
"type": "value"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "red", "value": null },
{ "color": "green", "value": 1 }
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 0, "y": 0 },
"id": 1,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": ["lastNotNull"],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "minikube_up",
"refId": "A"
}
],
"title": "Minikube Status",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"mappings": [
{
"options": {
"0": { "color": "red", "index": 0, "text": "DOWN" }
},
"type": "value"
},
{
"options": {
"1": { "color": "green", "index": 1, "text": "UP" }
},
"type": "value"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "red", "value": null },
{ "color": "green", "value": 1 }
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 4, "y": 0 },
"id": 2,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": ["lastNotNull"],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "minikube_apiserver_up",
"refId": "A"
}
],
"title": "API Server",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "red", "value": null },
{ "color": "green", "value": 1 }
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 8, "y": 0 },
"id": 3,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": ["lastNotNull"],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "minikube_node_count",
"refId": "A"
}
],
"title": "Node Count",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [{ "color": "green", "value": null }]
},
"unit": "short"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 12, "y": 0 },
"id": 4,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": ["lastNotNull"],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "minikube_pod_count",
"refId": "A"
}
],
"title": "Pod Count",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [{ "color": "green", "value": null }]
},
"unit": "short"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 16, "y": 0 },
"id": 5,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": ["lastNotNull"],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "minikube_namespace_count",
"refId": "A"
}
],
"title": "Namespaces",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": { "legend": false, "tooltip": false, "viz": false },
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": { "type": "linear" },
"showPoints": "never",
"spanNulls": false,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [{ "color": "green", "value": null }]
},
"unit": "short"
},
"overrides": []
},
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 4 },
"id": 6,
"options": {
"legend": {
"calcs": ["lastNotNull"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "desc" }
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "minikube_up",
"legendFormat": "Minikube",
"refId": "A"
},
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "minikube_apiserver_up",
"legendFormat": "API Server",
"refId": "B"
}
],
"title": "Cluster Health Over Time",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": { "legend": false, "tooltip": false, "viz": false },
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": { "type": "linear" },
"showPoints": "never",
"spanNulls": false,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [{ "color": "green", "value": null }]
},
"unit": "short"
},
"overrides": []
},
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 4 },
"id": 7,
"options": {
"legend": {
"calcs": ["lastNotNull", "max"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "desc" }
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "minikube_pod_count",
"legendFormat": "Pods",
"refId": "A"
},
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "minikube_namespace_count",
"legendFormat": "Namespaces",
"refId": "B"
}
],
"title": "Resource Counts Over Time",
"type": "timeseries"
},
{
"datasource": {
"type": "loki",
"uid": "loki"
},
"gridPos": { "h": 10, "w": 24, "x": 0, "y": 12 },
"id": 8,
"options": {
"dedupStrategy": "none",
"enableLogDetails": true,
"prettifyLogMessage": false,
"showCommonLabels": false,
"showLabels": false,
"showTime": true,
"sortOrder": "Descending",
"wrapLogMessage": false
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "loki", "uid": "loki" },
"expr": "{host=\"indri\"} |= \"minikube\" or {host=\"indri\"} |= \"kube\"",
"refId": "A"
}
],
"title": "Kubernetes Related Logs",
"type": "logs"
}
],
"refresh": "30s",
"schemaVersion": 38,
"tags": ["minikube", "kubernetes", "k8s"],
"templating": {
"list": []
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "Minikube Kubernetes",
"uid": "minikube",
"version": 1,
"weekStart": ""
}

View file

@ -0,0 +1,488 @@
{
"annotations": {
"list": []
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [],
"panels": [
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"mappings": [
{
"options": {
"0": { "color": "red", "index": 0, "text": "DOWN" }
},
"type": "value"
},
{
"options": {
"1": { "color": "green", "index": 1, "text": "UP" }
},
"type": "value"
}
],
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "red", "value": null },
{ "color": "green", "value": 1 }
]
},
"unit": "short"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 0, "y": 0 },
"id": 1,
"options": {
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": ["lastNotNull"],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "zot_up",
"refId": "A"
}
],
"title": "Zot Status",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [{ "color": "green", "value": null }]
},
"unit": "short"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 4, "y": 0 },
"id": 2,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": ["lastNotNull"],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "go_goroutines{job=\"prometheus.scrape.zot\"}",
"refId": "A"
}
],
"title": "Goroutines",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "green", "value": null },
{ "color": "yellow", "value": 536870912 },
{ "color": "red", "value": 1073741824 }
]
},
"unit": "bytes"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 8, "y": 0 },
"id": 3,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": ["lastNotNull"],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "sum(zot_repo_storage_bytes)",
"refId": "A"
}
],
"title": "Total Storage",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [{ "color": "green", "value": null }]
},
"unit": "reqps"
},
"overrides": []
},
"gridPos": { "h": 4, "w": 4, "x": 12, "y": 0 },
"id": 4,
"options": {
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": {
"calcs": ["lastNotNull"],
"fields": "",
"values": false
},
"textMode": "auto"
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "sum(rate(zot_http_requests_total{job=\"prometheus.scrape.zot\"}[5m]))",
"refId": "A"
}
],
"title": "Request Rate",
"type": "stat"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": { "legend": false, "tooltip": false, "viz": false },
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": { "type": "linear" },
"showPoints": "never",
"spanNulls": false,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [{ "color": "green", "value": null }]
},
"unit": "reqps"
},
"overrides": []
},
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 4 },
"id": 5,
"options": {
"legend": {
"calcs": ["mean", "max"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "desc" }
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "sum by (method) (rate(zot_http_requests_total{job=\"prometheus.scrape.zot\"}[5m]))",
"legendFormat": "{{method}}",
"refId": "A"
}
],
"title": "HTTP Requests by Method",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": { "legend": false, "tooltip": false, "viz": false },
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": { "type": "linear" },
"showPoints": "never",
"spanNulls": false,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [{ "color": "green", "value": null }]
},
"unit": "reqps"
},
"overrides": []
},
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 4 },
"id": 6,
"options": {
"legend": {
"calcs": ["mean", "max"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "desc" }
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "sum by (code) (rate(zot_http_requests_total{job=\"prometheus.scrape.zot\"}[5m]))",
"legendFormat": "{{code}}",
"refId": "A"
}
],
"title": "HTTP Requests by Status Code",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": { "legend": false, "tooltip": false, "viz": false },
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": { "type": "linear" },
"showPoints": "never",
"spanNulls": false,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [{ "color": "green", "value": null }]
},
"unit": "s"
},
"overrides": []
},
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 12 },
"id": 7,
"options": {
"legend": {
"calcs": ["mean", "p95"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "desc" }
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "histogram_quantile(0.50, sum(rate(zot_http_method_latency_seconds_bucket{job=\"prometheus.scrape.zot\"}[5m])) by (le))",
"legendFormat": "p50",
"refId": "A"
},
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "histogram_quantile(0.95, sum(rate(zot_http_method_latency_seconds_bucket{job=\"prometheus.scrape.zot\"}[5m])) by (le))",
"legendFormat": "p95",
"refId": "B"
},
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "histogram_quantile(0.99, sum(rate(zot_http_method_latency_seconds_bucket{job=\"prometheus.scrape.zot\"}[5m])) by (le))",
"legendFormat": "p99",
"refId": "C"
}
],
"title": "HTTP Request Latency",
"type": "timeseries"
},
{
"datasource": {
"type": "prometheus",
"uid": "prometheus"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "palette-classic"
},
"custom": {
"axisBorderShow": false,
"axisCenteredZero": false,
"axisColorMode": "text",
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 10,
"gradientMode": "none",
"hideFrom": { "legend": false, "tooltip": false, "viz": false },
"insertNulls": false,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution": { "type": "linear" },
"showPoints": "never",
"spanNulls": false,
"stacking": { "group": "A", "mode": "none" },
"thresholdsStyle": { "mode": "off" }
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [{ "color": "green", "value": null }]
},
"unit": "bytes"
},
"overrides": []
},
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 12 },
"id": 8,
"options": {
"legend": {
"calcs": ["lastNotNull"],
"displayMode": "table",
"placement": "bottom",
"showLegend": true
},
"tooltip": { "mode": "multi", "sort": "desc" }
},
"pluginVersion": "10.0.0",
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "zot_repo_storage_bytes",
"legendFormat": "{{repo}}",
"refId": "A"
}
],
"title": "Storage by Repository",
"type": "timeseries"
}
],
"refresh": "30s",
"schemaVersion": 38,
"tags": ["zot", "registry", "oci"],
"templating": {
"list": []
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {},
"timezone": "",
"title": "Zot Container Registry",
"uid": "zot",
"version": 1,
"weekStart": ""
}

View file

@ -0,0 +1,14 @@
---
# Minikube cluster configuration
minikube_cpus: 4
# Note: Must be less than podman machine memory (8192MB) to account for overhead
minikube_memory: 7800
minikube_disk_size: "200g"
minikube_driver: podman
minikube_container_runtime: cri-o
# Remote access configuration
# These allow kubectl from other machines (e.g., gilbert) to connect
minikube_apiserver_names:
- indri
minikube_listen_address: "0.0.0.0"

View file

@ -0,0 +1,9 @@
---
# Minikube handlers
# Note: Restarting minikube is a heavy operation and may require manual intervention
- name: Restart minikube
ansible.builtin.shell: |
minikube stop 2>/dev/null || true
minikube start
changed_when: true

View file

@ -0,0 +1,56 @@
---
# Minikube installation and cluster setup for indri
# Requires podman machine to be running (see podman role)
#
# NOTE: Similar to podman, minikube start may have issues when run via SSH.
# If cluster fails to start, manually run on indri:
# minikube start --driver=podman --container-runtime=cri-o \
# --cpus=4 --memory=7800 --disk-size=200g \
# --apiserver-names=indri --listen-address=0.0.0.0
- name: Install minikube via homebrew
community.general.homebrew:
name: minikube
state: present
- name: Install kubectl via homebrew
community.general.homebrew:
name: kubectl
state: present
- name: Check if minikube cluster exists
ansible.builtin.command:
cmd: minikube status --format={% raw %}'{{.Host}}'{% endraw %}
register: minikube_status
changed_when: false
failed_when: false
- name: Start minikube cluster
ansible.builtin.command:
cmd: >
minikube start
--driver={{ minikube_driver }}
--container-runtime={{ minikube_container_runtime }}
--cpus={{ minikube_cpus }}
--memory={{ minikube_memory }}
--disk-size={{ minikube_disk_size }}
{% for name in minikube_apiserver_names %}
--apiserver-names={{ name }}
{% endfor %}
--listen-address={{ minikube_listen_address }}
register: minikube_start
changed_when: minikube_start.rc == 0
failed_when: false # Don't fail - may need manual intervention like podman
when: minikube_status.rc != 0 or 'Running' not in minikube_status.stdout
- name: Check minikube status after start attempt
ansible.builtin.command:
cmd: minikube status --format={% raw %}'{{.Host}}'{% endraw %}
register: minikube_final_status
changed_when: false
failed_when: false
- name: Warn if minikube failed to start
ansible.builtin.debug:
msg: "WARNING: minikube may not have started properly. Run 'minikube start' manually on indri if needed. Status: {{ minikube_final_status.stdout | default('unknown') }}"
when: minikube_final_status.rc != 0 or 'Running' not in minikube_final_status.stdout

View file

@ -0,0 +1,5 @@
---
minikube_metrics_dir: /opt/homebrew/var/node_exporter/textfile
minikube_metrics_script: /Users/erichblume/bin/minikube-metrics
minikube_metrics_interval: 60 # seconds between metric collection
minikube_metrics_log_dir: /opt/homebrew/var/log

View file

@ -0,0 +1,6 @@
---
- name: Reload minikube-metrics
ansible.builtin.shell: |
launchctl unload ~/Library/LaunchAgents/mcquack.eblume.minikube-metrics.plist 2>/dev/null || true
launchctl load ~/Library/LaunchAgents/mcquack.eblume.minikube-metrics.plist
changed_when: true

View file

@ -0,0 +1,43 @@
---
- name: Ensure metrics directory exists
ansible.builtin.file:
path: "{{ minikube_metrics_dir }}"
state: directory
mode: '0755'
- name: Ensure log directory exists
ansible.builtin.file:
path: "{{ minikube_metrics_log_dir }}"
state: directory
mode: '0755'
- name: Ensure bin directory exists
ansible.builtin.file:
path: "{{ minikube_metrics_script | dirname }}"
state: directory
mode: '0755'
- name: Deploy minikube-metrics script
ansible.builtin.template:
src: minikube-metrics.sh.j2
dest: "{{ minikube_metrics_script }}"
mode: '0755'
- name: Deploy minikube-metrics LaunchAgent plist
ansible.builtin.template:
src: minikube-metrics.plist.j2
dest: ~/Library/LaunchAgents/mcquack.eblume.minikube-metrics.plist
mode: '0644'
notify: Reload minikube-metrics
- name: Check if minikube-metrics LaunchAgent is loaded
ansible.builtin.command: launchctl list mcquack.eblume.minikube-metrics
register: minikube_metrics_launchctl_check
changed_when: false
failed_when: false
- name: Load minikube-metrics LaunchAgent if not loaded
ansible.builtin.command: launchctl load ~/Library/LaunchAgents/mcquack.eblume.minikube-metrics.plist
when: minikube_metrics_launchctl_check.rc != 0
changed_when: true
failed_when: false

View file

@ -0,0 +1,21 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- {{ ansible_managed }} -->
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>mcquack.eblume.minikube-metrics</string>
<key>ProgramArguments</key>
<array>
<string>{{ minikube_metrics_script }}</string>
</array>
<key>StartInterval</key>
<integer>{{ minikube_metrics_interval }}</integer>
<key>RunAtLoad</key>
<true/>
<key>StandardErrorPath</key>
<string>{{ minikube_metrics_log_dir }}/mcquack.minikube-metrics.err.log</string>
<key>StandardOutPath</key>
<string>{{ minikube_metrics_log_dir }}/mcquack.minikube-metrics.out.log</string>
</dict>
</plist>

View file

@ -0,0 +1,57 @@
#!/bin/bash
# {{ ansible_managed }}
# Collects minikube/kubernetes metrics for node_exporter textfile collector
set -euo pipefail
OUTPUT_FILE="{{ minikube_metrics_dir }}/minikube.prom"
TEMP_FILE="${OUTPUT_FILE}.tmp"
# Start output file
cat > "$TEMP_FILE" << 'HEADER'
# HELP minikube_up Minikube cluster is running
# TYPE minikube_up gauge
# HELP minikube_apiserver_up Kubernetes API server is responding
# TYPE minikube_apiserver_up gauge
# HELP minikube_node_count Number of nodes in the cluster
# TYPE minikube_node_count gauge
# HELP minikube_pod_count Number of pods in the cluster
# TYPE minikube_pod_count gauge
# HELP minikube_namespace_count Number of namespaces in the cluster
# TYPE minikube_namespace_count gauge
HEADER
# Check if minikube is running
if minikube status --format='{% raw %}{{.Host}}{% endraw %}' 2>/dev/null | grep -q "Running"; then
echo "minikube_up 1" >> "$TEMP_FILE"
else
echo "minikube_up 0" >> "$TEMP_FILE"
echo "minikube_apiserver_up 0" >> "$TEMP_FILE"
echo "minikube_node_count 0" >> "$TEMP_FILE"
echo "minikube_pod_count 0" >> "$TEMP_FILE"
echo "minikube_namespace_count 0" >> "$TEMP_FILE"
mv "$TEMP_FILE" "$OUTPUT_FILE"
exit 0
fi
# Check API server health
if kubectl get --raw /healthz >/dev/null 2>&1; then
echo "minikube_apiserver_up 1" >> "$TEMP_FILE"
else
echo "minikube_apiserver_up 0" >> "$TEMP_FILE"
fi
# Get node count
NODE_COUNT=$(kubectl get nodes --no-headers 2>/dev/null | wc -l | tr -d ' ')
echo "minikube_node_count ${NODE_COUNT:-0}" >> "$TEMP_FILE"
# Get pod count (all namespaces)
POD_COUNT=$(kubectl get pods -A --no-headers 2>/dev/null | wc -l | tr -d ' ')
echo "minikube_pod_count ${POD_COUNT:-0}" >> "$TEMP_FILE"
# Get namespace count
NS_COUNT=$(kubectl get namespaces --no-headers 2>/dev/null | wc -l | tr -d ' ')
echo "minikube_namespace_count ${NS_COUNT:-0}" >> "$TEMP_FILE"
# Atomic move
mv "$TEMP_FILE" "$OUTPUT_FILE"

View file

@ -0,0 +1,3 @@
---
# No handlers currently - podman machine start is unreliable via Ansible
# See known issue in tasks/main.yml

View file

@ -0,0 +1,55 @@
---
# Podman installation and machine setup for indri
# Used as container runtime for minikube
#
# KNOWN ISSUE: podman machine init/start has reliability issues when run via
# Ansible/SSH. The machine sometimes gets stuck in "Starting" state due to a
# race condition (see https://github.com/containers/podman/issues/16945).
# Additionally, Apple Hypervisor may require GUI session context.
#
# WORKAROUND: If the machine fails to start via Ansible, manually run on indri:
# podman machine rm -f podman-machine-default
# podman machine init --cpus 4 --memory 8192 --disk-size 220
# podman machine start
#
# TODO: Investigate proper LaunchAgent or other solution for reliable automation.
- name: Install podman via homebrew
community.general.homebrew:
name: podman
state: present
- name: Check if podman machine exists
ansible.builtin.command:
cmd: podman machine list --format json
register: podman_machine_list
changed_when: false
- name: Initialize podman machine (if not exists)
ansible.builtin.command:
cmd: podman machine init --cpus 4 --memory 8192 --disk-size 220
register: podman_init
changed_when: podman_init.rc == 0
failed_when: podman_init.rc not in [0, 125] # 125 = already exists
when: podman_machine_list.stdout == '[]'
- name: Check if podman machine is running
ansible.builtin.command:
cmd: podman machine list --format "{{ '{{' }}.Running{{ '}}' }}"
register: podman_running
changed_when: false
- name: Start podman machine (if stopped)
ansible.builtin.command:
cmd: podman machine start
register: podman_start
changed_when: "'started successfully' in podman_start.stdout"
failed_when: false # Don't fail - see known issue above
when: "'true' not in podman_running.stdout"
- name: Warn if podman machine failed to start
ansible.builtin.debug:
msg: "WARNING: podman machine may not have started. Run 'podman machine start' manually on indri if needed."
when:
- "'true' not in podman_running.stdout"
- podman_start.rc != 0 or "'started successfully' not in podman_start.stdout"

View file

@ -35,3 +35,8 @@ tailscale_serve_services:
https: https:
port: 443 port: 443
upstream: http://localhost:8080 upstream: http://localhost:8080
- name: svc:registry
https:
port: 443
upstream: http://localhost:5050

View file

@ -0,0 +1,16 @@
---
zot_repo_dir: /Users/erichblume/code/3rd/zot
zot_binary: "{{ zot_repo_dir }}/bin/zot-darwin-arm64"
zot_data_dir: /Users/erichblume/zot
zot_config_dir: /Users/erichblume/.config/zot
zot_port: 5050
zot_log_dir: /Users/erichblume/Library/Logs
# Pull-through cache registries (on-demand sync)
zot_sync_registries:
- name: docker.io
url: https://registry-1.docker.io
- name: ghcr.io
url: https://ghcr.io
- name: quay.io
url: https://quay.io

View file

@ -0,0 +1,6 @@
---
- name: Restart zot
ansible.builtin.shell: |
launchctl unload ~/Library/LaunchAgents/mcquack.eblume.zot.plist 2>/dev/null || true
launchctl load ~/Library/LaunchAgents/mcquack.eblume.zot.plist
changed_when: true

View file

@ -0,0 +1,66 @@
---
# Note: Zot is built from source, not installed via homebrew.
#
# ONE-TIME SETUP (before running ansible):
#
# 1. Clone zot from forge mirror (use localhost:3001 - hairpinning doesn't work):
# ssh indri 'git clone http://localhost:3001/eblume/zot.git ~/code/3rd/zot'
#
# 2. Set up Go via mise:
# ssh indri 'cd ~/code/3rd/zot && mise use go@1.25'
#
# 3. Build (creates bin/zot-darwin-arm64):
# ssh indri 'cd ~/code/3rd/zot && mise x -- make binary'
#
# 4. Run ansible to deploy config and LaunchAgent
- name: Verify zot binary exists
ansible.builtin.stat:
path: "{{ zot_binary }}"
register: zot_binary_stat
- name: Fail if zot binary not found
ansible.builtin.fail:
msg: |
Zot binary not found at {{ zot_binary }}.
Please build from source first:
ssh indri 'cd ~/code/3rd/zot && mise x -- make binary'
when: not zot_binary_stat.stat.exists
- name: Ensure zot data directory exists
ansible.builtin.file:
path: "{{ zot_data_dir }}"
state: directory
mode: '0755'
- name: Ensure zot config directory exists
ansible.builtin.file:
path: "{{ zot_config_dir }}"
state: directory
mode: '0755'
- name: Deploy zot config
ansible.builtin.template:
src: config.json.j2
dest: "{{ zot_config_dir }}/config.json"
mode: '0644'
notify: Restart zot
- name: Deploy zot LaunchAgent plist
ansible.builtin.template:
src: zot.plist.j2
dest: ~/Library/LaunchAgents/mcquack.eblume.zot.plist
mode: '0644'
notify: Restart zot
- name: Check if zot LaunchAgent is loaded
ansible.builtin.command: launchctl list mcquack.eblume.zot
register: zot_launchctl_check
changed_when: false
failed_when: false
- name: Load zot LaunchAgent if not loaded
ansible.builtin.command: launchctl load ~/Library/LaunchAgents/mcquack.eblume.zot.plist
when: zot_launchctl_check.rc != 0
changed_when: true
failed_when: false

View file

@ -0,0 +1,47 @@
{
"distSpecVersion": "1.1.0",
"storage": {
"rootDirectory": "{{ zot_data_dir }}",
"gc": true,
"gcDelay": "1h",
"gcInterval": "24h"
},
"http": {
"address": "0.0.0.0",
"port": "{{ zot_port }}"
},
"log": {
"level": "info"
},
"extensions": {
"metrics": {
"enable": true,
"prometheus": {
"path": "/metrics"
}
},
"sync": {
"enable": true,
"registries": [
{% for registry in zot_sync_registries %}
{
"urls": ["{{ registry.url }}"],
"content": [{"prefix": "**", "destination": "/{{ registry.name }}"}],
"onDemand": true,
"tlsVerify": true
}{% if not loop.last %},{% endif %}
{% endfor %}
]
},
"search": {
"enable": true,
"cve": {
"updateInterval": "24h"
}
},
"ui": {
"enable": true
}
}
}

View file

@ -0,0 +1,24 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- {{ ansible_managed }} -->
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>mcquack.eblume.zot</string>
<key>ProgramArguments</key>
<array>
<!-- ABSOLUTE PATH to built binary in ~/code/3rd/zot -->
<string>{{ zot_binary }}</string>
<string>serve</string>
<string>{{ zot_config_dir }}/config.json</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>{{ zot_log_dir }}/mcquack.zot.out.log</string>
<key>StandardErrorPath</key>
<string>{{ zot_log_dir }}/mcquack.zot.err.log</string>
</dict>
</plist>

View file

@ -0,0 +1,6 @@
---
zot_metrics_url: http://localhost:5050/v2/_catalog
zot_metrics_dir: /opt/homebrew/var/node_exporter/textfile
zot_metrics_script: /Users/erichblume/bin/zot-metrics
zot_metrics_interval: 60 # seconds between metric collection
zot_metrics_log_dir: /opt/homebrew/var/log

View file

@ -0,0 +1,6 @@
---
- name: Reload zot-metrics
ansible.builtin.shell: |
launchctl unload ~/Library/LaunchAgents/mcquack.eblume.zot-metrics.plist 2>/dev/null || true
launchctl load ~/Library/LaunchAgents/mcquack.eblume.zot-metrics.plist
changed_when: true

View file

@ -0,0 +1,43 @@
---
- name: Ensure metrics directory exists
ansible.builtin.file:
path: "{{ zot_metrics_dir }}"
state: directory
mode: '0755'
- name: Ensure log directory exists
ansible.builtin.file:
path: "{{ zot_metrics_log_dir }}"
state: directory
mode: '0755'
- name: Ensure bin directory exists
ansible.builtin.file:
path: "{{ zot_metrics_script | dirname }}"
state: directory
mode: '0755'
- name: Deploy zot-metrics script
ansible.builtin.template:
src: zot-metrics.sh.j2
dest: "{{ zot_metrics_script }}"
mode: '0755'
- name: Deploy zot-metrics LaunchAgent plist
ansible.builtin.template:
src: zot-metrics.plist.j2
dest: ~/Library/LaunchAgents/mcquack.eblume.zot-metrics.plist
mode: '0644'
notify: Reload zot-metrics
- name: Check if zot-metrics LaunchAgent is loaded
ansible.builtin.command: launchctl list mcquack.eblume.zot-metrics
register: zot_metrics_launchctl_check
changed_when: false
failed_when: false
- name: Load zot-metrics LaunchAgent if not loaded
ansible.builtin.command: launchctl load ~/Library/LaunchAgents/mcquack.eblume.zot-metrics.plist
when: zot_metrics_launchctl_check.rc != 0
changed_when: true
failed_when: false

View file

@ -0,0 +1,21 @@
<?xml version="1.0" encoding="UTF-8"?>
<!-- {{ ansible_managed }} -->
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>mcquack.eblume.zot-metrics</string>
<key>ProgramArguments</key>
<array>
<string>{{ zot_metrics_script }}</string>
</array>
<key>StartInterval</key>
<integer>{{ zot_metrics_interval }}</integer>
<key>RunAtLoad</key>
<true/>
<key>StandardErrorPath</key>
<string>{{ zot_metrics_log_dir }}/mcquack.zot-metrics.err.log</string>
<key>StandardOutPath</key>
<string>{{ zot_metrics_log_dir }}/mcquack.zot-metrics.out.log</string>
</dict>
</plist>

View file

@ -0,0 +1,25 @@
#!/bin/bash
# {{ ansible_managed }}
# Collects Zot registry metrics for node_exporter textfile collector
set -euo pipefail
METRICS_URL="{{ zot_metrics_url }}"
OUTPUT_FILE="{{ zot_metrics_dir }}/zot.prom"
TEMP_FILE="${OUTPUT_FILE}.tmp"
# Start output file with header
cat > "$TEMP_FILE" << 'HEADER'
# HELP zot_up Zot registry is up and responding
# TYPE zot_up gauge
HEADER
# Check if zot is up
if curl -sf "$METRICS_URL" > /dev/null 2>&1; then
echo "zot_up 1" >> "$TEMP_FILE"
else
echo "zot_up 0" >> "$TEMP_FILE"
fi
# Atomic move
mv "$TEMP_FILE" "$OUTPUT_FILE"

View file

@ -0,0 +1,31 @@
#!/bin/bash
# kubectl exec credential plugin for 1Password
# Usage: kubectl-credential-1password <vault-id> <item-id> <cert-field> <key-field>
#
# Fetches client certificate and key from 1Password and outputs
# ExecCredential JSON for kubectl authentication.
set -euo pipefail
VAULT_ID="$1"
ITEM_ID="$2"
CERT_FIELD="$3"
KEY_FIELD="$4"
# Fetch credentials from 1Password (strips surrounding quotes from text fields)
CLIENT_CERT=$(op --vault "$VAULT_ID" item get "$ITEM_ID" --fields "$CERT_FIELD" | sed 's/^"//; s/"$//')
CLIENT_KEY=$(op --vault "$VAULT_ID" item get "$ITEM_ID" --fields "$KEY_FIELD" | sed 's/^"//; s/"$//')
# Output ExecCredential JSON
# Note: jq is used to properly escape the PEM data for JSON
jq -n \
--arg cert "$CLIENT_CERT" \
--arg key "$CLIENT_KEY" \
'{
"apiVersion": "client.authentication.k8s.io/v1beta1",
"kind": "ExecCredential",
"status": {
"clientCertificateData": $cert,
"clientKeyData": $key
}
}'

View file

@ -53,15 +53,18 @@ check_service "forgejo" "ssh indri 'brew services list | grep forgejo | grep sta
check_service "devpi" "ssh indri 'launchctl list | grep devpi | grep -v \"^-\"'" check_service "devpi" "ssh indri 'launchctl list | grep devpi | grep -v \"^-\"'"
check_service "postgresql" "ssh indri 'brew services list | grep postgresql | grep started'" check_service "postgresql" "ssh indri 'brew services list | grep postgresql | grep started'"
check_service "miniflux" "ssh indri 'brew services list | grep miniflux | grep started'" check_service "miniflux" "ssh indri 'brew services list | grep miniflux | grep started'"
check_service "zot" "ssh indri 'launchctl list | grep mcquack.eblume.zot | grep -v \"^-\"'"
check_service "zot-metrics" "ssh indri 'launchctl list | grep zot-metrics | grep -v \"^-\"'"
check_service "minikube-metrics" "ssh indri 'launchctl list | grep minikube-metrics | grep -v \"^-\"'"
echo "" echo ""
echo "HTTP endpoints (via Tailscale):" echo "HTTP endpoints (via Tailscale):"
check_http "Loki" "http://indri:3100/ready" check_http "Loki" "http://indri:3100/ready"
check_http "Prometheus" "http://indri:9090/-/healthy" check_http "Prometheus" "http://indri:9090/-/healthy"
check_http "Grafana" "http://indri:3000/api/health" check_http "Grafana" "https://grafana.tail8d86e.ts.net/api/health"
check_http "Kiwix" "http://indri:5501/" check_http "Kiwix" "https://kiwix.tail8d86e.ts.net/"
check_http "Forgejo" "http://indri:3001/" check_http "Forgejo" "https://forge.tail8d86e.ts.net/"
check_http "Devpi" "http://indri:3141/+api" check_http "Devpi" "https://pypi.tail8d86e.ts.net/+api"
check_http "Miniflux" "https://feed.tail8d86e.ts.net/healthcheck" check_http "Miniflux" "https://feed.tail8d86e.ts.net/healthcheck"
# Transmission RPC is localhost-only by design, check via SSH # Transmission RPC is localhost-only by design, check via SSH
check_service "Transmission RPC" "ssh indri 'curl -sf http://127.0.0.1:9091/transmission/rpc'" check_service "Transmission RPC" "ssh indri 'curl -sf http://127.0.0.1:9091/transmission/rpc'"
@ -69,6 +72,16 @@ check_service "Transmission RPC" "ssh indri 'curl -sf http://127.0.0.1:9091/tran
check_service "Transmission metrics" "ssh indri 'test -f /opt/homebrew/var/node_exporter/textfile/transmission.prom'" check_service "Transmission metrics" "ssh indri 'test -f /opt/homebrew/var/node_exporter/textfile/transmission.prom'"
# PostgreSQL uses TCP not HTTP, check via pg_isready # PostgreSQL uses TCP not HTTP, check via pg_isready
check_service "PostgreSQL" "ssh indri '/opt/homebrew/opt/postgresql@18/bin/pg_isready -h localhost'" check_service "PostgreSQL" "ssh indri '/opt/homebrew/opt/postgresql@18/bin/pg_isready -h localhost'"
# Zot registry (via Tailscale service)
check_http "Zot Registry" "https://registry.tail8d86e.ts.net/v2/_catalog"
check_service "Zot metrics file" "ssh indri 'test -f /opt/homebrew/var/node_exporter/textfile/zot.prom'"
check_service "Minikube metrics file" "ssh indri 'test -f /opt/homebrew/var/node_exporter/textfile/minikube.prom'"
echo ""
echo "Kubernetes cluster:"
check_service "minikube" "ssh indri 'minikube status --format={{.Host}} | grep -q Running'"
check_service "k8s-apiserver (indri)" "ssh indri 'kubectl get --raw /healthz'"
check_service "k8s-apiserver (remote)" "kubectl --kubeconfig=$HOME/.kube/minikube-indri/config.yml --context=minikube-indri get --raw /healthz"
echo "" echo ""
if [ $FAILED -eq 0 ]; then if [ $FAILED -eq 0 ]; then

View file

@ -130,6 +130,9 @@ mise run tailnet-preview # Review changes - should show new tag
mise run tailnet-up # Apply changes mise run tailnet-up # Apply changes
``` ```
**Implementation Details:**
- Also need to add `"tag:registry"` to indri's tags in `pulumi/__main__.py` (the `DeviceTags` resource), not just define it in `policy.hujson`. The policy file defines the tag ownership rules, but the device tags are managed separately in the Python code.
--- ---
### Step 0.2: Create Tailscale Services in Admin Console (MANUAL) ### Step 0.2: Create Tailscale Services in Admin Console (MANUAL)
@ -140,7 +143,9 @@ mise run tailnet-up # Apply changes
2. Create service `registry` with: 2. Create service `registry` with:
- Port: 443 (HTTPS) - Port: 443 (HTTPS)
- Host: indri - Host: indri
3. Apply tag `tag:registry` to indri if not already tagged
**Implementation Details:**
- Tag is applied to indri via Pulumi in Step 0.1, not manually in admin console.
**Verification:** **Verification:**
```bash ```bash
@ -319,6 +324,10 @@ ssh indri 'curl -s http://localhost:5000/v2/_catalog'
# Expected: {"repositories":["docker.io/library/alpine"]} # Expected: {"repositories":["docker.io/library/alpine"]}
``` ```
**Implementation Details:**
- Changed port from 5000 to 5050 because macOS ControlCenter (AirPlay Receiver) uses port 5000 by default.
- Fixed sync config: use `"content": [{"prefix": "**", "destination": "/{{ registry.name }}"}]` instead of `"prefix": "{{ registry.name }}/**"`. The destination rewrites the local path, while prefix `**` matches all upstream repos.
--- ---
### Step 0.4: Add Zot to Tailscale Serve ### Step 0.4: Add Zot to Tailscale Serve
@ -352,6 +361,11 @@ curl -s https://registry.tail8d86e.ts.net/v2/_catalog
# Expected: {"repositories":["blumeops/test","docker.io/library/alpine"]} # Expected: {"repositories":["blumeops/test","docker.io/library/alpine"]}
``` ```
**Implementation Details:**
- Changed upstream port from 5000 to 5050 (see Step 0.3 implementation details).
- After running `tailscale serve`, the service must be approved in Tailscale admin console at https://login.tailscale.com/admin/services before it becomes accessible.
- Podman needed on gilbert for testing - added to Brewfile. Requires `podman machine init && podman machine start` after install.
--- ---
### Step 0.5: Create Zot Metrics Role ### Step 0.5: Create Zot Metrics Role
@ -461,6 +475,9 @@ mise run indri-services-check
# Zot metrics... OK # Zot metrics... OK
``` ```
**Implementation Details:**
- Used Tailscale service URL (`https://registry.tail8d86e.ts.net/v2/_catalog`) instead of internal endpoint to verify full path works.
--- ---
### Step 0.8: Install and Configure Podman on Indri ### Step 0.8: Install and Configure Podman on Indri
@ -504,6 +521,17 @@ ssh indri 'podman info'
ssh indri 'podman run --rm hello-world' ssh indri 'podman run --rm hello-world'
``` ```
**Implementation Details:**
- **KNOWN ISSUE**: `podman machine init` and `podman machine start` have reliability issues when run via Ansible/SSH. The machine sometimes gets stuck in "Starting" state due to a race condition (see https://github.com/containers/podman/issues/16945). Apple Hypervisor may also require GUI session context.
- **WORKAROUND**: If the machine fails to start via Ansible, manually run on indri:
```bash
podman machine rm -f podman-machine-default
podman machine init --cpus 4 --memory 8192 --disk-size 220
podman machine start
```
- LaunchAgent approach was attempted but didn't resolve the issue reliably.
- TODO: Investigate proper automation solution for reliable podman machine management.
--- ---
### Step 0.9: Install and Configure Minikube ### Step 0.9: Install and Configure Minikube
@ -570,6 +598,10 @@ ssh indri 'kubectl get nodes'
# Expected: minikube Ready control-plane ... # Expected: minikube Ready control-plane ...
``` ```
**Implementation Details:**
- Changed `minikube_memory` from 8192 to 7800 because podman machine reports slightly less available memory (7908MB) due to VM overhead. Minikube rejects memory requests exceeding what podman reports.
- Deployed with Kubernetes v1.34.0 and CRI-O 1.24.6.
--- ---
### Step 0.10: Configure Kubeconfig on Gilbert ### Step 0.10: Configure Kubeconfig on Gilbert
@ -597,6 +629,93 @@ k9s # Should show the minikube cluster
The exact approach will be determined during implementation based on what works best with the podman driver. The exact approach will be determined during implementation based on what works best with the podman driver.
**Implementation Details:**
Chose **Option 3: Recreate cluster with `--apiserver-names`** after researching alternatives:
1. **SSH tunneling** - Requires keeping a tunnel running or complex on-demand setup
2. **SOCKS5 proxy with kubeconfig `proxy-url`** - Kubeconfig supports `proxy-url: socks5://localhost:1080` per-context, but still requires managing the proxy
3. **`--apiserver-names` + `--listen-address`** - Native minikube support, cleanest solution
**Cluster Setup:** Recreated the minikube cluster with additional flags:
```bash
minikube delete
minikube start \
--driver=podman \
--container-runtime=cri-o \
--cpus=4 --memory=7800 --disk-size=200g \
--apiserver-names=indri \
--listen-address=0.0.0.0
```
- `--apiserver-names=indri` adds "indri" to the API server certificate SAN
- `--listen-address=0.0.0.0` tells podman to expose the API port on all interfaces
- API server port is dynamic (check with `kubectl config view --minify -o jsonpath="{.clusters[0].cluster.server}"` on indri)
**Credential Management with 1Password:**
Rather than copying private keys between machines, credentials are stored in 1Password and fetched on-demand using kubectl's exec credential plugin. This mirrors the 1Password SSH agent pattern for biometric-protected key access.
1. **Store credentials in 1Password** (vault: `vg6xf6vvfmoh5hqjjhlhbeoaie`, item: `3jo4f2hnzvwfmamudfsbbbec7e`):
- `client-cert` - Contents of `~/.minikube/profiles/minikube/client.crt` (text field)
- `client-key` - Contents of `~/.minikube/profiles/minikube/client.key` (text field)
- `ca-cert` - Contents of `~/.minikube/ca.crt` (text field, not secret but stored for convenience)
2. **Created credential helper script** at `bin/kubectl-credential-1password`:
```bash
#!/bin/bash
# Fetches client cert/key from 1Password, outputs ExecCredential JSON
# Usage: kubectl-credential-1password <vault-id> <item-id> <cert-field> <key-field>
```
Symlinked to `~/.local/bin/kubectl-credential-1password`
3. **Kubeconfig setup on gilbert:**
```bash
# Store CA cert locally (not secret - public key for server verification)
mkdir -p ~/.kube/minikube-indri
op --vault <vault> item get <item> --fields ca-cert | sed 's/^"//; s/"$//' > ~/.kube/minikube-indri/ca.crt
# Configure cluster
kubectl config set-cluster minikube-indri \
--server=https://indri:<port> \
--certificate-authority=/Users/eblume/.kube/minikube-indri/ca.crt
# Configure credentials with exec plugin
kubectl config set-credentials minikube-indri \
--exec-api-version=client.authentication.k8s.io/v1beta1 \
--exec-command=kubectl-credential-1password \
--exec-arg=<vault-id> \
--exec-arg=<item-id> \
--exec-arg=client-cert \
--exec-arg=client-key
# Create context
kubectl config set-context minikube-indri \
--cluster=minikube-indri \
--user=minikube-indri
```
4. **Usage:**
```bash
kubectl --context=minikube-indri get nodes
# or
kubectl config use-context minikube-indri
kubectl get nodes
```
**Security Notes:**
- Client private key never stored on disk - fetched from 1Password on each kubectl command
- CA cert stored on disk (not secret - it's a public key for server verification)
- 1Password biometric/password prompt required for credential access
- `op` command strips quotes from text fields with `sed 's/^"//; s/"$//'`
**References:**
- [minikube start options](https://minikube.sigs.k8s.io/docs/commands/start/)
- [Using kubectl via SSH Tunnel](https://blog.scottlowe.org/2020/06/16/using-kubectl-via-an-ssh-tunnel/)
- [SOCKS5 Proxy Access to K8s API](https://kubernetes.ltd/docs/tasks/extend-kubernetes/socks5-proxy-access-api/)
- [kubectl-tokensshtunnel](https://github.com/jordiprats/kubectl-tokensshtunnel)
- [Securing kubectl config with 1Password](https://blog.mikael.green/post/1password-kubeconfig/)
--- ---
### Step 0.11: Add Minikube to indri-services-check ### Step 0.11: Add Minikube to indri-services-check
@ -623,6 +742,10 @@ mise run indri-services-check
# k8s-apiserver... OK # k8s-apiserver... OK
``` ```
**Implementation Notes:**
- Added a third check `k8s-apiserver (remote)` that verifies kubectl access from gilbert, not just via SSH to indri. This ensures the 1Password credential flow and remote API server access are working.
- The remote check uses both `--kubeconfig` and `--context` flags explicitly since the script runs in bash (not fish) and doesn't inherit the KUBECONFIG environment variable from fish config.
--- ---
### Step 0.12: Create Zettelkasten Documentation ### Step 0.12: Create Zettelkasten Documentation
@ -631,6 +754,45 @@ mise run indri-services-check
- `~/code/personal/zk/zot.md` - `~/code/personal/zk/zot.md`
- `~/code/personal/zk/minikube.md` - `~/code/personal/zk/minikube.md`
**Files to update:**
- `~/code/personal/zk/1767747119-YCPO.md` (main blumeops card)
**Updates to main blumeops card:**
1. Add to **Device Tags** table:
| `tag:registry` | indri | Container registry access |
2. Add to **Services** table:
| **Registry** | https://registry.tail8d86e.ts.net | OCI container registry (Zot) | [[zot]] |
| **Kubernetes** | https://indri:<port> | Minikube cluster | [[minikube]] |
3. Add to **Port Map (Indri)** table:
| 5050 | Zot | HTTP | localhost | Container registry |
| <dynamic> | K8s API | HTTPS | 0.0.0.0 | Minikube API server |
4. Add new section **Remote Kubernetes Access**:
```markdown
## Remote Kubernetes Access (from Gilbert)
The minikube cluster on indri is accessible from gilbert via direct connection.
Cluster was created with `--apiserver-names=indri --listen-address=0.0.0.0`.
**Fish abbreviations** (in `~/.config/fish/config.fish`):
- `ki``kubectl --context=minikube-indri`
- `k9i``k9s --context=minikube-indri`
- `k9``k9s`
```bash
# Quick access via abbreviations
ki get nodes
k9i
# Or explicitly set context
kubectl config use-context minikube-indri
kubectl get nodes
```
```
**Template for zot.md:** **Template for zot.md:**
```markdown ```markdown
--- ---
@ -651,7 +813,7 @@ Zot is an OCI-native container registry running on Indri, providing:
## Service Details ## Service Details
- URL: https://registry.tail8d86e.ts.net - URL: https://registry.tail8d86e.ts.net
- Local port: 5000 - Local port: 5050
- Data directory: ~/zot - Data directory: ~/zot
- Config: ~/.config/zot/config.json - Config: ~/.config/zot/config.json
- Managed via: mcquack LaunchAgent - Managed via: mcquack LaunchAgent
@ -669,10 +831,10 @@ Zot is an OCI-native container registry running on Indri, providing:
\`\`\`bash \`\`\`bash
# List all images # List all images
curl -s http://localhost:5000/v2/_catalog | jq curl -s http://localhost:5050/v2/_catalog | jq
# Pull via cache (from indri or k8s) # Pull via cache (from indri or k8s)
podman pull localhost:5000/docker.io/library/nginx:latest podman pull localhost:5050/docker.io/library/nginx:latest
# Build and push private image (from gilbert) # Build and push private image (from gilbert)
podman build -t registry.tail8d86e.ts.net/blumeops/myapp:v1 . podman build -t registry.tail8d86e.ts.net/blumeops/myapp:v1 .
@ -691,6 +853,91 @@ tail -f ~/Library/Logs/mcquack.zot.err.log
- Initial setup for k8s migration Phase 0 - Initial setup for k8s migration Phase 0
``` ```
**Template for minikube.md:**
```markdown
---
id: minikube
aliases:
- minikube
- kubernetes
- k8s
tags:
- blumeops
---
# Minikube Management Log
Minikube provides a single-node Kubernetes cluster on Indri for running containerized services.
## Cluster Details
- Driver: podman (rootless)
- Container runtime: CRI-O
- Kubernetes version: v1.34.0
- Resources: 4 CPUs, 7800MB RAM, 200GB disk
- API server: https://indri:<port> (accessible from gilbert via Tailscale)
## Remote Access from Gilbert
Cluster was created with `--apiserver-names=indri --listen-address=0.0.0.0` to allow remote kubectl access.
\`\`\`bash
# Switch context
kubectl config use-context minikube-indri
# Verify
kubectl get nodes
kubectl get namespaces
# Use k9s
k9s --context minikube-indri
\`\`\`
## Useful Commands (on indri)
\`\`\`bash
# Cluster status
minikube status
# Start/stop cluster
minikube start
minikube stop
# Access dashboard
minikube dashboard
# SSH into node
minikube ssh
# View logs
minikube logs
\`\`\`
## Podman Machine (prerequisite)
Minikube uses podman as the container runtime. The podman machine must be running:
\`\`\`bash
# Check podman machine
podman machine list
# Start if needed
podman machine start
\`\`\`
## Log
### [DATE]
- Initial cluster setup for k8s migration Phase 0
- Configured for remote access with --apiserver-names=indri
```
**Implementation Notes:**
- Created zot.md and minikube.md in ~/code/personal/zk/
- Updated 1767747119-YCPO.md (main blumeops card) with all specified changes
- Added 1Password credential plugin reference to minikube docs
- K8s API port is 39535 (dynamically assigned by minikube, may change on cluster recreation)
--- ---
### Step 0.13: Update Main Playbook ### Step 0.13: Update Main Playbook
@ -711,6 +958,10 @@ tail -f ~/Library/Logs/mcquack.zot.err.log
tags: minikube tags: minikube
``` ```
**Implementation Notes:**
- Roles were added incrementally during Steps 0.3, 0.5, 0.8, and 0.9
- All four roles (zot, zot_metrics, podman, minikube) confirmed present in indri.yml
--- ---
### Phase 0 Verification Checklist ### Phase 0 Verification Checklist

View file

@ -52,6 +52,7 @@ indri_tags = tailscale.DeviceTags(
"tag:loki", "tag:loki",
"tag:pg", "tag:pg",
"tag:feed", "tag:feed",
"tag:registry", # Zot container registry
], ],
) )

View file

@ -101,6 +101,7 @@
"tag:loki": ["autogroup:admin", "tag:blumeops"], "tag:loki": ["autogroup:admin", "tag:blumeops"],
"tag:pg": ["autogroup:admin", "tag:blumeops"], "tag:pg": ["autogroup:admin", "tag:blumeops"],
"tag:feed": ["autogroup:admin", "tag:blumeops"], "tag:feed": ["autogroup:admin", "tag:blumeops"],
"tag:registry": ["autogroup:admin", "tag:blumeops"],
}, },
// ============== ACL Tests ============== // ============== ACL Tests ==============
@ -108,13 +109,13 @@
// Erich can access everything // Erich can access everything
{ {
"src": "blume.erich@gmail.com", "src": "blume.erich@gmail.com",
"accept": ["tag:grafana:443", "tag:kiwix:443", "tag:feed:443", "tag:loki:3100", "tag:pg:5432", "tag:homelab:22"], "accept": ["tag:grafana:443", "tag:kiwix:443", "tag:feed:443", "tag:loki:3100", "tag:pg:5432", "tag:homelab:22", "tag:registry:443"],
}, },
// Allison can access user services but NOT grafana, loki, or NAS // Allison can access user services but NOT grafana, loki, or NAS
{ {
"src": "acmdavis@gmail.com", "src": "acmdavis@gmail.com",
"accept": ["tag:kiwix:443", "tag:forge:443", "tag:feed:443", "tag:pg:5432"], "accept": ["tag:kiwix:443", "tag:forge:443", "tag:feed:443", "tag:pg:5432"],
"deny": ["tag:grafana:443", "tag:loki:3100", "tag:nas:445"], "deny": ["tag:grafana:443", "tag:loki:3100", "tag:nas:445", "tag:registry:443"],
}, },
// Homelab can reach homelab and NAS // Homelab can reach homelab and NAS
{ {