diff --git a/README.md b/README.md index 8ba6b8d..ee4c3c8 100644 --- a/README.md +++ b/README.md @@ -31,9 +31,21 @@ flakes), all connected via Tailscale: Authentik SSO) on k3s, plus NixOS systemd services. - **Sifaka** (Synology NAS) - backup target and bulk storage. -Notable services include Grafana/Prometheus/Loki observability, Immich photos, -Jellyfin media, Forgejo git forge, a Zot container registry, and more. Public -access is routed through a Fly.io proxy; everything else is tailnet-only. +Notable services include Immich photos, Jellyfin media, Forgejo git forge, a +Zot container registry, and more. Public access is routed through a Fly.io +proxy; everything else is tailnet-only. + +### Observability stack + +The four(+) pillars of observability — metrics, logs, traces, and profiles — +collected by Grafana Alloy and visualized in Grafana with cross-signal linking: + +| Pillar | Backend | How | +|--------|---------|-----| +| **Metrics** | Prometheus | Alloy scrape + remote_write | +| **Logs** | Loki | Alloy pod log collection | +| **Traces** | Tempo | Alloy Beyla eBPF auto-instrumentation | +| **Profiles** | Pyroscope | Alloy eBPF continuous profiling | ## Project structure diff --git a/argocd/apps/alloy-profiling-ringtail.yaml b/argocd/apps/alloy-profiling-ringtail.yaml new file mode 100644 index 0000000..7f65782 --- /dev/null +++ b/argocd/apps/alloy-profiling-ringtail.yaml @@ -0,0 +1,17 @@ +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: alloy-profiling-ringtail + namespace: argocd +spec: + project: default + source: + repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git + targetRevision: main + path: argocd/manifests/alloy-profiling-ringtail + destination: + server: https://ringtail.tail8d86e.ts.net:6443 + namespace: alloy + syncPolicy: + syncOptions: + - CreateNamespace=true diff --git a/argocd/apps/pyroscope.yaml b/argocd/apps/pyroscope.yaml new file mode 100644 index 0000000..0019105 --- /dev/null +++ b/argocd/apps/pyroscope.yaml @@ -0,0 +1,17 @@ +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: pyroscope + namespace: argocd +spec: + project: default + source: + repoURL: ssh://forgejo@forge.ops.eblu.me:2222/eblume/blumeops.git + targetRevision: main + path: argocd/manifests/pyroscope + destination: + server: https://ringtail.tail8d86e.ts.net:6443 + namespace: pyroscope + syncPolicy: + syncOptions: + - CreateNamespace=true diff --git a/argocd/manifests/alloy-profiling-ringtail/config.alloy b/argocd/manifests/alloy-profiling-ringtail/config.alloy new file mode 100644 index 0000000..b9395c2 --- /dev/null +++ b/argocd/manifests/alloy-profiling-ringtail/config.alloy @@ -0,0 +1,73 @@ +// Alloy profiling configuration for ringtail +// Uses pyroscope.ebpf to continuously profile workloads and export to Pyroscope + +// ============== KUBERNETES DISCOVERY ============== + +discovery.kubernetes "pods" { + role = "pod" +} + +discovery.relabel "pods" { + targets = discovery.kubernetes.pods.targets + + // Map container ID for pyroscope.ebpf (required label) + rule { + source_labels = ["__meta_kubernetes_pod_container_id"] + target_label = "__container_id__" + } + + // Build service_name from namespace/pod (required label) + rule { + source_labels = ["__meta_kubernetes_namespace", "__meta_kubernetes_pod_name"] + separator = "/" + target_label = "service_name" + } + + // Keep namespace label + rule { + source_labels = ["__meta_kubernetes_namespace"] + target_label = "namespace" + } + + // Keep pod name label + rule { + source_labels = ["__meta_kubernetes_pod_name"] + target_label = "pod" + } + + // Keep container name label + rule { + source_labels = ["__meta_kubernetes_pod_container_name"] + target_label = "container" + } + + // Drop infrastructure namespaces + rule { + source_labels = ["namespace"] + regex = "kube-system|tailscale" + action = "drop" + } + + // Drop alloy pods to avoid self-profiling noise + rule { + source_labels = ["__meta_kubernetes_pod_label_app"] + regex = "alloy|alloy-tracing|alloy-profiling" + action = "drop" + } +} + +// ============== eBPF PROFILING ============== + +pyroscope.ebpf "instance" { + forward_to = [pyroscope.write.endpoint.receiver] + targets = discovery.relabel.pods.output + demangle = "none" +} + +// ============== PYROSCOPE WRITE ============== + +pyroscope.write "endpoint" { + endpoint { + url = "http://pyroscope.pyroscope.svc.cluster.local:4040" + } +} diff --git a/argocd/manifests/alloy-profiling-ringtail/daemonset.yaml b/argocd/manifests/alloy-profiling-ringtail/daemonset.yaml new file mode 100644 index 0000000..6c8b09a --- /dev/null +++ b/argocd/manifests/alloy-profiling-ringtail/daemonset.yaml @@ -0,0 +1,72 @@ +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: alloy-profiling + namespace: alloy + labels: + app: alloy-profiling +spec: + selector: + matchLabels: + app: alloy-profiling + template: + metadata: + labels: + app: alloy-profiling + spec: + serviceAccountName: alloy-profiling + hostPID: true + containers: + - name: alloy + image: registry.ops.eblu.me/blumeops/alloy:kustomized + args: + - run + - --server.http.listen-addr=0.0.0.0:12347 + - --storage.path=/var/lib/alloy/data + - /etc/alloy/config.alloy + ports: + - containerPort: 12347 + name: http + env: + - name: HOSTNAME + valueFrom: + fieldRef: + fieldPath: spec.nodeName + resources: + requests: + cpu: 100m + memory: 256Mi + limits: + cpu: "1" + memory: 1Gi + volumeMounts: + - name: config + mountPath: /etc/alloy + - name: data + mountPath: /var/lib/alloy/data + - name: tmp + mountPath: /tmp + - name: host-proc + mountPath: /host/proc + readOnly: true + - name: host-sys + mountPath: /host/sys + readOnly: true + securityContext: + privileged: true + tolerations: + - operator: Exists + volumes: + - name: config + configMap: + name: alloy-profiling-config + - name: data + emptyDir: {} + - name: tmp + emptyDir: {} + - name: host-proc + hostPath: + path: /proc + - name: host-sys + hostPath: + path: /sys diff --git a/argocd/manifests/alloy-profiling-ringtail/kustomization.yaml b/argocd/manifests/alloy-profiling-ringtail/kustomization.yaml new file mode 100644 index 0000000..76af63f --- /dev/null +++ b/argocd/manifests/alloy-profiling-ringtail/kustomization.yaml @@ -0,0 +1,17 @@ +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization + +namespace: alloy + +resources: + - rbac.yaml + - daemonset.yaml + +images: + - name: registry.ops.eblu.me/blumeops/alloy + newTag: v1.14.0-fd0bebb-nix + +configMapGenerator: + - name: alloy-profiling-config + files: + - config.alloy diff --git a/argocd/manifests/alloy-profiling-ringtail/rbac.yaml b/argocd/manifests/alloy-profiling-ringtail/rbac.yaml new file mode 100644 index 0000000..5b0bb04 --- /dev/null +++ b/argocd/manifests/alloy-profiling-ringtail/rbac.yaml @@ -0,0 +1,30 @@ +apiVersion: v1 +kind: ServiceAccount +metadata: + name: alloy-profiling + namespace: alloy +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: alloy-profiling +rules: + - apiGroups: [""] + resources: ["pods", "services", "endpoints", "nodes", "namespaces"] + verbs: ["get", "list", "watch"] + - apiGroups: ["apps"] + resources: ["deployments", "replicasets", "statefulsets", "daemonsets"] + verbs: ["get", "list", "watch"] +--- +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRoleBinding +metadata: + name: alloy-profiling +roleRef: + apiGroup: rbac.authorization.k8s.io + kind: ClusterRole + name: alloy-profiling +subjects: + - kind: ServiceAccount + name: alloy-profiling + namespace: alloy diff --git a/argocd/manifests/grafana/datasources.yaml b/argocd/manifests/grafana/datasources.yaml index 5a3d0f3..286bbf0 100644 --- a/argocd/manifests/grafana/datasources.yaml +++ b/argocd/manifests/grafana/datasources.yaml @@ -48,6 +48,19 @@ datasources: datasourceUid: prometheus nodeGraph: enabled: true + tracesToProfilesV2: + datasourceUid: pyroscope + customQuery: false +- access: proxy + editable: false + name: Pyroscope + orgId: 1 + type: grafana-pyroscope-datasource + uid: pyroscope + url: https://pyroscope.tail8d86e.ts.net + jsonData: + backendType: pyroscope + tlsSkipVerify: true - access: proxy database: teslamate editable: false diff --git a/argocd/manifests/pyroscope/config.yaml b/argocd/manifests/pyroscope/config.yaml new file mode 100644 index 0000000..cc1a136 --- /dev/null +++ b/argocd/manifests/pyroscope/config.yaml @@ -0,0 +1,13 @@ +storage: + backend: filesystem + filesystem: + dir: /data + +compactor: + compaction_interval: 30m + +limits: + max_query_lookback: 168h + +self_profiling: + disable_push: true diff --git a/argocd/manifests/pyroscope/ingress-tailscale.yaml b/argocd/manifests/pyroscope/ingress-tailscale.yaml new file mode 100644 index 0000000..4384def --- /dev/null +++ b/argocd/manifests/pyroscope/ingress-tailscale.yaml @@ -0,0 +1,26 @@ +# Tailscale Ingress for Pyroscope query API +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: pyroscope-tailscale + namespace: pyroscope + annotations: + tailscale.com/funnel: "false" + tailscale.com/proxy-group: "ingress" + tailscale.com/tags: "tag:k8s" + gethomepage.dev/enabled: "false" +spec: + ingressClassName: tailscale + rules: + - http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: pyroscope + port: + number: 4040 + tls: + - hosts: + - pyroscope diff --git a/argocd/manifests/pyroscope/kustomization.yaml b/argocd/manifests/pyroscope/kustomization.yaml new file mode 100644 index 0000000..f8db2c6 --- /dev/null +++ b/argocd/manifests/pyroscope/kustomization.yaml @@ -0,0 +1,20 @@ +apiVersion: kustomize.config.k8s.io/v1beta1 +kind: Kustomization + +namespace: pyroscope + +resources: + - namespace.yaml + - statefulset.yaml + - service.yaml + - ingress-tailscale.yaml + +images: + - name: grafana/pyroscope + newName: registry.ops.eblu.me/blumeops/pyroscope + newTag: "v1.19.1-a269bd6-nix" + +configMapGenerator: + - name: pyroscope-config + files: + - config.yaml diff --git a/argocd/manifests/pyroscope/namespace.yaml b/argocd/manifests/pyroscope/namespace.yaml new file mode 100644 index 0000000..a96c46f --- /dev/null +++ b/argocd/manifests/pyroscope/namespace.yaml @@ -0,0 +1,4 @@ +apiVersion: v1 +kind: Namespace +metadata: + name: pyroscope diff --git a/argocd/manifests/pyroscope/service.yaml b/argocd/manifests/pyroscope/service.yaml new file mode 100644 index 0000000..ccb0dc2 --- /dev/null +++ b/argocd/manifests/pyroscope/service.yaml @@ -0,0 +1,13 @@ +apiVersion: v1 +kind: Service +metadata: + name: pyroscope + namespace: pyroscope +spec: + selector: + app: pyroscope + ports: + - name: http + port: 4040 + targetPort: 4040 + type: ClusterIP diff --git a/argocd/manifests/pyroscope/statefulset.yaml b/argocd/manifests/pyroscope/statefulset.yaml new file mode 100644 index 0000000..e31414b --- /dev/null +++ b/argocd/manifests/pyroscope/statefulset.yaml @@ -0,0 +1,66 @@ +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: pyroscope + namespace: pyroscope +spec: + serviceName: pyroscope + replicas: 1 + selector: + matchLabels: + app: pyroscope + template: + metadata: + labels: + app: pyroscope + spec: + securityContext: + fsGroup: 10001 + runAsNonRoot: true + runAsUser: 10001 + seccompProfile: + type: RuntimeDefault + containers: + - name: pyroscope + image: grafana/pyroscope:kustomized + args: + - -config.file=/etc/pyroscope/config.yaml + ports: + - name: http + containerPort: 4040 + volumeMounts: + - name: config + mountPath: /etc/pyroscope + - name: data + mountPath: /data + resources: + requests: + memory: "256Mi" + cpu: "100m" + limits: + memory: "2Gi" + cpu: "500m" + livenessProbe: + httpGet: + path: /ready + port: 4040 + initialDelaySeconds: 45 + periodSeconds: 10 + readinessProbe: + httpGet: + path: /ready + port: 4040 + initialDelaySeconds: 10 + periodSeconds: 5 + volumes: + - name: config + configMap: + name: pyroscope-config + volumeClaimTemplates: + - metadata: + name: data + spec: + accessModes: ["ReadWriteOnce"] + resources: + requests: + storage: 10Gi diff --git a/containers/pyroscope/default.nix b/containers/pyroscope/default.nix new file mode 100644 index 0000000..950d0bb --- /dev/null +++ b/containers/pyroscope/default.nix @@ -0,0 +1,161 @@ +# Nix-built Grafana Pyroscope continuous profiling server +# Builds v1.19.1 from forge mirror +# Uses stdenv + make (not buildGoModule) due to multi-module go.work workspace +# with local replace directives (./api, ./lidia) +# Built with dockerTools.buildLayeredImage for efficient layer caching +{ pkgs ? import { } }: + +let + version = "1.19.1"; + + src = pkgs.fetchgit { + url = "https://forge.ops.eblu.me/mirrors/pyroscope.git"; + rev = "v${version}"; + hash = "sha256-UPxGimkzXLFACqmAM1hNQIoNjN6OquVibwVmNvP00+s="; + }; + + # Build frontend assets via yarn + webpack (upstream uses Docker for this) + # mkYarnPackage symlinks node_modules from the Nix store, but webpack's + # CopyPlugin can't follow symlinks for glob patterns. We dereference the + # @grafana/ui icons directory before running the build. + ui = pkgs.mkYarnPackage { + inherit version src; + pname = "pyroscope-ui"; + + buildPhase = '' + runHook preBuild + export HOME=$TMPDIR + cd deps/grafana-pyroscope + + # mkYarnPackage symlinks node_modules into the Nix store (read-only). + # Webpack CopyPlugin can't glob through these symlinks to find + # @grafana/ui icons. Pre-copy them to the output location and patch + # webpack to skip the CopyPlugin entry for icons. + mkdir -p public/build/grafana/build/img + cp -rL ../../node_modules/@grafana/ui/dist/public/img/icons \ + public/build/grafana/build/img/ + + # Rewrite the CopyPlugin icons path to point at our pre-copied location + # instead of the symlinked node_modules path that webpack can't glob + sed -i "s|from: 'node_modules/@grafana/ui/dist/public/img/icons'|from: 'public/build/grafana/build/img/icons'|" \ + scripts/webpack/webpack.common.js + + yarn --offline build + runHook postBuild + ''; + + installPhase = '' + runHook preInstall + mkdir -p $out + cp -r public/build/* $out/ + runHook postInstall + ''; + + distPhase = "true"; + }; + + # Pre-fetch Go modules for all go.mod files in the workspace (fixed-output derivation) + goModules = pkgs.stdenv.mkDerivation { + pname = "pyroscope-go-modules"; + inherit src version; + + nativeBuildInputs = with pkgs; [ go git cacert ]; + + buildPhase = '' + export GOPATH=$TMPDIR/go + export GOFLAGS=-modcacherw + # Download modules for all workspace members + go mod download + cd api && go mod download && cd .. + cd lidia && go mod download && cd .. + ''; + + installPhase = '' + cp -r $TMPDIR/go/pkg/mod $out + ''; + + # Disable fixup: patchelf and patchShebangs modify downloaded Go toolchain + # binaries, which makes the fixed-output derivation reference store paths + dontFixup = true; + + outputHashMode = "recursive"; + outputHash = "sha256-RCWuqz1XaDrS7+GqL/9v7LNA14M4/ohWEtPeTMDkJFc="; + outputHashAlgo = "sha256"; + }; + + pyroscope = pkgs.stdenv.mkDerivation { + inherit src version; + pname = "pyroscope"; + + nativeBuildInputs = with pkgs; [ + go + git + gnumake + cacert + ]; + + buildPhase = '' + runHook preBuild + + export HOME=$TMPDIR + export GOPATH=$TMPDIR/go + export GOFLAGS=-modcacherw + + # Populate module cache from pre-fetched modules + mkdir -p $GOPATH/pkg + cp -r ${goModules} $GOPATH/pkg/mod + chmod -R u+w $GOPATH/pkg/mod + + # Copy pre-built frontend assets + mkdir -p public/build + cp -r ${ui}/* public/build/ + + # Build Go binary with embedded frontend assets + # Skip the Makefile's frontend/build target (uses Docker) and + # invoke go/bin directly with EMBEDASSETS set + # CGO_ENABLED=0 for static binary (matches upstream) + CGO_ENABLED=0 \ + EMBEDASSETS=embedassets \ + IMAGE_TAG=v${version} \ + make go/bin + + runHook postBuild + ''; + + installPhase = '' + runHook preInstall + mkdir -p $out/bin + cp pyroscope $out/bin/pyroscope + runHook postInstall + ''; + + meta = with pkgs.lib; { + description = "Grafana Pyroscope continuous profiling platform"; + homepage = "https://grafana.com/docs/pyroscope/"; + license = licenses.agpl3Only; + mainProgram = "pyroscope"; + }; + }; +in + +pkgs.dockerTools.buildLayeredImage { + name = "blumeops/pyroscope"; + contents = [ + pyroscope + pkgs.cacert + pkgs.tzdata + ]; + + config = { + Entrypoint = [ "${pyroscope}/bin/pyroscope" ]; + Cmd = [ "-config.file=/etc/pyroscope/config.yaml" ]; + Env = [ + "SSL_CERT_FILE=${pkgs.cacert}/etc/ssl/certs/ca-bundle.crt" + "TZDIR=${pkgs.tzdata}/share/zoneinfo" + ]; + ExposedPorts = { + "4040/tcp" = { }; + }; + User = "65534"; + }; +} diff --git a/docs/changelog.d/feature-pyroscope-profiling.feature.md b/docs/changelog.d/feature-pyroscope-profiling.feature.md new file mode 100644 index 0000000..0d429fe --- /dev/null +++ b/docs/changelog.d/feature-pyroscope-profiling.feature.md @@ -0,0 +1 @@ +Deploy Grafana Pyroscope on ringtail for continuous eBPF profiling, with Alloy collection agent and Grafana cross-signal linking. diff --git a/docs/reference/kubernetes/apps.md b/docs/reference/kubernetes/apps.md index 02215fc..db9baca 100644 --- a/docs/reference/kubernetes/apps.md +++ b/docs/reference/kubernetes/apps.md @@ -1,6 +1,6 @@ --- title: Apps -modified: 2026-03-04 +modified: 2026-03-26 tags: - kubernetes - argocd @@ -30,6 +30,8 @@ Registry of all applications deployed via [[argocd]]. | `tempo` | monitoring | `argocd/manifests/tempo/` | [[tempo]] | | `alloy-k8s` | alloy | `argocd/manifests/alloy-k8s/` | [[alloy|Alloy]] | | `alloy-tracing-ringtail` | alloy | `argocd/manifests/alloy-tracing-ringtail/` | [[alloy|Alloy]] (eBPF tracing) | +| `alloy-profiling-ringtail` | alloy | `argocd/manifests/alloy-profiling-ringtail/` | [[alloy|Alloy]] (eBPF profiling) | +| `pyroscope` | pyroscope | `argocd/manifests/pyroscope/` | [[pyroscope]] | | `kube-state-metrics` | monitoring | `argocd/manifests/kube-state-metrics/` | K8s metrics | | `miniflux` | miniflux | `argocd/manifests/miniflux/` | [[miniflux]] | | `kiwix` | kiwix | `argocd/manifests/kiwix/` | [[kiwix]] | diff --git a/docs/reference/operations/observability.md b/docs/reference/operations/observability.md index 35136d5..e136fb6 100644 --- a/docs/reference/operations/observability.md +++ b/docs/reference/operations/observability.md @@ -1,21 +1,30 @@ --- title: Observability -modified: 2026-03-22 +modified: 2026-03-26 tags: - operations --- # Observability -Metrics, logs, traces, and dashboards for BlumeOps infrastructure. +The four(+) pillars of observability — metrics, logs, traces, and profiles — collected and visualized via the Grafana ecosystem. ## Components -- [[prometheus]] - Metrics storage and querying -- [[loki]] - Log aggregation -- [[tempo]] - Distributed tracing -- [[alloy|Alloy]] - Metrics, log, and trace collection -- [[grafana]] - Dashboards and visualization +| Pillar | Backend | Collector | Cluster | +|--------|---------|-----------|---------| +| **Metrics** | [[prometheus]] | [[alloy]] | indri | +| **Logs** | [[loki]] | [[alloy]] | indri | +| **Traces** | [[tempo]] | [[alloy]] (Beyla eBPF) | indri (backend), ringtail (collection) | +| **Profiles** | [[pyroscope]] | [[alloy]] (pyroscope.ebpf) | ringtail | + +All four are visualized in [[grafana]] with cross-signal linking (traces → logs, traces → profiles, traces → metrics). + +## Future: Frontend Monitoring (RUM) + +Grafana Faro is a Real User Monitoring SDK that captures page loads, web vitals, errors, and network timings from the browser, feeding into Loki (logs) and Tempo (traces) via Alloy's `faro.receiver` component. This would add an "outside-in" view of service health from the user's perspective. + +**Not currently deployed.** RUM captures browsing behavior from visitors to public services, creating a data retention liability. Would require careful sanitization before deploying. ## Alerting diff --git a/docs/reference/services/alloy.md b/docs/reference/services/alloy.md index d781f2f..d62b638 100644 --- a/docs/reference/services/alloy.md +++ b/docs/reference/services/alloy.md @@ -1,6 +1,6 @@ --- title: Alloy -modified: 2026-03-13 +modified: 2026-03-26 tags: - service - observability @@ -8,10 +8,12 @@ tags: # Grafana Alloy -Unified observability collector for metrics and logs with three deployments: +Unified observability collector for metrics, logs, traces, and profiles with five deployments: 1. **Indri (host)** - System metrics and service logs from macOS host 2. **Kubernetes (DaemonSet)** - Automatic pod log collection and service health probes 3. **Fly.io proxy (embedded)** - nginx access log metrics and log forwarding from [[flyio-proxy]] +4. **Ringtail tracing (DaemonSet)** - Beyla eBPF auto-instrumentation for HTTP traces +5. **Ringtail profiling (DaemonSet)** - `pyroscope.ebpf` continuous CPU profiling ## Quick Reference @@ -64,4 +66,7 @@ The Homebrew bottle uses `CGO_ENABLED=0`, which breaks Tailscale MagicDNS. Build - [[prometheus]] - Metrics storage - [[loki]] - Log storage +- [[tempo]] - Trace storage +- [[pyroscope]] - Profile storage - [[grafana]] - Visualization +- [[observability]] - Full stack overview diff --git a/docs/reference/services/grafana.md b/docs/reference/services/grafana.md index 3a9ae01..e5e038c 100644 --- a/docs/reference/services/grafana.md +++ b/docs/reference/services/grafana.md @@ -1,6 +1,6 @@ --- title: Grafana -modified: 2026-02-28 +modified: 2026-03-26 tags: - service - observability @@ -37,6 +37,7 @@ The OIDC client secret is injected via [[external-secrets]] (`grafana-authentik- | Prometheus | prometheus | `prometheus.monitoring.svc.cluster.local:9090` | | Loki | loki | `loki.monitoring.svc.cluster.local:3100` | | Tempo | tempo | `tempo.monitoring.svc.cluster.local:3200` | +| Pyroscope | grafana-pyroscope-datasource | `pyroscope.tail8d86e.ts.net` (ringtail, via Tailscale) | | TeslaMate | postgres | `blumeops-pg-rw.databases.svc.cluster.local:5432` | ## Dashboard Provisioning diff --git a/docs/reference/services/pyroscope.md b/docs/reference/services/pyroscope.md new file mode 100644 index 0000000..927d1b7 --- /dev/null +++ b/docs/reference/services/pyroscope.md @@ -0,0 +1,50 @@ +--- +title: Pyroscope +modified: 2026-03-26 +tags: + - service + - observability +--- + +# Grafana Pyroscope + +Continuous profiling backend for BlumeOps. Stores CPU profiles collected by Alloy's eBPF profiler on ringtail, providing function-level visibility into where compute time is spent. + +## Quick Reference + +| Property | Value | +|----------|-------| +| **URL** | https://pyroscope.tail8d86e.ts.net | +| **Namespace** | `pyroscope` | +| **Cluster** | ringtail (k3s) | +| **Deployment** | StatefulSet (`argocd/manifests/pyroscope/`) | +| **Image** | `grafana/pyroscope` | +| **Port** | 4040 | +| **Storage** | 10Gi PVC at `/data` | +| **Retention** | 7 days (`max_query_lookback: 168h`) | + +## Architecture + +Pyroscope runs on ringtail because eBPF profiling requires Linux. Grafana on indri queries it via Tailscale Ingress. + +``` +Alloy (pyroscope.ebpf on ringtail) → Pyroscope (ringtail) → Grafana (indri, via Tailscale) +``` + +## Collection + +Profiles are collected by the `alloy-profiling-ringtail` DaemonSet, which runs the `pyroscope.ebpf` component in privileged mode with `hostPID: true`. It discovers Kubernetes pods automatically and excludes infrastructure namespaces (`kube-system`, `tailscale`) and Alloy pods. + +The eBPF profiler works without application instrumentation — it samples CPU stack traces from the kernel, covering native code (Go, C/C++), interpreted languages (Python, Ruby, Node.js), and JIT-compiled runtimes (.NET). + +**Limitations:** +- GPU workloads (e.g., Frigate inference via CUDA) are invisible to CPU profiling +- Stripped binaries (no debug symbols) produce opaque stack frames +- Python frame quality varies depending on runtime version + +## Related + +- [[alloy]] - Collection agent +- [[observability]] - Full observability stack overview +- [[grafana]] - Visualization +- [[tempo]] - Distributed tracing (cross-linked via traces-to-profiles) diff --git a/service-versions.yaml b/service-versions.yaml index 909aa8c..8580525 100644 --- a/service-versions.yaml +++ b/service-versions.yaml @@ -285,6 +285,20 @@ services: upstream-source: https://github.com/prowler-cloud/prowler/releases notes: CIS Kubernetes Benchmark scanner; weekly CronJob on minikube-indri + - name: pyroscope + type: argocd + last-reviewed: 2026-03-26 + current-version: "v1.19.1" + upstream-source: https://github.com/grafana/pyroscope/releases + notes: Nix-built container on ringtail; continuous profiling backend + + - name: alloy-profiling-ringtail + type: argocd + last-reviewed: 2026-03-26 + current-version: "v1.14.0" + upstream-source: https://github.com/grafana/alloy/releases + notes: Privileged DaemonSet with pyroscope.ebpf for CPU profiling on ringtail + - name: forgejo type: ansible last-reviewed: 2026-02-22