2026-02-17 09:51:40 -08:00
---
title: Ntfy
Recurring review sweep: 4 doc cards + nvidia-device-plugin v0.19.2 (#366)
Knocks out the two daily recurring review tasks (doc review + service review) in one PR.
## Doc review (4 never-reviewed reference cards, `last-reviewed: 2026-06-04`)
- **cluster.md** — Kubernetes version v1.34.0 → **v1.35.0**; refreshed the stale ringtail workload list and noted the in-progress minikube→k3s migration (points to `[[ringtail]]` as the canonical list).
- **ntfy.md / tempo.md / alloy.md** — corrected image references: these are now **locally-built `registry.ops.eblu.me/blumeops/*` nix containers** (ntfy v2.19.2, tempo v2.10.3, alloy-k8s v1.16.0), not upstream Docker Hub. Fly.io alloy binary bumped to v1.16.1.
## Service review
- **nvidia-device-plugin** (ringtail GPU): v0.19.0 → **v0.19.2**. Upstream patch releases — CDI/Tegra fixes + dependency bumps, no breaking changes for our manifest-based CDI + RuntimeClass setup (the service-account change in the notes is helm-only).
## Not in this PR (need container rebuilds, deferred)
The other stale services are locally-built nix images, so upgrading them is a forge-runner rebuild rather than a clean tag bump — left untouched (not date-bumped, so they resurface): **prometheus** (v3.10.0→v3.12.0), **loki** (3.6.7→3.7.2), **kube-state-metrics**, **homepage**. Happy to do these as a follow-up rebuild PR.
## Deploy / verify
Not yet deployed — `nvidia-device-plugin` still points at `main`. After review:
```
argocd app set nvidia-device-plugin --revision reviews-jun4 && argocd app sync nvidia-device-plugin
# after merge:
argocd app set nvidia-device-plugin --revision main && argocd app sync nvidia-device-plugin
```
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.eblu.me/eblume/blumeops/pulls/366
2026-06-04 13:37:02 -07:00
modified: 2026-06-04
last-reviewed: 2026-06-04
2026-02-17 09:51:40 -08:00
tags:
- service
- notifications
---
# Ntfy
Self-hosted push notification service. Ntfy receives HTTP POST messages and delivers them to subscribed clients (mobile apps, web UI, CLI).
## Quick Reference
| Property | Value |
|----------|-------|
| **URL ** | https://ntfy.ops.eblu.me |
| **Tailscale URL ** | https://ntfy.tail8d86e.ts.net |
| **Namespace ** | `ntfy` |
Recurring review sweep: 4 doc cards + nvidia-device-plugin v0.19.2 (#366)
Knocks out the two daily recurring review tasks (doc review + service review) in one PR.
## Doc review (4 never-reviewed reference cards, `last-reviewed: 2026-06-04`)
- **cluster.md** — Kubernetes version v1.34.0 → **v1.35.0**; refreshed the stale ringtail workload list and noted the in-progress minikube→k3s migration (points to `[[ringtail]]` as the canonical list).
- **ntfy.md / tempo.md / alloy.md** — corrected image references: these are now **locally-built `registry.ops.eblu.me/blumeops/*` nix containers** (ntfy v2.19.2, tempo v2.10.3, alloy-k8s v1.16.0), not upstream Docker Hub. Fly.io alloy binary bumped to v1.16.1.
## Service review
- **nvidia-device-plugin** (ringtail GPU): v0.19.0 → **v0.19.2**. Upstream patch releases — CDI/Tegra fixes + dependency bumps, no breaking changes for our manifest-based CDI + RuntimeClass setup (the service-account change in the notes is helm-only).
## Not in this PR (need container rebuilds, deferred)
The other stale services are locally-built nix images, so upgrading them is a forge-runner rebuild rather than a clean tag bump — left untouched (not date-bumped, so they resurface): **prometheus** (v3.10.0→v3.12.0), **loki** (3.6.7→3.7.2), **kube-state-metrics**, **homepage**. Happy to do these as a follow-up rebuild PR.
## Deploy / verify
Not yet deployed — `nvidia-device-plugin` still points at `main`. After review:
```
argocd app set nvidia-device-plugin --revision reviews-jun4 && argocd app sync nvidia-device-plugin
# after merge:
argocd app set nvidia-device-plugin --revision main && argocd app sync nvidia-device-plugin
```
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Reviewed-on: https://forge.eblu.me/eblume/blumeops/pulls/366
2026-06-04 13:37:02 -07:00
| **Image ** | `registry.ops.eblu.me/blumeops/ntfy:v2.19.2-fd0bebb-nix` (locally built) |
2026-02-17 09:51:40 -08:00
| **Upstream ** | https://github.com/binwiederhier/ntfy |
| **Manifests ** | `argocd/manifests/ntfy/` |
## Architecture
Ntfy runs as a single pod with no persistent storage — message cache and attachments use an `emptyDir` volume. This is intentional: ntfy is treated as an ephemeral delivery channel, not a message store. Messages lost on pod restart are acceptable.
The upstream relay (`ntfy.sh` ) is configured so mobile app clients can receive push notifications via Google FCM / Apple APNs without self-hosting those integrations.
## Producers
2026-03-11 18:37:31 -07:00
Currently the only producer is **frigate-notify ** , which polls Frigate's webapi for camera detection alerts (person, vehicle, animal) and forwards them to ntfy:
2026-02-17 09:51:40 -08:00
```
2026-03-11 18:37:31 -07:00
Frigate → frigate-notify (webapi polling) → ntfy → mobile clients
2026-02-17 09:51:40 -08:00
```
The frigate-notify config points to ntfy's cluster-internal address:
```
http://ntfy.ntfy.svc.cluster.local:80
```
Other services could publish to ntfy in the future — any HTTP client can POST to a topic.
## Configuration
Server config is in a ConfigMap (`ntfy-config` ):
| Setting | Value |
|---------|-------|
| `base-url` | `https://ntfy.ops.eblu.me` |
| `upstream-base-url` | `https://ntfy.sh` |
| `attachment-total-size-limit` | 1 GB |
| `attachment-file-size-limit` | 10 MB |
| `attachment-expiry-duration` | 24h |
No authentication is configured — access is restricted by Tailscale ACLs (only tailnet clients can reach the service).
## Related
- [[routing]] - How ntfy is exposed via Caddy
- [[observability]] - Monitoring and alerting infrastructure