Fix cache hit rate on APM and Fly.io dashboards #177
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "fix/cache-hit-rate-dashboards"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
match_all = truefromflyio_nginx_cache_requests_totalin Alloy so the metric only counts requests that go through the proxy cache (excludes health checks with emptycache_status)rate(...[5m])toincrease(...[$__range])— aggregates over the full dashboard time window instead of a 5-minute sliding window, giving meaningful ratios for low-traffic static sitesRoot cause
Health check requests from Fly.io hit the default nginx server block (no
proxy_cache), producing entries with emptyupstream_cache_status. Withmatch_all = true, these were counted in the cache metric, diluting the Fly.io dashboard ratio. For APM dashboards,rate()[5m]on low-traffic sites with 24h cache validity almost always returns either all-HITs (100%) or no data (blank → red background).Deployment
Test plan