blumeops/containers/quartz/default.conf
Erich Blume 4f0476a851
All checks were successful
Build Container / detect (push) Successful in 3s
Build Container (Nix) / detect (push) Successful in 1s
Build Container (Nix) / build (quartz) (push) Successful in 1s
Build Container / build (quartz) (push) Successful in 10s
Fix spider trap: disable SPA mode, remove index files, relax wiki-links (#290)
## Summary

Fixes the Facebook crawler spider trap that's been generating infinite recursive URLs like `/how-to/tutorials/tutorials/how-to/explanation/...` for several days.

**Root cause:** Quartz SPA mode + nginx `try_files` fallback to `index.html` meant any fabricated URL returned the root HTML shell with HTTP 200. Crawlers followed relative links from those fake URLs, creating infinite recursion.

**Fix:**
- Disable Quartz SPA mode (`enableSPA: false`) — all pages are now fully static HTML
- Replace nginx SPA fallback with `=404` + Quartz's static `404.html`
- Remove `robots.txt` exclusions (no longer needed)

**Docs cleanup (Obsidian.nvim compat no longer needed):**
- Delete hand-curated category index files (`tutorials.md`, `reference.md`, `how-to.md`, `explanation.md`) — Quartz auto-generates folder pages
- Delete `postgresql-storage.md` (redirect stub) and `migrate-forgejo-from-brew.md` (stale history)
- Drop `docs-check-index` and `docs-check-filenames` prek hooks
- Rewrite `docs-check-links` to allow path-based wiki-links (`[[path/to/file]]`) and only error on true ambiguity
- Add `ai-docs` doc tree listing to replace index files for AI context
- Add natural cross-links from reference cards to fix orphan docs

## Deployment and Testing

- [ ] Merge and let the build pipeline run
- [ ] Verify docs.eblu.me serves pages correctly with full page loads
- [ ] Verify non-existent URLs return 404
- [ ] Monitor crawler traffic — should drop to near zero for fabricated URLs

Reviewed-on: #290
2026-03-09 11:59:43 -07:00

34 lines
1,009 B
Text

server {
listen 80;
server_name _;
root /usr/share/nginx/html;
index index.html;
# Enable gzip compression
gzip on;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml text/javascript;
# Cache static assets
location ~* \.(js|css|png|jpg|jpeg|gif|ico|svg|woff|woff2)$ {
expires 1y;
add_header Cache-Control "public, immutable";
}
# Static file serving — no SPA fallback.
# Quartz generates complete HTML for every page, so all valid URLs
# map to real files. Non-existent paths get 404.html (generated by
# Quartz's NotFoundPage plugin), preventing the spider-trap issue
# where crawlers would get index.html for fabricated URLs.
location / {
try_files $uri $uri/ $uri.html =404;
}
error_page 404 /404.html;
# Health check endpoint
location /healthz {
access_log off;
return 200 "ok\n";
add_header Content-Type text/plain;
}
}