Fix spider trap: disable SPA mode, remove index files, relax wiki-links (#290)
All checks were successful
Build Container / detect (push) Successful in 3s
Build Container (Nix) / detect (push) Successful in 1s
Build Container (Nix) / build (quartz) (push) Successful in 1s
Build Container / build (quartz) (push) Successful in 10s

## Summary

Fixes the Facebook crawler spider trap that's been generating infinite recursive URLs like `/how-to/tutorials/tutorials/how-to/explanation/...` for several days.

**Root cause:** Quartz SPA mode + nginx `try_files` fallback to `index.html` meant any fabricated URL returned the root HTML shell with HTTP 200. Crawlers followed relative links from those fake URLs, creating infinite recursion.

**Fix:**
- Disable Quartz SPA mode (`enableSPA: false`) — all pages are now fully static HTML
- Replace nginx SPA fallback with `=404` + Quartz's static `404.html`
- Remove `robots.txt` exclusions (no longer needed)

**Docs cleanup (Obsidian.nvim compat no longer needed):**
- Delete hand-curated category index files (`tutorials.md`, `reference.md`, `how-to.md`, `explanation.md`) — Quartz auto-generates folder pages
- Delete `postgresql-storage.md` (redirect stub) and `migrate-forgejo-from-brew.md` (stale history)
- Drop `docs-check-index` and `docs-check-filenames` prek hooks
- Rewrite `docs-check-links` to allow path-based wiki-links (`[[path/to/file]]`) and only error on true ambiguity
- Add `ai-docs` doc tree listing to replace index files for AI context
- Add natural cross-links from reference cards to fix orphan docs

## Deployment and Testing

- [ ] Merge and let the build pipeline run
- [ ] Verify docs.eblu.me serves pages correctly with full page loads
- [ ] Verify non-existent URLs return 404
- [ ] Monitor crawler traffic — should drop to near zero for fabricated URLs

Reviewed-on: #290
This commit is contained in:
Erich Blume 2026-03-09 11:59:43 -07:00
commit 4f0476a851
24 changed files with 110 additions and 666 deletions

View file

@ -91,7 +91,7 @@ BlumeOps operations are driven by mise tasks. Run `mise tasks` to list all avail
| Task | When to Use |
|------|-------------|
| `ai-docs` | At session start - review infrastructure documentation |
| `ai-docs` | At session start - review infrastructure documentation (see [[mise-tasks]]) |
| `docs-mikado` | View active Mikado dependency chains for C2 changes |
| `docs-mikado --resume` | Resume a C2 chain: detect branch, show state and next steps |
| `provision-indri` | Deploy changes to [[indri]]-hosted services via Ansible |
@ -104,9 +104,7 @@ BlumeOps operations are driven by mise tasks. Run `mise tasks` to list all avail
| `dns-up` | Apply DNS changes via Pulumi |
| `tailnet-preview` | Preview Tailscale ACL changes |
| `tailnet-up` | Apply Tailscale ACL changes via Pulumi |
| `docs-check-links` | Validate wiki-links in documentation (includes orphan detection) |
| `docs-check-index` | Check every doc is referenced in its category index |
| `docs-check-filenames` | Check for duplicate doc filenames |
| `docs-check-links` | Validate wiki-links resolve correctly (supports path-based links, orphan detection) |
| `docs-review-stale` | Report docs by last-modified date, highlight stale ones |
| `docs-review-tags` | Print frontmatter tag inventory across all docs |
| `docs-review` | Review the most stale doc by last-reviewed date |
@ -120,7 +118,7 @@ For ArgoCD operations, use the `argocd` CLI directly:
For AI agents building context:
- [[reference|Reference]] - Entry point for technical details
- [Reference](/reference/) - Entry point for technical details
- [[hosts|Host Inventory]] - What hardware exists
- [[apps|ArgoCD Apps]] - What's deployed in Kubernetes
- [[routing|Routing]] - How services are exposed

View file

@ -18,18 +18,18 @@ The docs follow the [Diataxis](https://diataxis.fr/) framework:
| Section | Purpose | When to Use |
|---------|---------|-------------|
| **[[tutorials|Tutorials]]** | Learning-oriented | "I'm new and want to understand" |
| **[[reference|Reference]]** | Information-oriented | "I need specific technical details" |
| **[[how-to|How-to]]** | Task-oriented | "I need to do X" |
| **[[explanation|Explanation]]** | Understanding-oriented | "I want to understand why" |
| **[Tutorials](/tutorials/)** | Learning-oriented | "I'm new and want to understand" |
| **[Reference](/reference/)** | Information-oriented | "I need specific technical details" |
| **[How-to](/how-to/)** | Task-oriented | "I need to do X" |
| **[Explanation](/explanation/)** | Understanding-oriented | "I want to understand why" |
## Quick Paths by Audience
### For Erich (Owner)
You probably want quick access to operational details:
- [[how-to]] guides for common operations (deploy, troubleshoot, update ACLs)
- [[reference]] has service URLs, commands, and config locations
- [How-to](/how-to/) guides for common operations (deploy, troubleshoot, update ACLs)
- [Reference](/reference/) has service URLs, commands, and config locations
- [[ai-assistance-guide]] explains how to work effectively with Claude
- Run `mise run ai-docs` to prime AI context with key documentation
@ -37,40 +37,41 @@ You probably want quick access to operational details:
Context for effective assistance:
- Read [[ai-assistance-guide]] for operational conventions
- [[reference]] has the technical specifics you'll need
- [Reference](/reference/) has the technical specifics you'll need
- The repo's `CLAUDE.md` has critical rules (especially the kubectl context requirement)
### For External Readers
Understanding what this is:
- [[explanation]] covers the "why" behind design decisions
- [[reference]] shows what's actually running
- [Explanation](/explanation/) covers the "why" behind design decisions
- [Reference](/reference/) shows what's actually running
- Browse service pages to see specific implementations
### For Contributors
Getting started with changes:
- [[contributing]] walks through the workflow
- [[how-to]] guides for specific tasks (deploy services, add roles)
- [[reference]] tells you where things live
- [How-to](/how-to/) guides for specific tasks (deploy services, add roles)
- [Reference](/reference/) tells you where things live
### For Replicators
Replicators are people who want to build their own similar homelab GitOps setup, using BlumeOps as inspiration.
- [[replicating-blumeops]] provides the overview, with linked tutorials that go deep on individual components
- [[explanation]] covers architecture and design rationale
- [Explanation](/explanation/) covers architecture and design rationale
- Reference pages show specific configuration choices
## Using Wiki Links
Documentation uses `[[wiki-links]]` for cross-references:
- `[[service-name]]` links to a reference page
- `[[service-name]]` links by filename stem (must be unambiguous)
- `[[path/to/file]]` links by path from docs root (for disambiguation)
- `[[page|Display Text]]` customizes the link text
When reading on the web (docs.eblu.me), these render as clickable links. The backlinks panel shows what references each page.
Prek hooks automatically validate that all wiki-links point to existing files and that link targets are unambiguous.
Prek hooks validate that all wiki-links resolve to existing files and flag ambiguous bare-name links.
## AI Context Priming
@ -80,10 +81,9 @@ The `ai-docs` mise task concatenates key documentation files for AI context:
mise run ai-docs
```
This outputs the AI assistance guide, reference index, how-to index, architecture overview, and tutorials index in plain text with file headers - providing Claude with essential context for BlumeOps operations.
This outputs key documentation files and a full tree listing of all docs, providing Claude with essential context for BlumeOps operations.
## Related
- [[tutorials]] - Parent index of all tutorials
- [[update-documentation]] - How to publish doc changes
- [[review-documentation]] - Periodic doc review process

View file

@ -136,5 +136,5 @@ Begin with [[tailscale-setup]] - networking is the foundation everything else bu
## Related
- [[reference]] - See BlumeOps' specific configurations
- [Reference](/reference/) - See BlumeOps' specific configurations
- [[contributing]] - Help improve BlumeOps instead

View file

@ -1,49 +0,0 @@
---
title: Tutorials
modified: 2026-02-07
tags:
- tutorials
---
# Tutorials
Learning-oriented guides for understanding and working with BlumeOps.
## Audience Guide
Each tutorial indicates which audiences it serves:
| Icon | Audience | Description |
|------|----------|-------------|
| **Owner** | Erich | Quick recall and operational refreshers |
| **AI** | Claude/AI agents | Context for AI-assisted operations |
| **Reader** | External readers | Understanding what BlumeOps is |
| **Contributor** | Operators/contributors | Helping with BlumeOps development |
| **Replicator** | Replicators | Building your own similar setup |
## Getting Started
| Tutorial | Audiences | Description |
|----------|-----------|-------------|
| [[exploring-the-docs]] | All | How to navigate and use this documentation |
| [[ai-assistance-guide]] | AI, Owner | Context for effective AI-assisted operations |
## Contributing
| Tutorial | Audiences | Description |
|----------|-----------|-------------|
| [[contributing]] | Contributor | Your first contribution to BlumeOps |
| [[adding-a-service]] | Contributor, Replicator | Deploy a new service via ArgoCD |
## Replication
For those building their own homelab GitOps setup.
| Tutorial | Audiences | Description |
|----------|-----------|-------------|
| [[replicating-blumeops]] | Replicator | Overview: building a similar environment |
| [[tailscale-setup|Tailscale Setup]] | Replicator | Setting up Tailscale networking |
| [[core-services|Core Services]] | Replicator | Forgejo and container registry |
| [[kubernetes-bootstrap|Kubernetes Bootstrap]] | Replicator | Bootstrapping a Kubernetes cluster |
| [[argocd-config|ArgoCD Config]] | Replicator | Configuring GitOps with ArgoCD |
| [[observability-stack|Observability Stack]] | Replicator | Metrics, logs, and dashboards |