From 7ebac4aef6b89fe7d246daa8863693d183a344b8 Mon Sep 17 00:00:00 2001 From: Erich Blume Date: Tue, 3 Feb 2026 18:51:57 -0800 Subject: [PATCH] Add Phase 3 tutorials with audience targeting (#94) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary - Create tutorials directory structure with index page - Add 5 main tutorials targeting different audiences: - **what-is-blumeops** (Reader, AI) - High-level orientation - **exploring-the-docs** (All) - Navigation guide - **ai-assistance-guide** (AI, Owner) - Context for AI-assisted operations - **contributing** (Contributor) - First contribution workflow - **replicating-blumeops** (Replicator) - Overview for building similar setup - Add 4 replication sub-tutorials: - tailscale-setup, kubernetes-bootstrap, argocd-config, observability-stack - Update README.md to mark Phase 3 complete - Add changelog fragment Each tutorial explicitly identifies its target audiences and links to reference material rather than re-explaining concepts. ## Deployment and Testing - [x] All pre-commit hooks pass (doc-links validates wiki links) - [ ] Build docs via workflow to verify rendering - [ ] Review content for accuracy 🤖 Generated with [Claude Code](https://claude.com/claude-code) Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/94 --- Brewfile | 1 + docs/README.md | 26 +- docs/changelog.d/phase3-tutorials.doc.md | 1 + docs/index.md | 9 +- docs/reference/ansible/roles.md | 47 ++++ docs/reference/index.md | 8 + docs/reference/kubernetes/apps.md | 2 +- .../kubernetes/tailscale-operator.md | 40 +++ docs/reference/services/docs.md | 49 ++++ docs/tutorials/adding-a-service.md | 255 ++++++++++++++++++ docs/tutorials/ai-assistance-guide.md | 121 +++++++++ docs/tutorials/contributing.md | 177 ++++++++++++ docs/tutorials/exploring-the-docs.md | 80 ++++++ docs/tutorials/index.md | 48 ++++ docs/tutorials/replicating-blumeops.md | 139 ++++++++++ docs/tutorials/replication/argocd-config.md | 221 +++++++++++++++ docs/tutorials/replication/core-services.md | 113 ++++++++ .../replication/kubernetes-bootstrap.md | 170 ++++++++++++ .../replication/observability-stack.md | 231 ++++++++++++++++ docs/tutorials/replication/tailscale-setup.md | 134 +++++++++ 20 files changed, 1863 insertions(+), 9 deletions(-) create mode 100644 docs/changelog.d/phase3-tutorials.doc.md create mode 100644 docs/reference/ansible/roles.md create mode 100644 docs/reference/kubernetes/tailscale-operator.md create mode 100644 docs/reference/services/docs.md create mode 100644 docs/tutorials/adding-a-service.md create mode 100644 docs/tutorials/ai-assistance-guide.md create mode 100644 docs/tutorials/contributing.md create mode 100644 docs/tutorials/exploring-the-docs.md create mode 100644 docs/tutorials/index.md create mode 100644 docs/tutorials/replicating-blumeops.md create mode 100644 docs/tutorials/replication/argocd-config.md create mode 100644 docs/tutorials/replication/core-services.md create mode 100644 docs/tutorials/replication/kubernetes-bootstrap.md create mode 100644 docs/tutorials/replication/observability-stack.md create mode 100644 docs/tutorials/replication/tailscale-setup.md diff --git a/Brewfile b/Brewfile index 5aa1402..a87fe8c 100644 --- a/Brewfile +++ b/Brewfile @@ -2,5 +2,6 @@ brew "actionlint" # GitHub/Forgejo Actions workflow linter brew "argocd" # ArgoCD CLI for GitOps management brew "bat" # Syntax-highlighted file concatenation +brew "mise" # Task runner and toolchain manager brew "tea" # Gitea/Forgejo CLI for forge.ops.eblu.me brew "podman" # Container CLI (uses VM on macOS, for building/pushing images) diff --git a/docs/README.md b/docs/README.md index 86ba1c3..1416361 100644 --- a/docs/README.md +++ b/docs/README.md @@ -79,13 +79,24 @@ Information-oriented technical descriptions. Built first so other docs can link **Reference URL:** https://docs.ops.eblu.me/reference/ -### Phase 3: Tutorials -Learning-oriented content for getting started. +### Phase 3: Tutorials (Complete) +Learning-oriented content for getting started. Each tutorial explicitly identifies its target audiences. -- [ ] Create `tutorials/` directory -- [ ] "Getting Started with BlumeOps" - What this is and how to explore it -- [ ] "Setting Up a Similar Environment" - For replicators -- [ ] "Your First Contribution" - For potential contributors +- [x] Create `tutorials/` directory with index +- [x] "Exploring the Docs" - How to navigate documentation (All) +- [x] "AI Assistance Guide" - Context for AI-assisted operations (AI, Owner) +- [x] "Contributing" - Your first contribution (Contributor) +- [x] "Adding a Service" - Deploy a new ArgoCD service (Contributor, Replicator) +- [x] "Replicating BlumeOps" - Overview for building similar setup (Replicator) +- [x] Replication sub-tutorials: + - [x] Tailscale Setup + - [x] Core Services (Forgejo, Zot) + - [x] Kubernetes Bootstrap + - [x] ArgoCD Config + - [x] Observability Stack +- [x] New reference cards: docs service, tailscale-operator, ansible/roles + +**Tutorials URL:** https://docs.ops.eblu.me/tutorials/ ### Phase 4: How-to Guides Task-oriented instructions for specific operations. @@ -96,6 +107,7 @@ Task-oriented instructions for specific operations. - [ ] "How to add a new Ansible role" - [ ] "How to update Tailscale ACLs" - [ ] "How to troubleshoot common issues" +- [ ] Update `exploring-the-docs` with How-to section ### Phase 5: Explanation Understanding-oriented discussion of concepts and decisions. @@ -105,11 +117,13 @@ Understanding-oriented discussion of concepts and decisions. - [ ] "Architecture Overview" - How everything fits together - [ ] "Security Model" - Tailscale, secrets management, etc. - [ ] "Decision Log" - ADRs (Architecture Decision Records) +- [ ] Update `exploring-the-docs` with Explanation section ### Phase 6: Integration & Cleanup - [ ] Migrate remaining useful content from `docs/zk/` - [ ] Decide fate of zk cards (archive, delete, or keep as separate knowledge base) - [ ] Update CLAUDE.md to reference new doc structure +- [ ] Final review of `exploring-the-docs` for completeness - [ ] Mirror docs to GitHub Pages for public access (optional) ## Current Directory Layout diff --git a/docs/changelog.d/phase3-tutorials.doc.md b/docs/changelog.d/phase3-tutorials.doc.md new file mode 100644 index 0000000..a67bb45 --- /dev/null +++ b/docs/changelog.d/phase3-tutorials.doc.md @@ -0,0 +1 @@ +Add Phase 3 tutorials: "What is BlumeOps?", "Exploring the Docs", "AI Assistance Guide", "Contributing", and "Replicating BlumeOps" with sub-tutorials for Tailscale, Kubernetes, ArgoCD, and Observability. Each tutorial explicitly identifies its target audiences. diff --git a/docs/index.md b/docs/index.md index 0d12e53..ce01b1f 100644 --- a/docs/index.md +++ b/docs/index.md @@ -4,8 +4,13 @@ title: blumeops-documentation Welcome to the BlumeOps documentation. -[[README | Documentation Home]] - Temporary home while docs are being restructured (see [Diataxis](https://diataxis.fr/) restructuring plan) +**New here?** Start with [[exploring-the-docs]] to find your way around. ## Sections -- [[reference/index | reference]] - Technical reference cards for services, infrastructure, and operations +- [[tutorials/index | Tutorials]] - Learning-oriented guides for getting started +- [[reference/index | Reference]] - Technical reference cards for services, infrastructure, and operations + +## About + +[[README | Documentation Home]] - Restructuring plan and changelog info diff --git a/docs/reference/ansible/roles.md b/docs/reference/ansible/roles.md new file mode 100644 index 0000000..d51847f --- /dev/null +++ b/docs/reference/ansible/roles.md @@ -0,0 +1,47 @@ +--- +title: ansible-roles +tags: + - ansible + - reference +--- + +# Ansible Roles + +Roles for provisioning services on [[indri]]. Run via `mise run provision-indri`. + +## Available Roles + +| Role | Purpose | Service | +|------|---------|---------| +| **alloy** | Observability collector | [[alloy]] | +| **borgmatic** | Backup automation | [[borgmatic]] | +| **borgmatic_metrics** | Backup metrics exporter | [[borgmatic]] | +| **caddy** | Reverse proxy & TLS | [[routing]] | +| **forgejo** | Git forge | [[forgejo]] | +| **jellyfin** | Media server | [[jellyfin]] | +| **jellyfin_metrics** | Media metrics exporter | [[jellyfin]] | +| **minikube** | Kubernetes cluster | [[cluster]] | +| **minikube_metrics** | Cluster metrics | [[cluster]] | +| **zot** | Container registry | [[zot]] | +| **zot_metrics** | Registry metrics | [[zot]] | + +## Role Structure + +Each role follows Ansible conventions: +``` +ansible/roles// +├── defaults/main.yml # Default variables +├── tasks/main.yml # Task definitions +├── handlers/main.yml # Handlers (restarts, etc.) +├── templates/ # Jinja2 templates +└── files/ # Static files +``` + +## Secrets + +Roles that need secrets use 1Password via the playbook's `pre_tasks`. Secrets are gathered at playbook start and passed to roles as variables. + +## Related + +- [[indri]] - Target host +- [[observability]] - Metrics collection diff --git a/docs/reference/index.md b/docs/reference/index.md index fe55778..8900424 100644 --- a/docs/reference/index.md +++ b/docs/reference/index.md @@ -31,6 +31,7 @@ Individual service reference cards with URLs and configuration details. | [[teslamate]] | Tesla data logger | k8s | | [[transmission]] | BitTorrent daemon | k8s | | [[zot]] | Container registry | indri | +| [[docs]] | Documentation site (Quartz) | k8s | ## Infrastructure @@ -48,8 +49,15 @@ Cluster configuration and application registry. - [[cluster | Cluster]] - Minikube specs, storage, networking - [[apps | Apps]] - ArgoCD application registry +- [[tailscale-operator]] - Tailscale ingress for k8s services - [[external-secrets]] - Secrets management +## Ansible + +Configuration management for [[indri]]-hosted services. + +- [[reference/ansible/roles | Roles]] - Available ansible roles + ## Storage Network storage and backup configuration. diff --git a/docs/reference/kubernetes/apps.md b/docs/reference/kubernetes/apps.md index db5ab58..298aafc 100644 --- a/docs/reference/kubernetes/apps.md +++ b/docs/reference/kubernetes/apps.md @@ -15,7 +15,7 @@ Registry of all applications deployed via [[argocd]]. |-----|-----------|-------------|---------| | `apps` | argocd | `argocd/apps/` | App-of-apps root | | `argocd` | argocd | `argocd/manifests/argocd/` | [[argocd]] | -| `tailscale-operator` | tailscale | `argocd/manifests/tailscale-operator/` | Tailscale k8s operator | +| `tailscale-operator` | tailscale | `argocd/manifests/tailscale-operator/` | [[tailscale-operator]] | | `1password-connect` | 1password | `argocd/manifests/1password-connect/` | [[1password]] | | `external-secrets` | external-secrets | Helm chart | [[1password]] | | `external-secrets-config` | external-secrets | `argocd/manifests/external-secrets-config/` | [[1password]] | diff --git a/docs/reference/kubernetes/tailscale-operator.md b/docs/reference/kubernetes/tailscale-operator.md new file mode 100644 index 0000000..8df8df3 --- /dev/null +++ b/docs/reference/kubernetes/tailscale-operator.md @@ -0,0 +1,40 @@ +--- +title: tailscale-operator +tags: + - kubernetes + - tailscale +--- + +# Tailscale Kubernetes Operator + +The Tailscale operator enables Kubernetes services to be exposed directly on the Tailscale network via Ingress resources. + +## Quick Reference + +| Property | Value | +|----------|-------| +| **Namespace** | `tailscale` | +| **Helm Chart** | `tailscale/tailscale-operator` | +| **ArgoCD App** | `tailscale-operator` | + +## How It Works + +When you create an Ingress with `ingressClassName: tailscale`: + +1. Operator provisions a Tailscale node for the service +2. Service becomes accessible at `.tail8d86e.ts.net` +3. TLS is handled automatically via Tailscale + +## Limitations + +Services exposed via Tailscale Ingress are **not accessible** from: +- Other Kubernetes pods (they're not Tailscale clients) +- Docker containers on indri + +For pod-to-service communication, use [[routing | Caddy]] (`*.ops.eblu.me`) instead. + +## Related + +- [[tailscale]] - Network configuration +- [[routing]] - Service routing options +- [[apps]] - Application registry diff --git a/docs/reference/services/docs.md b/docs/reference/services/docs.md new file mode 100644 index 0000000..9ab954c --- /dev/null +++ b/docs/reference/services/docs.md @@ -0,0 +1,49 @@ +--- +title: docs +tags: + - service + - documentation +--- + +# Docs (Quartz) + +Documentation site built with [Quartz](https://quartz.jzhao.xyz/) and served via nginx. + +## Quick Reference + +| Property | Value | +|----------|-------| +| **URL** | https://docs.ops.eblu.me | +| **Namespace** | `docs` | +| **Container** | `registry.ops.eblu.me/blumeops/quartz:v1.0.0` | +| **Source** | `docs/` directory in blumeops repo | +| **Build** | Forgejo workflow `build-blumeops.yaml` | + +## Architecture + +1. **Source**: Markdown files in `docs/` with Obsidian-compatible wiki-links +2. **Build**: Forgejo workflow builds Quartz static site on push to main +3. **Release**: Built assets published as Forgejo release attachments +4. **Deploy**: Container downloads release bundle on startup, serves via nginx + +## Release Process + +Documentation is automatically built and released when changes are pushed to main: + +1. Workflow detects changes in `docs/` directory +2. Quartz builds static HTML/CSS/JS +3. Assets uploaded as release attachment +4. ArgoCD deployment updated with new `DOCS_RELEASE_URL` +5. Pod restarts and downloads new bundle + +## Configuration + +- **Quartz config**: `quartz.config.ts` +- **Layout**: `quartz.layout.ts` +- **ArgoCD app**: `argocd/apps/docs.yaml` +- **Manifests**: `argocd/manifests/docs/` + +## Related + +- [[argocd]] - Deployment management +- [[forgejo]] - Build workflows diff --git a/docs/tutorials/adding-a-service.md b/docs/tutorials/adding-a-service.md new file mode 100644 index 0000000..b61c2dc --- /dev/null +++ b/docs/tutorials/adding-a-service.md @@ -0,0 +1,255 @@ +--- +title: adding-a-service +tags: + - tutorials + - argocd + - kubernetes +--- + +# Adding an ArgoCD-Managed Service + +> **Audiences:** Contributor, Replicator + +This tutorial walks through deploying a new service to BlumeOps via ArgoCD, including ingress configuration, homepage integration, and observability setup. + +## Prerequisites + +- Access to the [[tailscale | Tailscale]] network +- `kubectl` configured with `minikube-indri` context +- `argocd` CLI installed (via Brewfile: `brew bundle`) + +## Overview + +Adding a service involves: +1. Creating Kubernetes manifests +2. Creating an ArgoCD Application +3. Configuring Tailscale ingress +4. Adding Homepage dashboard entry +5. Setting up Grafana dashboards (optional) + +## Step 1: Create Manifests Directory + +Create a directory for your service's Kubernetes manifests: + +``` +argocd/manifests// +├── deployment.yaml +├── service.yaml +├── ingress-tailscale.yaml +└── configmap.yaml # if needed +``` + +### Example Deployment + +```yaml +# argocd/manifests/myservice/deployment.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: myservice + namespace: myservice +spec: + replicas: 1 + selector: + matchLabels: + app: myservice + template: + metadata: + labels: + app: myservice + spec: + containers: + - name: myservice + image: registry.ops.eblu.me/myservice:v1.0.0 + ports: + - containerPort: 8080 +``` + +### Example Service + +```yaml +# argocd/manifests/myservice/service.yaml +apiVersion: v1 +kind: Service +metadata: + name: myservice + namespace: myservice +spec: + selector: + app: myservice + ports: + - port: 80 + targetPort: 8080 +``` + +## Step 2: Configure Tailscale Ingress + +Create an Ingress to expose the service via Tailscale. See [[tailscale-operator]] for details. + +```yaml +# argocd/manifests/myservice/ingress-tailscale.yaml +apiVersion: networking.k8s.io/v1 +kind: Ingress +metadata: + name: myservice + namespace: myservice +spec: + ingressClassName: tailscale + rules: + - host: myservice + http: + paths: + - path: / + pathType: Prefix + backend: + service: + name: myservice + port: + number: 80 +``` + +This exposes the service at `https://myservice.tail8d86e.ts.net`. + +## Step 3: Add Homepage Annotations + +Add annotations to the Ingress for automatic Homepage dashboard discovery: + +```yaml +metadata: + annotations: + gethomepage.dev/enabled: "true" + gethomepage.dev/name: "My Service" + gethomepage.dev/group: "Apps" + gethomepage.dev/icon: "myservice.png" + gethomepage.dev/description: "Short description" + gethomepage.dev/href: "https://myservice.ops.eblu.me" + gethomepage.dev/pod-selector: "app=myservice" +``` + +Icons use [Dashboard Icons](https://github.com/walkxcode/dashboard-icons) format. + +## Step 4: Create ArgoCD Application + +Create an Application manifest to tell ArgoCD about your service: + +```yaml +# argocd/apps/myservice.yaml +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: myservice + namespace: argocd +spec: + project: default + source: + repoURL: ssh://forgejo@indri.tail8d86e.ts.net:2200/eblume/blumeops.git + targetRevision: main + path: argocd/manifests/myservice + destination: + server: https://kubernetes.default.svc + namespace: myservice + syncPolicy: + syncOptions: + - CreateNamespace=true +``` + +## Step 5: Add Caddy Route (Optional) + +If the service needs to be accessible from other pods or containers, add a Caddy route in `ansible/roles/caddy/defaults/main.yml`: + +```yaml +caddy_services: + # ... existing services ... + - name: myservice + upstream: "https://myservice.tail8d86e.ts.net" +``` + +Then run `mise run provision-indri -- --tags caddy` to apply. + +This enables access via `https://myservice.ops.eblu.me`. See [[routing]] for details on when this is needed. + +## Step 6: Deploy + +### Testing on a Feature Branch + +For new services, point ArgoCD at your feature branch first: + +```bash +# Sync the apps application to pick up your new Application +argocd app sync apps + +# Point your app at the feature branch +argocd app set myservice --revision feature/your-branch +argocd app sync myservice +``` + +### Verify Deployment + +```bash +kubectl --context=minikube-indri -n myservice get pods +kubectl --context=minikube-indri -n myservice logs -f deployment/myservice +``` + +### After PR Merge + +Reset to main branch: +```bash +argocd app set myservice --revision main +argocd app sync myservice +``` + +## Step 7: Add Observability (Optional) + +### Prometheus Metrics + +If your service exposes Prometheus metrics, add scrape annotations: + +```yaml +# In deployment.yaml pod template +metadata: + annotations: + prometheus.io/scrape: "true" + prometheus.io/port: "8080" + prometheus.io/path: "/metrics" +``` + +### Grafana Dashboard + +Create a ConfigMap in `argocd/manifests/grafana-config/dashboards/`: + +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: myservice-dashboard + namespace: monitoring + labels: + grafana_dashboard: "1" + annotations: + grafana_folder: "Services" +data: + myservice.json: | + { ... dashboard JSON ... } +``` + +See [[grafana]] for dashboard provisioning details. + +## Checklist + +- [ ] Manifests created in `argocd/manifests//` +- [ ] ArgoCD Application created in `argocd/apps/` +- [ ] Tailscale Ingress configured +- [ ] Homepage annotations added +- [ ] Caddy route added (if needed for pod access) +- [ ] Feature branch tested via ArgoCD +- [ ] Metrics/dashboard configured (if applicable) +- [ ] PR created and reviewed +- [ ] Reset to main after merge + +## Related + +- [[argocd]] - GitOps platform +- [[tailscale-operator]] - Kubernetes ingress +- [[routing]] - Service routing options +- [[grafana]] - Dashboard configuration +- [[apps]] - Application registry diff --git a/docs/tutorials/ai-assistance-guide.md b/docs/tutorials/ai-assistance-guide.md new file mode 100644 index 0000000..3c49b60 --- /dev/null +++ b/docs/tutorials/ai-assistance-guide.md @@ -0,0 +1,121 @@ +--- +title: ai-assistance-guide +tags: + - tutorials + - ai +--- + +# AI Assistance Guide + +> **Audiences:** AI, Owner + +This guide provides context for AI agents (like Claude Code) assisting with BlumeOps operations, and helps Erich understand how to work effectively with AI assistance. + +## Critical Rules + +These are non-negotiable for AI agents working in this repo: + +1. **Always use `--context=minikube-indri` with kubectl** - Work contexts exist that must never be touched +2. **Run `mise run zk-docs` at session start** - Review current infrastructure state +3. **Never commit secrets** - The repo is public at github.com/eblume/blumeops +4. **Wait for user review before deploying** - Create PRs, don't auto-deploy +5. **Never merge PRs without explicit request** - The user merges after review + +Full rules are in the repo's `CLAUDE.md`. + +## Workflow Conventions + +### Feature Branches + +All work happens on feature branches: +```bash +git checkout main && git pull +git checkout -b feature/descriptive-name +# ... make changes ... +git commit -m "Description" +``` + +### Pull Requests + +Use the forge's `tea` CLI: +```bash +tea pr create --title "Title" --description "$(cat <<'EOF' +## Summary +- Change 1 +- Change 2 + +## Deployment and Testing +- [ ] Test step +EOF +)" +``` + +### Changelog Fragments + +Add a fragment for user-visible changes: +```bash +echo "Description" > docs/changelog.d/branch-name.feature.md +``` + +Types: `feature`, `bugfix`, `infra`, `doc`, `misc` + +## Service Locations + +Understanding where services run helps target changes correctly: + +| Location | Services | Management | +|----------|----------|------------| +| [[indri]] (native) | Forgejo, Zot, Jellyfin, Caddy | Ansible | +| [[cluster | Kubernetes]] | Everything else | ArgoCD | + +## Mise Tasks + +BlumeOps operations are driven by mise tasks. Run `mise tasks` to list all available tasks. + +| Task | When to Use | +|------|-------------| +| `zk-docs` | At session start - review infrastructure documentation | +| `provision-indri` | Deploy changes to [[indri]]-hosted services via Ansible | +| `indri-services-check` | After deployments - verify all services are healthy | +| `pr-comments` | Check unresolved PR comments during review | +| `blumeops-tasks` | Find pending tasks from Todoist | +| `container-list` | View available container images and tags | +| `container-tag-and-release` | Release a new container image version | +| `dns-preview` | Preview DNS changes before applying | +| `dns-up` | Apply DNS changes via Pulumi | +| `tailnet-preview` | Preview Tailscale ACL changes | +| `tailnet-up` | Apply Tailscale ACL changes via Pulumi | +| `doc-links` | Validate wiki-links in documentation | +| `doc-titles` | Check for duplicate doc titles | +| `doc-filenames` | Check for duplicate doc filenames | +| `indri-runner-logs` | View Forgejo workflow logs from local runner | + +For ArgoCD operations, use the `argocd` CLI directly: +- `argocd app diff ` - Preview changes +- `argocd app sync ` - Deploy changes + +## Reference Navigation + +For AI agents building context: + +- [[reference/index|Reference Index]] - Entry point for technical details +- [[hosts|Host Inventory]] - What hardware exists +- [[apps|ArgoCD Apps]] - What's deployed in Kubernetes +- [[routing|Routing]] - How services are exposed + +## Credential Access + +Credentials live in 1Password. Never retrieve them directly - use existing patterns: +- Ansible `pre_tasks` gather secrets at playbook start +- [[external-secrets|External Secrets]] syncs to Kubernetes +- Scripts use `op` CLI with user biometric prompts + +## Common Pitfalls + +| Pitfall | Correct Approach | +|---------|------------------| +| Missing kubectl context | Always add `--context=minikube-indri` | +| Deploying without review | Create PR first, wait for user approval | +| Re-explaining reference material | Link to reference cards instead | +| Committing to main | Use feature branches | +| Guessing at credentials | Ask user or check 1Password patterns | diff --git a/docs/tutorials/contributing.md b/docs/tutorials/contributing.md new file mode 100644 index 0000000..d65ed23 --- /dev/null +++ b/docs/tutorials/contributing.md @@ -0,0 +1,177 @@ +--- +title: contributing +tags: + - tutorials + - contributing +--- + +# Your First Contribution + +> **Audiences:** Contributor + +This tutorial walks through making your first contribution to BluemeOps - from understanding the codebase to submitting a pull request. + +## Prerequisites + +Before contributing, you'll need: +- Access to the [[tailscale|Tailscale]] network (request from Erich) +- SSH key added to [[forgejo|Forgejo]] (https://forge.ops.eblu.me) +- Development tools installed (see below) + +## Tooling Setup + +The repo includes a `Brewfile` and `mise.toml` for easy setup, but these are optional - install the tools however you prefer. + +### Required Tools + +- `tea` - Gitea/Forgejo CLI for creating PRs +- `argocd` - ArgoCD CLI for deployments +- `pre-commit` - Git hooks for validation + +### Using Brewfile (Optional) + +```bash +brew bundle # installs tea, argocd, mise, etc. +``` + +### Using Mise (Optional) + +Mise manages language toolchains and runs tasks: +```bash +mise install # installs Python, Node.js, etc. from mise.toml +``` + +### Pre-commit Hooks + +Pre-commit hooks validate changes on `git commit`: +```bash +pre-commit install +pre-commit run --all-files # verify setup +``` + +All hooks should pass on a fresh clone. + +## Understanding the Codebase + +BlumeOps manages infrastructure through three main systems: + +| System | Directory | What It Manages | +|--------|-----------|-----------------| +| **Ansible** | `ansible/` | Services running directly on [[indri]] | +| **ArgoCD** | `argocd/` | Kubernetes services in the [[cluster]] | +| **Pulumi** | `pulumi/` | [[tailscale|Tailscale]] ACLs and DNS | + +Most contributions involve either Ansible roles or ArgoCD manifests. + +## The Contribution Workflow + +### 1. Clone and Branch + +```bash +git clone ssh://git@forge.ops.eblu.me:2222/eblume/blumeops.git +cd blumeops +git checkout -b feature/your-change-name +``` + +### 2. Make Your Changes + +Depending on what you're changing: + +**For Kubernetes services:** +- Edit manifests in `argocd/manifests//` +- Or create new Application in `argocd/apps/` +- For new apps, set `targetRevision` to your feature branch for testing +- For existing apps, you'll need to temporarily change the revision via `argocd app set` + +**For Indri services:** +- Edit or create roles in `ansible/roles/` +- Update `ansible/playbooks/indri.yml` if adding a role + +**For documentation:** +- Edit files in `docs/` +- Add changelog fragment (see below) + +### 3. Add a Changelog Fragment + +For user-visible changes: +```bash +echo "Description of your change" > docs/changelog.d/your-branch.feature.md +``` + +Fragment types: +- `.feature.md` - New functionality +- `.bugfix.md` - Bug fixes +- `.infra.md` - Infrastructure changes +- `.doc.md` - Documentation +- `.misc.md` - Other + +### 4. Test Your Changes + +**Before pushing, always test:** + +For Kubernetes changes: +```bash +# Preview what will change +argocd app diff +``` + +For DNS changes: +```bash +mise run dns-preview +``` + +### 5. Commit and Push + +```bash +git add +git commit -m "Brief description of change" +git push -u origin feature/your-change-name +``` + +### 6. Create a Pull Request + +```bash +tea pr create --title "Your PR Title" --description "$(cat <<'EOF' +## Summary +- What you changed +- Why you changed it + +## Deployment and Testing +- [ ] Tested locally / dry run +- [ ] Ready for ArgoCD sync / Ansible apply + +EOF +)" +``` + +### 7. Wait for Review + +Erich will review your PR and may leave comments. Check for feedback: +```bash +mise run pr-comments +``` + +Address each comment, then Erich will: +1. Approve the changes +2. Deploy them (you don't need to do this) +3. Merge the PR + +## Example: Adding a Homepage Link + +A simple first contribution - adding a service to the Homepage dashboard (go.ops.eblu.me): + +1. Find the service's Ingress in `argocd/manifests//` +2. Add homepage annotations: +```yaml +annotations: + gethomepage.dev/enabled: "true" + gethomepage.dev/name: "Service Name" + gethomepage.dev/group: "Apps" + gethomepage.dev/icon: "service.png" +``` +3. Create PR and wait for sync + +## Related + +- [[adding-a-service]] - Full tutorial on deploying a new service +- [[replicating-blumeops]] - If you want to build your own instead diff --git a/docs/tutorials/exploring-the-docs.md b/docs/tutorials/exploring-the-docs.md new file mode 100644 index 0000000..37b4c3d --- /dev/null +++ b/docs/tutorials/exploring-the-docs.md @@ -0,0 +1,80 @@ +--- +title: exploring-the-docs +tags: + - tutorials + - getting-started +--- + +# Exploring the Documentation + +> **Audiences:** All (Owner, AI, Reader, Contributor, Replicator) + +This guide explains how the BlumeOps documentation is organized and how to find what you need. + +## Documentation Structure + +The docs follow the [Diataxis](https://diataxis.fr/) framework: + +| Section | Purpose | When to Use | +|---------|---------|-------------| +| **[[tutorials/index | Tutorials]]** | Learning-oriented | "I'm new and want to understand" | +| **[[reference/index | Reference]]** | Information-oriented | "I need specific technical details" | +| **How-to** (planned) | Task-oriented | "I need to do X" | +| **Explanation** (planned) | Understanding-oriented | "I want to understand why" | + +## Quick Paths by Audience + +### For Erich (Owner) + +You probably want quick access to operational details: +- [[reference/index|Reference]] has service URLs, commands, and config locations +- The `zk-docs` mise task still works for legacy zettelkasten access +- [[ai-assistance-guide]] explains how to work effectively with Claude + +### For Claude/AI Agents + +Context for effective assistance: +- Read [[ai-assistance-guide]] for operational conventions +- [[reference/index|Reference]] has the technical specifics you'll need +- The repo's `CLAUDE.md` has critical rules (especially the kubectl context requirement) + +### For External Readers + +Understanding what this is: +- [[reference/index|Reference]] shows what's actually running +- Browse service pages to see specific implementations +- The repo's README has project context + +### For Contributors + +Getting started with changes: +- [[contributing]] walks through the workflow +- [[reference/index|Reference]] tells you where things live + +### For Replicators + +Replicators are people who want to build their own similar homelab GitOps setup, using BlumeOps as inspiration. + +- [[replicating-blumeops]] provides the overview +- The `replication/` tutorials go deep on components +- Reference pages show specific configuration choices + +## Using Wiki Links + +Documentation uses `[[wiki-links]]` for cross-references: +- `[[service-name]]` links to a reference page +- `[[folder/page]]` links to nested pages +- `[[page | Display Text]]` customizes the link text + +When reading on the web (docs.ops.eblu.me), these render as clickable links. The backlinks panel shows what references each page. + +Pre-commit hooks automatically validate that all wiki-links point to existing files and that link targets are unambiguous. + +## Legacy Content + +The `docs/zk/` directory contains zettelkasten cards from before the restructuring. These are read-only reference - new content goes in the structured sections. The cards will eventually be migrated or archived. + +To view legacy cards: +```bash +mise run zk-docs +``` diff --git a/docs/tutorials/index.md b/docs/tutorials/index.md new file mode 100644 index 0000000..776e42b --- /dev/null +++ b/docs/tutorials/index.md @@ -0,0 +1,48 @@ +--- +title: tutorials +tags: + - tutorials +--- + +# Tutorials + +Learning-oriented guides for understanding and working with BlumeOps. + +## Audience Guide + +Each tutorial indicates which audiences it serves: + +| Icon | Audience | Description | +|------|----------|-------------| +| **Owner** | Erich | Quick recall and operational refreshers | +| **AI** | Claude/AI agents | Context for AI-assisted operations | +| **Reader** | External readers | Understanding what BlumeOps is | +| **Contributor** | Operators/contributors | Helping with BlumeOps development | +| **Replicator** | Replicators | Building your own similar setup | + +## Getting Started + +| Tutorial | Audiences | Description | +|----------|-----------|-------------| +| [[exploring-the-docs]] | All | How to navigate and use this documentation | +| [[ai-assistance-guide]] | AI, Owner | Context for effective AI-assisted operations | + +## Contributing + +| Tutorial | Audiences | Description | +|----------|-----------|-------------| +| [[contributing]] | Contributor | Your first contribution to BlumeOps | +| [[adding-a-service]] | Contributor, Replicator | Deploy a new service via ArgoCD | + +## Replication + +For those building their own homelab GitOps setup. + +| Tutorial | Audiences | Description | +|----------|-----------|-------------| +| [[replicating-blumeops]] | Replicator | Overview: building a similar environment | +| [[tutorials/replication/tailscale-setup | Tailscale Setup]] | Replicator | Setting up Tailscale networking | +| [[tutorials/replication/core-services | Core Services]] | Replicator | Forgejo and container registry | +| [[tutorials/replication/kubernetes-bootstrap | Kubernetes Bootstrap]] | Replicator | Bootstrapping a Kubernetes cluster | +| [[tutorials/replication/argocd-config | ArgoCD Config]] | Replicator | Configuring GitOps with ArgoCD | +| [[tutorials/replication/observability-stack | Observability Stack]] | Replicator | Metrics, logs, and dashboards | diff --git a/docs/tutorials/replicating-blumeops.md b/docs/tutorials/replicating-blumeops.md new file mode 100644 index 0000000..6ddb5c5 --- /dev/null +++ b/docs/tutorials/replicating-blumeops.md @@ -0,0 +1,139 @@ +--- +title: replicating-blumeops +tags: + - tutorials + - replication +--- + +# Replicating BlumeOps + +> **Audiences:** Replicator + +This tutorial provides a roadmap for building your own homelab GitOps environment inspired by BluemeOps. It links to detailed component tutorials for each major piece. + +## What You'll Build + +By following this guide, you'll have: +- A secure mesh network connecting your devices +- A Kubernetes cluster for running containerized services +- GitOps-driven deployments via ArgoCD +- Observability with metrics, logs, and dashboards +- Backup and disaster recovery capabilities + +## Hardware Requirements + +BluemeOps runs on modest hardware. At minimum: + +| Component | BlumeOps Uses | Minimum Alternative | +|-----------|---------------|---------------------| +| **Server** | Mac Mini M1 | Any machine with sufficient RAM (16GB recommended) | +| **NAS** | Synology DS920+ | USB drive or second machine | +| **Workstation** | MacBook Air M4 | Whatever you use daily | + +You can start with a single machine and add storage later. + +## The Journey + +### Phase 1: Networking Foundation + +Before deploying services, establish secure connectivity. + +**[[tutorials/replication/tailscale-setup|Setting Up Tailscale]]** +- Create a tailnet and connect your devices +- Configure ACLs for service access +- Set up MagicDNS for convenient naming + +This replaces: traditional VPNs, port forwarding, dynamic DNS + +### Phase 2: Core Services + +Bootstrap the essential services that everything else depends on. + +**[[tutorials/replication/core-services | Core Services Setup]]** +- Set up [[forgejo]] for git hosting and CI/CD +- Optionally set up [[zot]] container registry +- Configure SSH access and deploy keys + +Forgejo is central to GitOps - it's where your infrastructure definitions live and where CI/CD workflows run. + +### Phase 3: Kubernetes Cluster + +A cluster for running containerized workloads. + +**[[tutorials/replication/kubernetes-bootstrap|Bootstrapping Kubernetes]]** +- Install minikube (or k3s, kind, etc.) +- Configure persistent storage +- Expose the API securely via Tailscale + +BlumeOps uses minikube for simplicity, but the patterns apply to any distribution. + +### Phase 4: GitOps with ArgoCD + +Declarative, git-driven deployments. + +**[[tutorials/replication/argocd-config|Configuring ArgoCD]]** +- Install ArgoCD in your cluster +- Connect to your git repository +- Deploy your first application +- Set up the app-of-apps pattern + +This is the heart of GitOps - changes in git automatically sync to your cluster. + +### Phase 5: Observability Stack + +Know what's happening in your infrastructure. + +**[[tutorials/replication/observability-stack|Building the Observability Stack]]** +- Deploy Prometheus for metrics +- Deploy Loki for logs +- Deploy Grafana for dashboards +- Configure Alloy for collection + +Without observability, you're flying blind. + +### Phase 6: Your First Services + +With the foundation in place, deploy actual workloads. BluemeOps runs: +- [[miniflux]] - RSS reader +- [[jellyfin]] - Media server +- [[immich]] - Photo management +- [[navidrome]] - Music streaming +- [[docs]] - Documentation site (Quartz) + +Pick what matters to you. Each service follows similar patterns: +1. Create Kubernetes manifests +2. Create ArgoCD Application +3. Configure ingress routing +4. Sync and verify + +### Phase 7: Backups and Resilience + +Protect your data. + +- Set up [[borgmatic]] for backup automation +- Configure NAS as backup target +- Test restore procedures +- Document disaster recovery + +## Alternative Approaches + +BluemeOps makes specific choices that may not suit everyone: + +| BlumeOps Choice | Alternative | +|-----------------|-------------| +| macOS server | Linux server (more common) | +| Minikube | k3s, kind, or managed K8s | +| Tailscale | WireGuard, Nebula | +| ArgoCD | Flux, manual kubectl | +| Ansible | NixOS, Docker Compose | + +The principles (GitOps, IaC, observability) matter more than specific tools. + +## Getting Started + +Begin with [[tutorials/replication/tailscale-setup]] - networking is the foundation everything else builds on. + +## Related + +- [[reference/index]] - See BlumeOps' specific configurations +- [[contributing]] - Help improve BlumeOps instead diff --git a/docs/tutorials/replication/argocd-config.md b/docs/tutorials/replication/argocd-config.md new file mode 100644 index 0000000..2dcbe85 --- /dev/null +++ b/docs/tutorials/replication/argocd-config.md @@ -0,0 +1,221 @@ +--- +title: argocd-config +tags: + - tutorials + - replication + - argocd +--- + +# Configuring ArgoCD + +> **Audiences:** Replicator + +This tutorial walks through installing ArgoCD and establishing GitOps-driven deployments for your homelab. + +## What is GitOps? + +GitOps means your git repository is the source of truth for infrastructure: +- Infrastructure state is defined in git +- Changes happen through commits and pull requests +- A controller (ArgoCD) syncs git state to the cluster +- Drift is detected and can be corrected automatically + +For BlumeOps specifics, see [[argocd|ArgoCD Reference]]. + +## Step 1: Install ArgoCD + +```bash +kubectl create namespace argocd +kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml +``` + +Wait for pods to be ready: +```bash +kubectl -n argocd get pods -w +``` + +## Step 2: Access the UI + +### Get the Initial Password + +```bash +kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d +``` + +### Expose the Service + +For Tailscale access: +```bash +tailscale serve --bg --https 8443 https+insecure://localhost:$(kubectl -n argocd get svc argocd-server -o jsonpath='{.spec.ports[?(@.name=="https")].port}') +``` + +Or create a Tailscale Ingress in Kubernetes (see [[tailscale-operator]]). + +Access at `https://your-server.tailnet.ts.net:8443` + +### Install the CLI + +BlumeOps includes `argocd` in its Brewfile (`brew bundle`), or install it however you prefer. + +Login: +```bash +argocd login your-server.tailnet.ts.net:8443 +``` + +## Step 3: Connect Your Git Repository + +Create a repository credential: + +```bash +# For SSH +argocd repo add git@github.com:you/your-repo.git \ + --ssh-private-key-path ~/.ssh/id_ed25519 + +# For HTTPS +argocd repo add https://github.com/you/your-repo.git \ + --username you \ + --password your-token +``` + +## Step 4: Create Your First Application + +Create a directory in your repo: +``` +your-repo/ +└── apps/ + └── hello-world/ + ├── deployment.yaml + └── service.yaml +``` + +With a simple deployment: +```yaml +# deployment.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: hello-world +spec: + replicas: 1 + selector: + matchLabels: + app: hello-world + template: + metadata: + labels: + app: hello-world + spec: + containers: + - name: hello + image: nginx:alpine + ports: + - containerPort: 80 +``` + +Create the ArgoCD Application: +```bash +argocd app create hello-world \ + --repo git@github.com:you/your-repo.git \ + --path apps/hello-world \ + --dest-server https://kubernetes.default.svc \ + --dest-namespace default +``` + +## Step 5: Sync and Verify + +```bash +# See what will be deployed +argocd app diff hello-world + +# Deploy it +argocd app sync hello-world + +# Check status +argocd app get hello-world +``` + +The pods should now be running: +```bash +kubectl get pods -l app=hello-world +``` + +## Step 6: App of Apps Pattern + +For managing multiple applications, use the "app of apps" pattern: + +``` +your-repo/ +├── argocd/ +│ ├── apps/ # Application definitions +│ │ ├── hello-world.yaml +│ │ └── another-app.yaml +│ └── manifests/ # Actual Kubernetes manifests +│ ├── hello-world/ +│ └── another-app/ +``` + +Create a root Application that manages other Applications: +```yaml +# argocd/apps/apps.yaml +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: apps + namespace: argocd +spec: + project: default + source: + repoURL: git@github.com:you/your-repo.git + targetRevision: main + path: argocd/apps + destination: + server: https://kubernetes.default.svc + namespace: argocd + syncPolicy: + automated: + prune: true +``` + +Now adding a new application is just creating a YAML file. + +## Step 7: Configure Sync Policies + +| Policy | When to Use | +|--------|-------------| +| Manual sync | Production, explicit control | +| Auto sync | Development, or trusted workloads | +| Auto prune | Remove resources deleted from git | +| Self heal | Revert manual kubectl changes | + +BlumeOps uses manual sync for workloads, auto sync only for the `apps` Application itself. + +## What You Now Have + +- GitOps workflow for deployments +- UI for visualizing application state +- Automatic drift detection +- Declarative application management + +## Next Steps + +- [[tutorials/replication/observability-stack | Build observability]] - Monitor your deployments +- Add more applications to your repo +- Set up notifications for sync failures + +## BluemeOps Specifics + +BlumeOps' ArgoCD configuration includes: +- SSH connection to [[forgejo]] git server +- Manual sync policy for all workloads +- Separate manifests and apps directories + +See [[argocd|ArgoCD Reference]] and [[apps|Apps Reference]] for full details. + +## Troubleshooting + +| Problem | Solution | +|---------|----------| +| Sync failed | Check `argocd app get ` for error details | +| Can't connect to repo | Verify credentials, check SSH key permissions | +| Resources not appearing | Ensure path in Application matches repo structure | +| Out of sync but no diff | Check for ignored differences in app config | diff --git a/docs/tutorials/replication/core-services.md b/docs/tutorials/replication/core-services.md new file mode 100644 index 0000000..3dc5847 --- /dev/null +++ b/docs/tutorials/replication/core-services.md @@ -0,0 +1,113 @@ +--- +title: core-services +tags: + - tutorials + - replication + - forgejo +--- + +# Core Services Setup + +> **Audiences:** Replicator + +This tutorial walks through setting up the foundational services that your GitOps infrastructure depends on: a git forge and optionally a container registry. + +## Why Core Services First? + +Before Kubernetes and ArgoCD, you need somewhere to store your infrastructure definitions. [[forgejo]] provides: +- Git hosting for your GitOps repository +- CI/CD workflows for building and deploying +- A web interface for code review and PRs + +The [[zot]] container registry is optional but useful for hosting your own container images. + +## Step 1: Install Forgejo + +Forgejo runs directly on your server (not in Kubernetes) because Kubernetes depends on it. + +### Using Ansible (BlumeOps Approach) + +BlumeOps manages Forgejo via an Ansible role. See [[reference/ansible/roles | Ansible Roles]]. + +### Manual Installation + +1. Download Forgejo from [forgejo.org](https://forgejo.org/download/) +2. Create a service user and directories +3. Configure with `app.ini` +4. Set up as a system service + +Key configuration points: +- SSH on a non-standard port (e.g., 2222) to avoid conflicts +- Database (SQLite works fine for personal use) +- Domain and URL settings for your Tailscale hostname + +## Step 2: Configure SSH Access + +Set up SSH for git operations: + +```bash +# Add your SSH key to Forgejo via the web UI +# Then test access: +ssh -T git@your-server.tailnet.ts.net -p 2222 +``` + +## Step 3: Create Your GitOps Repository + +1. Create a new repository in Forgejo (e.g., `infrastructure` or `homelab`) +2. Initialize the standard directory structure: + +``` +your-repo/ +├── ansible/ # Host configuration +│ ├── playbooks/ +│ └── roles/ +├── argocd/ # Kubernetes GitOps +│ ├── apps/ # ArgoCD Applications +│ └── manifests/ # K8s manifests per service +├── pulumi/ # IaC for Tailscale, DNS +└── docs/ # Documentation +``` + +3. Push your initial commit + +## Step 4: Set Up CI/CD Runner (Optional) + +Forgejo Actions runs workflows defined in `.forgejo/workflows/`. To use it: + +1. Register a runner on your server +2. Configure runner to access your build tools +3. Create workflow files for builds and deployments + +BlumeOps runs a Forgejo runner in Kubernetes - see [[forgejo]] for details. + +## Step 5: Container Registry (Optional) + +If you'll build custom container images, set up [[zot]]: + +1. Install Zot on your server +2. Configure authentication +3. Set up TLS (via Caddy or similar) + +For getting started, you can skip this and use public registries. + +## What You Now Have + +- Git hosting for infrastructure code +- SSH access for git operations +- Foundation for CI/CD workflows +- Optionally, a private container registry + +## Next Steps + +- [[tutorials/replication/kubernetes-bootstrap | Bootstrap Kubernetes]] - Now that you have a git repo, set up your cluster +- Configure Forgejo webhooks for ArgoCD (after ArgoCD is running) + +## BlumeOps Specifics + +BlumeOps' Forgejo setup includes: +- Ansible role for installation and updates +- SSH on port 2222, proxied via Caddy +- Integration with ArgoCD via deploy keys +- Forgejo runner in Kubernetes for CI/CD + +See [[forgejo]] and [[zot]] for full details. diff --git a/docs/tutorials/replication/kubernetes-bootstrap.md b/docs/tutorials/replication/kubernetes-bootstrap.md new file mode 100644 index 0000000..b92c025 --- /dev/null +++ b/docs/tutorials/replication/kubernetes-bootstrap.md @@ -0,0 +1,170 @@ +--- +title: kubernetes-bootstrap +tags: + - tutorials + - replication + - kubernetes +--- + +# Bootstrapping Kubernetes + +> **Audiences:** Replicator + +This tutorial walks through setting up a Kubernetes cluster for your homelab, making it accessible via Tailscale. + +## Choosing a Distribution + +For homelab use, lightweight distributions work well: + +| Distribution | Best For | BlumeOps Uses | +|--------------|----------|---------------| +| **Minikube** | Single-node, macOS | Yes | +| **k3s** | Single-node, Linux | - | +| **kind** | Local development | - | +| **kubeadm** | Multi-node clusters | - | + +This tutorial uses minikube, but principles apply broadly. + +For BlumeOps specifics, see [[cluster|Cluster Reference]]. + +## Step 1: Install Minikube + +### macOS + +```bash +brew install minikube +``` + +### Linux + +```bash +curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64 +sudo install minikube-linux-amd64 /usr/local/bin/minikube +``` + +## Step 2: Create the Cluster + +```bash +minikube start \ + --driver=docker \ + --cpus=4 \ + --memory=8g \ + --disk-size=100g \ + --apiserver-names=k8s.your-tailnet.ts.net,$(hostname) \ + --listen-address=0.0.0.0 +``` + +Key flags: +- `--apiserver-names` - Include your Tailscale hostname for remote access +- `--listen-address=0.0.0.0` - Allow connections from other machines + +## Step 3: Verify the Cluster + +```bash +kubectl get nodes +# Should show your node as Ready + +kubectl get pods -A +# Should show system pods running +``` + +## Step 4: Expose via Tailscale + +To access the cluster from other Tailscale devices, expose the API server: + +### Option A: Tailscale Serve (Simple) + +```bash +tailscale serve --bg --tcp 6443 tcp://localhost:$(minikube ip --format '{{.Port}}') +``` + +### Option B: Tailscale Kubernetes Operator (Advanced) + +For production-like setup, install the Tailscale operator which manages ingress automatically. + +BlumeOps uses TCP passthrough via Caddy - see [[routing|Routing Reference]]. + +## Step 5: Configure Remote Access + +On your workstation, add a context for the remote cluster: + +```bash +# Copy the CA cert from the server +scp server:~/.minikube/ca.crt ~/.kube/minikube-ca.crt + +# Add the cluster +kubectl config set-cluster minikube-remote \ + --server=https://k8s.your-tailnet.ts.net:6443 \ + --certificate-authority=$HOME/.kube/minikube-ca.crt + +# Add credentials (copy from server's ~/.kube/config) +kubectl config set-credentials minikube-remote \ + --client-certificate=... \ + --client-key=... + +# Add context +kubectl config set-context minikube-remote \ + --cluster=minikube-remote \ + --user=minikube-remote + +# Test +kubectl --context=minikube-remote get nodes +``` + +## Step 6: Storage Configuration + +For persistent workloads, configure storage: + +### Local Path Provisioner (Simple) + +```bash +kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml +kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' +``` + +### NFS for Shared Storage + +If you have a NAS: +```yaml +apiVersion: v1 +kind: PersistentVolume +metadata: + name: nfs-share +spec: + capacity: + storage: 1Ti + accessModes: + - ReadWriteMany + nfs: + server: nas.your-tailnet.ts.net + path: /volume1/k8s +``` + +## What You Now Have + +- A Kubernetes cluster running on your server +- Remote access via Tailscale +- Storage for persistent workloads + +## Next Steps + +- [[tutorials/replication/argocd-config | Configure ArgoCD]] - GitOps deployments +- Install essential addons (ingress controller, cert-manager) + +## BluemeOps Specifics + +BlumeOps' cluster configuration includes: +- Tailscale operator for automatic ingress +- NFS mounts from [[sifaka]] for media storage +- CloudNativePG for PostgreSQL databases + +See [[cluster|Cluster Reference]] and [[apps|Apps Reference]] for full details. + +## Troubleshooting + +| Problem | Solution | +|---------|----------| +| Can't connect remotely | Check `--apiserver-names` includes Tailscale hostname | +| Pods stuck pending | Check storage class is available | +| Connection refused | Verify `--listen-address=0.0.0.0` was set | +| Certificate errors | Ensure CA cert matches server's | diff --git a/docs/tutorials/replication/observability-stack.md b/docs/tutorials/replication/observability-stack.md new file mode 100644 index 0000000..4c3adb2 --- /dev/null +++ b/docs/tutorials/replication/observability-stack.md @@ -0,0 +1,231 @@ +--- +title: observability-stack +tags: + - tutorials + - replication + - observability +--- + +# Building the Observability Stack + +> **Audiences:** Replicator + +This tutorial walks through deploying metrics, logs, and dashboards for your homelab - because you can't fix what you can't see. + +## The Stack + +A complete observability solution has three pillars: + +| Component | Purpose | BlumeOps Uses | +|-----------|---------|---------------| +| **Metrics** | Numeric measurements over time | [[prometheus]] | +| **Logs** | Text output from applications | [[loki]] | +| **Dashboards** | Visualization and alerting | [[grafana]] | +| **Collection** | Gathering and forwarding data | [[alloy]] | + +For BlumeOps specifics, see [[observability|Observability Reference]]. + +## Step 1: Create Monitoring Namespace + +```bash +kubectl create namespace monitoring +``` + +## Step 2: Deploy Prometheus + +Prometheus collects and stores metrics. + +### Using Helm + +```bash +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts +helm install prometheus prometheus-community/prometheus \ + --namespace monitoring \ + --set server.persistentVolume.size=10Gi +``` + +### Or via ArgoCD + +Create an Application pointing to a values file in your repo: +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: prometheus + namespace: argocd +spec: + project: default + source: + repoURL: https://prometheus-community.github.io/helm-charts + chart: prometheus + targetRevision: 25.0.0 + helm: + values: | + server: + persistentVolume: + size: 10Gi + destination: + server: https://kubernetes.default.svc + namespace: monitoring +``` + +### Verify + +```bash +kubectl -n monitoring get pods -l app.kubernetes.io/name=prometheus +``` + +## Step 3: Deploy Loki + +Loki aggregates logs (like Prometheus but for logs). + +```bash +helm repo add grafana https://grafana.github.io/helm-charts +helm install loki grafana/loki-stack \ + --namespace monitoring \ + --set loki.persistence.enabled=true \ + --set loki.persistence.size=10Gi +``` + +This also installs Promtail for log collection from pods. + +## Step 4: Deploy Grafana + +Grafana provides dashboards and visualization. + +```bash +helm install grafana grafana/grafana \ + --namespace monitoring \ + --set persistence.enabled=true \ + --set persistence.size=1Gi \ + --set adminPassword=admin # Change this! +``` + +### Configure Data Sources + +After installation, add data sources in Grafana UI or via ConfigMap: + +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: grafana-datasources + namespace: monitoring + labels: + grafana_datasource: "1" +data: + datasources.yaml: | + apiVersion: 1 + datasources: + - name: Prometheus + type: prometheus + url: http://prometheus-server.monitoring.svc:80 + isDefault: true + - name: Loki + type: loki + url: http://loki.monitoring.svc:3100 +``` + +## Step 5: Access Grafana + +Expose via Tailscale: +```bash +kubectl -n monitoring port-forward svc/grafana 3000:80 & +tailscale serve --bg --https 3000 http://localhost:3000 +``` + +Or create an Ingress. + +Default credentials: `admin` / (password you set or retrieve from secret) + +## Step 6: Add Dashboards + +Import community dashboards from [grafana.com/grafana/dashboards](https://grafana.com/grafana/dashboards/): + +| Dashboard | ID | Shows | +|-----------|-----|-------| +| Node Exporter Full | 1860 | Host metrics | +| Kubernetes Cluster | 7249 | Cluster overview | +| Loki Logs | 13639 | Log exploration | + +In Grafana: Dashboards > Import > Enter ID + +## Step 7: Deploy Alloy (Optional) + +Grafana Alloy is a unified collector that replaces multiple agents (Promtail, node_exporter, etc.). + +```yaml +apiVersion: argoproj.io/v1alpha1 +kind: Application +metadata: + name: alloy + namespace: argocd +spec: + project: default + source: + repoURL: https://grafana.github.io/helm-charts + chart: alloy + targetRevision: 0.1.0 + helm: + values: | + alloy: + configMap: + content: | + // Alloy configuration here + destination: + server: https://kubernetes.default.svc + namespace: monitoring +``` + +BluemeOps uses Alloy on both [[indri]] (for host metrics, via [[reference/ansible/roles | Ansible role]]) and in the [[cluster]] (for pod logs and service probes). + +## What You Now Have + +- Metrics collection and storage (Prometheus) +- Log aggregation (Loki) +- Dashboards and visualization (Grafana) +- Foundation for alerting + +## Adding Alerts + +Configure alerting rules in Prometheus: + +```yaml +groups: +- name: example + rules: + - alert: HighMemoryUsage + expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes < 0.1 + for: 5m + labels: + severity: warning + annotations: + summary: "High memory usage detected" +``` + +And notification channels in Grafana (email, Slack, PagerDuty, etc.). + +## Next Steps + +- Create custom dashboards for your services +- Set up alerting for critical conditions +- Add service-specific metrics exporters + +## BluemeOps Specifics + +BlumeOps' observability setup includes: +- Prometheus scraping all services via annotations +- Loki collecting logs from all pods and [[indri]] services +- Custom dashboards for [[jellyfin]], [[teslamate]], and cluster health +- [[alloy]] running on both host and in-cluster + +See [[observability|Observability Reference]] for full details. + +## Troubleshooting + +| Problem | Solution | +|---------|----------| +| No metrics appearing | Check Prometheus targets (`/targets` endpoint) | +| No logs in Loki | Verify Promtail/Alloy is collecting (`/ready` endpoint) | +| Dashboard shows no data | Check data source configuration and time range | +| High storage usage | Adjust retention settings in Prometheus/Loki | diff --git a/docs/tutorials/replication/tailscale-setup.md b/docs/tutorials/replication/tailscale-setup.md new file mode 100644 index 0000000..92d0a40 --- /dev/null +++ b/docs/tutorials/replication/tailscale-setup.md @@ -0,0 +1,134 @@ +--- +title: tailscale-setup +tags: + - tutorials + - replication + - tailscale +--- + +# Setting Up Tailscale + +> **Audiences:** Replicator + +This tutorial walks through establishing a Tailscale mesh network as the foundation for your homelab infrastructure. + +## Why Tailscale? + +Tailscale solves several problems at once: +- **Secure connectivity** - WireGuard-encrypted traffic between all devices +- **No port forwarding** - Devices connect directly through NATs and firewalls +- **MagicDNS** - Human-readable names like `server.tailnet.ts.net` +- **ACLs** - Fine-grained access control between devices + +For BlumeOps context, see [[tailscale|Tailscale Reference]]. + +## Step 1: Create Your Tailnet + +1. Sign up at [tailscale.com](https://tailscale.com) +2. Choose your identity provider (Google, Microsoft, GitHub, etc.) +3. Note your tailnet name (e.g., `yourname.ts.net`) + +## Step 2: Install on Your Devices + +### macOS + +```bash +brew install tailscale +sudo tailscaled & +tailscale up +``` + +### Linux + +```bash +curl -fsSL https://tailscale.com/install.sh | sh +sudo tailscale up +``` + +### Other Platforms + +See [Tailscale Downloads](https://tailscale.com/download) for iOS, Android, Windows, etc. + +## Step 3: Verify Connectivity + +After installing on two devices: +```bash +tailscale status +# Shows all connected devices + +ping .yourname.ts.net +# Should work immediately +``` + +## Step 4: Configure ACLs + +Default Tailscale allows all-to-all connectivity. For a homelab, you'll want restrictions. + +Create `policy.hujson` (or use the web admin): +```json +{ + "groups": { + "group:admin": ["your-email@example.com"] + }, + "tagOwners": { + "tag:homelab": ["group:admin"] + }, + "acls": [ + // Admins can access everything + {"action": "accept", "src": ["group:admin"], "dst": ["*:*"]}, + // Homelab servers can reach NAS + {"action": "accept", "src": ["tag:homelab"], "dst": ["tag:nas:*"]} + ] +} +``` + +BlumeOps manages ACLs via Pulumi - see [[tailscale|Tailscale Reference]] for the actual configuration. + +## Step 5: Enable MagicDNS + +In the Tailscale admin console: +1. Go to DNS settings +2. Enable MagicDNS +3. Optionally add a search domain + +Now `ssh server` works instead of `ssh 100.x.y.z`. + +## Step 6: Tag Your Devices + +Tags enable role-based access control: +```bash +# On your server +sudo tailscale up --advertise-tags=tag:homelab +``` + +Tags must be defined in ACLs before use. + +## What You Now Have + +- Encrypted mesh network between all your devices +- DNS names for each device +- Foundation for exposing services securely + +## Next Steps + +With networking established: +- [[tutorials/replication/kubernetes-bootstrap | Bootstrap Kubernetes]] - Your cluster will join the tailnet +- Set up your server and storage devices + +## BlumeOps Specifics + +BluemeOps' Tailscale configuration includes: +- Multiple device tags (`homelab`, `nas`, `registry`, `k8s-api`) +- Group-based access for family members +- SSH access rules with authentication requirements + +See [[tailscale|Tailscale Reference]] for full details. + +## Troubleshooting + +| Problem | Solution | +|---------|----------| +| Device won't connect | Check firewall allows UDP 41641 | +| Can't reach other devices | Verify ACLs don't block traffic | +| DNS not resolving | Enable MagicDNS in admin console | +| Tags not applying | Ensure tags defined in ACL policy |