Commit graph

14 commits

Author SHA1 Message Date
1bc2b421a8 Adopt Dagger CI for container builds (Phase 1) (#156)
All checks were successful
Build Container / build (push) Successful in 13s
## Summary

- Add Dagger Python module (`.dagger/`) with `build` and `publish` functions for container images
- Replace Docker buildx + skopeo composite action with `dagger call publish` in `build-container.yaml`
- BuildKit's native push is compatible with Zot — **skopeo workaround eliminated**
- Add Dagger CLI (v0.19.11) to forgejo-runner Dockerfile, bump runner to v2.6.0
- Bootstrap step in workflow curl-installs dagger if not in runner (for first build on v2.5.1 runner)
- Delete old `.forgejo/actions/build-push-image/` composite action
- Add GPLv3 LICENSE

## Verified locally

- `dagger call build --src=. --container-name=nettest` — builds ✓
- `dagger call publish --src=. --container-name=nettest --version=dagger-test` — pushed to Zot ✓
- `dagger call build --src=. --container-name=forgejo-runner` — new runner image builds ✓
- Dagger CLI accessible inside built runner image ✓

## Deployment sequence (after merge)

1. `mise run container-tag-and-release forgejo-runner v2.6.0` — old runner bootstraps dagger via curl, builds new runner
2. `argocd app sync forgejo-runner` — runner restarts with v2.6.0 (dagger baked in)
3. `mise run container-tag-and-release nettest v0.13.0` — end-to-end test of new pipeline
4. `mise run container-list` — verify tags

## Not included (future phases)

- Phase 2: docs build + Forgejo packages migration
- Phase 3: runner simplification (remove skopeo, Node.js, etc.)
- Phase 4: future workflows

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/156
2026-02-11 15:38:31 -08:00
2fc5aa82b1 Add docker-buildx-plugin to forgejo-runner (#147)
Some checks failed
Build Container / build (push) Failing after 3s
## Summary
- Install `docker-buildx-plugin` alongside `docker-ce-cli` in the forgejo-runner image
- Fixes `docker buildx build` failing with "unknown flag: --tag" from #146

## Test plan
- [ ] Merge and release `forgejo-runner-v2.5.1`
- [ ] Update runner configmap/labels if needed to use new image
- [ ] Re-tag `nettest-v0.11.1` (or `v0.12.0`) to verify build-container workflow succeeds

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/147
2026-02-10 21:14:29 -08:00
cb36f1784f Switch CI builds to docker buildx (#146)
Some checks failed
Build Container / build (push) Failing after 4s
## Summary
- Replace deprecated `docker build` with `docker buildx build` in the build-push-image composite action
- Remove redundant build/run comments from nettest Dockerfile

## Test plan
- [ ] Merge and tag `nettest-v1.1.0` (or similar) to trigger the build-container workflow
- [ ] Verify the build succeeds without the deprecation warning

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/146
2026-02-10 21:03:41 -08:00
09317fecc1 Fix argocd CLI download in forgejo-runner image
All checks were successful
Build Container / build (push) Successful in 1m40s
- Add -L flag to follow redirects
- Add -f flag to fail on HTTP errors
- Use dpkg --print-architecture as fallback for TARGETARCH
- Verify binary works after download

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 17:11:36 -08:00
1f73eb675d Auto-deploy docs from build workflow (#93)
## Summary
- Add `uv` and `argocd` CLI to forgejo-runner container image
- Add `workflow-bot` ArgoCD account with sync permissions (declarative via kustomize patches)
- Add `ARGOCD_AUTH_TOKEN` to forgejo-runner external secret for workflow auth
- Update build workflow to auto-deploy docs after release:
  - Update configmap with new release URL
  - Commit changelog and configmap changes
  - Sync docs app via ArgoCD

## Deployment and Testing
Manual steps required before this can work:
1. [ ] Build and push new forgejo-runner image (v2.4.0)
2. [ ] Sync argocd app to create workflow-bot account
3. [ ] Generate token: `argocd account generate-token --account workflow-bot`
4. [ ] Store token in 1Password under "Forgejo Secrets" with field `argocd_token`
5. [ ] Sync forgejo-runner app to pick up new external secret
6. [ ] Update forgejo-runner deployment to use new image version
7. [ ] Test by running workflow manually

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/93
2026-02-03 16:58:03 -08:00
1c86134a62 Phase 1b: Deploy docs hosting with Quartz (#85)
## Summary
- Add ArgoCD Application and manifests for `quartz` service
- Add `docs.ops.eblu.me` to Caddy reverse proxy configuration
- ConfigMap points to blumeops v1.0.0 release tarball
- Tailscale ingress with homepage annotations for auto-discovery

## Deployment and Testing

**Pre-deployment (container build):**
- [ ] Build and tag quartz container: `mise run container-tag-and-release quartz v1.0.0`

**K8s deployment:**
- [ ] Sync apps: `argocd app sync apps`
- [ ] Point quartz at feature branch: `argocd app set quartz --revision feature/docs-phase-1b-hosting`
- [ ] Sync quartz: `argocd app sync quartz`
- [ ] Verify pod is running: `kubectl --context=minikube-indri get pods -n quartz`
- [ ] Verify Tailscale ingress: `kubectl --context=minikube-indri get ingress -n quartz`

**Caddy deployment:**
- [ ] Dry run: `mise run provision-indri -- --tags caddy --check --diff`
- [ ] Apply: `mise run provision-indri -- --tags caddy`

**Verification:**
- [ ] Test https://docs.tail8d86e.ts.net
- [ ] Test https://docs.ops.eblu.me
- [ ] Verify homepage dashboard shows docs link

**Post-merge:**
- [ ] Reset to main: `argocd app set quartz --revision main && argocd app sync quartz`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/85
2026-02-03 10:52:20 -08:00
d870694e10 Upgrade forgejo-runner to Node.js 24.x LTS
All checks were successful
Build Container / build (push) Successful in 54s
Quartz now requires Node.js >= 22.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 09:19:58 -08:00
b8104d75ad Move zk cards to docs/zk/ for documentation restructuring (#84)
## Summary
- Move all existing zettelkasten cards from `docs/` to `docs/zk/` as a temporary holding area
- Update `zk-docs` mise task to look in the new location
- Add `docs/README.md` explaining the Diataxis-based restructuring plan and target audiences

## Context
This is phase 1 of a multi-phase documentation restructuring effort. The goal is to reorganize docs to follow the Diataxis framework while serving multiple audiences:
1. Erich (owner) - knowledge graph/zk
2. Claude/AI agents - memory and context enrichment
3. New external readers - high-level overview
4. Potential operators/contributors - onboarding
5. Replicators - people wanting to duplicate the approach

## Testing
- [x] Verified `mise run zk-docs` still works with the new path
- [x] Updated obsidian.nvim config (in ~/.config/nvim) to point to new path

## Note
The obsidian.nvim config change is outside this repo but was made as part of this work.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/84
2026-02-03 09:13:50 -08:00
dc974858b0 Add skopeo to forgejo-runner image
All checks were successful
Build Container / build (push) Successful in 1m8s
Pre-install skopeo for pushing images to zot registry.
Docker 27's manifest format has compatibility issues with zot,
so we use skopeo for the push step.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-30 11:11:19 -08:00
c8b655f177 Build local containers for k8s services (#61)
## Summary
- Move devpi Dockerfile from argocd/manifests to containers/devpi/
- Add containers for: transmission, teslamate, miniflux, kiwix-serve, kubectl
- Update all k8s deployments to use local images (registry.ops.eblu.me/blumeops/*)
- All containers use v1.0.0 tag for initial release

## Containers Added
| Container | Source | Notes |
|-----------|--------|-------|
| devpi | python:3.12-slim | Existing, moved to containers/ |
| kubectl | alpine + download | For zim-watcher CronJob |
| miniflux | Go build from source | v2.2.16 |
| kiwix-serve | Download pre-built binary | v3.8.1 |
| transmission | alpine + apk install | Simpler than linuxserver image |
| teslamate | Elixir build from source | v2.2.0 |

## Deployment and Testing
- [ ] Build and tag devpi-v1.0.0
- [ ] Build and tag kubectl-v1.0.0
- [ ] Build and tag miniflux-v1.0.0
- [ ] Build and tag kiwix-serve-v1.0.0
- [ ] Build and tag transmission-v1.0.0
- [ ] Build and tag teslamate-v1.0.0
- [ ] Sync ArgoCD apps and verify services

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/61
2026-01-25 21:35:57 -08:00
ea42362b6f Migrate Forgejo runner to Kubernetes with DinD (#60)
## Summary
- Deploy Forgejo runner to k8s with Docker-in-Docker sidecar
- Add job execution image with Node.js and Docker CLI
- Retire host-mode runner on indri
- All CI jobs now run containerized in k8s

## Components Added
- `containers/forgejo-runner/Dockerfile` - Job execution image
- `argocd/apps/forgejo-runner.yaml` - ArgoCD Application
- `argocd/manifests/forgejo-runner/` - Kubernetes manifests

## Components Removed
- `ansible/roles/forgejo_runner/` - No longer needed

## Changes to Existing Files
- `.forgejo/workflows/build-container.yaml` - Use `k8s` runner with `DOCKER_HOST` env
- `.github/actionlint.yaml` - Only `k8s` label now valid

## Deployment
1. Apply secret: `op inject -i argocd/manifests/forgejo-runner/secret.yaml.tpl | kubectl --context=minikube-indri apply -f -`
2. Sync ArgoCD: `argocd app sync forgejo-runner`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/60
2026-01-25 19:56:17 -08:00
d6e6b48f6a Migrate registry to Caddy (registry.ops.eblu.me) (#58)
## Summary
- Update all references from `registry.tail8d86e.ts.net` to `registry.ops.eblu.me`
- Remove `tailscale_serve` ansible role (no longer needed - all services migrated to Caddy)
- Update minikube containerd config for new registry URL
- Update devpi manifest, CI actions, and mise tasks

## Deployment and Testing
- [ ] Run `mise run provision-indri -- --check --diff` (dry run)
- [ ] Run `mise run provision-indri -- --tags minikube` to update containerd config
- [ ] Sync devpi ArgoCD app: `argocd app sync devpi`
- [ ] Manually remove old Tailscale serve entry: `ssh indri 'tailscale serve --service=svc:registry off'`
- [ ] Test registry access: `curl https://registry.ops.eblu.me/v2/_catalog`
- [ ] Run `mise run indri-services-check` to verify all services healthy

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/58
2026-01-25 12:06:15 -08:00
1184b4de1d Add Caddy layer4 for Forgejo SSH (#56)
## Summary
- Add layer4 TCP proxy configuration to Caddyfile template for SSH services
- Configure Forgejo SSH on port 2222 → localhost:2200
- Switch HTTPS from port 8443 (testing) to 443 (production)
- Requires Caddy rebuilt with `github.com/mholt/caddy-l4` plugin

## What This Enables
Git+SSH access via `forge.ops.eblu.me:2222` is now accessible from:
- Tailnet clients (gilbert)
- Docker containers on indri
- Kubernetes pods in minikube

This solves the DNS resolution issues where containers couldn't reach Tailscale MagicDNS names.

## Testing Done
- [x] Caddy rebuilt with layer4 plugin
- [x] Validated Caddyfile syntax
- [x] Cleared `svc:forge` from tailscale serve
- [x] Verified HTTPS works: `curl https://forge.ops.eblu.me`
- [x] Verified SSH works: `ssh -p 2222 forgejo@forge.ops.eblu.me`
- [x] Verified git clone works via new endpoint
- [x] Verified minikube pods can reach both HTTPS and SSH endpoints

## Deployment
Caddy is already running with the new config on indri. This PR captures the ansible changes.

## Next Steps
- Update zk docs with new git remote format
- Migrate registry and other services to Caddy
- Retire tailscale_services ansible role

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/56
2026-01-25 11:37:23 -08:00
31697b4d63 Add nettest container for CI/CD network debugging (#52)
Some checks failed
Build Container / build (push) Failing after 18s
## Summary
- Add `containers/nettest/` with Alpine-based Dockerfile and connectivity test script
- Add `.forgejo/workflows/build-nettest.yaml` workflow triggered by `nettest-v*` tags
- Test script checks DNS resolution and HTTPS connectivity to forge and registry

## Deployment and Testing
- [ ] Merge PR to main
- [ ] Run `mise run container-release nettest v0.1.0` to trigger first build
- [ ] Verify workflow runs successfully and container can reach tailnet services
- [ ] Manually test from minikube: `kubectl run nettest --rm -it --image=registry.tail8d86e.ts.net/blumeops/nettest:v0.1.0`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Reviewed-on: https://forge.tail8d86e.ts.net/eblume/blumeops/pulls/52
2026-01-24 16:54:35 -08:00