C1: SHA-pin tooling dependencies (2026-04 cycle) (#344)
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m45s
All checks were successful
Deploy Fly.io Proxy / deploy (push) Successful in 1m45s
## Summary Monthly tooling dependency refresh, with a one-time conversion from version-tag pins (`rev = "vX.Y.Z"`, `image:tag`, `>=`) to SHA / digest pins everywhere. ## Changes - **prek hooks**: all `rev = "vX.Y.Z"` → commit SHA + `# vX.Y.Z` comment. Bumped trufflehog (3.94.0→3.95.2), kingfisher (1.91.0→1.97.0), ruff (0.15.7→0.15.12), shfmt (3.13.0→3.13.1), prettier (3.8.1→3.8.3), actionlint (1.7.11→1.7.12). - **fly/Dockerfile**: tag pins → `image@sha256:...` digest pins. Bumped nginx (1.29.6→1.30.0-alpine), tailscale (v1.94.1→v1.94.2 — still inside the safe pre-1.96.5 range), alloy (v1.14.1→v1.16.0). - **mise-tasks**: PEP 723 inline deps converted from `>=` to `==` (PEP 508 doesn't support hashes inline). All scripts pinned to current latest: rich 15.0.0, typer 0.25.0, pyyaml 6.0.3, httpx 0.28.1. - **prek `additional_dependencies`**: ansible-lint==26.4.0, ansible-core==2.20.5. - **taplo-lint**: pass `--no-schema`. Upstream's `--default-schema-catalogs` returns a format taplo v0.9.3 can't parse — we don't validate against TOML schemas anyway, so this turns off the broken catalog fetch. - **docs/update-tooling-dependencies**: documents the SHA-pin convention, `docker buildx imagetools inspect` for digest lookup, and `prek clean` before re-verifying (cache grows to several GiB). Forgejo workflow `actions/checkout@v6.0.2` was already at the latest SHA — no change. ## Test plan - [x] `prek run --all-files` passes after `prek clean` - [x] `deploy-fly` workflow builds and deploys the new fly image on merge - [x] `fly status -a blumeops-proxy` healthy after deploy - [x] Spot-check a few mise tasks (`mise run blumeops-tasks`, `mise run docs-check-links`) to confirm pinned deps resolve cleanly Reviewed-on: #344
This commit is contained in:
parent
5096223b48
commit
f6e392b80c
29 changed files with 174 additions and 47 deletions
108
docs/how-to/configuration/rotate-fly-deploy-token.md
Normal file
108
docs/how-to/configuration/rotate-fly-deploy-token.md
Normal file
|
|
@ -0,0 +1,108 @@
|
|||
---
|
||||
title: Rotate the Fly.io API Token
|
||||
modified: 2026-04-30
|
||||
last-reviewed: 2026-04-30
|
||||
tags:
|
||||
- how-to
|
||||
- fly-io
|
||||
- secrets
|
||||
---
|
||||
|
||||
# Rotate the Fly.io API Token
|
||||
|
||||
How to rotate the Fly.io API token used to deploy [[flyio-proxy]]. The token lives in 1Password at `op://blumeops/fly.io admin/add more/deploy-token` and is consumed by [`mise run fly-deploy`](../../../mise-tasks/fly-deploy) and the `deploy-fly` Forgejo workflow (via the `FLY_DEPLOY_TOKEN` secret).
|
||||
|
||||
## When to rotate
|
||||
|
||||
- Every 75 days (Todoist recurring task)
|
||||
- After any compromise / accidental disclosure
|
||||
- If `fly deploy` starts returning auth errors
|
||||
|
||||
Fly.io tokens default to a 20-year expiry, but a short rotation cadence limits the blast radius of an undetected leak. Token expiry is set to **90 days** (longer than the rotation window), leaving a 15-day buffer if a rotation is delayed.
|
||||
|
||||
## Scope
|
||||
|
||||
Use **`fly tokens create org`**, not `deploy`.
|
||||
|
||||
| Scope | What it grants | Practical blast radius (this org) |
|
||||
|-------|---------------|-----------------------------------|
|
||||
| `deploy` | Manage one app and its resources | Same single-app surface as `org` for current setup |
|
||||
| `org` | Manage one org and its resources | Adds: ability to create new apps (billing abuse) and read org-level metadata |
|
||||
| `readonly` | Read one org | Not enough to deploy |
|
||||
| Personal access token | Full account | Excessive |
|
||||
|
||||
The personal Fly org currently contains a single app (`blumeops-proxy`), so the marginal blast radius of `org` over `deploy` is small. The benefit of `org` is that `fly status` works without a `Metrics token unavailable: ... context canceled` warning. That warning happens because `fly status` always tries to fetch org-level metrics-token info, and an app-scoped `deploy` token can't query the org. The warning is benign but persistent and could mask a real future failure.
|
||||
|
||||
If a second Fly app is ever added to this org, reconsider — at that point the marginal scope cost of `org` grows.
|
||||
|
||||
## Procedure
|
||||
|
||||
### 1. Authenticate flyctl with the current token
|
||||
|
||||
```fish
|
||||
fly auth login
|
||||
```
|
||||
|
||||
(Browser-based. Required to mint a new token, since the existing deploy token can't create tokens.)
|
||||
|
||||
### 2. Mint the new token
|
||||
|
||||
```fish
|
||||
fly tokens create org \
|
||||
--org personal \
|
||||
--name "blumeops-proxy deploy $(date +%Y-%m-%d)" \
|
||||
--expiry 2160h
|
||||
```
|
||||
|
||||
(`2160h` = 90 days, paired with the 75-day rotation cadence for a 15-day buffer. Capture the output — it's the only time the token is shown.)
|
||||
|
||||
### 3. Update 1Password
|
||||
|
||||
```fish
|
||||
op item edit on5slfaygtdjrxmdwezyhfmqsq 'add more.deploy-token=<paste-new-token>' --vault vg6xf6vvfmoh5hqjjhlhbeoaie
|
||||
```
|
||||
|
||||
### 4. Sync to Forgejo Actions
|
||||
|
||||
The `deploy-fly` workflow reads the same token from a Forgejo Actions secret named `FLY_DEPLOY_TOKEN`, populated by the `forgejo_actions_secrets` ansible role:
|
||||
|
||||
```fish
|
||||
mise run provision-indri -- --tags forgejo_actions_secrets
|
||||
```
|
||||
|
||||
### 5. Verify
|
||||
|
||||
```fish
|
||||
mise run fly-deploy
|
||||
```
|
||||
|
||||
A successful deploy confirms the new token works locally. Watch for the metrics-token warning — it should be **absent** with an `org`-scoped token. If still present, the rotation produced a `deploy`-scoped token by mistake.
|
||||
|
||||
Then trigger the CI workflow (push a no-op commit touching `fly/`, or dispatch manually) to confirm Forgejo Actions has the new secret.
|
||||
|
||||
### 6. Revoke the old token
|
||||
|
||||
```fish
|
||||
fly tokens list
|
||||
fly tokens revoke <old-token-id>
|
||||
```
|
||||
|
||||
## Debugging
|
||||
|
||||
### `fly deploy` returns "unauthorized"
|
||||
|
||||
Token is invalid (expired, revoked, or wrong scope). Repeat the procedure.
|
||||
|
||||
### `Metrics token unavailable: ... context canceled` after rotation
|
||||
|
||||
The new token was created with `deploy` scope, not `org`. Either accept it (cosmetic) or re-mint with `fly tokens create org`.
|
||||
|
||||
### Forgejo Actions deploy fails but local works
|
||||
|
||||
The Forgejo secret wasn't synced. Re-run `mise run provision-indri -- --tags forgejo_actions_secrets` and confirm the secret value in Forgejo matches 1Password.
|
||||
|
||||
## Related
|
||||
|
||||
- [[flyio-proxy]] — Service reference card
|
||||
- [[manage-flyio-proxy]] — Day-to-day operations and Tailscale auth-key rotation (separate 90-day rotation)
|
||||
- [[expose-service-publicly]] — Full setup architecture
|
||||
|
|
@ -28,33 +28,45 @@ Out of scope: ArgoCD-deployed service images, Ansible role versions, NixOS flake
|
|||
|
||||
### 1. Check prek hook versions
|
||||
|
||||
For each repo in `prek.toml` with a `rev =` value, check the upstream GitHub releases page for a newer tag. Update each `rev` to the latest release tag. Also check `additional_dependencies` entries for PyPI version bumps.
|
||||
|
||||
Verify after updating:
|
||||
For each repo in `prek.toml` with a `rev =` value, check the upstream GitHub releases page for a newer tag. Update each `rev` to the **commit SHA** of the latest release with a trailing `# vX.Y.Z` comment (matches the `additional_dependencies` and Forgejo workflow pinning style). Also check `additional_dependencies` entries for PyPI version bumps and pin them with `==`.
|
||||
|
||||
```fish
|
||||
git ls-remote --tags https://github.com/<owner>/<repo>.git 'refs/tags/v*' | sort -t/ -k3 -V | tail -5
|
||||
```
|
||||
|
||||
Clear the prek cache before verifying — it can grow to several GiB (one venv per hook per version) and old cached environments can mask resolution failures or stale catalogs:
|
||||
|
||||
```fish
|
||||
prek clean
|
||||
prek run --all-files
|
||||
```
|
||||
|
||||
### 2. Check Fly.io Dockerfile pins
|
||||
|
||||
Review `fly/Dockerfile` for pinned image tags:
|
||||
Review `fly/Dockerfile` for pinned image digests. Each `FROM` and `COPY --from=` uses `image@sha256:...` digest pinning with a comment line above documenting the human-readable version.
|
||||
|
||||
- **nginx** — check [Docker Hub](https://hub.docker.com/_/nginx) for latest stable alpine tag
|
||||
- **grafana/alloy** — check [GitHub releases](https://github.com/grafana/alloy/releases)
|
||||
- **tailscale/tailscale** — uses `stable` rolling tag, no action needed
|
||||
- **tailscale/tailscale** — pinned to a known-good version. Do not bump to v1.96.5 or later (MagicDNS regression breaks the proxy boot)
|
||||
|
||||
To resolve a tag to a digest:
|
||||
|
||||
```fish
|
||||
docker buildx imagetools inspect docker.io/<image>:<tag>
|
||||
# Use the top-level "Digest:" line (multi-arch index) — not the per-platform sub-digest
|
||||
```
|
||||
|
||||
After updating, the deploy-fly workflow will build and deploy on merge to main. Verify with `fly status -a blumeops-proxy` after deploy.
|
||||
|
||||
### 3. Normalize mise task dependency bounds
|
||||
### 3. Pin mise task dependencies
|
||||
|
||||
Mise tasks use `uv run --script` with inline PEP 723 dependency metadata. Check that lower bounds are consistent across all scripts:
|
||||
Mise tasks use `uv run --script` with inline PEP 723 dependency metadata. All packages are pinned with `==` (PEP 508 doesn't support hashes inline). Check that pinned versions are consistent across all scripts:
|
||||
|
||||
```fish
|
||||
grep -r 'dependencies' mise-tasks/ | grep '# dependencies'
|
||||
```
|
||||
|
||||
Ensure all scripts using the same package agree on the minimum version. When a package has a new major or breaking minor release, bump the lower bound across all scripts at once.
|
||||
For each package in use (`httpx`, `rich`, `typer`, `pyyaml`), pick the latest PyPI version and update every script in lockstep — divergence between scripts is the failure mode this catches. Bump everything together; don't leave one script behind.
|
||||
|
||||
### 4. Pin Forgejo workflow action versions
|
||||
|
||||
|
|
|
|||
|
|
@ -76,6 +76,10 @@ The auth key expires every 90 days. To rotate:
|
|||
2. Re-run setup to stage the new secret: `mise run fly-setup`
|
||||
3. Deploy to pick up the new secret: `mise run fly-deploy`
|
||||
|
||||
## Rotate Fly.io API Token
|
||||
|
||||
See [[rotate-fly-deploy-token]] for the full rotation procedure (75-day cadence, `org`-scoped).
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**502 Bad Gateway on fresh deploy**: MagicDNS may not be ready when nginx starts. The `start.sh` script polls `nslookup` before launching nginx, but if it still fails, check that `tailscale status` is healthy inside the container.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue