Document container tag provenance and enhance container-list (#263)

## Summary

After investigating deployed container images, confirmed that squash-merging PRs orphans the commit SHAs embedded in container image tags. Two of our currently deployed images (prometheus, grafana) reference branch commits not on main.

This PR:

- Documents the squash-merge SHA orphan problem and the post-merge workflow in [[build-container-image]]
- Adds step 9 to the C1 process: after merging a PR that changes `containers/`, do a follow-up C0 to point manifests at the rebuilt `[main]` tag
- Rewrites `container-list` as a `uv run --script` (typer + rich + httpx)
- Adds optional container name filter (`mise run container-list prometheus` shows 10 tags instead of 4)
- Annotates every tag with `[main]` or `[branch]` based on git commit ancestry

## Test plan

- [x] `mise run container-list` — all containers shown with `[main]`/`[branch]` hints
- [x] `mise run container-list prometheus` — filtered view, more tags, correctly shows `[main]` and `[branch]`
- [x] `mise run container-list nonexistent` — error message with exit code 1
- [x] Pre-commit hooks pass

Reviewed-on: https://forge.ops.eblu.me/eblume/blumeops/pulls/263
This commit is contained in:
Erich Blume 2026-02-24 09:54:58 -08:00
commit 1b9f706a30
4 changed files with 170 additions and 62 deletions

View file

@ -0,0 +1 @@
Document squash-merge container tag provenance issue and post-merge workflow for updating manifests to main-SHA tags.

View file

@ -1,6 +1,6 @@
---
title: Agent Change Process
modified: 2026-02-23
modified: 2026-02-24
last-reviewed: 2026-02-23
tags:
- how-to
@ -52,6 +52,7 @@ A change with enough complexity or risk that a human should review it, but not s
- **Workflows:** point workflow triggers at the branch if needed
7. After user review and successful deployment, the user merges the PR
8. **After merge:** reset ArgoCD revisions back to main, re-sync
9. **If the PR changed `containers/`:** the merge triggers a rebuild from main automatically. Once it completes, commit a C0 updating the manifest to the new `[main]`-tagged image (see [[build-container-image#Squash-merge and container tags]])
### Upgrading to C2
@ -227,6 +228,7 @@ When starting a new session to continue C2 work:
Mikado resets apply to branch code, not build artifacts. Container images in the registry are independent of branch lifecycle:
- **Registry images** are build outputs cached in zot — tagged with commit SHAs, so each build is unique and traceable
- **Squash-merge orphans:** Images built during PR development reference branch SHAs that won't exist on main after merge. After merge, a rebuild triggers automatically; commit a C0 to update manifests to the new `[main]`-tagged image. Use `mise run container-list <name>` to find it
- **Automatic builds** trigger when container changes merge to main. Use `mise run container-build-and-release` for manual dispatch
- **If a build succeeds but deployment fails**, the image is fine; the problem is elsewhere. Document what you learned and try again
- **If a build fails in CI**, no image is pushed. Fix the nix/dockerfile and re-merge or re-dispatch

View file

@ -1,6 +1,6 @@
---
title: Build Container Image
modified: 2026-02-20
modified: 2026-02-24
last-reviewed: 2026-02-15
tags:
- how-to
@ -86,6 +86,26 @@ image: registry.ops.eblu.me/blumeops/<name>:vX.Y.Z-abc1234
Then deploy per [[deploy-k8s-service]].
### Squash-merge and container tags
Container image tags include the git commit SHA they were built from (e.g. `v3.9.1-74029e1`). When a PR is squash-merged, the original branch commits are replaced by a single new commit on main — the SHA in the image tag no longer exists on main. After branch cleanup (30 days), the SHA becomes unreachable and the container loses source traceability.
**The rule:** Production manifests must reference images built from a commit on main. After merging a PR that changed `containers/<name>/`:
1. The merge to main automatically triggers a rebuild (the `build-container.yaml` / `build-container-nix.yaml` workflows fire on pushes to `main` that touch `containers/**`)
2. Wait for the workflow to complete — check at `https://forge.ops.eblu.me/eblume/blumeops/actions`
3. Find the new main-SHA tag:
```bash
mise run container-list <name>
```
Tags marked `[main]` were built from a commit on main; tags marked `[branch]` are from PR branches
4. Commit a C0 follow-up updating the manifest to use the `[main]` tag:
```yaml
image: registry.ops.eblu.me/blumeops/<name>:vX.Y.Z-<main-sha>
```
This follow-up C0 is expected and routine — it's the cost of squash-merge + SHA-tagged containers.
## Common Patterns
Existing containers demonstrate several build approaches:

View file

@ -1,62 +1,147 @@
#!/usr/bin/env bash
#!/usr/bin/env -S uv run --script
# /// script
# requires-python = ">=3.12"
# dependencies = ["httpx>=0.28.0", "rich>=13.0.0", "typer>=0.15.0"]
# ///
#MISE description="List available containers and their recent tags"
#USAGE arg "[name]" help="Optional container name to filter output"
"""List container images and their recent registry tags.
set -euo pipefail
Shows build type (dockerfile/nix), registry path, and recent tags from the
zot registry. Tags are annotated with [main] or [branch] to indicate whether
the build commit is an ancestor of origin/main.
REGISTRY="registry.ops.eblu.me"
CONTAINER_DIR="containers"
Usage:
mise run container-list # all containers
mise run container-list prometheus # single container (more tags shown)
"""
echo "Container Images"
echo "================"
echo ""
import re
import subprocess
from pathlib import Path
# Find all container directories with Dockerfiles or default.nix
for dir in "$CONTAINER_DIR"/*/; do
[[ -d "$dir" ]] || continue
import httpx
import typer
from rich.console import Console
from rich.table import Table
# Determine available build types
has_dockerfile=false
has_nix=false
[[ -f "$dir/Dockerfile" ]] && has_dockerfile=true
[[ -f "$dir/default.nix" ]] && has_nix=true
REGISTRY = "registry.ops.eblu.me"
CONTAINER_DIR = Path("containers")
# Skip directories with no build files
$has_dockerfile || $has_nix || continue
console = Console()
app = typer.Typer(add_completion=False)
# Build type label
types=()
$has_dockerfile && types+=("dockerfile")
$has_nix && types+=("nix")
label=$(IFS=+; echo "${types[*]}")
# Extract container name from directory
container=$(basename "$dir")
image="blumeops/$container"
def git(*args: str) -> str:
result = subprocess.run(
["git", *args], capture_output=True, text=True, check=True
)
return result.stdout.strip()
echo "[$label] $container"
echo " Image: $REGISTRY/$image"
echo " Path: $dir"
# Query zot for recent tags
tags=$(curl -sf "https://$REGISTRY/v2/$image/tags/list" 2>/dev/null | jq -r '.tags // [] | .[]' | grep -E '^v[0-9]' | sort -V | tail -4 || true)
def sha_hint(tag: str) -> str:
"""Check if the 7-char hex SHA in a tag is on origin/main."""
match = re.search(r"[0-9a-f]{7}", tag)
if not match:
return ""
sha = match.group()
try:
full_sha = git("rev-parse", "--verify", sha)
except subprocess.CalledProcessError:
return "[dim]\\[unknown][/dim]"
try:
git("merge-base", "--is-ancestor", full_sha, "origin/main")
return "[green]\\[main][/green]"
except subprocess.CalledProcessError:
return "[yellow]\\[branch][/yellow]"
if [[ -n "$tags" ]]; then
echo " Recent tags:"
echo "$tags" | while read -r tag; do
echo " - $tag"
done
else
echo " Recent tags: (none)"
fi
echo ""
done
echo "---"
echo "To trigger a build:"
echo " mise run container-build-and-release <container>"
echo ""
echo "Dispatches both Dockerfile and Nix workflows (each skips if build file absent)."
echo "Tags: vX.Y.Z-<sha> (Dockerfile), vX.Y.Z-<sha>-nix (Nix)"
echo ""
echo "Example:"
echo " mise run container-build-and-release nettest"
def get_tags(image: str) -> list[str]:
"""Query zot registry for version tags."""
try:
resp = httpx.get(
f"https://{REGISTRY}/v2/{image}/tags/list", timeout=10
)
resp.raise_for_status()
tags = resp.json().get("tags", [])
return sorted(
[t for t in tags if re.match(r"^v[0-9]", t)],
key=lambda t: t,
)
except (httpx.HTTPError, ValueError):
return []
def discover_containers() -> list[dict]:
"""Find container directories with build files."""
containers = []
for d in sorted(CONTAINER_DIR.iterdir()):
if not d.is_dir():
continue
has_dockerfile = (d / "Dockerfile").exists()
has_nix = (d / "default.nix").exists()
if not has_dockerfile and not has_nix:
continue
types = []
if has_dockerfile:
types.append("dockerfile")
if has_nix:
types.append("nix")
containers.append({
"name": d.name,
"types": types,
"path": str(d),
})
return containers
@app.command()
def main(
name: str = typer.Argument("", help="Container name to filter (optional)"),
) -> None:
"""List available containers and their recent tags."""
containers = discover_containers()
if name:
containers = [c for c in containers if c["name"] == name]
if not containers:
console.print(f"[red]No container found matching '{name}'.[/red]")
console.print("Run without arguments to see all containers.")
raise typer.Exit(1)
tag_count = 10 if name else 4
for c in containers:
image = f"blumeops/{c['name']}"
label = "+".join(c["types"])
tags = get_tags(image)
recent = tags[-tag_count:] if tags else []
console.print(f"[bold]\\[{label}] {c['name']}[/bold]")
console.print(f" Image: {REGISTRY}/{image}")
console.print(f" Path: {c['path']}")
if recent:
console.print(" Recent tags:")
for tag in recent:
hint = sha_hint(tag)
console.print(f" - {tag} {hint}")
else:
console.print(" Recent tags: [dim](none)[/dim]")
console.print()
console.print("[dim]---[/dim]")
console.print(
"Tags marked [green]\\[main][/green] were built from a commit on main."
)
console.print(
"Tags marked [yellow]\\[branch][/yellow] were built from a PR branch — "
"use \\[main] tags in production manifests."
)
console.print()
console.print("To trigger a build:")
console.print(" mise run container-build-and-release <container>")
if __name__ == "__main__":
app()