Add Ollama reference card and update indexes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
5ddb47de1c
commit
6ca3c67705
4 changed files with 94 additions and 2 deletions
1
docs/changelog.d/+ollama-reference-card.doc.md
Normal file
1
docs/changelog.d/+ollama-reference-card.doc.md
Normal file
|
|
@ -0,0 +1 @@
|
|||
Add reference card for the Ollama LLM inference service.
|
||||
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
title: Apps
|
||||
modified: 2026-02-25
|
||||
modified: 2026-03-04
|
||||
tags:
|
||||
- kubernetes
|
||||
- argocd
|
||||
|
|
@ -36,6 +36,7 @@ Registry of all applications deployed via [[argocd]].
|
|||
| `teslamate` | teslamate | `argocd/manifests/teslamate/` | [[teslamate]] |
|
||||
| `cv` | cv | `argocd/manifests/cv/` | [[cv]] |
|
||||
| `forgejo-runner` | forgejo-runner | `argocd/manifests/forgejo-runner/` | [[forgejo]] CI |
|
||||
| `ollama` | ollama | `argocd/manifests/ollama/` | [[ollama]] |
|
||||
|
||||
## Sync Policies
|
||||
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
title: Reference
|
||||
modified: 2026-02-24
|
||||
modified: 2026-03-04
|
||||
tags:
|
||||
- reference
|
||||
---
|
||||
|
|
@ -40,6 +40,7 @@ Individual service reference cards with URLs and configuration details.
|
|||
| [[authentik]] | OIDC identity provider | k8s (ringtail) |
|
||||
| [[docs]] | Documentation site (Quartz) | k8s |
|
||||
| [[flyio-proxy]] | Public reverse proxy (Fly.io + Tailscale) | Fly.io |
|
||||
| [[ollama]] | LLM inference server | k8s (ringtail) |
|
||||
| [[automounter]] | SMB share automounter | indri |
|
||||
|
||||
## Infrastructure
|
||||
|
|
|
|||
89
docs/reference/services/ollama.md
Normal file
89
docs/reference/services/ollama.md
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
---
|
||||
title: Ollama
|
||||
modified: 2026-03-04
|
||||
tags:
|
||||
- service
|
||||
- ai
|
||||
---
|
||||
|
||||
# Ollama
|
||||
|
||||
LLM inference server with GPU acceleration. Runs on [[ringtail]] with declarative model management via a sidecar.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| **URL** | https://ollama.ops.eblu.me |
|
||||
| **Tailscale URL** | https://ollama.tail8d86e.ts.net |
|
||||
| **Namespace** | `ollama` |
|
||||
| **Cluster** | ringtail k3s |
|
||||
| **Image** | `ollama/ollama:0.17.5` |
|
||||
| **Upstream** | https://github.com/ollama/ollama |
|
||||
| **Manifests** | `argocd/manifests/ollama/` |
|
||||
| **API Port** | 11434 |
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
models.txt (ConfigMap, declarative)
|
||||
│
|
||||
▼
|
||||
model-sync sidecar ──ollama pull──► Ollama server (GPU)
|
||||
│ │
|
||||
│ reads /config/models.txt │ serves /api/*
|
||||
│ polls every 30 min │ NVIDIA runtime (RTX 4080, time-sliced)
|
||||
│ │
|
||||
└────────────────────────────────────┘
|
||||
│
|
||||
/models (200 Gi hostPath PV)
|
||||
/mnt/storage1/ollama on ringtail
|
||||
```
|
||||
|
||||
## Models
|
||||
|
||||
Declared in `argocd/manifests/ollama/models.txt`. The model-sync sidecar pulls missing models on startup and every 30 minutes.
|
||||
|
||||
| Model | Parameters |
|
||||
|-------|------------|
|
||||
| `qwen2.5:14b` | 14B |
|
||||
| `deepseek-r1:14b` | 14B |
|
||||
| `phi4:14b` | 14B |
|
||||
| `gemma3:12b` | 12B |
|
||||
|
||||
To add or remove models, edit `models.txt` and sync via ArgoCD.
|
||||
|
||||
## GPU
|
||||
|
||||
Shares [[ringtail]]'s RTX 4080 with [[frigate]] via NVIDIA device plugin time-slicing (2 virtual slots). Constrained to one loaded model and one parallel request to avoid VRAM contention.
|
||||
|
||||
| Setting | Value |
|
||||
|---------|-------|
|
||||
| `OLLAMA_MAX_LOADED_MODELS` | 1 |
|
||||
| `OLLAMA_NUM_PARALLEL` | 1 |
|
||||
| GPU limit | `nvidia.com/gpu: "1"` (time-sliced) |
|
||||
|
||||
## Storage
|
||||
|
||||
| Mount | Backend | Size |
|
||||
|-------|---------|------|
|
||||
| `/models` | hostPath PV (`/mnt/storage1/ollama`) | 200 Gi |
|
||||
|
||||
PV reclaim policy is `Retain` — models survive PV deletion.
|
||||
|
||||
## Networking
|
||||
|
||||
| Endpoint | Reachable from |
|
||||
|----------|----------------|
|
||||
| `https://ollama.ops.eblu.me` | Public internet (Fly.io → Caddy) |
|
||||
| `https://ollama.tail8d86e.ts.net` | Tailnet clients |
|
||||
| `http://ollama.ollama.svc.cluster.local:11434` | In-cluster (ringtail) |
|
||||
|
||||
Tailscale ingress uses ProxyGroup `ingress` — no explicit `host:` field (see [[tailscale-operator]]).
|
||||
|
||||
## Related
|
||||
|
||||
- [[frigate]] — Shares GPU via time-slicing
|
||||
- [[ringtail]] — Host node
|
||||
- [[apps]] — ArgoCD application registry
|
||||
- [[tailscale-operator]] — Tailscale ingress
|
||||
Loading…
Add table
Add a link
Reference in a new issue