Gatus deployment alignment
Gatus deployment alignment (matching example-ops/onedr0p-home-ops)
This repo historically deployed Gatus using:
- A
HelmReleasebased onapp-templatev3 - A sidecar (
k8s-sidecar) to watchgatus.io/enabledConfigMaps across namespaces - Postgres (
cloudnative-pg) + init container (postgres-init) - Flux global substitution/SOPS injection (repo-wide default behavior)
The reference repo (example-ops/onedr0p-home-ops) has moved to a simpler pattern:
- SQLite on a local PVC
gatus-sidecarto auto-discover endpoints from Kubernetes resources- Per-app
OCIRepository+HelmRelease.spec.chartRef - Explicitly disabling Flux substitution for the generated configmap so
${VAR}remains for Gatus runtime env var expansion
Why this matters for “substitution patterns”
Gatus config supports ${ENV_VAR} expansion at runtime.
In haynes-ops, Flux currently injects substitution defaults globally, which means ${VAR} inside generated ConfigMaps can be interpreted by Flux during build time. The reference repo avoids this by adding:
generatorOptions:
annotations:
kustomize.toolkit.fluxcd.io/substitute: disabled
We now do the same for Gatus configmaps.
Target pattern (what we’re aligning to)
- Storage: SQLite (no Postgres dependency)
- Controller: StatefulSet +
volumeClaimTemplatesfor/config - Endpoint discovery:
gatus-sidecarinit container - Chart sourcing:
HelmRelease.spec.chartRef-> per-appOCIRepository(namedgatus)
Operational implications / gotchas
- PVC naming changes: with
volumeClaimTemplatesthe created PVC name is derived from the template + StatefulSet name (e.g.config-gatus-0), not a stablegatusPVC name. - This does not prevent VolSync, but it does make generic “
${APP}-named claim” templates harder. - Chart upgrade risk: switching from
app-templatev3 to v4 (viachartRefto anOCIRepository) can trigger immutable-field issues for Deployments. - For Gatus we change controller type to StatefulSet anyway; plan for a delete/recreate of old workloads if Helm can’t mutate cleanly.
- Old PVC cleanup: if a previous
gatusPVC exists (from the “existingClaim” pattern), it may become unused after the switch and should be cleaned up deliberately once you’re satisfied with the new deployment.
Rollout steps (main cluster)
- Ensure the new manifests are committed (GitOps source of truth).
- Reconcile the app:
flux reconcile kustomization gatus -n flux-system --with-source
flux reconcile helmrelease gatus -n observability --with-source
- If Helm gets stuck and the old workload blocks the new controller type:
- Suspend the HelmRelease, delete the old Deployment/StatefulSet, resume and reconcile (same pattern as the
app-templatev3 -> v4 remediation).
Follow-ups (optional)
- Decide whether we still want the legacy “labelled ConfigMap endpoint injection” (
gatus.io/enabled) approach anywhere. - If the
gatus-sidecarapproach is sufficient, we can phase outkubernetes/shared/templates/gatus/*over time.