Cloudflare tunnel
Cloudflare Tunnel (cloudflared) plan — Traefik external ingress
Goal: eliminate intermittent Cloudflare 523 Origin Unreachable events by removing inbound WAN reachability from the equation. We’ll do this by running a Cloudflare Tunnel (cloudflared) inside Kubernetes and routing public traffic through it to traefik-external.
This document is written as a step-by-step plan so it can be copied into Cursor Plan Mode later.
Status (2026-02-27)
- Implemented: tunnel connector deployed in-cluster, stable DNS target created (
ingress-ext.haynesnetwork.com), and public app hostnames repointed toingress-extvia External-DNS. - Key learning: Cloudflare “wizard” style tunnels (
docker run ... --token ...) and “Published application” routes can make the tunnel remotely managed and override localconfig.yamlingress rules (including forcinghttp_status:404). The working pattern is a locally-managed token built from account tag + tunnel id + tunnel secret (the same pattern used in the example repos).
High-level decisions (what we’re optimizing for)
- Single URL per app: keep
https://<app>.haynesnetwork.comas the only URL (no “-internal” duplicates). - Phase 1 (simpler): no split DNS — even LAN clients will go out to Cloudflare and come back through the tunnel. This makes validation straightforward.
- Phase 2: add monitoring so we can safely introduce split DNS later.
- Phase 3: split DNS — LAN clients resolve the same public hostnames to
traefik-externaldirectly (bypassing Cloudflare), while WAN continues through the tunnel.
Background / current state (relevant to the 523 issue)
- External services use
IngressRouteobjects with: kubernetes.io/ingress.class: traefik-externalexternal-dns.alpha.kubernetes.io/target: haynesnetwork.comexternal-dns-cloudflareis configured withsources: ["crd", "ingress", "traefik-proxy"]and--cloudflare-proxied.- This causes records like
authentik.haynesnetwork.comto be published asCNAME haynesnetwork.com. - When Cloudflare returns
523, it means the selected Cloudflare POP couldn’t reach the origin (today: your WAN IP path).
With a tunnel, Cloudflare does not need to initiate a connection to your WAN IP. Cloudflare connects to a tunnel that your cloudflared pods keep open outbound.
Architecture overview
Without split DNS (Phase 1)
- Client (LAN or WAN) → Cloudflare → Cloudflare Tunnel →
cloudflared(in-cluster) →traefik-external→ app Service
With split DNS (Phase 3)
- WAN client → Cloudflare → tunnel →
traefik-external→ app Service - LAN client → LAN DNS override →
traefik-externaldirectly → app Service
Same hostname, different DNS answer depending on where you are.
References (read-up)
Cloudflare:
- Cloudflare Tunnel (concepts + setup): https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/
- cloudflared configuration (config.yaml, ingress rules): https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/configure-tunnels/local-management/configuration-file/
- Public hostname routing for tunnels: https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/routing-to-tunnel/
- Routes UI terminology (“Published application”, “Private hostname”, etc): https://developers.cloudflare.com/cloudflare-one/networks/routes/
Traefik:
- Forwarded headers / real client IP concepts (Traefik proxy behavior): https://doc.traefik.io/traefik/routing/entrypoints/#forwarded-headers
GitOps examples in this workspace:
- onedr0p-home-ops Cloudflare tunnel app:
- kubernetes/apps/network/cloudflare-tunnel/app/helmrelease.yaml
- kubernetes/apps/network/cloudflare-tunnel/app/externalsecret.yaml
- kubernetes/apps/network/cloudflare-tunnel/app/dnsendpoint.yaml
- bjw-s-labs-home-ops cloudflared app:
- kubernetes/apps/network/cloudflared/app/helmrelease.yaml
- kubernetes/apps/network/cloudflared/app/externalsecret.yaml
- kubernetes/apps/network/cloudflared/app/dnsEndpoint.yaml
Phase 0 — prerequisites (mostly manual / external systems)
0.1 Cloudflare prerequisites (manual)
You need:
- A Cloudflare Zone for haynesnetwork.com
- Cloudflare Zero Trust enabled (for Tunnel management)
- A Tunnel created (name it something like haynes-ops-main)
Outputs we will use in Git: - Tunnel ID (UUID) - Tunnel secret (base64-ish secret used to build the token) - Account tag (Cloudflare account identifier)
Notes:
- Prefer creating the tunnel via cloudflared tunnel create --credentials-file cloudflare-tunnel.json ... so you can reliably extract AccountTag, TunnelID, and TunnelSecret.
- Avoid relying on the “run with Docker token” wizard token for GitOps; it leads to remotely-managed tunnel config which can override cloudflared ingress rules from config.yaml.
0.2 Secret management prerequisites (manual)
This repo uses External Secrets (1Password Connect). Store tunnel values in 1Password under item cloudflared:
CLOUDFLARE_ACCOUNT_TAGCLOUDFLARE_TUNNEL_IDCLOUDFLARE_TUNNEL_SECRET
Optional (not used by the locally-managed pattern):
CLOUDFLARED_TUNNEL_TOKEN(the dashboard/docker run token)CLOUDFLARED_API_TOKEN(useful for read-only inspection; not required for the in-cluster connector)
An agent cannot create or read these secrets safely without your secret store access.
0.3 Certificate issuance prerequisites (important)
If you plan to remove inbound port-forwards (80/443) as part of the tunnel migration, certificate issuance/renewal must not depend on inbound HTTP reachability (ACME HTTP-01).
- This cluster already satisfies that requirement:
cert-manageruses ACME DNS-01 via Cloudflare for Let’s Encrypt. - Warning: if you later change
cert-managerIssuers to HTTP-01 (or otherwise remove/break Cloudflare DNS-01), it can silently become a dependency again and renewals may fail once ports are closed.
Phase 1 — deploy the tunnel in-cluster (no split DNS yet)
1.1 Add a cloudflared app to the network namespace (GitOps)
Pattern to follow: the onedr0p-home-ops cloudflare-tunnel app.
Create a new app under kubernetes/main/apps/network/ (proposed structure):
- kubernetes/main/apps/network/cloudflare-tunnel/ks.yaml
- kubernetes/main/apps/network/cloudflare-tunnel/app/helmrelease.yaml
- kubernetes/main/apps/network/cloudflare-tunnel/app/externalsecret.yaml
- kubernetes/main/apps/network/cloudflare-tunnel/app/dnsendpoint.yaml (Phase 1.3)
- optionally GrafanaDashboard + ServiceMonitor like the examples
Implementation details (recommended defaults):
- Replicas: 2 (avoid single-pod tunnel outages)
- Health checks: /ready on metrics port (TUNNEL_METRICS)
- Metrics: enable scrape via ServiceMonitor
- Ingress rules:
- Start with a wildcard for your external domain:
- hostname: "*.haynesnetwork.com"
- service: https://traefik-external.network.svc.cluster.local:443
- Default: http_status:404
Important gotcha (Traefik redirect loop):
- Routing to Traefik over
http://...:80causes redirect loops (ERR_TOO_MANY_REDIRECTS) becausetraefik-externalredirectsweb→websecure. - Prefer
https://...:443and configureoriginRequestappropriately: - simplest bring-up:
noTLSVerify: true - better end-state: set
originServerName: ingress-ext.haynesnetwork.comand removenoTLSVerifyonce validated
1.2 Reconcile and verify cloudflared is healthy (manual commands)
You can validate using kubectl locally; cluster state can also be inspected via Kubernetes MCP (pods/logs/events) when available.
- Pods ready:
kubectl -n network get pods -l app.kubernetes.io/name=cloudflare-tunnel- Logs show “connected” / no auth errors:
kubectl -n network logs -l app.kubernetes.io/name=cloudflare-tunnel --tail=200 -f- Metrics endpoint responds (optional):
kubectl -n network port-forward svc/cloudflare-tunnel 8080:8080- then browse
http://localhost:8080/ready
Success criteria: - Two pods Running/Ready - No repeated reconnect/auth failures
Rollback: - revert the Git commits that add the app, reconcile Flux
1.3 Create a stable DNS target name for external-dns (GitOps)
Today, many IngressRoutes publish CNAME haynesnetwork.com because of:
external-dns.alpha.kubernetes.io/target: haynesnetwork.com
With a tunnel, we want external routes to publish:
- CNAME ingress-ext.haynesnetwork.com (example name)
Then we create one record:
- ingress-ext.haynesnetwork.com → ${TUNNEL_ID}.cfargotunnel.com (CNAME)
This is exactly what the example repos do with a DNSEndpoint CR:
- onedr0p-home-ops: external.turbo.ac → ${CLOUDFLARE_TUNNEL_ID}.cfargotunnel.com
- bjw-s-labs-home-ops: ingress-ext.bjw-s.dev → <tunnel_id>.cfargotunnel.com
In this repo:
- Add a DNSEndpoint in network namespace (or wherever your external-dns-cloudflare watches CRDs) that creates:
- dnsName: ingress-ext.haynesnetwork.com
- recordType: CNAME
- targets: ["<tunnel_id>.cfargotunnel.com"]
Success criteria:
- Cloudflare DNS shows the CNAME for ingress-ext.haynesnetwork.com
Important gotcha (Error 1033):
- If you delete/recreate the tunnel, the Tunnel ID changes.
- If
ingress-ext.haynesnetwork.comstill points at the old<tunnel_id>.cfargotunnel.com, Cloudflare returns Error 1033 (“unable to resolve”). - Fix by updating the
DNSEndpointtarget to the new tunnel id and letting External-DNS reconcile.
1.4 Update external IngressRoute targets to point at ingress-ext
For each external IngressRoute currently using:
external-dns.alpha.kubernetes.io/target: haynesnetwork.com
Change to:
external-dns.alpha.kubernetes.io/target: ingress-ext.haynesnetwork.com
Known locations (non-exhaustive; expand as you implement):
- kubernetes/main/apps/network/authentik/app/ingressroute.yaml
- kubernetes/main/apps/photos/immich/server/ingressroute.yaml
- kubernetes/main/apps/office/paperless-ngx/app/ingressroute.yaml
- kubernetes/main/apps/network/traefik/config/ingress-routes/traefik-external/*
Success criteria:
- External-DNS records for apps become CNAME ingress-ext.haynesnetwork.com (not haynesnetwork.com)
Rollback: - revert the annotation changes (records will revert)
1.5 “No split DNS” validation (manual)
From LAN:
- confirm https://<app>.haynesnetwork.com works
- optionally verify Cloudflare headers present (means you went through Cloudflare)
From WAN/cellular: - confirm the same URLs work
Optional: temporarily remove/disable your router port forwards for 80/443 (after you are confident).
Phase 2 — Monitoring strategy for tunnel-first + split DNS later
We want to detect two classes of failure:
1) Public path down (Cloudflare/tunnel/external) 2) LAN path down while public is up (or vice-versa) once split DNS is introduced
2.1 Gatus checks: Public (Cloudflare/tunnel) path
Add (or keep) endpoints like:
- url: https://authentik.haynesnetwork.com/
These validate: - Cloudflare edge - tunnel connectivity - Traefik routing - app health (at least HTTP-level)
2.2 Gatus checks: “LAN direct” path (even before split DNS)
Because Gatus runs inside the cluster, it won’t automatically use your LAN DNS overrides later. So for “LAN path” checks, don’t rely on DNS — test the internal routing path directly.
Recommended approach:
- Send HTTP to the in-cluster Traefik service
- Set the Host header to the public hostname so Traefik routes the same as external
Example pattern (conceptual):
- url: http://traefik-external.network.svc.cluster.local/
- headers: { Host: authentik.haynesnetwork.com }
- condition: status code is 200/302/etc depending on the app
This validates: - Traefik is reachable - Traefik is routing that hostname correctly - Upstream service is reachable
It does not validate: - Cloudflare edge behavior - tunnel connectivity
2.3 DNS monitoring for split DNS (future)
Once you introduce split DNS via Unifi:
- Add Gatus DNS endpoints that query Unifi DNS and assert:
- authentik.haynesnetwork.com resolves to 192.168.40.206 (Traefik external VIP)
- Add Gatus DNS endpoints that query a public resolver (e.g. 1.1.1.1) and assert:
- authentik.haynesnetwork.com resolves to the tunnel target (CNAME chain to cfargotunnel.com)
This gives quick visibility: - “Public DNS correct?” - “LAN DNS override correct?”
Agent constraint:
- configuring Unifi DNS is outside cluster GitOps unless you manage it via external-dns-unifi (see Phase 3).
Phase 3 — Introduce split DNS (after tunnel is proven stable)
3.1 Decide how to manage Unifi DNS overrides
Options:
- Option A (manual): create Unifi DNS host overrides for selected external hostnames
- Option B (GitOps): extend/adjust external-dns-unifi so it can manage haynesnetwork.com LAN overrides
- currently it is scoped to haynesops.com and excludes haynesnetwork.com
- changing this requires care to avoid record ownership collisions
3.2 Implement split DNS (manual or GitOps)
For each external hostname you want “LAN direct”:
- set Unifi DNS A record to 192.168.40.206 (Traefik external VIP)
3.3 Verify split DNS with the Phase 2 monitoring in place
Success criteria: - Public Gatus checks remain green - LAN-direct (Traefik service + Host header) checks remain green - DNS checks show: - public: tunnel route - LAN: VIP route
Rollback: - remove Unifi overrides (LAN returns to public DNS/tunnel)
Implementation notes / gotchas (Traefik + Cloudflare specifics)
- Client IPs:
- With tunnel, Traefik will see the immediate client as
cloudflared. - Apps should rely on
CF-Connecting-IP/X-Forwarded-Forfor real client IP. -
You may want to configure Traefik forwarded header trust appropriately (don’t blindly trust all in-cluster sources).
-
TLS between cloudflared → Traefik:
- Prefer HTTPS to Traefik (
:443) to avoid redirect loops. - Bring-up can use
noTLSVerify: true. -
Harden by setting
originServerName: ingress-ext.haynesnetwork.com(SNI) once validated. -
external-dns record shape:
- The easiest migration is changing the
external-dns .../targetannotation fromhaynesnetwork.comtoingress-ext.haynesnetwork.comso every app record repoints cleanly.
Work breakdown (copy/paste into Plan Mode)
Phase 1: tunnel deployed, no split DNS
- Create
cloudflare-tunnelapp manifests (HelmRelease + ExternalSecret + config.yaml) - Add
DNSEndpointforingress-ext.haynesnetwork.com→<tunnel_id>.cfargotunnel.com - Update external
IngressRouteannotations to targetingress-ext.haynesnetwork.com - Reconcile Flux and validate from LAN + WAN
- Optionally close router 80/443 port-forwards after confidence
What we actually changed in haynes-ops (implemented)
- Added
kubernetes/main/apps/network/cloudflare-tunnel/(FluxKustomization+ app manifests) - Added
DNSEndpointforingress-ext.haynesnetwork.com - Repointed public app records by changing
external-dns.alpha.kubernetes.io/targetfromhaynesnetwork.com→ingress-ext.haynesnetwork.com(Plex, Authentik, Immich, Paperless, Open WebUI, and Traefik external ingressroutes) - Added
/cloudflare-tunnel.jsonto.gitignore(local credentials artifact)
Phase 2: monitoring foundation
- Add Gatus endpoints for public URLs (tunnel path)
- Add Gatus endpoints that hit Traefik service directly with
Host:header (LAN-direct path) - Add DNS assertions (public resolver vs Unifi resolver) once split DNS is introduced
Phase 3: split DNS
- Choose manual vs GitOps for Unifi DNS overrides
- Implement overrides for selected hostnames
- Validate with Gatus (public + lan-direct + DNS)
Remaining TODOs (validation + WAN cleanup)
Validate tunnel viability (before removing rollback options)
- Verify from WAN/cellular:
https://authentik.haynesnetwork.comhttps://ai.haynesnetwork.comhttps://immich.haynesnetwork.com- Confirm
cloudflare-tunnelpods stay connected (no flapping) and External-DNS remains stable. - Add/verify Gatus checks (public URL checks + “LAN direct” Traefik service checks with Host headers).
WAN / router cleanup (recommended: disable first, delete later)
- Disable (don’t delete yet) UniFi port forwards for inbound 80/443 (and any other “public ingress” forwards).
- After 24–48h of confidence, delete the forwards.
Cloudflare DNS cleanup (WAN public IP records)
- Decide what to do with existing A records that point to your WAN IP (e.g. apex
haynesnetwork.com,www): - keep temporarily for rollback
- or remove/disable once tunnel is proven
- or repoint to the tunnel pattern (Cloudflare supports proxied CNAME/flattening)
- Review
kubernetes/main/apps/network/cloudflare-ddns/: - If you remove WAN ingress,
cloudflare-ddnsmay no longer be needed for public app traffic. - Decide whether to keep it for other records, or disable it to prevent it from continuing to update WAN-IP A records unnecessarily.