Skip to content

Cloudflare tunnel

Cloudflare Tunnel (cloudflared) plan — Traefik external ingress

Goal: eliminate intermittent Cloudflare 523 Origin Unreachable events by removing inbound WAN reachability from the equation. We’ll do this by running a Cloudflare Tunnel (cloudflared) inside Kubernetes and routing public traffic through it to traefik-external.

This document is written as a step-by-step plan so it can be copied into Cursor Plan Mode later.


High-level decisions (what we’re optimizing for)

  • Single URL per app: keep https://<app>.haynesnetwork.com as the only URL (no “-internal” duplicates).
  • Phase 1 (simpler): no split DNS — even LAN clients will go out to Cloudflare and come back through the tunnel. This makes validation straightforward.
  • Phase 2: add monitoring so we can safely introduce split DNS later.
  • Phase 3: split DNS — LAN clients resolve the same public hostnames to traefik-external directly (bypassing Cloudflare), while WAN continues through the tunnel.

Background / current state (relevant to the 523 issue)

  • External services use IngressRoute objects with:
  • kubernetes.io/ingress.class: traefik-external
  • external-dns.alpha.kubernetes.io/target: haynesnetwork.com
  • external-dns-cloudflare is configured with sources: ["crd", "ingress", "traefik-proxy"] and --cloudflare-proxied.
  • This causes records like authentik.haynesnetwork.com to be published as CNAME haynesnetwork.com.
  • When Cloudflare returns 523, it means the selected Cloudflare POP couldn’t reach the origin (today: your WAN IP path).

With a tunnel, Cloudflare does not need to initiate a connection to your WAN IP. Cloudflare connects to a tunnel that your cloudflared pods keep open outbound.


Architecture overview

Without split DNS (Phase 1)

  • Client (LAN or WAN) → Cloudflare → Cloudflare Tunnel → cloudflared (in-cluster) → traefik-external → app Service

With split DNS (Phase 3)

  • WAN client → Cloudflare → tunnel → traefik-external → app Service
  • LAN client → LAN DNS override → traefik-external directly → app Service

Same hostname, different DNS answer depending on where you are.


References (read-up)

Cloudflare: - Cloudflare Tunnel (concepts + setup): https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/ - cloudflared configuration (config.yaml, ingress rules): https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/configure-tunnels/local-management/configuration-file/ - Public hostname routing for tunnels: https://developers.cloudflare.com/cloudflare-one/connections/connect-networks/routing-to-tunnel/

Traefik: - Forwarded headers / real client IP concepts (Traefik proxy behavior): https://doc.traefik.io/traefik/routing/entrypoints/#forwarded-headers

GitOps examples in this workspace: - onedr0p-home-ops Cloudflare tunnel app: - kubernetes/apps/network/cloudflare-tunnel/app/helmrelease.yaml - kubernetes/apps/network/cloudflare-tunnel/app/externalsecret.yaml - kubernetes/apps/network/cloudflare-tunnel/app/dnsendpoint.yaml - bjw-s-labs-home-ops cloudflared app: - kubernetes/apps/network/cloudflared/app/helmrelease.yaml - kubernetes/apps/network/cloudflared/app/externalsecret.yaml - kubernetes/apps/network/cloudflared/app/dnsEndpoint.yaml


Phase 0 — prerequisites (mostly manual / external systems)

0.1 Cloudflare prerequisites (manual)

You need: - A Cloudflare Zone for haynesnetwork.com - Cloudflare Zero Trust enabled (for Tunnel management) - A Tunnel created (name it something like haynes-ops-main)

Outputs we will use in Git: - Tunnel ID (UUID) - Tunnel secret (base64-ish secret used to build the token) - Account tag (Cloudflare account identifier)

Notes: - The Kubernetes deployment can authenticate either via a “tunnel token” or via credentials file. The example repos build the TUNNEL_TOKEN JSON and base64-encode it.

0.2 Secret management prerequisites (manual)

This repo uses External Secrets. We will store tunnel values in your secret backend (e.g., 1Password item): - cloudflare_tunnel_account_tag - cloudflare_tunnel_id - cloudflare_tunnel_secret

An agent cannot create or read these secrets safely without your secret store access.

0.3 Certificate issuance prerequisites (important)

If you plan to remove inbound port-forwards (80/443) as part of the tunnel migration, certificate issuance/renewal must not depend on inbound HTTP reachability (ACME HTTP-01).

  • This cluster already satisfies that requirement: cert-manager uses ACME DNS-01 via Cloudflare for Let’s Encrypt.
  • Warning: if you later change cert-manager Issuers to HTTP-01 (or otherwise remove/break Cloudflare DNS-01), it can silently become a dependency again and renewals may fail once ports are closed.

Phase 1 — deploy the tunnel in-cluster (no split DNS yet)

1.1 Add a cloudflared app to the network namespace (GitOps)

Pattern to follow: the onedr0p-home-ops cloudflare-tunnel app.

Create a new app under kubernetes/main/apps/network/ (proposed structure): - kubernetes/main/apps/network/cloudflare-tunnel/ks.yaml - kubernetes/main/apps/network/cloudflare-tunnel/app/helmrelease.yaml - kubernetes/main/apps/network/cloudflare-tunnel/app/externalsecret.yaml - kubernetes/main/apps/network/cloudflare-tunnel/app/dnsendpoint.yaml (Phase 1.3) - optionally GrafanaDashboard + ServiceMonitor like the examples

Implementation details (recommended defaults): - Replicas: 2 (avoid single-pod tunnel outages) - Health checks: /ready on metrics port (TUNNEL_METRICS) - Metrics: enable scrape via ServiceMonitor - Ingress rules: - Start with a wildcard for your external domain: - hostname: "*.haynesnetwork.com" - service: http://traefik-external.network.svc.cluster.local:80 - Default: http_status:404

Why HTTP to Traefik initially: - Avoids TLS/SNI complexity between cloudflared and Traefik during early validation. - Still end-to-end encrypted between client↔Cloudflare↔tunnel; the only plaintext hop is inside the cluster.

1.2 Reconcile and verify cloudflared is healthy (manual commands)

An agent can’t run kubectl from this environment; you’ll run these and paste results if needed:

  • Pods ready:
  • kubectl -n network get pods -l app.kubernetes.io/name=cloudflare-tunnel
  • Logs show “connected” / no auth errors:
  • kubectl -n network logs -l app.kubernetes.io/name=cloudflare-tunnel --tail=200 -f
  • Metrics endpoint responds (optional):
  • kubectl -n network port-forward svc/cloudflare-tunnel 8080:8080
  • then browse http://localhost:8080/ready

Success criteria: - Two pods Running/Ready - No repeated reconnect/auth failures

Rollback: - revert the Git commits that add the app, reconcile Flux

1.3 Create a stable DNS target name for external-dns (GitOps)

Today, many IngressRoutes publish CNAME haynesnetwork.com because of: external-dns.alpha.kubernetes.io/target: haynesnetwork.com

With a tunnel, we want external routes to publish: - CNAME ingress-ext.haynesnetwork.com (example name)

Then we create one record: - ingress-ext.haynesnetwork.com${TUNNEL_ID}.cfargotunnel.com (CNAME)

This is exactly what the example repos do with a DNSEndpoint CR: - onedr0p-home-ops: external.turbo.ac${CLOUDFLARE_TUNNEL_ID}.cfargotunnel.com - bjw-s-labs-home-ops: ingress-ext.bjw-s.dev<tunnel_id>.cfargotunnel.com

In this repo: - Add a DNSEndpoint in network namespace (or wherever your external-dns-cloudflare watches CRDs) that creates: - dnsName: ingress-ext.haynesnetwork.com - recordType: CNAME - targets: ["<tunnel_id>.cfargotunnel.com"]

Success criteria: - Cloudflare DNS shows the CNAME for ingress-ext.haynesnetwork.com

1.4 Update external IngressRoute targets to point at ingress-ext

For each external IngressRoute currently using: external-dns.alpha.kubernetes.io/target: haynesnetwork.com

Change to: external-dns.alpha.kubernetes.io/target: ingress-ext.haynesnetwork.com

Known locations (non-exhaustive; expand as you implement): - kubernetes/main/apps/network/authentik/app/ingressroute.yaml - kubernetes/main/apps/photos/immich/server/ingressroute.yaml - kubernetes/main/apps/office/paperless-ngx/app/ingressroute.yaml - kubernetes/main/apps/network/traefik/config/ingress-routes/traefik-external/*

Success criteria: - External-DNS records for apps become CNAME ingress-ext.haynesnetwork.com (not haynesnetwork.com)

Rollback: - revert the annotation changes (records will revert)

1.5 “No split DNS” validation (manual)

From LAN: - confirm https://<app>.haynesnetwork.com works - optionally verify Cloudflare headers present (means you went through Cloudflare)

From WAN/cellular: - confirm the same URLs work

Optional: temporarily remove/disable your router port forwards for 80/443 (after you are confident).


Phase 2 — Monitoring strategy for tunnel-first + split DNS later

We want to detect two classes of failure:

1) Public path down (Cloudflare/tunnel/external) 2) LAN path down while public is up (or vice-versa) once split DNS is introduced

2.1 Gatus checks: Public (Cloudflare/tunnel) path

Add (or keep) endpoints like: - url: https://authentik.haynesnetwork.com/

These validate: - Cloudflare edge - tunnel connectivity - Traefik routing - app health (at least HTTP-level)

2.2 Gatus checks: “LAN direct” path (even before split DNS)

Because Gatus runs inside the cluster, it won’t automatically use your LAN DNS overrides later. So for “LAN path” checks, don’t rely on DNS — test the internal routing path directly.

Recommended approach: - Send HTTP to the in-cluster Traefik service - Set the Host header to the public hostname so Traefik routes the same as external

Example pattern (conceptual): - url: http://traefik-external.network.svc.cluster.local/ - headers: { Host: authentik.haynesnetwork.com } - condition: status code is 200/302/etc depending on the app

This validates: - Traefik is reachable - Traefik is routing that hostname correctly - Upstream service is reachable

It does not validate: - Cloudflare edge behavior - tunnel connectivity

2.3 DNS monitoring for split DNS (future)

Once you introduce split DNS via Unifi: - Add Gatus DNS endpoints that query Unifi DNS and assert: - authentik.haynesnetwork.com resolves to 192.168.40.206 (Traefik external VIP) - Add Gatus DNS endpoints that query a public resolver (e.g. 1.1.1.1) and assert: - authentik.haynesnetwork.com resolves to the tunnel target (CNAME chain to cfargotunnel.com)

This gives quick visibility: - “Public DNS correct?” - “LAN DNS override correct?”

Agent constraint: - configuring Unifi DNS is outside cluster GitOps unless you manage it via external-dns-unifi (see Phase 3).


Phase 3 — Introduce split DNS (after tunnel is proven stable)

3.1 Decide how to manage Unifi DNS overrides

Options: - Option A (manual): create Unifi DNS host overrides for selected external hostnames - Option B (GitOps): extend/adjust external-dns-unifi so it can manage haynesnetwork.com LAN overrides - currently it is scoped to haynesops.com and excludes haynesnetwork.com - changing this requires care to avoid record ownership collisions

3.2 Implement split DNS (manual or GitOps)

For each external hostname you want “LAN direct”: - set Unifi DNS A record to 192.168.40.206 (Traefik external VIP)

3.3 Verify split DNS with the Phase 2 monitoring in place

Success criteria: - Public Gatus checks remain green - LAN-direct (Traefik service + Host header) checks remain green - DNS checks show: - public: tunnel route - LAN: VIP route

Rollback: - remove Unifi overrides (LAN returns to public DNS/tunnel)


Implementation notes / gotchas (Traefik + Cloudflare specifics)

  • Client IPs:
  • With tunnel, Traefik will see the immediate client as cloudflared.
  • Apps should rely on CF-Connecting-IP / X-Forwarded-For for real client IP.
  • You may want to configure Traefik forwarded header trust appropriately (don’t blindly trust all in-cluster sources).

  • TLS between cloudflared → Traefik:

  • Start with HTTP to Traefik to validate the tunnel quickly.
  • Later, you can tighten to HTTPS origin if desired (requires SNI / cert alignment).

  • external-dns record shape:

  • The easiest migration is changing the external-dns .../target annotation from haynesnetwork.com to ingress-ext.haynesnetwork.com so every app record repoints cleanly.

Work breakdown (copy/paste into Plan Mode)

Phase 1: tunnel deployed, no split DNS

  • Create cloudflare-tunnel app manifests (HelmRelease + ExternalSecret + config.yaml)
  • Add DNSEndpoint for ingress-ext.haynesnetwork.com<tunnel_id>.cfargotunnel.com
  • Update external IngressRoute annotations to target ingress-ext.haynesnetwork.com
  • Reconcile Flux and validate from LAN + WAN
  • Optionally close router 80/443 port-forwards after confidence

Phase 2: monitoring foundation

  • Add Gatus endpoints for public URLs (tunnel path)
  • Add Gatus endpoints that hit Traefik service directly with Host: header (LAN-direct path)
  • Add DNS assertions (public resolver vs Unifi resolver) once split DNS is introduced

Phase 3: split DNS

  • Choose manual vs GitOps for Unifi DNS overrides
  • Implement overrides for selected hostnames
  • Validate with Gatus (public + lan-direct + DNS)