CF Access runbook
System: Cloudflare Zero Trust Access — all Raxx-managed applications Owner: sre-agent / operator Last incident: 2026-06-12 (incomplete bypass set blocked beta invite flow; see 2026-06-12-beta-invite-cf-access-blocks.md) Last reviewed: 2026-06-13
Related runbooks
docs/ops/runbooks/getraxx-launch-day-cf-access-removal.md— launch-day removal procedure for the getraxx.com beta gatedocs/ops/runbooks/terraform-cf-access-state-imports.md— import commands for theterraform/cf-access/stack (console, vault, freescout service tokens)docs/ops/runbooks/cf-access-service-token-provisioning.md— minting CF Access service tokens for machine-to-machine routes
How to tell CF Access is broken
curl -I https://<gated-surface>/returnsHTTP/2 200when it should returnHTTP/2 302(gate removed accidentally).curl -I https://<gated-surface>/returnsHTTP/2 302after the gate was intentionally removed (gate not fully destroyed).terraform planreports resources to add/destroy when you expectNo changes(state drift).- CF Zero Trust dashboard shows an application that Terraform thinks is gone, or vice versa.
How to diagnose (in order)
-
Check live CF Access state via API:
bash export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_ACCESS_MGMT \ --path /MooseQuest/cloudflare/ --plain) curl -sS -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ "https://api.cloudflare.com/client/v4/accounts/22b5c35090724fbf05db6d4f501ac821/access/apps" \ | python3 -c "import sys,json; [print(a['id'], a['name'], a['domain']) for a in json.load(sys.stdin).get('result',[])]"Compare output to what Terraform state believes exists. -
Check Terraform state:
bash cd terraform/modules/cf-access-getraxx terraform state list -
Run a plan against live state:
bash export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_ACCESS_MGMT \ --path /MooseQuest/cloudflare/ --plain) export TF_VAR_cf_access_account_id=$(infisical secrets get CF_ACCESS_ACCOUNT_ID_MOOSEQUEST \ --path /MooseQuest/cloudflare/ --plain) terraform planANo changesplan means state is in sync. Anything else is drift.
Known failure modes
Failure mode A: Dashboard delete created Terraform state orphan (#2849)
Symptom: terraform plan shows resources to destroy that no longer exist in CF (e.g.,
cloudflare_zero_trust_access_application.getraxx_beta, cloudflare_zero_trust_access_policy.getraxx_beta_invitees,
cloudflare_ruleset.www_to_apex_redirect). The CF dashboard shows these apps are gone.
Cause: The operator deleted one or more CF Access resources directly via the CF Zero Trust
dashboard (or via direct API call), bypassing Terraform. Terraform state still references these
resources; terraform plan plans to destroy things that are already gone, which would either
no-op or error during apply.
Recommendation: terraform state rm (not terraform import) when the deletion was
intentional and the resource should not be re-created. Use terraform import when the
resource was accidentally deleted and must be re-created and re-imported.
Decision for #2849: The operator deleted the getraxx-beta CF Access app intentionally
on 2026-05-27 UTC as part of the v1 launch procedure (Path B in
docs/ops/runbooks/getraxx-launch-day-cf-access-removal.md). The module
terraform/modules/cf-access-getraxx/ should be kept for reference but its state entries
must be removed. Use the terraform state rm path below.
Fix — operator-action required (do NOT run autonomously):
Step 1: Source credentials
export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_ACCESS_MGMT \
--path /MooseQuest/cloudflare/ --plain)
export TF_VAR_cf_access_account_id=$(infisical secrets get CF_ACCESS_ACCOUNT_ID_MOOSEQUEST \
--path /MooseQuest/cloudflare/ --plain)
Step 2: Confirm the resources are gone in CF
curl -sS -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
"https://api.cloudflare.com/client/v4/accounts/22b5c35090724fbf05db6d4f501ac821/access/apps" \
| python3 -c "
import sys, json
apps = json.load(sys.stdin).get('result', [])
getraxx = [a for a in apps if 'getraxx' in a.get('name','').lower() or 'getraxx.com' in a.get('domain','')]
if getraxx:
print('FOUND — not safe to rm:')
for a in getraxx: print(' ', a['id'], a['name'])
else:
print('Not found in CF — safe to terraform state rm')
"
Expected output if the delete succeeded: Not found in CF — safe to terraform state rm
If the resources are still present in CF, STOP. Do NOT run terraform state rm — the resource
is live. Investigate why the dashboard delete did not propagate (CF rate limit, partial delete).
Step 3: Inspect current Terraform state
cd terraform/modules/cf-access-getraxx
terraform state list
Expected output (three orphaned resources):
cloudflare_ruleset.www_to_apex_redirect
cloudflare_zero_trust_access_application.getraxx_beta
cloudflare_zero_trust_access_policy.getraxx_beta_invitees
If the list is empty, state is already clean — skip to Step 5.
Step 4: Remove the orphaned resources from Terraform state
Run each terraform state rm command individually. There is no risk of live resource
deletion — state rm only removes the entry from Terraform's local state file; it does
not call the CF API.
terraform state rm cloudflare_zero_trust_access_policy.getraxx_beta_invitees
terraform state rm cloudflare_zero_trust_access_application.getraxx_beta
terraform state rm cloudflare_ruleset.www_to_apex_redirect
Order: remove the policy first (it depends on the application), then the application, then the ruleset (independent).
Step 5: Verify clean state
terraform plan
Expected output: No changes. Your infrastructure matches the configuration.
If Terraform still plans to destroy resources, check that the correct S3 backend state
file is being used (terraform init may be needed if the backend was recently changed).
Step 6: Post-cleanup documentation
After successful terraform plan shows No changes:
- Add a
# ARCHIVED — resources removed 2026-05-27, state rm on YYYY-MM-DDcomment to the top ofterraform/modules/cf-access-getraxx/main.tf. - Update
docs/security/web-surface-posture.md: set thegetraxx.comrow toWAF + rate limit (public). - Update
docs/security/auth-posture.md§9: set thegetraxx.comrow toPublic.
Failure mode B: terraform plan fails with authentication error
Symptom: terraform plan exits with Authentication error (10000) or
invalid token when planning the terraform/modules/cf-access-getraxx/ module.
Cause: CF_ACCESS_MGMT token has expired, been revoked, or lost the Account:Zero Trust:Edit
scope. Common after a vault rotation or token recreation.
Fix:
# Verify token status
CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_ACCESS_MGMT \
--path /MooseQuest/cloudflare/ --plain)
curl -s -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
https://api.cloudflare.com/client/v4/user/tokens/verify | python3 -m json.tool
# Expect: "status": "active"
If inactive: follow docs/ops/runbooks/rotation/cloudflare-user-api-token.md to rotate
the token, then update Infisical at /MooseQuest/cloudflare/CF_ACCESS_MGMT.
Verification: Re-run terraform plan after rotating the token.
Failure mode C: CF Access gate blocks legitimate service token traffic
Symptom: Machine-to-machine requests (e.g., Lambda → FreeScout) receive a 302 redirect
to cloudflareaccess.com instead of being passed through.
Cause: CF Access service token policy uses decision = "allow" instead of
decision = "non_identity". Per feedback_cf_access_service_token_needs_non_identity.md:
allow requires an IdP identity that service tokens don't carry.
Fix: In terraform/cf-access/ (the main stack, not the getraxx module), set
decision = "non_identity" on the service token policy resource, then apply.
See docs/ops/runbooks/terraform-cf-access-state-imports.md Fix 1 for the freescout
service token correction history.
Also check that the WAF skip rule is in place (Bot Fight Mode bypasses CF Access headers).
See feedback_cf_access_does_not_bypass_bot_fight_mode.md.
Verification:
curl -v -H "CF-Access-Client-Id: <service-token-id>" \
-H "CF-Access-Client-Secret: <service-token-secret>" \
https://<gated-surface>/health
# Expect: 200 (not 302)
Emergency stop — re-gate a surface that was accidentally made public
If a CF Access application is accidentally removed and the surface needs to be immediately re-gated:
For getraxx.com (manual CF dashboard path — fastest):
See "Re-adding the gate (rollback)" in docs/ops/runbooks/getraxx-launch-day-cf-access-removal.md.
For other surfaces (Terraform path):
cd terraform/cf-access # or the relevant module directory
terraform apply # re-creates from HCL definition
Escalation
Wake the operator when:
- terraform apply proposes destroying any active CF Access application or policy protecting a live surface.
- CF Access tokens cannot be retrieved from Infisical (vault access outage).
- A surface that should be gated is publicly accessible and the cause is unknown.
- terraform state rm would affect more than the expected resources listed above.
CF support: https://support.cloudflare.com/hc/en-us/requests/new
Bypass apps — complete inventory (raxx.app pre-launch gate)
The raxx.app root app (23bd313e) gates the entire domain. Every path that an
anonymous browser or SSR fetch must reach requires its own bypass app
(decision=bypass, include=everyone). This table is the canonical record — update
after every bypass change.
Vault secret for the API token: CLOUDFLARE_ACCESS_MGMT_TOKEN at /MooseQuest/cloudflare/
(note: runbook diagnostic steps above reference CF_ACCESS_MGMT — the live key is CLOUDFLARE_ACCESS_MGMT_TOKEN).
Phase 1 — Beta walkthrough / preview (created 2026-06-10 / 2026-06-12)
| Domain / path | Name | App UUID | Policy UUID | Created |
|---|---|---|---|---|
raxx.app/_next/* |
Next.js static assets bypass | 3ac0d097-fd72-491c-965c-ecc2a741739f |
c9f5bbb2-5452-4157-a06c-6c01d1b327cd |
2026-06-12 |
raxx.app/beta-preview/* |
Beta preview panel images bypass | e56548b3-48f7-4f29-9580-be6dc2443392 |
44f30cd8-a2f7-45b1-be8a-9b828204c4c3 |
2026-06-12 |
raxx.app/beta/preview/* |
Beta tester marketing preview bypass | f5d00795-c607-4ca6-ad7b-8fa0822e87af |
(verify via API) | 2026-06-10 |
raxx.app/beta/walk/* |
Beta tester walkthrough bypass | 2718e5ff-7989-4156-ae7b-e8493a50d20f |
(verify via API) | 2026-06-10 |
raxx.app/api/beta/preview/* |
Beta preview API bypass (raxx.app) | 699cc376-cded-4a47-941b-3dfc94b63d93 |
0458f66f-c13f-44ed-afba-536d659f3b4d |
2026-06-12 |
raxx.app/api/beta/walk/* |
Beta walk API bypass (raxx.app) | d1a35c46-2d64-430e-b1d8-d73de0966b25 |
80d313a9-b9fa-4b3f-9ffd-2c4983cbd763 |
2026-06-12 |
api.raxx.app/api/beta/preview/* |
Beta preview API bypass (api.raxx.app) | cfe2caf4-e058-46a4-8d30-6abd88c0f614 |
c1b61651-1b1c-4c9b-b665-04c33acd082f |
2026-06-12 |
api.raxx.app/api/beta/walk/* |
Beta walk API bypass (api.raxx.app) | 609d12e2-e7e8-48d0-943c-76fb34b602d5 |
7871e13f-275c-4b24-a37d-e5017d759fdc |
2026-06-12 |
Note: api.raxx.app root app (01b5423f) has a decision=bypass policy (71c61b7f) with
include=everyone as its first policy, which currently bypasses the entire api.raxx.app
domain for anonymous traffic. The per-path bypass apps above are belt-and-suspenders for when
that root policy is eventually tightened.
Phase 2 — Beta join flow (created 2026-06-13, SC-5 #3550)
| Domain / path | Name | App UUID | Policy UUID | Created |
|---|---|---|---|---|
raxx.app/beta/join/* |
Beta join page bypass (raxx.app) | 0f1aee49-102d-45ac-ad90-f774567efc23 |
ed41d62e-a2ac-48b9-ae69-e6f0d2eeca24 |
2026-06-13 |
raxx.app/api/beta/join/* |
Beta join API bypass (raxx.app) | 06578360-f66b-427c-8857-2e8e0156611c |
cb080c74-87ed-4bec-bac9-5921bc1ef842 |
2026-06-13 |
api.raxx.app/api/beta/join/* |
Beta join API bypass (api.raxx.app) | 45afaa60-2798-45e9-9294-1ae00ab20a23 |
4ca00138-cdb2-4c4e-91ab-c37bce6bde9c |
2026-06-13 |
Verification probes run 2026-06-13 UTC (routes not yet built — probes confirm CF is not intercepting):
| Probe | Expected | Result |
|---|---|---|
raxx.app/beta/join/smoketoken |
non-302 (Next.js 404) | 404 PASS |
raxx.app/api/beta/join/smoketoken/state |
non-302 (Raptor 401) | 401 PASS |
api.raxx.app/api/beta/join/smoketoken/state |
non-302 (Raptor 401) | 401 PASS |
raxx.app/dashboard |
302 to cloudflareaccess.com | 302 PASS (root gate intact) |
raxx.app/_next/static/css/bde7e58e6b10bfac.css |
non-302 | 200 PASS |
When SC-6 deploys the /beta/join/<token> page, re-run probe 1 and confirm non-302 (200 or
expected Next.js response, not cloudflareaccess.com redirect).
Known gap — raxx.app/api/auth/register/* (operator action required before Phase 2 live)
Live probe 2026-06-13: raxx.app/api/auth/register/options returns 302 to cloudflareaccess.com
for anonymous requests. This path is used by the WebAuthn registration flow
(begin-with-token, verify-with-token) when the browser calls Raptor via raxx.app as the
proxy host.
The corresponding api.raxx.app/api/auth/register/* path is unaffected — covered by the
api.raxx.app root bypass policy 71c61b7f.
Impact now: zero — Phase 2 routes not yet live. SSR fetches on the join flow target
api.raxx.app, which is bypassed.
Required before Phase 2 goes live: a bypass app for raxx.app/api/auth/register/*
(decision=bypass, include=everyone). This is the same pattern as all other Phase 1/2
bypasses. Provision via CF API using CLOUDFLARE_ACCESS_MGMT_TOKEN from vault, following
the Phase 2 provisioning pattern in docs/incidents/2026-06-12-beta-invite-cf-access-blocks.md
Resolution section.
Checklist — CF Access bypass set for a public HMAC-signed flow
Lesson from docs/incidents/2026-06-12-beta-invite-cf-access-blocks.md: every anonymous
dependency path needs its own bypass app. Use this checklist when adding a new public
HMAC-signed flow.
For each new flow, ask:
- [ ] Page route bypassed? (
raxx.app/<flow-path>/*) - [ ] Next.js static assets bypassed? (
raxx.app/_next/*) — shared; verify existing bypass covers - [ ] Public dir assets bypassed? (
raxx.app/<asset-dir>/*) — if the flow serves images or other public-dir files - [ ] API routes on
raxx.appbypassed? (raxx.app/api/<flow>/*) - [ ] API routes on
api.raxx.appbypassed? (api.raxx.app/api/<flow>/*) — SSR fetches target this host - [ ] Auth/register paths bypassed if WebAuthn flow included? (
raxx.app/api/auth/register/*) — currently gated; see Known gap above - [ ] Verification probes run from a clean session (no CF cookie, browser UA, no CF Access credentials)?
- [ ] Root gate sanity check:
raxx.app/dashboardstill returns 302 to cloudflareaccess.com?
References
- Terraform module (getraxx):
terraform/modules/cf-access-getraxx/ - Terraform stack (console/vault/freescout):
terraform/cf-access/ - Launch-day removal runbook:
docs/ops/runbooks/getraxx-launch-day-cf-access-removal.md - State import runbook:
docs/ops/runbooks/terraform-cf-access-state-imports.md - Service token provisioning:
docs/ops/runbooks/cf-access-service-token-provisioning.md - Auth posture:
docs/security/auth-posture.md - Web surface posture:
docs/security/web-surface-posture.md - Incident #2849: getraxx.com dashboard delete
- Incident 2026-06-12: beta invite CF Access blocks —
docs/incidents/2026-06-12-beta-invite-cf-access-blocks.md - Phase 2 design doc:
docs/architecture/beta-phase2-join-flow.md§6 - CF Zero Trust dashboard:
https://one.dash.cloudflare.com/ - CF API token management:
https://dash.cloudflare.com/profile/api-tokens