Raxx · internal docs

internal · gated

CF Access runbook

System: Cloudflare Zero Trust Access — all Raxx-managed applications Owner: sre-agent / operator Last incident: 2026-06-12 (incomplete bypass set blocked beta invite flow; see 2026-06-12-beta-invite-cf-access-blocks.md) Last reviewed: 2026-06-13



How to tell CF Access is broken

How to diagnose (in order)

  1. Check live CF Access state via API: bash export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_ACCESS_MGMT \ --path /MooseQuest/cloudflare/ --plain) curl -sS -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ "https://api.cloudflare.com/client/v4/accounts/22b5c35090724fbf05db6d4f501ac821/access/apps" \ | python3 -c "import sys,json; [print(a['id'], a['name'], a['domain']) for a in json.load(sys.stdin).get('result',[])]" Compare output to what Terraform state believes exists.

  2. Check Terraform state: bash cd terraform/modules/cf-access-getraxx terraform state list

  3. Run a plan against live state: bash export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_ACCESS_MGMT \ --path /MooseQuest/cloudflare/ --plain) export TF_VAR_cf_access_account_id=$(infisical secrets get CF_ACCESS_ACCOUNT_ID_MOOSEQUEST \ --path /MooseQuest/cloudflare/ --plain) terraform plan A No changes plan means state is in sync. Anything else is drift.


Known failure modes

Failure mode A: Dashboard delete created Terraform state orphan (#2849)

Symptom: terraform plan shows resources to destroy that no longer exist in CF (e.g., cloudflare_zero_trust_access_application.getraxx_beta, cloudflare_zero_trust_access_policy.getraxx_beta_invitees, cloudflare_ruleset.www_to_apex_redirect). The CF dashboard shows these apps are gone.

Cause: The operator deleted one or more CF Access resources directly via the CF Zero Trust dashboard (or via direct API call), bypassing Terraform. Terraform state still references these resources; terraform plan plans to destroy things that are already gone, which would either no-op or error during apply.

Recommendation: terraform state rm (not terraform import) when the deletion was intentional and the resource should not be re-created. Use terraform import when the resource was accidentally deleted and must be re-created and re-imported.

Decision for #2849: The operator deleted the getraxx-beta CF Access app intentionally on 2026-05-27 UTC as part of the v1 launch procedure (Path B in docs/ops/runbooks/getraxx-launch-day-cf-access-removal.md). The module terraform/modules/cf-access-getraxx/ should be kept for reference but its state entries must be removed. Use the terraform state rm path below.

Fix — operator-action required (do NOT run autonomously):

Step 1: Source credentials

export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_ACCESS_MGMT \
  --path /MooseQuest/cloudflare/ --plain)
export TF_VAR_cf_access_account_id=$(infisical secrets get CF_ACCESS_ACCOUNT_ID_MOOSEQUEST \
  --path /MooseQuest/cloudflare/ --plain)

Step 2: Confirm the resources are gone in CF

curl -sS -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  "https://api.cloudflare.com/client/v4/accounts/22b5c35090724fbf05db6d4f501ac821/access/apps" \
  | python3 -c "
import sys, json
apps = json.load(sys.stdin).get('result', [])
getraxx = [a for a in apps if 'getraxx' in a.get('name','').lower() or 'getraxx.com' in a.get('domain','')]
if getraxx:
    print('FOUND — not safe to rm:')
    for a in getraxx: print(' ', a['id'], a['name'])
else:
    print('Not found in CF — safe to terraform state rm')
"

Expected output if the delete succeeded: Not found in CF — safe to terraform state rm

If the resources are still present in CF, STOP. Do NOT run terraform state rm — the resource is live. Investigate why the dashboard delete did not propagate (CF rate limit, partial delete).

Step 3: Inspect current Terraform state

cd terraform/modules/cf-access-getraxx
terraform state list

Expected output (three orphaned resources):

cloudflare_ruleset.www_to_apex_redirect
cloudflare_zero_trust_access_application.getraxx_beta
cloudflare_zero_trust_access_policy.getraxx_beta_invitees

If the list is empty, state is already clean — skip to Step 5.

Step 4: Remove the orphaned resources from Terraform state

Run each terraform state rm command individually. There is no risk of live resource deletion — state rm only removes the entry from Terraform's local state file; it does not call the CF API.

terraform state rm cloudflare_zero_trust_access_policy.getraxx_beta_invitees
terraform state rm cloudflare_zero_trust_access_application.getraxx_beta
terraform state rm cloudflare_ruleset.www_to_apex_redirect

Order: remove the policy first (it depends on the application), then the application, then the ruleset (independent).

Step 5: Verify clean state

terraform plan

Expected output: No changes. Your infrastructure matches the configuration.

If Terraform still plans to destroy resources, check that the correct S3 backend state file is being used (terraform init may be needed if the backend was recently changed).

Step 6: Post-cleanup documentation

After successful terraform plan shows No changes:

  1. Add a # ARCHIVED — resources removed 2026-05-27, state rm on YYYY-MM-DD comment to the top of terraform/modules/cf-access-getraxx/main.tf.
  2. Update docs/security/web-surface-posture.md: set the getraxx.com row to WAF + rate limit (public).
  3. Update docs/security/auth-posture.md §9: set the getraxx.com row to Public.

Failure mode B: terraform plan fails with authentication error

Symptom: terraform plan exits with Authentication error (10000) or invalid token when planning the terraform/modules/cf-access-getraxx/ module.

Cause: CF_ACCESS_MGMT token has expired, been revoked, or lost the Account:Zero Trust:Edit scope. Common after a vault rotation or token recreation.

Fix:

# Verify token status
CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_ACCESS_MGMT \
  --path /MooseQuest/cloudflare/ --plain)
curl -s -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  https://api.cloudflare.com/client/v4/user/tokens/verify | python3 -m json.tool
# Expect: "status": "active"

If inactive: follow docs/ops/runbooks/rotation/cloudflare-user-api-token.md to rotate the token, then update Infisical at /MooseQuest/cloudflare/CF_ACCESS_MGMT.

Verification: Re-run terraform plan after rotating the token.


Failure mode C: CF Access gate blocks legitimate service token traffic

Symptom: Machine-to-machine requests (e.g., Lambda → FreeScout) receive a 302 redirect to cloudflareaccess.com instead of being passed through.

Cause: CF Access service token policy uses decision = "allow" instead of decision = "non_identity". Per feedback_cf_access_service_token_needs_non_identity.md: allow requires an IdP identity that service tokens don't carry.

Fix: In terraform/cf-access/ (the main stack, not the getraxx module), set decision = "non_identity" on the service token policy resource, then apply. See docs/ops/runbooks/terraform-cf-access-state-imports.md Fix 1 for the freescout service token correction history.

Also check that the WAF skip rule is in place (Bot Fight Mode bypasses CF Access headers). See feedback_cf_access_does_not_bypass_bot_fight_mode.md.

Verification:

curl -v -H "CF-Access-Client-Id: <service-token-id>" \
     -H "CF-Access-Client-Secret: <service-token-secret>" \
     https://<gated-surface>/health
# Expect: 200 (not 302)

Emergency stop — re-gate a surface that was accidentally made public

If a CF Access application is accidentally removed and the surface needs to be immediately re-gated:

For getraxx.com (manual CF dashboard path — fastest): See "Re-adding the gate (rollback)" in docs/ops/runbooks/getraxx-launch-day-cf-access-removal.md.

For other surfaces (Terraform path):

cd terraform/cf-access   # or the relevant module directory
terraform apply           # re-creates from HCL definition

Escalation

Wake the operator when: - terraform apply proposes destroying any active CF Access application or policy protecting a live surface. - CF Access tokens cannot be retrieved from Infisical (vault access outage). - A surface that should be gated is publicly accessible and the cause is unknown. - terraform state rm would affect more than the expected resources listed above.

CF support: https://support.cloudflare.com/hc/en-us/requests/new



Bypass apps — complete inventory (raxx.app pre-launch gate)

The raxx.app root app (23bd313e) gates the entire domain. Every path that an anonymous browser or SSR fetch must reach requires its own bypass app (decision=bypass, include=everyone). This table is the canonical record — update after every bypass change.

Vault secret for the API token: CLOUDFLARE_ACCESS_MGMT_TOKEN at /MooseQuest/cloudflare/ (note: runbook diagnostic steps above reference CF_ACCESS_MGMT — the live key is CLOUDFLARE_ACCESS_MGMT_TOKEN).

Phase 1 — Beta walkthrough / preview (created 2026-06-10 / 2026-06-12)

Domain / path Name App UUID Policy UUID Created
raxx.app/_next/* Next.js static assets bypass 3ac0d097-fd72-491c-965c-ecc2a741739f c9f5bbb2-5452-4157-a06c-6c01d1b327cd 2026-06-12
raxx.app/beta-preview/* Beta preview panel images bypass e56548b3-48f7-4f29-9580-be6dc2443392 44f30cd8-a2f7-45b1-be8a-9b828204c4c3 2026-06-12
raxx.app/beta/preview/* Beta tester marketing preview bypass f5d00795-c607-4ca6-ad7b-8fa0822e87af (verify via API) 2026-06-10
raxx.app/beta/walk/* Beta tester walkthrough bypass 2718e5ff-7989-4156-ae7b-e8493a50d20f (verify via API) 2026-06-10
raxx.app/api/beta/preview/* Beta preview API bypass (raxx.app) 699cc376-cded-4a47-941b-3dfc94b63d93 0458f66f-c13f-44ed-afba-536d659f3b4d 2026-06-12
raxx.app/api/beta/walk/* Beta walk API bypass (raxx.app) d1a35c46-2d64-430e-b1d8-d73de0966b25 80d313a9-b9fa-4b3f-9ffd-2c4983cbd763 2026-06-12
api.raxx.app/api/beta/preview/* Beta preview API bypass (api.raxx.app) cfe2caf4-e058-46a4-8d30-6abd88c0f614 c1b61651-1b1c-4c9b-b665-04c33acd082f 2026-06-12
api.raxx.app/api/beta/walk/* Beta walk API bypass (api.raxx.app) 609d12e2-e7e8-48d0-943c-76fb34b602d5 7871e13f-275c-4b24-a37d-e5017d759fdc 2026-06-12

Note: api.raxx.app root app (01b5423f) has a decision=bypass policy (71c61b7f) with include=everyone as its first policy, which currently bypasses the entire api.raxx.app domain for anonymous traffic. The per-path bypass apps above are belt-and-suspenders for when that root policy is eventually tightened.

Phase 2 — Beta join flow (created 2026-06-13, SC-5 #3550)

Domain / path Name App UUID Policy UUID Created
raxx.app/beta/join/* Beta join page bypass (raxx.app) 0f1aee49-102d-45ac-ad90-f774567efc23 ed41d62e-a2ac-48b9-ae69-e6f0d2eeca24 2026-06-13
raxx.app/api/beta/join/* Beta join API bypass (raxx.app) 06578360-f66b-427c-8857-2e8e0156611c cb080c74-87ed-4bec-bac9-5921bc1ef842 2026-06-13
api.raxx.app/api/beta/join/* Beta join API bypass (api.raxx.app) 45afaa60-2798-45e9-9294-1ae00ab20a23 4ca00138-cdb2-4c4e-91ab-c37bce6bde9c 2026-06-13

Verification probes run 2026-06-13 UTC (routes not yet built — probes confirm CF is not intercepting):

Probe Expected Result
raxx.app/beta/join/smoketoken non-302 (Next.js 404) 404 PASS
raxx.app/api/beta/join/smoketoken/state non-302 (Raptor 401) 401 PASS
api.raxx.app/api/beta/join/smoketoken/state non-302 (Raptor 401) 401 PASS
raxx.app/dashboard 302 to cloudflareaccess.com 302 PASS (root gate intact)
raxx.app/_next/static/css/bde7e58e6b10bfac.css non-302 200 PASS

When SC-6 deploys the /beta/join/<token> page, re-run probe 1 and confirm non-302 (200 or expected Next.js response, not cloudflareaccess.com redirect).

Known gap — raxx.app/api/auth/register/* (operator action required before Phase 2 live)

Live probe 2026-06-13: raxx.app/api/auth/register/options returns 302 to cloudflareaccess.com for anonymous requests. This path is used by the WebAuthn registration flow (begin-with-token, verify-with-token) when the browser calls Raptor via raxx.app as the proxy host.

The corresponding api.raxx.app/api/auth/register/* path is unaffected — covered by the api.raxx.app root bypass policy 71c61b7f.

Impact now: zero — Phase 2 routes not yet live. SSR fetches on the join flow target api.raxx.app, which is bypassed.

Required before Phase 2 goes live: a bypass app for raxx.app/api/auth/register/* (decision=bypass, include=everyone). This is the same pattern as all other Phase 1/2 bypasses. Provision via CF API using CLOUDFLARE_ACCESS_MGMT_TOKEN from vault, following the Phase 2 provisioning pattern in docs/incidents/2026-06-12-beta-invite-cf-access-blocks.md Resolution section.


Checklist — CF Access bypass set for a public HMAC-signed flow

Lesson from docs/incidents/2026-06-12-beta-invite-cf-access-blocks.md: every anonymous dependency path needs its own bypass app. Use this checklist when adding a new public HMAC-signed flow.

For each new flow, ask:


References