Raxx · internal docs

internal · gated

Vault access runbook

System: vault.raxx.app — self-hosted Infisical instance (Infisical CE) Owner: operator (Kristerpher) Last incident: #680 (2026-05-15 UTC — CF Access gate blocking Infisical CLI machine identity; root cause identified as missing WAF skip rule, fixed in #2143) Last reviewed: 2026-05-19 UTC


Access modes

Caller Auth method Path
Human operator (browser) CF Access email OTP or Google SSO vault.raxx.app/*
Agent / CI (machine) CF Access service token + Infisical universal auth vault.raxx.app/api/v1/auth/universal-auth/login
Infisical CLI (agent sessions) CF Access service token (via env vars) vault.raxx.app/api/v1/auth/*

CF WAF skip rule

Why it exists

Cloudflare Bot Fight Mode (BFM) evaluates BEFORE Cloudflare Access authenticates a request. The Infisical CLI and any agent making direct REST calls to vault.raxx.app/api/v1/auth/* egress from AWS ASNs (AS14618/AS16509), which CF scores as bot-origin traffic. Without a skip rule, BFM returns CF error 1010 before the service-token headers are ever checked by CF Access — causing the CLI to fail silently even when machine-identity credentials are valid.

This was the root cause of issue #680 ("stale CF Access token" errors in agent sessions). The CLI was not stale; the WAF was blocking it before auth could complete.

See memory: feedback_cf_access_does_not_bypass_bot_fight_mode.md and feedback_cf_access_service_token_needs_non_identity.md.

What the rule does

The skip rule is a Priority 0.5 rule in terraform/modules/cf-waf/main.tf (inside cloudflare_ruleset.custom_waf), enabled for the raxx.app zone via vault_infisical_auth_skip_enabled = true in terraform/waf/main.tf.

Expression:

(http.host eq "vault.raxx.app"
  and starts_with(http.request.uri.path, "/api/v1/auth/")
  and len(http.request.headers["cf-access-client-id"]) gt 0)

Action: skipruleset = "current" (skips all rules in the custom ruleset for matching requests).

Scope: Host-specific + path-specific + header-gated. Requests without CF-Access-Client-Id (e.g., unauthenticated browser requests) still hit BFM normally. CF Access still authenticates the service token on matched requests — the skip is a WAF bypass only.

How to verify the skip rule is live

# Check the raxx.app zone custom WAF ruleset via CF API
# Token must have Zone:WAF:Read scope on raxx.app
curl -sS \
  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  "https://api.cloudflare.com/client/v4/zones/f12dbb5cac57d5591a5058874498a6d1/rulesets" \
  | python3 -c "
import sys, json
rulesets = json.load(sys.stdin)['result']
for r in rulesets:
    if r.get('phase') == 'http_request_firewall_custom':
        print('RULESET ID:', r['id'])
        print('NAME:', r['name'])
"

# Then fetch the specific ruleset and check for the vault skip rule:
RULESET_ID="<id from above>"
curl -sS \
  -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  "https://api.cloudflare.com/client/v4/zones/f12dbb5cac57d5591a5058874498a6d1/rulesets/${RULESET_ID}" \
  | python3 -c "
import sys, json
rs = json.load(sys.stdin)['result']
for rule in rs.get('rules', []):
    if 'vault.raxx.app' in rule.get('expression', ''):
        print('FOUND vault skip rule:')
        print('  expression:', rule['expression'])
        print('  action:', rule['action'])
        print('  enabled:', rule['enabled'])
"

Expected: the rule is present, action = "skip", enabled = true.

How to verify the Infisical CLI can reach the vault

From any agent session with valid machine identity credentials:

# AC verification command (no secret values echoed — head -c 8 shows first 8 chars only)
infisical secrets get SENTRY_INTERNAL_INTEGRATION_TOKEN \
  --env=prod \
  --path=/MooseQuest/sentry \
  --plain | head -c 8

Expected: returns the first 8 characters of the token (not an empty string, 401, or CF error 1010).

A clean run confirms: 1. The WAF skip rule is active. 2. The CF Access service-token policy has decision = non_identity (not allow). 3. The machine identity universal-auth credentials in the session are valid.


Terraform management

The skip rule is managed in terraform/waf/ (not terraform/cf-access/). CF enforces one ruleset per phase per zone; the custom WAF ruleset for raxx.app lives in terraform/waf.

Apply the skip rule:

cd terraform/waf

# Token: Zone:WAF:Edit scope on raxx.app
export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_WAF_EDIT \
  --path /MooseQuest/cloudflare/ --plain)

# Zone ID vars — inject via environment, not terraform.tfvars
export TF_VAR_raxx_app_zone_id=$(infisical secrets get CF_ZONE_ID_RAXX_APP \
  --path /MooseQuest/cloudflare/ --plain)
export TF_VAR_getraxx_zone_id=$(infisical secrets get CF_ZONE_ID_GETRAXX \
  --path /MooseQuest/cloudflare/ --plain)

terraform init
terraform plan -out=tfplan
# Review: expect the vault skip rule to appear as an addition inside custom_waf
terraform apply tfplan

Note: if the cross-stack state migration (docs/ops/runbooks/waf.md §Cross-stack ruleset migration) has not yet been run, terraform plan will show the existing zone-default custom ruleset as a diff. Complete the state migration steps in waf.md before applying.


Emergency: disable the skip rule

If the skip rule is suspected to be too broad (e.g., unexpected traffic bypassing WAF on the vault auth path):

# In terraform/waf/main.tf, change:
#   vault_infisical_auth_skip_enabled = true
# to:
#   vault_infisical_auth_skip_enabled = false
# Then apply:
cd terraform/waf
export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_WAF_EDIT \
  --path /MooseQuest/cloudflare/ --plain)
export TF_VAR_raxx_app_zone_id=$(infisical secrets get CF_ZONE_ID_RAXX_APP \
  --path /MooseQuest/cloudflare/ --plain)
export TF_VAR_getraxx_zone_id=$(infisical secrets get CF_ZONE_ID_GETRAXX \
  --path /MooseQuest/cloudflare/ --plain)
terraform plan -out=tfplan
terraform apply tfplan

Disabling the skip rule will cause Infisical CLI machine-identity calls from AWS ASNs to fail with CF error 1010. This is an acceptable emergency trade-off if a security incident requires tightening the vault perimeter. Restore as soon as the incident is resolved.


How to tell vault access is broken

How to diagnose (in order)

  1. Check CF WAF Events on raxx.app zone — Security → WAF, filter vault.raxx.app, last 30 min. Look for blocks on /api/v1/auth/.
  2. Verify the Priority 0.5 skip rule is present in the custom WAF ruleset (see §How to verify the skip rule is live above).
  3. If the skip rule is absent: the cross-stack state migration has not been applied, or the WAF stack was not yet applied. Apply with terraform apply per §Terraform management above.
  4. If the skip rule is present but Infisical CLI is still failing: verify the CF Access service token for the agent session is valid (infisical login --method=universal-auth). Stale tokens return 401 after passing the WAF.
  5. For persistent 1010 errors with a valid service token: verify decision = non_identity on the CF Access policy for vault.raxx.app (not allow). See feedback_cf_access_service_token_needs_non_identity.md.

Escalation

Wake the operator when: - The vault is unreachable from CF dashboard (not a WAF issue — lightsail host may be down). - The CF Access email OTP flow is broken (human operator locked out of vault UI). - Machine identity credentials are expired or revoked and cannot be refreshed automatically. - The skip rule is disabled and an active agent rotation or deploy depends on vault access.

Cross-references