Vault access runbook
System: vault.raxx.app — self-hosted Infisical instance (Infisical CE)
Owner: operator (Kristerpher)
Last incident: #680 (2026-05-15 UTC — CF Access gate blocking Infisical CLI machine identity; root cause identified as missing WAF skip rule, fixed in #2143)
Last reviewed: 2026-05-19 UTC
Access modes
| Caller | Auth method | Path |
|---|---|---|
| Human operator (browser) | CF Access email OTP or Google SSO | vault.raxx.app/* |
| Agent / CI (machine) | CF Access service token + Infisical universal auth | vault.raxx.app/api/v1/auth/universal-auth/login |
| Infisical CLI (agent sessions) | CF Access service token (via env vars) | vault.raxx.app/api/v1/auth/* |
CF WAF skip rule
Why it exists
Cloudflare Bot Fight Mode (BFM) evaluates BEFORE Cloudflare Access authenticates a request. The Infisical CLI and any agent making direct REST calls to vault.raxx.app/api/v1/auth/* egress from AWS ASNs (AS14618/AS16509), which CF scores as bot-origin traffic. Without a skip rule, BFM returns CF error 1010 before the service-token headers are ever checked by CF Access — causing the CLI to fail silently even when machine-identity credentials are valid.
This was the root cause of issue #680 ("stale CF Access token" errors in agent sessions). The CLI was not stale; the WAF was blocking it before auth could complete.
See memory: feedback_cf_access_does_not_bypass_bot_fight_mode.md and feedback_cf_access_service_token_needs_non_identity.md.
What the rule does
The skip rule is a Priority 0.5 rule in terraform/modules/cf-waf/main.tf (inside cloudflare_ruleset.custom_waf), enabled for the raxx.app zone via vault_infisical_auth_skip_enabled = true in terraform/waf/main.tf.
Expression:
(http.host eq "vault.raxx.app"
and starts_with(http.request.uri.path, "/api/v1/auth/")
and len(http.request.headers["cf-access-client-id"]) gt 0)
Action: skip — ruleset = "current" (skips all rules in the custom ruleset for matching requests).
Scope: Host-specific + path-specific + header-gated. Requests without CF-Access-Client-Id (e.g., unauthenticated browser requests) still hit BFM normally. CF Access still authenticates the service token on matched requests — the skip is a WAF bypass only.
How to verify the skip rule is live
# Check the raxx.app zone custom WAF ruleset via CF API
# Token must have Zone:WAF:Read scope on raxx.app
curl -sS \
-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
"https://api.cloudflare.com/client/v4/zones/f12dbb5cac57d5591a5058874498a6d1/rulesets" \
| python3 -c "
import sys, json
rulesets = json.load(sys.stdin)['result']
for r in rulesets:
if r.get('phase') == 'http_request_firewall_custom':
print('RULESET ID:', r['id'])
print('NAME:', r['name'])
"
# Then fetch the specific ruleset and check for the vault skip rule:
RULESET_ID="<id from above>"
curl -sS \
-H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
"https://api.cloudflare.com/client/v4/zones/f12dbb5cac57d5591a5058874498a6d1/rulesets/${RULESET_ID}" \
| python3 -c "
import sys, json
rs = json.load(sys.stdin)['result']
for rule in rs.get('rules', []):
if 'vault.raxx.app' in rule.get('expression', ''):
print('FOUND vault skip rule:')
print(' expression:', rule['expression'])
print(' action:', rule['action'])
print(' enabled:', rule['enabled'])
"
Expected: the rule is present, action = "skip", enabled = true.
How to verify the Infisical CLI can reach the vault
From any agent session with valid machine identity credentials:
# AC verification command (no secret values echoed — head -c 8 shows first 8 chars only)
infisical secrets get SENTRY_INTERNAL_INTEGRATION_TOKEN \
--env=prod \
--path=/MooseQuest/sentry \
--plain | head -c 8
Expected: returns the first 8 characters of the token (not an empty string, 401, or CF error 1010).
A clean run confirms:
1. The WAF skip rule is active.
2. The CF Access service-token policy has decision = non_identity (not allow).
3. The machine identity universal-auth credentials in the session are valid.
Terraform management
The skip rule is managed in terraform/waf/ (not terraform/cf-access/). CF enforces one ruleset per phase per zone; the custom WAF ruleset for raxx.app lives in terraform/waf.
Apply the skip rule:
cd terraform/waf
# Token: Zone:WAF:Edit scope on raxx.app
export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_WAF_EDIT \
--path /MooseQuest/cloudflare/ --plain)
# Zone ID vars — inject via environment, not terraform.tfvars
export TF_VAR_raxx_app_zone_id=$(infisical secrets get CF_ZONE_ID_RAXX_APP \
--path /MooseQuest/cloudflare/ --plain)
export TF_VAR_getraxx_zone_id=$(infisical secrets get CF_ZONE_ID_GETRAXX \
--path /MooseQuest/cloudflare/ --plain)
terraform init
terraform plan -out=tfplan
# Review: expect the vault skip rule to appear as an addition inside custom_waf
terraform apply tfplan
Note: if the cross-stack state migration (docs/ops/runbooks/waf.md §Cross-stack ruleset migration) has not yet been run, terraform plan will show the existing zone-default custom ruleset as a diff. Complete the state migration steps in waf.md before applying.
Emergency: disable the skip rule
If the skip rule is suspected to be too broad (e.g., unexpected traffic bypassing WAF on the vault auth path):
# In terraform/waf/main.tf, change:
# vault_infisical_auth_skip_enabled = true
# to:
# vault_infisical_auth_skip_enabled = false
# Then apply:
cd terraform/waf
export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_WAF_EDIT \
--path /MooseQuest/cloudflare/ --plain)
export TF_VAR_raxx_app_zone_id=$(infisical secrets get CF_ZONE_ID_RAXX_APP \
--path /MooseQuest/cloudflare/ --plain)
export TF_VAR_getraxx_zone_id=$(infisical secrets get CF_ZONE_ID_GETRAXX \
--path /MooseQuest/cloudflare/ --plain)
terraform plan -out=tfplan
terraform apply tfplan
Disabling the skip rule will cause Infisical CLI machine-identity calls from AWS ASNs to fail with CF error 1010. This is an acceptable emergency trade-off if a security incident requires tightening the vault perimeter. Restore as soon as the incident is resolved.
How to tell vault access is broken
- Symptom 1: Agent sessions fail with
infisical: error: CF error 1010or empty secret values. - Symptom 2: CI pipelines fail at vault read steps (typically
scripts/agents/mint_github_token.pyor any script callinginfisical secrets get). - Symptom 3: Velvet rotation worker fails at vault write steps with 403 or silent empty responses.
How to diagnose (in order)
- Check CF WAF Events on
raxx.appzone — Security → WAF, filtervault.raxx.app, last 30 min. Look for blocks on/api/v1/auth/. - Verify the Priority 0.5 skip rule is present in the custom WAF ruleset (see §How to verify the skip rule is live above).
- If the skip rule is absent: the cross-stack state migration has not been applied, or the WAF stack was not yet applied. Apply with
terraform applyper §Terraform management above. - If the skip rule is present but Infisical CLI is still failing: verify the CF Access service token for the agent session is valid (
infisical login --method=universal-auth). Stale tokens return 401 after passing the WAF. - For persistent 1010 errors with a valid service token: verify
decision = non_identityon the CF Access policy for vault.raxx.app (notallow). Seefeedback_cf_access_service_token_needs_non_identity.md.
Escalation
Wake the operator when: - The vault is unreachable from CF dashboard (not a WAF issue — lightsail host may be down). - The CF Access email OTP flow is broken (human operator locked out of vault UI). - Machine identity credentials are expired or revoked and cannot be refreshed automatically. - The skip rule is disabled and an active agent rotation or deploy depends on vault access.
Cross-references
- WAF runbook:
docs/ops/runbooks/waf.md - ADR-0042 (CF Access service token patterns)
- ADR-0077 (CF WAF layered defense)
- Issue #680 (root cause: BFM blocking Infisical CLI)
- Issue #2143 (vault skip rule fix)
- Memory:
feedback_cf_access_does_not_bypass_bot_fight_mode.md - Memory:
feedback_cf_access_service_token_needs_non_identity.md