System: Cloudflare WAF — raxx.app zone (raxx.app, api.raxx.app, console.raxx.app, vault.raxx.app, tickets.raxx.app) + getraxx.com zone (getraxx.com, www.getraxx.com)
Owner: operator
Last incident: n/a (initial setup — SC-WAF-01 #1737)
Last reviewed: 2026-05-17
FirewallMatchesActions on attack-pattern requests.FirewallMatchesActions contains block on /api/webhooks/postmark from Postmark IP ranges.cf-access-client-id skip rule is present in custom ruleset.terraform plan shows unexpected diffs to CF Access resources (the WAF stack should NOT touch terraform/cf-access/ state).console.raxx.app, vault.raxx.app, or tickets.raxx.app. WAF rate limit firing before CF Access (operator IP hit the 60/min ceiling). Check operator_surface_hostnames rate limit rule in the raxx.app zone rate limit ruleset.terraform apply fails with "ruleset already exists" on raxx.app http_request_firewall_custom phase. This is the cross-stack conflict — see §Cross-stack ruleset migration before proceeding.raxx.app or getraxx.com zone → Security → WAF. Filter by last 30 min. Expected: zero blocking actions in Phase 1 (log-only).FirewallMatchesActions field. A block action in Phase 1 indicates a rule error.FirewallMatchesRuleIDs with the ruleset IDs from terraform output. Identify which ruleset (managed vs custom vs rate limit) fired.GET /zones/{zone_id}/rulesets/{custom_waf_ruleset_id} — verify the Postmark IP ranges in rule Priority 2 match the current Postmark IP list.GET /zones/{zone_id}/rulesets/{custom_waf_ruleset_id} — verify Priority 1 skip rule expression is (len(http.request.headers["cf-access-client-id"]) gt 0) and is enabled.cd terraform/waf && terraform plan. Any non-zero diff against a known-good apply indicates dashboard drift (Failure Mode F11).This stack requires a CF API token with Zone:WAF:Edit + Zone:Logs:Edit on both zones.
Verify your token has the correct scopes before applying:
curl -s -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
https://api.cloudflare.com/client/v4/user/tokens/verify | python3 -m json.tool
The CLOUDFLARE_RAXX_AUTOMATION_API_TOKEN documented in terraform/README.md was
confirmed to NOT have WAF:Edit scope as of 2026-04-30 (see cloudflare-rate-limiting.md).
If that token has not been updated, mint a new WAF-scoped token:
POST /api/v3/secrets/raw/CF_WAF_EDIT at path /MooseQuest/cloudflare/export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_WAF_EDIT --path /MooseQuest/cloudflare/ --plain)Symptom: Customer reports 403 on a valid request. WAF Events log shows an OWASP or CF Managed rule firing on a legitimate path.
Cause: OWASP CRS triggering on valid JSON body or API field names containing SQL/XSS patterns. Most common on api.raxx.app with complex order payloads.
Fix:
# Identify the rule ID from WAF Events or Logpush
# Edit terraform/waf/terraform.tfvars: set owasp_action = "log" to revert to observation
# Or apply a per-rule override in terraform/modules/cf-waf/main.tf overrides block
cd terraform/waf
export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_WAF_EDIT --path /MooseQuest/cloudflare/ --plain)
terraform plan -out=tfplan
terraform apply tfplan
Verification: Customer can complete the previously blocked action. WAF Events shows "log" not "block" for the rule. Phase impact: Rolling back to log is always safe. Docs: waf-strategy.md §8 Phase 1, Failure Mode F1.
Symptom: Postmark webhook delivery failures. FreeScout inbound email stops. Logpush shows block on /api/webhooks/postmark.
Cause: Postmark rotated their delivery IP ranges without notice.
Fix:
# Get current Postmark IP ranges from:
# https://postmarkapp.com/support/article/800-ips-for-rate-limiting-or-firewall-rules
# Update terraform/waf/main.tf postmark_ip_ranges in both module calls
# Then:
cd terraform/waf
terraform plan -out=tfplan
terraform apply tfplan
Verification: curl -X POST https://api.raxx.app/api/webhooks/postmark from a Postmark IP returns 200 (not 403). Logpush shows no block on this path.
Symptom: Velvet, CI, or Console machine calls to Queue or Raptor returning 403 or CAPTCHA challenge. Cause: New service token not matching the skip rule, or BFM skip rule accidentally disabled. Fix:
# Verify the skip rule in CF dashboard:
# raxx.app zone → Security → WAF → Custom rules → "Priority 1 — skip BFM..."
# Confirm: expression = (len(http.request.headers["cf-access-client-id"]) gt 0)
# Confirm: Action = Skip, Status = Enabled
# If the rule is present but not working, verify the service token is sending
# the CF-Access-Client-Id header. Trace with:
curl -v -H "CF-Access-Client-Id: <token-id>" -H "CF-Access-Client-Secret: <token-secret>" \
https://api.raxx.app/health
Verification: Machine caller returns expected response (not 403/challenge). Logpush shows no block on affected path.
Symptom: Stripe webhook delivery failures. Payment processing lag. Rate limit action fires on /api/v1/billing/webhook.
Cause: Rate limit threshold on global or order path too tight during a Stripe event replay burst.
Fix:
# Immediately revert rate_limit_action to "log" (observation mode):
# In terraform.tfvars: rate_limit_action = "log"
cd terraform/waf
terraform plan -out=tfplan
terraform apply tfplan
Verification: Stripe webhook delivery resumes. Check Stripe dashboard for webhook retry status.
Symptom: terraform plan shows diff for a resource that was not intentionally changed. Indicates a direct CF dashboard edit (not via Terraform).
Fix:
cd terraform/waf
# Review the diff carefully. If the dashboard state is correct:
# Import the changed resource into TF state and update main.tf to match.
# If TF state is correct:
terraform apply -target=<resource_address>
Prevention: All WAF changes must go through Terraform. No direct CF dashboard edits after first apply (ADR-0077 D2, ADR-0051).
Phase transitions require explicit operator sign-off. Do not advance phases autonomously.
| Phase | tfvars change | Gate criteria |
|---|---|---|
| Phase 1 → Phase 2 | managed_ruleset_action = "managed_challenge", rate_limit_action = "managed_challenge" |
7-day log soak; false-positive rate <1% |
| Phase 2 → Phase 3 | managed_ruleset_action = "block", rate_limit_action = "block" |
72h; zero legitimate flows challenged |
| Phase 4 → Phase 5 | n/a (flag flip — FLAG_ENFORCE_CF_ORIGIN) |
7-day Phase 4 soak; SC-WAF-07 (#1741) |
Always run terraform plan and review before terraform apply on any phase change.
Fastest rollback: set all actions to log/simulate and apply. ~30s CF propagation.
cd terraform/waf
# Edit terraform.tfvars:
# managed_ruleset_action = "log"
# owasp_action = "log"
# auth_challenge_action = "log"
# rate_limit_action = "simulate"
export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_WAF_EDIT --path /MooseQuest/cloudflare/ --plain)
terraform plan -out=tfplan
terraform apply tfplan
Full removal (removes all WAF rulesets and rate limits; CF Access unaffected):
cd terraform/waf
terraform destroy
Note: terraform destroy removes WAF only. It does not touch terraform/cf-access/ (separate state file).
cd terraform/waf
# 1. Set the CF WAF-scoped API token
export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_WAF_EDIT \
--path /MooseQuest/cloudflare/ --plain)
# 2. Set zone IDs (populate terraform.tfvars REPLACE_WITH_* placeholders)
# raxx.app zone: f12dbb5cac57d5591a5058874498a6d1 (from cloudflare-rate-limiting.md)
# getraxx.com zone: retrieve from CF dashboard
# 3. If SC-WAF-00 is complete, set Logpush vars from SSM:
# export TF_VAR_logpush_destination_conf=$(aws ssm get-parameter \
# --name /raxx/waf/logpush_destination_conf --with-decryption \
# --query Parameter.Value --output text)
# export TF_VAR_logpush_ownership_challenge=$(aws ssm get-parameter \
# --name /raxx/waf/logpush_ownership_challenge --with-decryption \
# --query Parameter.Value --output text)
# 4. Init + plan + apply
terraform init
terraform plan -out=tfplan
# Review: all changes must be additive; no modifications to cf-access/ resources
terraform apply tfplan
# 5. Verify
terraform output
# Check CF dashboard: Security → WAF → Custom rules
# Expected: all rules show mode "log"; no blocking actions
The logpush_destination_conf and logpush_ownership_challenge variables are empty
by default. The Logpush job is not created until SC-WAF-00 (#1736) completes.
SC-WAF-00 operator actions:
1. Create S3 bucket for WAF logs (raxx-waf-logs-prod recommended).
2. Create IAM user with s3:PutObject on that bucket only.
3. Run Cloudflare ownership challenge for the destination.
4. Store destination conf and challenge token in SSM:
- /raxx/waf/logpush_destination_conf
- /raxx/waf/logpush_ownership_challenge
5. Re-apply this stack with the SSM values injected via TF_VAR_*.
Code migration status: COMPLETE as of 2026-05-17 (Issue #2183, PR pending).
The freescout_lambda_skip rule has been moved out of terraform/cf-access/freescout_service_token.tf and into terraform/modules/cf-waf/main.tf as a Priority 0 dynamic rule inside cloudflare_ruleset.custom_waf. The cloudflare_ruleset.freescout_lambda_skip resource declaration has been removed from terraform/cf-access.
What remains: two state operations. These require live CF credentials (#2328 token refresh) and cannot be executed until tokens are valid. Until then, terraform plan on terraform/cf-access will show the ruleset as a planned destroy — do not run terraform apply on cf-access until Step 2 below is complete.
State migration — run once after #2328 token refresh:
Step 1 — import the live zone-default ruleset into terraform/waf state:
cd terraform/waf
export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_ACCESS_MGMT \
--path /MooseQuest/cloudflare/ --plain)
export TF_VAR_raxx_app_zone_id="f12dbb5cac57d5591a5058874498a6d1"
export TF_VAR_getraxx_zone_id=$(infisical secrets get CF_ZONE_ID_GETRAXX \
--path /MooseQuest/cloudflare/ --plain)
terraform init
terraform import module.waf_raxx_app.cloudflare_ruleset.custom_waf \
zones/f12dbb5cac57d5591a5058874498a6d1/17dc768ccadf4d02ae279e133b7b5bfd
Step 2 — remove from terraform/cf-access state (does NOT destroy the CF resource):
cd terraform/cf-access
export CLOUDFLARE_API_TOKEN=$(infisical secrets get CF_ACCESS_MGMT \
--path /MooseQuest/cloudflare/ --plain)
terraform init
terraform state rm cloudflare_ruleset.freescout_lambda_skip
Step 3 — plan both stacks; both must show zero diff on the ruleset:
cd terraform/waf && terraform plan
cd terraform/cf-access && terraform plan
Expected terraform/waf plan: new resources for managed WAF, rate limits, zone settings, logpush — zero destroys. The custom_waf ruleset shows as an in-place update (Priority 0 rule added, all other rules preserved).
Expected terraform/cf-access plan: zero changes (ruleset resource gone from both config and state).
Step 4 — apply terraform/waf only:
cd terraform/waf
terraform apply tfplan
Step 5 — verify in CF dashboard:
raxx.app zone → Security → WAF → Custom rules: Priority 0 skip rule present for tickets.raxx.app/api with CF-Access-Client-Id condition.Tracking: Required before #1735 can be closed.
docs/architecture/waf-strategy.mddocs/architecture/adr/0077-cloudflare-waf-layered-defense.mddocs/security/waf-threat-model-2026-05-11.mddocs/ops/runbooks/cloudflare-tokens.mddocs/ops/runbooks/cloudflare-rate-limiting.mddocs/architecture/raxx-app-track-b.md, SC-WAF-07 (#1741)Wake the operator when:
- A WAF rule is blocking customers in Phase 1 (log mode should never block)
- CF WAF Events shows a block action that cannot be explained by the ruleset
- terraform destroy is being considered (impacts all WAF protection for both zones)
- Any incident that involves the WAF affecting payment or order submission flows