Raxx · internal docs

internal · gated

RCA — Synthetic gate 403 on authenticated routes due to raw-Heroku routing

Incident ID: 2026-05-26-synthetic-gate-routed-via-cf Date: 2026-05-26 Severity: SEV-3 Duration: ~6 days (detection 2026-05-20 → remediation PR 2026-05-26) Blast radius: Internal CI only — no user-visible impact. 42 auto-filed GH issues on #2630 + #2631 over 6 days. Author: sre-agent

Summary

The synthetic-gate workflow targeted the raw Heroku origin URL (raxx-api-staging-1a19fb3873b9.herokuapp.com) rather than the Cloudflare-proxied API hostname (api-staging.raxx.app). Raptor's cloudflare_origin_guard middleware requires the CF-Connecting-IP header, which Cloudflare injects only when proxying — direct-to-Heroku requests never carry it. The /health endpoint is allowlisted by the guard and passed, masking the issue; historical_data and backtest returned 403 on every probe run. Over 6 days, 42 GitHub issues were auto-filed against #2630 and #2631, generating noise with zero signal value.

Timeline (all times UTC)

Impact

What went well

What didn't go well

Root cause analysis

Detection

Resolution

Action items

# Action Owner Due Issue
1 Apply terraform/cf-access/ to create CF Access apps + service token operator 2026-05-27 #2630
2 Write service-token credentials to Infisical /raxx/synthetic-gate/ operator 2026-05-27 #2630
3 Set GH Actions secrets CF_ACCESS_CLIENT_ID_SYNTHETIC_GATE + CF_ACCESS_CLIENT_SECRET_SYNTHETIC_GATE operator 2026-05-27 #2630
4 Run workflow_dispatch, confirm all 4 checks PASS on staging + prod operator 2026-05-27 #2630
5 After 2 consecutive green scheduled runs, close #2630 + #2631 with resolution comments sre-agent 2026-05-29 #2630 #2631
6 Add "CF Access mode" check to the pre-deploy gate checklist (detect bootstrap-window regression) sre-agent 2026-06-02 new
7 File calendar reminder 60 days before synthetic_gate_service_token_expires_at for token rotation operator after Terraform apply new

References