Raxx · internal docs

internal · gated

Incident RCA — getraxx.com unserved (2026-05-08 UTC)

Status: Fix in PR (see #1368 comment — pending merge + re-deploy) Severity: SEV-2 — marketing front door unserved; no customer data at risk Reported: #1368 Fix PRs: #1369 (landing + workflow) · this PR (grep bug in workflow)


Timeline

Time (UTC) Event
2026-04-22 getraxx.com zone created on Cloudflare. DNS CNAME getraxx.comgetraxx.pages.dev added. No CF Pages project created. 403 begins.
2026-04-23 Commit a837db0 (feature/brand-tokens-getraxx-landing) builds landing page components inside frontend/trademaster_ui/src/pages/getraxx/. Branch never merged; no deploy workflow authored.
2026-05-08 Operator notices https://getraxx.com/ returns HTTP 403 with Cloudflare server header. Filed as #1368.
2026-05-08 SRE investigation confirms: CNAME points to getraxx.pages.dev, but no CF Pages project named getraxx exists on the account. Cloudflare returns 403 for unresolvable custom hostname.
2026-05-08 Feature-developer PR #1369 opens: standalone frontend/getraxx-landing/ project + deploy-getraxx.yml workflow.
2026-05-09 01:13 UTC PR #1369 merges to main (commit dfda2b6). deploy-getraxx.yml triggers.
2026-05-09 01:14 UTC deploy-getraxx.yml run 25587414670 fails at "Ensure CF Pages custom domain (getraxx.com)" step. CF Pages project getraxx is created successfully. Custom domain attach API call returns "success": true (pretty-printed JSON with space after colon). Shell grep -q '"success":true' (no space) does not match. success_flag stays "false". Script exits 1. Pages deploy never executes.
2026-05-10 SRE re-investigation. curl -I https://getraxx.com returns HTTP 522 (connection timeout) — changed from 403 because the CF Pages project now exists but has zero deployments. Root cause of current state: grep pattern bug in the workflow.
2026-05-10 Fix PR opened (this PR): grep -qE '"success":[[:space:]]*true' in both custom-domain steps.

Root cause

Three independent failures compounded — original two from the 2026-05-08 investigation plus a third introduced by the fix workflow:

  1. No CF Pages project created. The DNS CNAME for getraxx.com pointed to getraxx.pages.dev but no CF Pages project with that name was ever provisioned. Cloudflare returns HTTP 403 for unresolvable custom hostname.

  2. Landing source not standalone. The React landing page (built in commit a837db0) was embedded inside the Antlers CRA app with no standalone deployable.

  3. Grep pattern assumed compact JSON. deploy-getraxx.yml checked for '"success":true' (no whitespace) but the Cloudflare v4 API returns pretty-printed JSON: "success": true (space after colon). The shell script false-negated a successful API response and aborted. The CF Pages project and custom domain attach both completed successfully; only the guard check failed.


Current state (2026-05-10)

On merge of this fix PR, deploy-getraxx.yml will re-trigger. The idempotent steps (project create, domain attach) will no-op correctly. The deploy step will upload the first artifact and the site will serve HTTP 200.


Fix

This PR patches .github/workflows/deploy-getraxx.yml:


Operator prerequisite (still outstanding)

The CLOUDFLARE_EDIT_DNS token covers the raxx.app zone only. The DNS bootstrap steps in deploy-getraxx.yml require DNS:Edit scope on the getraxx.com zone (Zone ID 0bdcee38d1da2d021eb6166f0bd6204f). Since the DNS CNAME already exists, those steps will no-op on the CNAME check and not attempt creation. The Pages deploy does not require the DNS token — the site will serve after merge regardless.

Action: Extend CLOUDFLARE_EDIT_DNS to cover the getraxx.com zone when convenient (low urgency; DNS CNAME is already correct).


Contributing factors


Action items

Action Owner Due Issue
Merge this fix PR Kristerpher 2026-05-10 #1368
Verify curl -I https://getraxx.com returns 200 after deploy SRE 2026-05-10 #1368
Extend CLOUDFLARE_EDIT_DNS to cover getraxx.com zone Operator 2026-05-17
Add probe alert: non-200 from getraxx.com fires Slack alert SRE 2026-05-17
Audit other workflows for '"success":true' compact-JSON grep anti-pattern SRE 2026-05-17