Raxx · internal docs

internal · gated

CF Pages — Antlers Next.js runbook

System: Cloudflare Pages (Antlers Next.js — raxx-staging-next + raxx-prod-next) Owner: sre-agent Provisioned: 2026-05-27 Last reviewed: 2026-05-27 Related issues: #2883 (Phase 3 prod DNS cutover)


Project inventory

Project name CF Pages project ID Default subdomain Custom domain Environment
raxx-staging-next 9932e284-b2bc-48a3-b6f0-4f2e601eccd1 raxx-staging-next.pages.dev staging-nextjs.raxx.app staging
raxx-prod-next 5347a333-6724-4e07-bc1f-83351b6cfe6a raxx-prod-next.pages.dev None (Phase 3 cutover) production

Project IDs are also stored in Infisical at /MooseQuest/cloudflare/ (env: prod):

Token references (values in vault — never inline):


Build configuration

Setting Value
Build command npm run build:cf
Output directory .vercel/output/static
Root directory frontend/trademaster_ui
Production branch main

The npm run build:cf script is defined in frontend/trademaster_ui/package.json (feature-dev adds this as part of Phase 2 Wave A). It runs next build with the @cloudflare/next-on-pages adapter, which outputs the static bundle to .vercel/output/static.

Unlike the getraxx project, the Next.js build pipeline supports runtime env vars via CF Pages project environment variables — values set on the CF Pages project ARE read at build time when the deploy workflow injects them. See the GH Actions workflow for the injection pattern.


Environment variables

raxx-staging-next (production deployment)

NEXT_PUBLIC_API_URL=https://api-staging.raxx.app

raxx-prod-next (production deployment)

NEXT_PUBLIC_API_URL=https://api.raxx.app

Custom domains + DNS

staging-nextjs.raxx.app

raxx-prod-next — NO custom domain

Production project has no custom domain attached. The DNS cutover of raxx.app from the existing raxx-app CF Pages project to raxx-prod-next is Phase 3 work tracked in #2883. Do NOT add a custom domain or modify DNS for raxx.app apex until Phase 3.


GitHub Environments

Environment name Repo Protection rules Deploy trigger
staging-nextjs raxx-app/TradeMasterAPI None Auto on push to main
production-nextjs raxx-app/TradeMasterAPI None (see note) Workflow dispatch

Note — reviewer gate: GitHub required-reviewer protection rules require GitHub Team or Enterprise plan. The production-nextjs environment was provisioned without a reviewer gate. A manual review step must be enforced in the GH Actions workflow YAML itself (using environment: production-nextjs in the deploy job and a separate approval job) until the plan is upgraded. See operator-action card filed as part of this provisioning.

The existing production environment uses a branch policy rather than required reviewers — same pattern applies here until the plan gate is resolved.


Deploy approval flow

Staging (staging-nextjs environment)

Push to main
  └─> GH Actions workflow fires
       └─> deploy job uses environment: staging-nextjs
            └─> deploys to raxx-staging-next CF Pages project
                 └─> staging-nextjs.raxx.app serves the build

No approval required. Deploy fires automatically on every merge to main for paths matching frontend/trademaster_ui/**.

Production (production-nextjs environment)

workflow_dispatch (manual trigger)
  └─> GH Actions workflow fires
       └─> approval-gate job (requires operator confirmation in workflow step)
            └─> deploy job uses environment: production-nextjs
                 └─> deploys to raxx-prod-next CF Pages project
                      └─> raxx-prod-next.pages.dev serves the build
                           └─> (no public traffic until Phase 3 DNS cutover)

Until the Phase 3 DNS cutover (#2883), prod deploys build the project and make it available on raxx-prod-next.pages.dev only. No end-user traffic flows to it.


How to tell it's broken


How to diagnose (in order)

  1. Check CF Pages deploy token is valid:

bash TOKEN=$(infisical secrets get CF_PAGES_DEPLOY \ --path /MooseQuest/cloudflare/ --env prod --plain) curl -sS -H "Authorization: Bearer ${TOKEN}" \ https://api.cloudflare.com/client/v4/user/tokens/verify # Expect: {"success":true,"result":{"status":"active"}}

  1. Check project exists and has correct config:

bash ACCOUNT_ID=$(infisical secrets get CLOUDFLARE_ACCOUNT_ID \ --path /MooseQuest/cloudflare/ --env prod --plain) curl -sS \ -H "Authorization: Bearer ${TOKEN}" \ "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/pages/projects/raxx-staging-next" \ | python3 -m json.tool | grep -E '"name"|"subdomain"|"build_command"|"destination_dir"'

  1. Check custom domain verification status for staging-nextjs.raxx.app:

bash curl -sS \ -H "Authorization: Bearer ${TOKEN}" \ "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/pages/projects/raxx-staging-next/domains" \ | python3 -m json.tool # Look for: "status": "active" (not "initializing" or "blocked")

  1. If domain status is blocked, check the DNS CNAME record is present and proxied:

bash DNS_TOKEN=$(infisical secrets get CLOUDFLARE_EDIT_DNS \ --path /MooseQuest/cloudflare/ --env prod --plain) curl -sS \ -H "Authorization: Bearer ${DNS_TOKEN}" \ "https://api.cloudflare.com/client/v4/zones/f12dbb5cac57d5591a5058874498a6d1/dns_records?name=staging-nextjs.raxx.app" \ | python3 -m json.tool # Expect: type=CNAME, content=raxx-staging-next.pages.dev, proxied=true

  1. If the deploy fails in GH Actions: check the Actions run logs for the exact error. Common: token scope, project name mismatch, output directory empty after build.

Known failure modes

Failure mode A: CF Pages deploy token expired

Symptom: Deploy step returns HTTP 403 or {"success":false,"errors":[{"code":9106,...}]}.

Cause: CF_PAGES_DEPLOY token hit its 90-day rotation cadence or was revoked.

Fix: 1. Read CLOUDFLARE_ACCESS_MGMT_TOKEN from vault (/MooseQuest/cloudflare/) and roll CF_PAGES_DEPLOY per docs/ops/runbooks/cloudflare-tokens.md → Failure mode B. 2. Write the new value back to vault at /MooseQuest/cloudflare/CF_PAGES_DEPLOY. 3. Update the CF_PAGES_DEPLOY__EXPIRES_AT companion. 4. Re-run the failed deploy workflow.

Verification: Deploy workflow completes with exit 0; project appears in the deployments list via CF API.

Failure mode B: staging-nextjs.raxx.app domain stuck in initializing

Symptom: staging-nextjs.raxx.app serves a CF error instead of the Pages content >30 min after domain was attached.

Cause: Domain verification is pending — CF needs the CNAME to propagate and a TXT verification record to resolve.

Fix: 1. Check domain status via API (step 3 in diagnosis). 2. If status is initializing, wait up to 10 min for CF to verify the CNAME (record ID ca04428054c6a0de38f25e19eb663483 is already in place and proxied). 3. If status is blocked, re-attach the domain:

```bash curl -sS -X DELETE \ -H "Authorization: Bearer ${TOKEN}" \ "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/pages/projects/raxx-staging-next/domains/staging-nextjs.raxx.app"

curl -sS -X POST \ -H "Authorization: Bearer ${TOKEN}" \ -H "Content-Type: application/json" \ "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/pages/projects/raxx-staging-next/domains" \ -d '{"name":"staging-nextjs.raxx.app"}' ```

Verification: Domain status returns "status":"active" and curl -sI https://staging-nextjs.raxx.app returns HTTP 200 after first deploy.

Failure mode C: wrong NEXT_PUBLIC_API_URL in deployed bundle

Symptom: API calls from staging-nextjs.raxx.app hit the wrong backend (prod instead of staging, or vice versa).

Cause: Env var set on preview deployment config instead of production, or set in the GH Actions step env rather than the CF Pages project config.

Fix: Verify the env var is on the production deployment config:

curl -sS \
  -H "Authorization: Bearer ${TOKEN}" \
  "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/pages/projects/raxx-staging-next" \
  | python3 -c "
import sys,json
p=json.load(sys.stdin)['result']
print(p.get('deployment_configs',{}).get('production',{}).get('env_vars',{}))
"

If missing or wrong, PATCH the project:

curl -sS -X PATCH \
  -H "Authorization: Bearer ${TOKEN}" \
  -H "Content-Type: application/json" \
  "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/pages/projects/raxx-staging-next" \
  -d '{
    "deployment_configs": {
      "production": {
        "env_vars": {
          "NEXT_PUBLIC_API_URL": {"value": "https://api-staging.raxx.app"}
        }
      }
    }
  }'

Verification: Redeploy and confirm the NEXT_PUBLIC_API_URL value in the built bundle.

Failure mode D: build command not found (npm run build:cf)

Symptom: Deploy fails with missing script: build:cf error.

Cause: The build:cf script hasn't been added to frontend/trademaster_ui/package.json yet (Phase 2 Wave A feature-dev task).

Fix: This is a feature-dev dependency, not an SRE fix. Check whether the feature-dev PR adding npm run build:cf has landed on main. If not, the workflow cannot deploy until it does. File an escalation to the feature-dev agent with the blocker.

Verification: cat frontend/trademaster_ui/package.json | python3 -c "import sys,json; s=json.load(sys.stdin); print(s.get('scripts',{}).get('build:cf','MISSING'))" returns the build command, not MISSING.


Phase 3 — production DNS cutover

The raxx-prod-next project is provisioned and receives no public traffic. The DNS cutover from the existing raxx-app CF Pages project (serving raxx.app) to raxx-prod-next is tracked in #2883.

Steps at cutover time (do NOT run now — Phase 3 only):

  1. Attach raxx.app as a custom domain to raxx-prod-next.
  2. Remove raxx.app custom domain from the existing raxx-app project.
  3. Verify DNS propagation via dig raxx.app CNAME.
  4. Run smoke tests against https://raxx.app.
  5. Update this runbook with the cutover date.

Emergency stop

To take staging-nextjs.raxx.app offline:

# Delete the custom domain from the CF Pages project
TOKEN=$(infisical secrets get CF_PAGES_DEPLOY \
  --path /MooseQuest/cloudflare/ --env prod --plain)
ACCOUNT_ID=$(infisical secrets get CLOUDFLARE_ACCOUNT_ID \
  --path /MooseQuest/cloudflare/ --env prod --plain)

curl -sS -X DELETE \
  -H "Authorization: Bearer ${TOKEN}" \
  "https://api.cloudflare.com/client/v4/accounts/${ACCOUNT_ID}/pages/projects/raxx-staging-next/domains/staging-nextjs.raxx.app"

This removes the custom domain; raxx-staging-next.pages.dev remains accessible.

To remove a production deployment (before Phase 3 cutover, raxx-prod-next serves no traffic — nothing to stop).


Escalation

Escalate to operator when: - CF_PAGES_DEPLOY or CLOUDFLARE_ACCESS_MGMT_TOKEN needs re-minting (requires 2FA/dashboard access). - The raxx-prod-next project needs to be deleted and recreated. - Phase 3 DNS cutover is being executed (operator must be present). - A new CF Pages project needs to be provisioned for a surface not covered by this runbook.


References