Raxx · internal docs

internal · gated

ADR 0106 — Antlers Next.js production cutover strategy

Status: Accepted Date: 2026-05-27 UTC Deciders: Kristerpher (operator); software-architect Scope: raxx.app — CF Pages production alias cutover from CRA (raxx-app project) to Next.js (raxx-prod-next project) Parent issue: #2883 Epic: #2872 Refs: ADR-0105, ADR-0105-addendum-phase0, docs/architecture/adr/0105-cf-pages-compat-audit-2026-05-27.md


TL;DR verdict

Recommendation: Strategy A — hard DNS swap (CNAME re-point), with CF Pages deployment-alias rollback path.

Rationale: for a single-operator, personal-use-first launch posture with CF Access on the domain and no real user traffic, the incremental complexity of Strategy B (path-based routing) and Strategy C (CF Worker canary) buys nothing. The WebAuthn RP ID invariant is safest under Strategy A because the RP ID (raxx.app) never migrates — only which CF Pages project answers the DNS CNAME changes. Rollback is a one-line CF Pages alias update, not a DNS TTL wait, because the CNAME is already pointing at raxx-prod-next.pages.dev; rollback means publishing a new deployment in raxx-prod-next that serves the pinned CRA build, not changing DNS again.


Background

Current state

Target state

What is NOT changing


Invariants (restated)

  1. WebAuthn RP ID = raxx.app is inviolate. Any cutover that changes the effective origin seen by navigator.credentials.create() or .get() disqualifies enrolled passkeys. The RP ID must remain raxx.app before, during, and after cutover. Strategies are evaluated against this constraint first.
  2. No stored credentials — no action required, frontend migration does not touch credential storage.
  3. CF Access gate stays on during personal-use posture — cutover must preserve the CF Access application binding at all times.
  4. Audit trail — the cutover itself is a deployment action; the operator performs it with an explicit workflow_dispatch; the GitHub Actions log is the audit record.
  5. Paper-first gating — no live-trading code paths exist in Next.js Phase 3; this invariant is already satisfied by the application logic, not the DNS layer.

Strategy A — Hard DNS swap (CF Pages CNAME re-point)

Description

The production CNAME for raxx.app is updated to point at the raxx-prod-next CF Pages project. The CRA build in raxx-app is retained as a dormant project. Rollback means publishing the CRA build as a deployment under raxx-prod-next (using CF Pages' deployment history, not a DNS change), because by rollback time the CNAME already targets raxx-prod-next.

Cutover sequence

Step Action Operator decision point?
0 Phase 2 gate: all Playwright E2E tests green on staging (raxx-staging-next) No — automated gate
1 Confirm raxx-prod-next CF Pages project exists and has a successful build No — CI gate
2 Record the current CRA deployment ID from raxx-app CF Pages history (for rollback documentation) No — automated in workflow
3 Attach raxx.app custom domain to raxx-prod-next CF Pages project via CF Pages API No — automated
4 In Cloudflare DNS: update CNAME raxx.appraxx-prod-next.pages.dev (proxied) YES — operator runs workflow_dispatch on deploy-antlers-cutover.yml
5 Verify CF Access application binding now points to raxx-prod-next custom domain No — automated health check
6 Run post-cutover smoke (#2884) — 5-minute delay after Step 4 No — automated workflow_run trigger
7 Remove raxx.app custom domain from raxx-app CF Pages project No — automated (avoids CF Pages "duplicate custom domain" error)
8 Begin 72-hour Sentry soak window No — monitoring alert configured in Phase 2
9 At T+14d: decommission sub-card (#2885) — retire frontend/trademaster_ui/ YES — operator files/approves retirement card

DNS state at each phase

Phase raxx.app CNAME target Active CF Pages project CRA state
Pre-cutover (current) raxx-app.pages.dev raxx-app (CRA build) Active
During CF Pages domain attach (Step 3) raxx-app.pages.dev Both projects have raxx.app temporarily CF warns; resolve by removing from raxx-app immediately after
Post-cutover (Steps 4–7) raxx-prod-next.pages.dev raxx-prod-next (Next.js) Dormant project, CNAME removed
Rollback (if needed) raxx-prod-next.pages.dev raxx-prod-next serving pinned CRA build Restored via CF Pages deployment pin
T+14d (decommission) raxx-prod-next.pages.dev raxx-prod-next (Next.js, permanent) raxx-app project archived

Important: Because Cloudflare proxies the CNAME (proxied: true), the CNAME update propagates at Cloudflare's edge in under 60 seconds. There is no 24-hour TTL propagation delay for proxied CF records. The "TTL propagation delay" concern (listed as a Strategy A con in the brief) does not apply to proxied CF DNS records.

Rollback path

  1. In CF Pages, navigate to raxx-prod-next → Deployments
  2. Find the deployment tagged cra-rollback-alias (pinned before cutover)
  3. Click "Set as active deployment"
  4. CF Pages begins serving the CRA build from raxx-prod-next immediately (alias update, not DNS change)
  5. Time to restore: under 2 minutes. No DNS change required.
  6. No WebAuthn impact: RP ID raxx.app was never changed; the same CNAME target is used.

What is lost on rollback: any user sessions created during the Next.js soak window remain valid (session cookie domain .raxx.app is shared). Next.js-specific data stores (if any are added during Phase 3) must be evaluated at rollback time. Phase 3 does not add new data stores — it ports existing pages — so no data is lost on rollback.

WebAuthn invariant

Strategy A never changes the RP ID. The RP ID is validated by the browser against window.location.origin (which is https://raxx.app regardless of which CF Pages project is behind the CNAME). Pre-cutover passkeys enrolled against raxx.app continue to work post-cutover without re-enrollment. There is no RP ID migration, no cross-origin redirect during the credential ceremony, and no change to Raptor's WEBAUTHN_RP_ID env var.

CF Access policy state

CF Access application for raxx.app binds to the hostname, not the CF Pages project. When the CNAME re-points to raxx-prod-next, the CF Access policy continues to evaluate all requests for raxx.app before they reach the origin. No policy change is required. Verify Step 5 confirms the policy is applied; if CF Access requires re-attachment to the new CF Pages project (it typically does not for hostname-based policies), the workflow handles it.

The session cookie is set by Raptor with Domain=.raxx.app; SameSite=None; Secure; HttpOnly. The Next.js middleware at raxx.app reads this cookie via request.cookies.get('session'). The CRA app also reads this cookie. Because both apps are served from raxx.app with the same session cookie domain, there is no cookie migration. Sessions created under CRA remain valid under Next.js and vice versa. No SameSite policy conflict exists because both apps serve from the same origin (https://raxx.app).

Risk score

Dimension Score (1=low, 5=high) Notes
Blast radius 2 Atomic — either the DNS CNAME points to Next.js or it doesn't. No partial state.
Rollback speed 1 CF Pages alias pin, under 2 minutes, no DNS change
Test surface 2 Single app in production; all tests run against raxx.app directly
Operational complexity 2 One workflow_dispatch, automated health check, one CF Pages alias operation

Strategy B — Path-based routing (page-by-page migration)

Description

raxx.app keeps its current CNAME at raxx-app (CRA). Individual routes are moved to the Next.js app by routing specific paths to raxx-prod-next via a CF _redirects or CF Worker rule. For example, /loginraxx-prod-next.pages.dev/login, rest → CRA.

Cutover sequence

  1. For each page to migrate: update CF _redirects in raxx-app to proxy that path to raxx-prod-next
  2. Validate each page individually
  3. Migrate pages one-by-one over weeks until all paths are in Next.js
  4. Final step: re-point the CNAME entirely (same as Strategy A at completion)

WebAuthn invariant — DISQUALIFYING CONCERN

The WebAuthn ceremony initiates from https://raxx.app and requires the JS context to be served from raxx.app. If the /login page is proxied from raxx-prod-next while the overall app still serves from raxx-app, the window.location.origin seen by the passkey ceremony may differ depending on how the proxy is implemented:

The proxy variant avoids the hard disqualification, but introduces dual auth state: the CRA app and the Next.js app both have active route guards and session cookie consumers. A user on /login (Next.js) who then navigates to /dashboard (still CRA until migrated) hits two different route guard implementations. Session cookie sharing works (same domain), but:

Risk score

Dimension Score (1=low, 5=high) Notes
Blast radius 3 Gradual, but any page migration failure affects only that page
Rollback speed 3 Must roll back individual route proxies; state is spread across pages
Test surface 5 Two apps, two auth states, two Sentry contexts, split CE port in flight
Operational complexity 5 Weeks of dual-app maintenance; CE skin must stay in sync across both

Verdict: Strategy B is not recommended. The dual-auth-state and dual-test-surface complexity is not justified for a solo operator with no real user traffic. The WebAuthn ceremony risk (redirect variant is outright disqualified; proxy variant is safe but adds test surface complexity with no benefit) makes B a worse choice than A in every dimension except "blast-radius per deployment" — a benefit that does not matter when there are no users to blast.


Strategy C — CF Worker canary router

Description

A CF Worker is deployed at raxx.app that inspects an opt-in cookie (raxx-next=1) or a canary cohort (1% → 10% → 50% → 100%) and proxies requests to either raxx-app.pages.dev (CRA) or raxx-prod-next.pages.dev (Next.js) based on the cohort bucket.

Cutover sequence

  1. Deploy CF Worker at raxx.app (replaces the CF Pages → Pages direct serving model)
  2. Configure canary: raxx-next=1 cookie → Next.js; default → CRA
  3. Set cohort to 1% via Worker KV or env var
  4. Monitor Sentry for Next.js error rate vs. CRA baseline
  5. Ramp cohort 1% → 10% → 50% → 100%
  6. At 100%, retire the Worker and re-point to CF Pages directly (same as Strategy A)

WebAuthn invariant — DISQUALIFYING CONCERN

The CF Worker proxies both apps from behind raxx.app. The window.location.origin seen by the WebAuthn ceremony is https://raxx.app regardless of which backend app is serving the request. At first glance, this preserves the RP ID. However:

A user in the 1% Next.js cohort who initiates a passkey registration ceremony and then (due to a Worker canary flip, a cookie expiry, or a page reload) falls back to the CRA cohort mid-ceremony will encounter a broken ceremony state. The in-progress WebAuthn ceremony is stateful (a challenge is generated by Raptor, stored server-side for the in-flight ceremony). The ceremony completion must reach the same Raptor endpoint regardless of which frontend served the initiation. This is already true in the current design (Raptor owns the ceremony, not the frontend), so the risk is limited to UX inconsistency rather than credential breakage.

The more substantive concern: two React applications simultaneously serving raxx.app creates dual session state. The session cookie is shared (correct), but the Next.js middleware.ts route guard and the CRA RouteGuard.js are simultaneously evaluating that cookie for different users. If a user in the Next.js 10% cohort visits /dashboard, the Next.js middleware handles their session. If a Worker canary flip moves them to the CRA cohort on their next request, the CRA route guard handles their session. Both work, but:

Risk score

Dimension Score (1=low, 5=high) Notes
Blast radius 1 Canary at 1% means only 1% of traffic hits Next.js at a time
Rollback speed 2 Set cohort to 0% via Worker KV; Worker itself stays deployed
Test surface 4 Dual apps, dual session handling, Worker as new failure surface
Operational complexity 4 CF Worker implementation, KV canary state, cohort ramp automation

Verdict: Strategy C is not recommended for this launch posture. The canary benefit (real production traffic exercise at low blast radius) is irrelevant when the user population is one operator. The Worker is a net addition to the operational surface. When real customer traffic exists, Strategy C becomes more attractive as a future migration pattern — but by that point, the CRA-to-Next.js migration will be complete.


Decision matrix

Dimension A: Hard DNS swap B: Path-based routing C: CF Worker canary
Blast radius (1=low) 2 3 1
Rollback speed (1=fast) 1 3 2
Test surface (1=small) 2 5 4
Operational complexity (1=simple) 2 5 4
WebAuthn RP ID safety Full Partial (proxy variant only) Partial (dual-session risk)
CF Access continuity Full (hostname policy, unchanged) Partial (dual project complexity) Requires Worker CF Access bypass
Cookie / session safety Full (same domain, no migration) Full (same domain) Full (same domain)
Total (lower=better) 7 16 11

Recommendation

Strategy A. The CNAME re-point is atomic, reversible in under 2 minutes via CF Pages deployment alias, and fully preserves the WebAuthn RP ID invariant. The "no gradual rollout" con is immaterial: the personal-use launch posture has no user traffic to blast. The DNS TTL concern does not apply to Cloudflare proxied records (sub-60-second propagation).

Caveats

  1. Phase 2 gate is non-negotiable. Do not execute Strategy A until the Playwright E2E suite passes on raxx-staging-next. A failed cutover to a broken Next.js build is the only real risk in Strategy A, and Phase 2 eliminates it.

  2. CF Pages domain conflict window. Between Step 3 (attach raxx.app to raxx-prod-next) and Step 7 (detach from raxx-app), CF Pages may warn about a duplicate custom domain. The workflow should handle detach before attach, or handle the overlap with a retry. See the sub-card for deploy-antlers-cutover.yml implementation details.

  3. CF Access re-verification. CF Access hostname policies are typically transparent to CF Pages project changes. However, verify in Step 5 that the CF Access application for raxx.app still evaluates correctly against the new CF Pages project. If CF Access is bound to the Pages project (not just the hostname), it may require re-attachment.

  4. Sentry project separation. The Next.js app should use a distinct Sentry project (antlers-nextjs, not the CRA project). This ensures error rate baselines are not contaminated during the soak window. The post-cutover smoke card (#2884) gates on this.


Rollout milestones

Milestone Condition Action
T-0 (cutover) Phase 2 Playwright E2E green on staging; operator runs workflow_dispatch on deploy-antlers-cutover.yml CNAME re-pointed; CF Access verified; smoke test runs
T+0:05 Post-cutover smoke passes Begin Sentry soak window; alert threshold active (3x 7-day baseline → ops@raxx.app)
T+24h Sentry error rate within 2x baseline No action required; continue soak
T+72h Sentry error rate within 2x baseline; operator confirms no regressions Soak complete; CRA rollback alias retained but soak period over
T+14d Operator approves retirement File #2885 (retire frontend/trademaster_ui/) — see sub-card
T+14d+ raxx-app CF Pages project archived raxx-app project marked inactive; deploy-antlers.yml updated to target raxx-prod-next only

Migrations

No schema migrations. No database changes. No Raptor changes. The session cookie shape is unchanged.

One CF Pages infrastructure change: attach raxx.app custom domain to raxx-prod-next, detach from raxx-app. This is reversible at any point before T+14d.


Security considerations


Open questions

The following require explicit operator input before sub-cards #2883 and #2884 are dispatched:

  1. CF Access binding model: Does the current CF Access application for raxx.app bind to the hostname only, or is it also bound to the CF Pages project name? If project-bound, Step 5 (re-verification) needs an explicit re-attach step in the cutover workflow. The operator should check CF Zero Trust dashboard: Access > Applications > raxx.app → confirm "Application domain" is set to raxx.app (not raxx-app.pages.dev). If it is set to the .pages.dev URL, the policy must be updated before cutover.

  2. raxx-prod-next CF Pages project: Does this project already exist (from Phase 2 work on #2882), or does the cutover workflow need to create it? The cutover card (#2883) should not create a new CF Pages project on the fly during a production cutover — project creation is a Phase 2 sub-card.

  3. Sentry project for Next.js: Has antlers-nextjs been created in Sentry? The post-cutover smoke (#2884) references it. If not created, it should be done in Phase 2.

  4. Staging Next.js hostname: The cutover plan assumes raxx-staging-next CF Pages project serves a staging URL (e.g., staging-next.raxx.app or raxx-staging-next.pages.dev). Phase 2 (#2878–#2880) must confirm this before Phase 3 can proceed.


Language choice rationale

Not applicable. This ADR governs a deployment cutover strategy for a frontend surface. No new service is introduced.


Consequences

Positive

Negative / risks

Neutral


Alternatives considered

Strategy B — Path-based routing

Rejected because: dual auth state, dual Sentry context, and split CE port create test surface complexity that cannot be justified for a solo operator with zero customer traffic. The WebAuthn ceremony is further complicated by the proxy/redirect ambiguity. The operational drag over weeks of dual-app maintenance exceeds the blast-radius benefit.

Strategy C — CF Worker canary

Rejected because: the canary benefit is irrelevant at zero customer traffic; the CF Worker is a new operational surface with no offsetting benefit at this launch posture. Strategy C is revisitable when Raxx has real user traffic that justifies a gradual ramp.


Security / GDPR checklist


Revisit when