ADR 0106 — Antlers Next.js production cutover strategy

Status: Accepted Date: 2026-05-27 UTC Deciders: Kristerpher (operator); software-architect Scope: raxx.app — CF Pages production alias cutover from CRA (raxx-app project) to Next.js (raxx-prod-next project) Parent issue: #2883 Epic: #2872 Refs: ADR-0105, ADR-0105-addendum-phase0, docs/architecture/adr/0105-cf-pages-compat-audit-2026-05-27.md

TL;DR verdict

Recommendation: Strategy A — hard DNS swap (CNAME re-point), with CF Pages deployment-alias rollback path.

Rationale: for a single-operator, personal-use-first launch posture with CF Access on the domain and no real user traffic, the incremental complexity of Strategy B (path-based routing) and Strategy C (CF Worker canary) buys nothing. The WebAuthn RP ID invariant is safest under Strategy A because the RP ID (raxx.app) never migrates — only which CF Pages project answers the DNS CNAME changes. Rollback is a one-line CF Pages alias update, not a DNS TTL wait, because the CNAME is already pointing at raxx-prod-next.pages.dev; rollback means publishing a new deployment in raxx-prod-next that serves the pinned CRA build, not changing DNS again.

Background

Current state

raxx.app DNS CNAME → raxx-app.pages.dev (proxied via Cloudflare, raxx.app zone)
CF Pages project raxx-app serves the CRA static build of frontend/trademaster_ui/
CF Access policy gates raxx.app (personal-use / operator-testing posture per project_launch_posture_personal_use)
WebAuthn RP ID is raxx.app — locked per project_webauthn_rp_id_raxx_app and ADR-0005; passkeys are already enrolled against this origin
Deploy workflow: deploy-antlers.yml pushes CRA builds to raxx-app CF Pages project

Target state

raxx.app DNS CNAME → raxx-prod-next.pages.dev (same zone, same CF Access policy)
CF Pages project raxx-prod-next serves the Next.js build of frontend/raxx-next/
CRA build pinned as a named alias under raxx-prod-next for 14 days (rollback path)
Deploy workflow updated: deploy-antlers.yml builds from frontend/raxx-next/ and deploys to raxx-prod-next
raxx-app CF Pages project retained but dormant (its CNAME removed, not the project)

What is NOT changing

raxx.app hostname — stays exactly as is
WebAuthn RP ID — raxx.app — unchanged throughout; no credential re-enrollment required
CF Access policy — same policy, same gate; moves to point at raxx-prod-next custom domain
api.raxx.app (Raptor backend) — unchanged; session cookie domain .raxx.app is already correct
Oracle Dyn DNS for moosequest.net — not involved; per feedback_dyndns_stays

Invariants (restated)

WebAuthn RP ID = raxx.app is inviolate. Any cutover that changes the effective origin seen by navigator.credentials.create() or .get() disqualifies enrolled passkeys. The RP ID must remain raxx.app before, during, and after cutover. Strategies are evaluated against this constraint first.
No stored credentials — no action required, frontend migration does not touch credential storage.
CF Access gate stays on during personal-use posture — cutover must preserve the CF Access application binding at all times.
Audit trail — the cutover itself is a deployment action; the operator performs it with an explicit workflow_dispatch; the GitHub Actions log is the audit record.
Paper-first gating — no live-trading code paths exist in Next.js Phase 3; this invariant is already satisfied by the application logic, not the DNS layer.

Strategy A — Hard DNS swap (CF Pages CNAME re-point)

Description

The production CNAME for raxx.app is updated to point at the raxx-prod-next CF Pages project. The CRA build in raxx-app is retained as a dormant project. Rollback means publishing the CRA build as a deployment under raxx-prod-next (using CF Pages' deployment history, not a DNS change), because by rollback time the CNAME already targets raxx-prod-next.

Cutover sequence

Step	Action	Operator decision point?
0	Phase 2 gate: all Playwright E2E tests green on staging (`raxx-staging-next`)	No — automated gate
1	Confirm `raxx-prod-next` CF Pages project exists and has a successful build	No — CI gate
2	Record the current CRA deployment ID from `raxx-app` CF Pages history (for rollback documentation)	No — automated in workflow
3	Attach `raxx.app` custom domain to `raxx-prod-next` CF Pages project via CF Pages API	No — automated
4	In Cloudflare DNS: update CNAME `raxx.app` → `raxx-prod-next.pages.dev` (proxied)	YES — operator runs `workflow_dispatch` on `deploy-antlers-cutover.yml`
5	Verify CF Access application binding now points to `raxx-prod-next` custom domain	No — automated health check
6	Run post-cutover smoke (#2884) — 5-minute delay after Step 4	No — automated `workflow_run` trigger
7	Remove `raxx.app` custom domain from `raxx-app` CF Pages project	No — automated (avoids CF Pages "duplicate custom domain" error)
8	Begin 72-hour Sentry soak window	No — monitoring alert configured in Phase 2
9	At T+14d: decommission sub-card (#2885) — retire `frontend/trademaster_ui/`	YES — operator files/approves retirement card

DNS state at each phase

Phase	`raxx.app` CNAME target	Active CF Pages project	CRA state
Pre-cutover (current)	`raxx-app.pages.dev`	`raxx-app` (CRA build)	Active
During CF Pages domain attach (Step 3)	`raxx-app.pages.dev`	Both projects have `raxx.app` temporarily	CF warns; resolve by removing from `raxx-app` immediately after
Post-cutover (Steps 4–7)	`raxx-prod-next.pages.dev`	`raxx-prod-next` (Next.js)	Dormant project, CNAME removed
Rollback (if needed)	`raxx-prod-next.pages.dev`	`raxx-prod-next` serving pinned CRA build	Restored via CF Pages deployment pin
T+14d (decommission)	`raxx-prod-next.pages.dev`	`raxx-prod-next` (Next.js, permanent)	`raxx-app` project archived

Important: Because Cloudflare proxies the CNAME (proxied: true), the CNAME update propagates at Cloudflare's edge in under 60 seconds. There is no 24-hour TTL propagation delay for proxied CF records. The "TTL propagation delay" concern (listed as a Strategy A con in the brief) does not apply to proxied CF DNS records.

Rollback path

In CF Pages, navigate to raxx-prod-next → Deployments
Find the deployment tagged cra-rollback-alias (pinned before cutover)
Click "Set as active deployment"
CF Pages begins serving the CRA build from raxx-prod-next immediately (alias update, not DNS change)
Time to restore: under 2 minutes. No DNS change required.
No WebAuthn impact: RP ID raxx.app was never changed; the same CNAME target is used.

What is lost on rollback: any user sessions created during the Next.js soak window remain valid (session cookie domain .raxx.app is shared). Next.js-specific data stores (if any are added during Phase 3) must be evaluated at rollback time. Phase 3 does not add new data stores — it ports existing pages — so no data is lost on rollback.

WebAuthn invariant

Strategy A never changes the RP ID. The RP ID is validated by the browser against window.location.origin (which is https://raxx.app regardless of which CF Pages project is behind the CNAME). Pre-cutover passkeys enrolled against raxx.app continue to work post-cutover without re-enrollment. There is no RP ID migration, no cross-origin redirect during the credential ceremony, and no change to Raptor's WEBAUTHN_RP_ID env var.

CF Access policy state

CF Access application for raxx.app binds to the hostname, not the CF Pages project. When the CNAME re-points to raxx-prod-next, the CF Access policy continues to evaluate all requests for raxx.app before they reach the origin. No policy change is required. Verify Step 5 confirms the policy is applied; if CF Access requires re-attachment to the new CF Pages project (it typically does not for hostname-based policies), the workflow handles it.

The session cookie is set by Raptor with Domain=.raxx.app; SameSite=None; Secure; HttpOnly. The Next.js middleware at raxx.app reads this cookie via request.cookies.get('session'). The CRA app also reads this cookie. Because both apps are served from raxx.app with the same session cookie domain, there is no cookie migration. Sessions created under CRA remain valid under Next.js and vice versa. No SameSite policy conflict exists because both apps serve from the same origin (https://raxx.app).

Risk score

Dimension	Score (1=low, 5=high)	Notes
Blast radius	2	Atomic — either the DNS CNAME points to Next.js or it doesn't. No partial state.
Rollback speed	1	CF Pages alias pin, under 2 minutes, no DNS change
Test surface	2	Single app in production; all tests run against `raxx.app` directly
Operational complexity	2	One `workflow_dispatch`, automated health check, one CF Pages alias operation

Strategy B — Path-based routing (page-by-page migration)

Description

raxx.app keeps its current CNAME at raxx-app (CRA). Individual routes are moved to the Next.js app by routing specific paths to raxx-prod-next via a CF _redirects or CF Worker rule. For example, /login → raxx-prod-next.pages.dev/login, rest → CRA.

Cutover sequence

For each page to migrate: update CF _redirects in raxx-app to proxy that path to raxx-prod-next
Validate each page individually
Migrate pages one-by-one over weeks until all paths are in Next.js
Final step: re-point the CNAME entirely (same as Strategy A at completion)

WebAuthn invariant — DISQUALIFYING CONCERN

The WebAuthn ceremony initiates from https://raxx.app and requires the JS context to be served from raxx.app. If the /login page is proxied from raxx-prod-next while the overall app still serves from raxx-app, the window.location.origin seen by the passkey ceremony may differ depending on how the proxy is implemented:

CF _redirects (transparent proxy / "pass-through"): The browser address bar still shows https://raxx.app/login. window.location.origin is https://raxx.app. RP ID check passes. However, this is CF Pages _redirects proxy behavior, which serves raxx-prod-next content at the raxx.app origin — effectively identical to Strategy A on a per-route basis, but with the added complexity of maintaining dual auth state during the transition.
CF _redirects (302 redirect): The browser redirects to raxx-prod-next.pages.dev/login. RP ID would mismatch (raxx.app vs raxx-prod-next.pages.dev) — this disqualifies the redirect variant entirely for any route that runs the WebAuthn ceremony.

The proxy variant avoids the hard disqualification, but introduces dual auth state: the CRA app and the Next.js app both have active route guards and session cookie consumers. A user on /login (Next.js) who then navigates to /dashboard (still CRA until migrated) hits two different route guard implementations. Session cookie sharing works (same domain), but:

Analytics events come from two different Sentry projects
Two separate React context trees initialize independently on each navigation
Error boundaries and loading states are inconsistent between apps
The feature flag system is potentially inconsistent (flags read from process.env in Next.js middleware vs. localStorage in CRA)
The CE visual skin must be kept identical in both apps simultaneously during the transition

Risk score

Dimension	Score (1=low, 5=high)	Notes
Blast radius	3	Gradual, but any page migration failure affects only that page
Rollback speed	3	Must roll back individual route proxies; state is spread across pages
Test surface	5	Two apps, two auth states, two Sentry contexts, split CE port in flight
Operational complexity	5	Weeks of dual-app maintenance; CE skin must stay in sync across both

Verdict: Strategy B is not recommended. The dual-auth-state and dual-test-surface complexity is not justified for a solo operator with no real user traffic. The WebAuthn ceremony risk (redirect variant is outright disqualified; proxy variant is safe but adds test surface complexity with no benefit) makes B a worse choice than A in every dimension except "blast-radius per deployment" — a benefit that does not matter when there are no users to blast.

Strategy C — CF Worker canary router

Description

A CF Worker is deployed at raxx.app that inspects an opt-in cookie (raxx-next=1) or a canary cohort (1% → 10% → 50% → 100%) and proxies requests to either raxx-app.pages.dev (CRA) or raxx-prod-next.pages.dev (Next.js) based on the cohort bucket.

Cutover sequence

Deploy CF Worker at raxx.app (replaces the CF Pages → Pages direct serving model)
Configure canary: raxx-next=1 cookie → Next.js; default → CRA
Set cohort to 1% via Worker KV or env var
Monitor Sentry for Next.js error rate vs. CRA baseline
Ramp cohort 1% → 10% → 50% → 100%
At 100%, retire the Worker and re-point to CF Pages directly (same as Strategy A)

WebAuthn invariant — DISQUALIFYING CONCERN

The CF Worker proxies both apps from behind raxx.app. The window.location.origin seen by the WebAuthn ceremony is https://raxx.app regardless of which backend app is serving the request. At first glance, this preserves the RP ID. However:

A user in the 1% Next.js cohort who initiates a passkey registration ceremony and then (due to a Worker canary flip, a cookie expiry, or a page reload) falls back to the CRA cohort mid-ceremony will encounter a broken ceremony state. The in-progress WebAuthn ceremony is stateful (a challenge is generated by Raptor, stored server-side for the in-flight ceremony). The ceremony completion must reach the same Raptor endpoint regardless of which frontend served the initiation. This is already true in the current design (Raptor owns the ceremony, not the frontend), so the risk is limited to UX inconsistency rather than credential breakage.

The more substantive concern: two React applications simultaneously serving raxx.app creates dual session state. The session cookie is shared (correct), but the Next.js middleware.ts route guard and the CRA RouteGuard.js are simultaneously evaluating that cookie for different users. If a user in the Next.js 10% cohort visits /dashboard, the Next.js middleware handles their session. If a Worker canary flip moves them to the CRA cohort on their next request, the CRA route guard handles their session. Both work, but:

Analytics diverge between cohorts
The CE visual port must be complete before ramping cohort, otherwise some users see un-skinned CRA pages while others see CE-skinned Next.js pages
The Worker itself is a new operational surface that must be maintained, tested, and eventually removed
For a zero-customer launch (personal-use only), the Worker adds cost and complexity with no measurable blast-radius benefit over Strategy A + CF Pages alias rollback

Risk score

Dimension	Score (1=low, 5=high)	Notes
Blast radius	1	Canary at 1% means only 1% of traffic hits Next.js at a time
Rollback speed	2	Set cohort to 0% via Worker KV; Worker itself stays deployed
Test surface	4	Dual apps, dual session handling, Worker as new failure surface
Operational complexity	4	CF Worker implementation, KV canary state, cohort ramp automation

Verdict: Strategy C is not recommended for this launch posture. The canary benefit (real production traffic exercise at low blast radius) is irrelevant when the user population is one operator. The Worker is a net addition to the operational surface. When real customer traffic exists, Strategy C becomes more attractive as a future migration pattern — but by that point, the CRA-to-Next.js migration will be complete.

Decision matrix

Dimension	A: Hard DNS swap	B: Path-based routing	C: CF Worker canary
Blast radius (1=low)	2	3	1
Rollback speed (1=fast)	1	3	2
Test surface (1=small)	2	5	4
Operational complexity (1=simple)	2	5	4
WebAuthn RP ID safety	Full	Partial (proxy variant only)	Partial (dual-session risk)
CF Access continuity	Full (hostname policy, unchanged)	Partial (dual project complexity)	Requires Worker CF Access bypass
Cookie / session safety	Full (same domain, no migration)	Full (same domain)	Full (same domain)
Total (lower=better)	7	16	11

Recommendation

Strategy A. The CNAME re-point is atomic, reversible in under 2 minutes via CF Pages deployment alias, and fully preserves the WebAuthn RP ID invariant. The "no gradual rollout" con is immaterial: the personal-use launch posture has no user traffic to blast. The DNS TTL concern does not apply to Cloudflare proxied records (sub-60-second propagation).

Caveats

Phase 2 gate is non-negotiable. Do not execute Strategy A until the Playwright E2E suite passes on raxx-staging-next. A failed cutover to a broken Next.js build is the only real risk in Strategy A, and Phase 2 eliminates it.
CF Pages domain conflict window. Between Step 3 (attach raxx.app to raxx-prod-next) and Step 7 (detach from raxx-app), CF Pages may warn about a duplicate custom domain. The workflow should handle detach before attach, or handle the overlap with a retry. See the sub-card for deploy-antlers-cutover.yml implementation details.
CF Access re-verification. CF Access hostname policies are typically transparent to CF Pages project changes. However, verify in Step 5 that the CF Access application for raxx.app still evaluates correctly against the new CF Pages project. If CF Access is bound to the Pages project (not just the hostname), it may require re-attachment.
Sentry project separation. The Next.js app should use a distinct Sentry project (antlers-nextjs, not the CRA project). This ensures error rate baselines are not contaminated during the soak window. The post-cutover smoke card (#2884) gates on this.

Rollout milestones

Milestone	Condition	Action
T-0 (cutover)	Phase 2 Playwright E2E green on staging; operator runs `workflow_dispatch` on `deploy-antlers-cutover.yml`	CNAME re-pointed; CF Access verified; smoke test runs
T+0:05	Post-cutover smoke passes	Begin Sentry soak window; alert threshold active (3x 7-day baseline → `ops@raxx.app`)
T+24h	Sentry error rate within 2x baseline	No action required; continue soak
T+72h	Sentry error rate within 2x baseline; operator confirms no regressions	Soak complete; CRA rollback alias retained but soak period over
T+14d	Operator approves retirement	File #2885 (`retire frontend/trademaster_ui/`) — see sub-card
T+14d+	`raxx-app` CF Pages project archived	`raxx-app` project marked inactive; `deploy-antlers.yml` updated to target `raxx-prod-next` only

Migrations

No schema migrations. No database changes. No Raptor changes. The session cookie shape is unchanged.

One CF Pages infrastructure change: attach raxx.app custom domain to raxx-prod-next, detach from raxx-app. This is reversible at any point before T+14d.

Security considerations

WebAuthn RP ID invariant: preserved throughout; no action required.
Session cookies: domain .raxx.app; no change required. HttpOnly; Secure; SameSite=None settings stay in Raptor's Set-Cookie response — not in the frontend deploy.
CF Access gate: personal-use posture maintained. The gate is on the raxx.app hostname, not the CF Pages project name.
Sentry DSN: the Next.js Sentry project DSN is a NEXT_PUBLIC_SENTRY_DSN env var at CF Pages project level. It is a public, non-secret value (Sentry DSNs are designed to be client-visible). No rotation required.
Kill-switch: CF Pages deployment alias pin is the kill-switch. Rollback time < 2 minutes.
Breach notification path: unchanged. Raptor owns the audit trail. The frontend migration does not affect breach notification routing.

Open questions

The following require explicit operator input before sub-cards #2883 and #2884 are dispatched:

CF Access binding model: Does the current CF Access application for raxx.app bind to the hostname only, or is it also bound to the CF Pages project name? If project-bound, Step 5 (re-verification) needs an explicit re-attach step in the cutover workflow. The operator should check CF Zero Trust dashboard: Access > Applications > raxx.app → confirm "Application domain" is set to raxx.app (not raxx-app.pages.dev). If it is set to the .pages.dev URL, the policy must be updated before cutover.
raxx-prod-next CF Pages project: Does this project already exist (from Phase 2 work on #2882), or does the cutover workflow need to create it? The cutover card (#2883) should not create a new CF Pages project on the fly during a production cutover — project creation is a Phase 2 sub-card.
Sentry project for Next.js: Has antlers-nextjs been created in Sentry? The post-cutover smoke (#2884) references it. If not created, it should be done in Phase 2.
Staging Next.js hostname: The cutover plan assumes raxx-staging-next CF Pages project serves a staging URL (e.g., staging-next.raxx.app or raxx-staging-next.pages.dev). Phase 2 (#2878–#2880) must confirm this before Phase 3 can proceed.

Language choice rationale

Not applicable. This ADR governs a deployment cutover strategy for a frontend surface. No new service is introduced.

Consequences

Positive

Single app in production after cutover — no dual maintenance burden.
Route guards are server-side; the entire category of CRA race-condition bugs is eliminated.
WebAuthn passkeys continue to work without re-enrollment.
Rollback is faster than the current "re-run CRA deploy workflow" path (CF Pages alias pin vs. triggering a full build).

Negative / risks

The cutover is atomic — if Next.js has a bug not caught in Phase 2 E2E, all users (including the operator) see it immediately. Mitigation: Phase 2 gate is non-negotiable.
The 14-day soak window with CRA rollback alias available in CF Pages history assumes the CRA build stays pinned and not garbage-collected by CF Pages. CF Pages retains all prior deployments (they are not automatically pruned). Verify this assumption before cutover.

Neutral

The raxx-app CF Pages project remains as a dormant artifact for the soak window. It can be archived (not deleted) at T+14d.
deploy-antlers.yml will be updated to target raxx-prod-next as part of Phase 2 (#2882). The cutover plan assumes this is done before T-0.

Alternatives considered

Strategy B — Path-based routing

Rejected because: dual auth state, dual Sentry context, and split CE port create test surface complexity that cannot be justified for a solo operator with zero customer traffic. The WebAuthn ceremony is further complicated by the proxy/redirect ambiguity. The operational drag over weeks of dual-app maintenance exceeds the blast-radius benefit.

Strategy C — CF Worker canary

Rejected because: the canary benefit is irrelevant at zero customer traffic; the CF Worker is a new operational surface with no offsetting benefit at this launch posture. Strategy C is revisitable when Raxx has real user traffic that justifies a gradual ramp.

PII collected: No new PII. The frontend migration does not change Raptor's PII collection.
Retention period: No change. Raptor and Queue own all retention policies.
Deletion on DSR: No change. No PII stored in the frontend.
Audit trail: Cutover execution is recorded in GitHub Actions logs (operator-triggered workflow_dispatch). CF Pages deployment history is the infrastructure audit trail.
Stored credentials: None. The session cookie is opaque, server-issued, and never stored in JS state. The WebAuthn private key lives in the platform authenticator. This ADR does not introduce new credential storage.
Breach notification path: No change. Raptor/Queue owns breach notification. The frontend swap does not affect this path.
Secrets location + rotation: NEXT_PUBLIC_SENTRY_DSN is a CF Pages project env var (public, non-secret). No other secrets in the frontend. All secrets rotate without a redeploy (Raptor Heroku config-vars).
Kill-switch: CF Pages deployment alias pin. Rollback time < 2 minutes. No DNS change required for rollback.

Revisit when

When Raxx has real user traffic (>100 DAU): Strategy C (CF Worker canary) becomes worth the operational complexity. Reconsider at that traffic level.
When the Next.js → Next.js upgrade (v14 → v15+) becomes necessary: the CF Pages project swap model (A) remains the correct approach — deploy a new build, verify on staging, alias-pin the current build for rollback, then cutover.

ADR 0106 — Antlers Next.js production cutover strategy

TL;DR verdict

Background

Current state

Target state

What is NOT changing

Invariants (restated)

Strategy A — Hard DNS swap (CF Pages CNAME re-point)

Description

Cutover sequence

DNS state at each phase

Rollback path

WebAuthn invariant

CF Access policy state

Cookie + session state

Risk score

Strategy B — Path-based routing (page-by-page migration)

Description

Cutover sequence

WebAuthn invariant — DISQUALIFYING CONCERN

Risk score

Strategy C — CF Worker canary router

Description

Cutover sequence

WebAuthn invariant — DISQUALIFYING CONCERN

Risk score

Decision matrix

Recommendation

Caveats

Rollout milestones

Migrations

Security considerations

Open questions

Language choice rationale

Consequences

Positive

Negative / risks

Neutral

Alternatives considered

Strategy B — Path-based routing

Strategy C — CF Worker canary

Security / GDPR checklist

Revisit when