ADR 0106 — Antlers Next.js production cutover strategy
Status: Accepted
Date: 2026-05-27 UTC
Deciders: Kristerpher (operator); software-architect
Scope: raxx.app — CF Pages production alias cutover from CRA (raxx-app project) to Next.js (raxx-prod-next project)
Parent issue: #2883
Epic: #2872
Refs: ADR-0105, ADR-0105-addendum-phase0, docs/architecture/adr/0105-cf-pages-compat-audit-2026-05-27.md
TL;DR verdict
Recommendation: Strategy A — hard DNS swap (CNAME re-point), with CF Pages deployment-alias rollback path.
Rationale: for a single-operator, personal-use-first launch posture with CF Access on the domain and no real user traffic, the incremental complexity of Strategy B (path-based routing) and Strategy C (CF Worker canary) buys nothing. The WebAuthn RP ID invariant is safest under Strategy A because the RP ID (raxx.app) never migrates — only which CF Pages project answers the DNS CNAME changes. Rollback is a one-line CF Pages alias update, not a DNS TTL wait, because the CNAME is already pointing at raxx-prod-next.pages.dev; rollback means publishing a new deployment in raxx-prod-next that serves the pinned CRA build, not changing DNS again.
Background
Current state
raxx.appDNS CNAME →raxx-app.pages.dev(proxied via Cloudflare,raxx.appzone)- CF Pages project
raxx-appserves the CRA static build offrontend/trademaster_ui/ - CF Access policy gates
raxx.app(personal-use / operator-testing posture perproject_launch_posture_personal_use) - WebAuthn RP ID is
raxx.app— locked perproject_webauthn_rp_id_raxx_appand ADR-0005; passkeys are already enrolled against this origin - Deploy workflow:
deploy-antlers.ymlpushes CRA builds toraxx-appCF Pages project
Target state
raxx.appDNS CNAME →raxx-prod-next.pages.dev(same zone, same CF Access policy)- CF Pages project
raxx-prod-nextserves the Next.js build offrontend/raxx-next/ - CRA build pinned as a named alias under
raxx-prod-nextfor 14 days (rollback path) - Deploy workflow updated:
deploy-antlers.ymlbuilds fromfrontend/raxx-next/and deploys toraxx-prod-next raxx-appCF Pages project retained but dormant (its CNAME removed, not the project)
What is NOT changing
raxx.apphostname — stays exactly as is- WebAuthn RP ID —
raxx.app— unchanged throughout; no credential re-enrollment required - CF Access policy — same policy, same gate; moves to point at
raxx-prod-nextcustom domain api.raxx.app(Raptor backend) — unchanged; session cookie domain.raxx.appis already correct- Oracle Dyn DNS for
moosequest.net— not involved; perfeedback_dyndns_stays
Invariants (restated)
- WebAuthn RP ID =
raxx.appis inviolate. Any cutover that changes the effective origin seen bynavigator.credentials.create()or.get()disqualifies enrolled passkeys. The RP ID must remainraxx.appbefore, during, and after cutover. Strategies are evaluated against this constraint first. - No stored credentials — no action required, frontend migration does not touch credential storage.
- CF Access gate stays on during personal-use posture — cutover must preserve the CF Access application binding at all times.
- Audit trail — the cutover itself is a deployment action; the operator performs it with an explicit
workflow_dispatch; the GitHub Actions log is the audit record. - Paper-first gating — no live-trading code paths exist in Next.js Phase 3; this invariant is already satisfied by the application logic, not the DNS layer.
Strategy A — Hard DNS swap (CF Pages CNAME re-point)
Description
The production CNAME for raxx.app is updated to point at the raxx-prod-next CF Pages project. The CRA build in raxx-app is retained as a dormant project. Rollback means publishing the CRA build as a deployment under raxx-prod-next (using CF Pages' deployment history, not a DNS change), because by rollback time the CNAME already targets raxx-prod-next.
Cutover sequence
| Step | Action | Operator decision point? |
|---|---|---|
| 0 | Phase 2 gate: all Playwright E2E tests green on staging (raxx-staging-next) |
No — automated gate |
| 1 | Confirm raxx-prod-next CF Pages project exists and has a successful build |
No — CI gate |
| 2 | Record the current CRA deployment ID from raxx-app CF Pages history (for rollback documentation) |
No — automated in workflow |
| 3 | Attach raxx.app custom domain to raxx-prod-next CF Pages project via CF Pages API |
No — automated |
| 4 | In Cloudflare DNS: update CNAME raxx.app → raxx-prod-next.pages.dev (proxied) |
YES — operator runs workflow_dispatch on deploy-antlers-cutover.yml |
| 5 | Verify CF Access application binding now points to raxx-prod-next custom domain |
No — automated health check |
| 6 | Run post-cutover smoke (#2884) — 5-minute delay after Step 4 | No — automated workflow_run trigger |
| 7 | Remove raxx.app custom domain from raxx-app CF Pages project |
No — automated (avoids CF Pages "duplicate custom domain" error) |
| 8 | Begin 72-hour Sentry soak window | No — monitoring alert configured in Phase 2 |
| 9 | At T+14d: decommission sub-card (#2885) — retire frontend/trademaster_ui/ |
YES — operator files/approves retirement card |
DNS state at each phase
| Phase | raxx.app CNAME target |
Active CF Pages project | CRA state |
|---|---|---|---|
| Pre-cutover (current) | raxx-app.pages.dev |
raxx-app (CRA build) |
Active |
| During CF Pages domain attach (Step 3) | raxx-app.pages.dev |
Both projects have raxx.app temporarily |
CF warns; resolve by removing from raxx-app immediately after |
| Post-cutover (Steps 4–7) | raxx-prod-next.pages.dev |
raxx-prod-next (Next.js) |
Dormant project, CNAME removed |
| Rollback (if needed) | raxx-prod-next.pages.dev |
raxx-prod-next serving pinned CRA build |
Restored via CF Pages deployment pin |
| T+14d (decommission) | raxx-prod-next.pages.dev |
raxx-prod-next (Next.js, permanent) |
raxx-app project archived |
Important: Because Cloudflare proxies the CNAME (proxied: true), the CNAME update propagates at Cloudflare's edge in under 60 seconds. There is no 24-hour TTL propagation delay for proxied CF records. The "TTL propagation delay" concern (listed as a Strategy A con in the brief) does not apply to proxied CF DNS records.
Rollback path
- In CF Pages, navigate to
raxx-prod-next→ Deployments - Find the deployment tagged
cra-rollback-alias(pinned before cutover) - Click "Set as active deployment"
- CF Pages begins serving the CRA build from
raxx-prod-nextimmediately (alias update, not DNS change) - Time to restore: under 2 minutes. No DNS change required.
- No WebAuthn impact: RP ID
raxx.appwas never changed; the same CNAME target is used.
What is lost on rollback: any user sessions created during the Next.js soak window remain valid (session cookie domain .raxx.app is shared). Next.js-specific data stores (if any are added during Phase 3) must be evaluated at rollback time. Phase 3 does not add new data stores — it ports existing pages — so no data is lost on rollback.
WebAuthn invariant
Strategy A never changes the RP ID. The RP ID is validated by the browser against window.location.origin (which is https://raxx.app regardless of which CF Pages project is behind the CNAME). Pre-cutover passkeys enrolled against raxx.app continue to work post-cutover without re-enrollment. There is no RP ID migration, no cross-origin redirect during the credential ceremony, and no change to Raptor's WEBAUTHN_RP_ID env var.
CF Access policy state
CF Access application for raxx.app binds to the hostname, not the CF Pages project. When the CNAME re-points to raxx-prod-next, the CF Access policy continues to evaluate all requests for raxx.app before they reach the origin. No policy change is required. Verify Step 5 confirms the policy is applied; if CF Access requires re-attachment to the new CF Pages project (it typically does not for hostname-based policies), the workflow handles it.
Cookie + session state
The session cookie is set by Raptor with Domain=.raxx.app; SameSite=None; Secure; HttpOnly. The Next.js middleware at raxx.app reads this cookie via request.cookies.get('session'). The CRA app also reads this cookie. Because both apps are served from raxx.app with the same session cookie domain, there is no cookie migration. Sessions created under CRA remain valid under Next.js and vice versa. No SameSite policy conflict exists because both apps serve from the same origin (https://raxx.app).
Risk score
| Dimension | Score (1=low, 5=high) | Notes |
|---|---|---|
| Blast radius | 2 | Atomic — either the DNS CNAME points to Next.js or it doesn't. No partial state. |
| Rollback speed | 1 | CF Pages alias pin, under 2 minutes, no DNS change |
| Test surface | 2 | Single app in production; all tests run against raxx.app directly |
| Operational complexity | 2 | One workflow_dispatch, automated health check, one CF Pages alias operation |
Strategy B — Path-based routing (page-by-page migration)
Description
raxx.app keeps its current CNAME at raxx-app (CRA). Individual routes are moved to the Next.js app by routing specific paths to raxx-prod-next via a CF _redirects or CF Worker rule. For example, /login → raxx-prod-next.pages.dev/login, rest → CRA.
Cutover sequence
- For each page to migrate: update CF
_redirectsinraxx-appto proxy that path toraxx-prod-next - Validate each page individually
- Migrate pages one-by-one over weeks until all paths are in Next.js
- Final step: re-point the CNAME entirely (same as Strategy A at completion)
WebAuthn invariant — DISQUALIFYING CONCERN
The WebAuthn ceremony initiates from https://raxx.app and requires the JS context to be served from raxx.app. If the /login page is proxied from raxx-prod-next while the overall app still serves from raxx-app, the window.location.origin seen by the passkey ceremony may differ depending on how the proxy is implemented:
- CF
_redirects(transparent proxy / "pass-through"): The browser address bar still showshttps://raxx.app/login.window.location.originishttps://raxx.app. RP ID check passes. However, this is CF Pages_redirectsproxy behavior, which servesraxx-prod-nextcontent at theraxx.apporigin — effectively identical to Strategy A on a per-route basis, but with the added complexity of maintaining dual auth state during the transition. - CF
_redirects(302 redirect): The browser redirects toraxx-prod-next.pages.dev/login. RP ID would mismatch (raxx.appvsraxx-prod-next.pages.dev) — this disqualifies the redirect variant entirely for any route that runs the WebAuthn ceremony.
The proxy variant avoids the hard disqualification, but introduces dual auth state: the CRA app and the Next.js app both have active route guards and session cookie consumers. A user on /login (Next.js) who then navigates to /dashboard (still CRA until migrated) hits two different route guard implementations. Session cookie sharing works (same domain), but:
- Analytics events come from two different Sentry projects
- Two separate React context trees initialize independently on each navigation
- Error boundaries and loading states are inconsistent between apps
- The feature flag system is potentially inconsistent (flags read from
process.envin Next.js middleware vs.localStoragein CRA) - The CE visual skin must be kept identical in both apps simultaneously during the transition
Risk score
| Dimension | Score (1=low, 5=high) | Notes |
|---|---|---|
| Blast radius | 3 | Gradual, but any page migration failure affects only that page |
| Rollback speed | 3 | Must roll back individual route proxies; state is spread across pages |
| Test surface | 5 | Two apps, two auth states, two Sentry contexts, split CE port in flight |
| Operational complexity | 5 | Weeks of dual-app maintenance; CE skin must stay in sync across both |
Verdict: Strategy B is not recommended. The dual-auth-state and dual-test-surface complexity is not justified for a solo operator with no real user traffic. The WebAuthn ceremony risk (redirect variant is outright disqualified; proxy variant is safe but adds test surface complexity with no benefit) makes B a worse choice than A in every dimension except "blast-radius per deployment" — a benefit that does not matter when there are no users to blast.
Strategy C — CF Worker canary router
Description
A CF Worker is deployed at raxx.app that inspects an opt-in cookie (raxx-next=1) or a canary cohort (1% → 10% → 50% → 100%) and proxies requests to either raxx-app.pages.dev (CRA) or raxx-prod-next.pages.dev (Next.js) based on the cohort bucket.
Cutover sequence
- Deploy CF Worker at
raxx.app(replaces the CF Pages → Pages direct serving model) - Configure canary:
raxx-next=1cookie → Next.js; default → CRA - Set cohort to 1% via Worker KV or env var
- Monitor Sentry for Next.js error rate vs. CRA baseline
- Ramp cohort 1% → 10% → 50% → 100%
- At 100%, retire the Worker and re-point to CF Pages directly (same as Strategy A)
WebAuthn invariant — DISQUALIFYING CONCERN
The CF Worker proxies both apps from behind raxx.app. The window.location.origin seen by the WebAuthn ceremony is https://raxx.app regardless of which backend app is serving the request. At first glance, this preserves the RP ID. However:
A user in the 1% Next.js cohort who initiates a passkey registration ceremony and then (due to a Worker canary flip, a cookie expiry, or a page reload) falls back to the CRA cohort mid-ceremony will encounter a broken ceremony state. The in-progress WebAuthn ceremony is stateful (a challenge is generated by Raptor, stored server-side for the in-flight ceremony). The ceremony completion must reach the same Raptor endpoint regardless of which frontend served the initiation. This is already true in the current design (Raptor owns the ceremony, not the frontend), so the risk is limited to UX inconsistency rather than credential breakage.
The more substantive concern: two React applications simultaneously serving raxx.app creates dual session state. The session cookie is shared (correct), but the Next.js middleware.ts route guard and the CRA RouteGuard.js are simultaneously evaluating that cookie for different users. If a user in the Next.js 10% cohort visits /dashboard, the Next.js middleware handles their session. If a Worker canary flip moves them to the CRA cohort on their next request, the CRA route guard handles their session. Both work, but:
- Analytics diverge between cohorts
- The CE visual port must be complete before ramping cohort, otherwise some users see un-skinned CRA pages while others see CE-skinned Next.js pages
- The Worker itself is a new operational surface that must be maintained, tested, and eventually removed
- For a zero-customer launch (personal-use only), the Worker adds cost and complexity with no measurable blast-radius benefit over Strategy A + CF Pages alias rollback
Risk score
| Dimension | Score (1=low, 5=high) | Notes |
|---|---|---|
| Blast radius | 1 | Canary at 1% means only 1% of traffic hits Next.js at a time |
| Rollback speed | 2 | Set cohort to 0% via Worker KV; Worker itself stays deployed |
| Test surface | 4 | Dual apps, dual session handling, Worker as new failure surface |
| Operational complexity | 4 | CF Worker implementation, KV canary state, cohort ramp automation |
Verdict: Strategy C is not recommended for this launch posture. The canary benefit (real production traffic exercise at low blast radius) is irrelevant when the user population is one operator. The Worker is a net addition to the operational surface. When real customer traffic exists, Strategy C becomes more attractive as a future migration pattern — but by that point, the CRA-to-Next.js migration will be complete.
Decision matrix
| Dimension | A: Hard DNS swap | B: Path-based routing | C: CF Worker canary |
|---|---|---|---|
| Blast radius (1=low) | 2 | 3 | 1 |
| Rollback speed (1=fast) | 1 | 3 | 2 |
| Test surface (1=small) | 2 | 5 | 4 |
| Operational complexity (1=simple) | 2 | 5 | 4 |
| WebAuthn RP ID safety | Full | Partial (proxy variant only) | Partial (dual-session risk) |
| CF Access continuity | Full (hostname policy, unchanged) | Partial (dual project complexity) | Requires Worker CF Access bypass |
| Cookie / session safety | Full (same domain, no migration) | Full (same domain) | Full (same domain) |
| Total (lower=better) | 7 | 16 | 11 |
Recommendation
Strategy A. The CNAME re-point is atomic, reversible in under 2 minutes via CF Pages deployment alias, and fully preserves the WebAuthn RP ID invariant. The "no gradual rollout" con is immaterial: the personal-use launch posture has no user traffic to blast. The DNS TTL concern does not apply to Cloudflare proxied records (sub-60-second propagation).
Caveats
-
Phase 2 gate is non-negotiable. Do not execute Strategy A until the Playwright E2E suite passes on
raxx-staging-next. A failed cutover to a broken Next.js build is the only real risk in Strategy A, and Phase 2 eliminates it. -
CF Pages domain conflict window. Between Step 3 (attach
raxx.apptoraxx-prod-next) and Step 7 (detach fromraxx-app), CF Pages may warn about a duplicate custom domain. The workflow should handle detach before attach, or handle the overlap with a retry. See the sub-card fordeploy-antlers-cutover.ymlimplementation details. -
CF Access re-verification. CF Access hostname policies are typically transparent to CF Pages project changes. However, verify in Step 5 that the CF Access application for
raxx.appstill evaluates correctly against the new CF Pages project. If CF Access is bound to the Pages project (not just the hostname), it may require re-attachment. -
Sentry project separation. The Next.js app should use a distinct Sentry project (
antlers-nextjs, not the CRA project). This ensures error rate baselines are not contaminated during the soak window. The post-cutover smoke card (#2884) gates on this.
Rollout milestones
| Milestone | Condition | Action |
|---|---|---|
| T-0 (cutover) | Phase 2 Playwright E2E green on staging; operator runs workflow_dispatch on deploy-antlers-cutover.yml |
CNAME re-pointed; CF Access verified; smoke test runs |
| T+0:05 | Post-cutover smoke passes | Begin Sentry soak window; alert threshold active (3x 7-day baseline → ops@raxx.app) |
| T+24h | Sentry error rate within 2x baseline | No action required; continue soak |
| T+72h | Sentry error rate within 2x baseline; operator confirms no regressions | Soak complete; CRA rollback alias retained but soak period over |
| T+14d | Operator approves retirement | File #2885 (retire frontend/trademaster_ui/) — see sub-card |
| T+14d+ | raxx-app CF Pages project archived |
raxx-app project marked inactive; deploy-antlers.yml updated to target raxx-prod-next only |
Migrations
No schema migrations. No database changes. No Raptor changes. The session cookie shape is unchanged.
One CF Pages infrastructure change: attach raxx.app custom domain to raxx-prod-next, detach from raxx-app. This is reversible at any point before T+14d.
Security considerations
- WebAuthn RP ID invariant: preserved throughout; no action required.
- Session cookies: domain
.raxx.app; no change required.HttpOnly; Secure; SameSite=Nonesettings stay in Raptor'sSet-Cookieresponse — not in the frontend deploy. - CF Access gate: personal-use posture maintained. The gate is on the
raxx.apphostname, not the CF Pages project name. - Sentry DSN: the Next.js Sentry project DSN is a
NEXT_PUBLIC_SENTRY_DSNenv var at CF Pages project level. It is a public, non-secret value (Sentry DSNs are designed to be client-visible). No rotation required. - Kill-switch: CF Pages deployment alias pin is the kill-switch. Rollback time < 2 minutes.
- Breach notification path: unchanged. Raptor owns the audit trail. The frontend migration does not affect breach notification routing.
Open questions
The following require explicit operator input before sub-cards #2883 and #2884 are dispatched:
-
CF Access binding model: Does the current CF Access application for
raxx.appbind to the hostname only, or is it also bound to the CF Pages project name? If project-bound, Step 5 (re-verification) needs an explicit re-attach step in the cutover workflow. The operator should check CF Zero Trust dashboard:Access > Applications > raxx.app→ confirm "Application domain" is set toraxx.app(notraxx-app.pages.dev). If it is set to the.pages.devURL, the policy must be updated before cutover. -
raxx-prod-nextCF Pages project: Does this project already exist (from Phase 2 work on #2882), or does the cutover workflow need to create it? The cutover card (#2883) should not create a new CF Pages project on the fly during a production cutover — project creation is a Phase 2 sub-card. -
Sentry project for Next.js: Has
antlers-nextjsbeen created in Sentry? The post-cutover smoke (#2884) references it. If not created, it should be done in Phase 2. -
Staging Next.js hostname: The cutover plan assumes
raxx-staging-nextCF Pages project serves a staging URL (e.g.,staging-next.raxx.apporraxx-staging-next.pages.dev). Phase 2 (#2878–#2880) must confirm this before Phase 3 can proceed.
Language choice rationale
Not applicable. This ADR governs a deployment cutover strategy for a frontend surface. No new service is introduced.
Consequences
Positive
- Single app in production after cutover — no dual maintenance burden.
- Route guards are server-side; the entire category of CRA race-condition bugs is eliminated.
- WebAuthn passkeys continue to work without re-enrollment.
- Rollback is faster than the current "re-run CRA deploy workflow" path (CF Pages alias pin vs. triggering a full build).
Negative / risks
- The cutover is atomic — if Next.js has a bug not caught in Phase 2 E2E, all users (including the operator) see it immediately. Mitigation: Phase 2 gate is non-negotiable.
- The 14-day soak window with CRA rollback alias available in CF Pages history assumes the CRA build stays pinned and not garbage-collected by CF Pages. CF Pages retains all prior deployments (they are not automatically pruned). Verify this assumption before cutover.
Neutral
- The
raxx-appCF Pages project remains as a dormant artifact for the soak window. It can be archived (not deleted) at T+14d. deploy-antlers.ymlwill be updated to targetraxx-prod-nextas part of Phase 2 (#2882). The cutover plan assumes this is done before T-0.
Alternatives considered
Strategy B — Path-based routing
Rejected because: dual auth state, dual Sentry context, and split CE port create test surface complexity that cannot be justified for a solo operator with zero customer traffic. The WebAuthn ceremony is further complicated by the proxy/redirect ambiguity. The operational drag over weeks of dual-app maintenance exceeds the blast-radius benefit.
Strategy C — CF Worker canary
Rejected because: the canary benefit is irrelevant at zero customer traffic; the CF Worker is a new operational surface with no offsetting benefit at this launch posture. Strategy C is revisitable when Raxx has real user traffic that justifies a gradual ramp.
Security / GDPR checklist
- PII collected: No new PII. The frontend migration does not change Raptor's PII collection.
- Retention period: No change. Raptor and Queue own all retention policies.
- Deletion on DSR: No change. No PII stored in the frontend.
- Audit trail: Cutover execution is recorded in GitHub Actions logs (operator-triggered
workflow_dispatch). CF Pages deployment history is the infrastructure audit trail. - Stored credentials: None. The session cookie is opaque, server-issued, and never stored in JS state. The WebAuthn private key lives in the platform authenticator. This ADR does not introduce new credential storage.
- Breach notification path: No change. Raptor/Queue owns breach notification. The frontend swap does not affect this path.
- Secrets location + rotation:
NEXT_PUBLIC_SENTRY_DSNis a CF Pages project env var (public, non-secret). No other secrets in the frontend. All secrets rotate without a redeploy (Raptor Heroku config-vars). - Kill-switch: CF Pages deployment alias pin. Rollback time < 2 minutes. No DNS change required for rollback.
Revisit when
- When Raxx has real user traffic (>100 DAU): Strategy C (CF Worker canary) becomes worth the operational complexity. Reconsider at that traffic level.
- When the Next.js → Next.js upgrade (v14 → v15+) becomes necessary: the CF Pages project swap model (A) remains the correct approach — deploy a new build, verify on staging, alias-pin the current build for rollback, then cutover.