Raxx · internal docs

internal · gated

ADR 0129 — RBAC V2 Blueprint Cutover Rollout Strategy

Status: Accepted Date: 2026-06-18 UTC Deciders: Kristerpher (operator), software-architect Scope: Console service — all 17 blueprint files in console/app/blueprints/


Context

The Console has 141 legacy @require_role(...) call sites that resolve against the flat four-level admin_roles table (superadmin / ops / support / readonly). RBAC V2 tables and fine-grained decorators exist in console/app/middleware/rbac.py. Issue #1473 (operator-authorized for pre-launch, 2026-06-18) asks: how do we cut over 141 sites safely without a single all-or-nothing mega-PR that cannot be rolled back at the route level?

Two questions drove this ADR:

  1. Should the cutover use a runtime flag-check inside each decorator (live-flip) or a deployment-time switch (redeploy)?
  2. Should the cutover happen in a single PR or a phased cluster-by-cluster sequence?

Decision

The cutover uses direct decorator replacement in phased blueprint clusters, with FLAG_RBAC_V2 as a deployment gate (not a runtime gate). Shadow dual-mode code is removed in the first sub-card. Each cluster ships independently to staging and soaks before promotion to prod. Rollback is via tagged-SHA redeploy, not flag flip.

The per-route permission mapping in docs/architecture/rbac-blueprint-cutover.md §3 is the authoritative correctness artifact. Every sub-card must produce integration tests proving 200/403 behaviour before merging.


Language choice rationale

Skipped. This ADR governs an operational/authz rollout decision, not a new service.


Consequences

Positive

Negative / risks

Neutral


Alternatives considered

Alternative A: Runtime flag-check inside each decorator

Each route registers a wrapper that checks FLAG_RBAC_V2 at request time and branches to the legacy or V2 check. This enables live flip without redeploy.

Rejected because: Flask decorator stacks are evaluated at import time. A per-request flag check would require every route to be wrapped in an additional callable, significantly increasing code complexity and introducing a new surface for decorator-ordering bugs. The existing shadow-check mechanism already demonstrated the fragility of this pattern (it had to be lazy-imported to avoid circular imports). The operational benefit — live flip — is low: flag flips on Heroku trigger a dyno restart anyway, so the latency difference between flag-flip-restart and SHA-redeploy is seconds.

Alternative B: Single mega-PR, all 141 sites at once

All blueprints are ported in one PR. Reviewed once, merged once.

Rejected because: A 141-site change with no per-blueprint granularity is unreviable, unrollbackable at the route level, and creates a single point of failure. A wrong mapping in secrets.py would require reverting all blueprints. The phased approach allows secrets (the highest-privilege blueprint) to ship last, after all other clusters have soaked.


Security / GDPR checklist


Revisit when