Console Phase 2 Functional Audit — 2026-05-07 UTC
Status: Phase 2 complete (live functional click-through with operator-driven SSO).
Method: Playwright MCP browser session, operator authenticated interactively (CF Access SSO + passkey + TOTP), agent drove the rest.
Auth identity: kris@moosequest.net, role superadmin.
Active env: prod (banner reads "PRODUCTION", surface badge "active env: prod").
Executive summary
16 surfaces tested; 7 healthy, 4 broken (real bugs), 5 correctly gated (working as designed).
The console-completeness pain has a clearer shape now:
- Real bugs (3): empty
/console/flags, broken/secrets/historytemplate,/console/admins/online404 - Flag-gated 404s (3):
/security,/console/customers/,/ops— these need flag flips, not new code - Role-gated 403 (2):
/billing,/billing/alert-config— operator account is missingconsole-billing-readrole - Empty-data states (1):
/admin/console-versionsshows "no completed deploy on record" because the audit-ingest endpoint hasn't been flipped on yet (#1267 secrets pending in #1314) - Healthy (7): dashboard, /secrets, /console/flags/promotions, /console/deploy-freeze, /status, /admin/console-versions, /dashboard/sites/console-prod
Everything that's not working either has a clear cause + cheap fix, or is correctly gated and just needs a flag flip.
Per-route detail
✓ Working
| # | Route | Title | What's there |
|---|---|---|---|
| 1 | /dashboard |
Dashboard — Raxx Console | 9 tile grid (1 DEGRADED — support.raxx.app http_404 with "investigate" button), Recent activity (15 items), session bar, Sign out + Health link. The Investigate-from-status FreeScout integration is visible and clickable |
| 2 | /secrets |
Secrets — Raxx Console | Full secrets list (~40+ rows) with rotation status badges (auto-rotate ready / No SOP / Mode A — manual). "Rotate now" buttons live. Recent rotations table at top shows multiple "Failed: GET /user/tokens returned status 403" — that's the CF token rotation handler hitting the same 401 we diagnosed earlier today |
| 3 | /console/flags/promotions |
Flag Promotions Queue | Correctly empty: "No active promotions. Mark a flag active for prod on the flags page to start a promotion." Empty state is well-handled |
| 4 | /console/deploy-freeze |
Deploy Freeze — Raxx Console | "Active — deploys proceeding normally" status + "Freeze Deploys" red button. Clean simple page |
| 5 | /status |
Status — Raxx Console | Surface registry table — 12 surfaces with hosting type tags (HEROKU / CONSOLE_SELF / LIGHTSAIL / FREESCOUT / CLOUDFLARE_PAGES). Includes staging surfaces |
| 6 | /admin/console-versions |
Console Versions | Two cards (staging / prod) with deploy state. Currently empty data ("No completed deploy on record. Ref: unknown, Commit SHA: unknown, Status: unknown") because deploy-audit-ingest endpoint hasn't been flipped on yet (#1267 secrets sit in unmerged #1314). Page renders correctly though |
| 7 | /dashboard/sites/console-prod |
console-prod — Surface Detail | Excellent rich page. HEALTHY badge, liveness 102ms HTTP 200, latency history chart with markers, probe history table (last 48 entries with timestamps + OK + latency + HTTP + vendor + error columns), Surface Control Flags toggle (FLAG_ENFORCE_CF_ORIGIN off — block requests bypassing CF). This is the deepest-functioning page in the console |
Screenshots: screenshots/01-dashboard.png, 02-secrets.png, 04-promotions.png, 05-deploy-freeze.png, 08-status.png, 12-console-versions.png, 13-site-detail.png.
🚩 Real bugs
| # | Route | Issue | Severity | Likely cause |
|---|---|---|---|---|
| 8 | /console/flags |
"No feature flags declared in feature_flags.yaml" despite YAML having 40+ flags | HIGH | The console blueprint reads from a different YAML path than the canonical backend_v2/api/feature_flags.yaml. Either console/config/feature_flags.yaml (the gitignored slug copy fixed in PR #1315) wasn't bundled correctly into the prod slug, OR the consumer is looking at the wrong path entirely. Page IS navigable; data layer is empty. The "Feature Flags" UI is essentially non-functional in prod |
| 9 | /secrets/history |
Page renders the rotations table at top, then dumps the same data flat-text below without the base.html template wrapper (no nav, no styling) |
MEDIUM | Template inheritance bug — likely {% extends "base.html" %} missing or {% block content %} mis-scoped. Could also be a server-side concatenation that includes raw rendered output |
| 10 | /console/admins/online |
HTTP 404 despite the route being registered in console/app/blueprints/admins_online.py:91 and the blueprint registered in console/app/__init__.py:108 |
MEDIUM | Either prod slug doesn't include the latest blueprint code (deploy lag), or there's a before_request flag-check that returns 404 when a flag is OFF. Worth verifying via heroku run whether the route is in the live Flask url_map |
Screenshots: screenshots/03-flags.png, 14-secrets-history.png, 06-admins-online-404.png.
🔒 Flag-gated 404s (need flag flip, not code)
| # | Route | Gate | Status |
|---|---|---|---|
| 11 | /security |
flag_console_nav_v2 (default OFF in YAML) |
404 — flip flag to enable |
| 12 | /console/customers/ |
flag_console_customer_admin (default OFF, risk: high) |
404 — flip flag to enable |
| 13 | /ops |
flag_console_claude_menu (default OFF) |
404 — flip flag to enable |
The pattern: the route is wired in code AND the nav-link is wired in base.html, but BOTH are conditional on a flag that defaults OFF in prod. To activate, flip via heroku config:set (per memory feedback_bootstrap_via_heroku.md — first-time bootstrap goes via Heroku CLI direct, not via the in-UI promotions flow that itself needs console_flag_promotions ON).
Screenshots: 07-security-404.png, 10-customers-404.png, 11-ops-404.png.
🔐 Role-gated 403s (RBAC working, role missing)
| # | Route | Required role | Operator has? |
|---|---|---|---|
| 14 | /billing |
console-billing-read |
NO — returns {"error":"forbidden","required_role":"console-billing-read"} |
| 15 | /billing/alert-config |
console-billing-read |
NO — same response |
The RBAC gate is doing exactly what it should. The fix is administrative: assign console-billing-read to the operator's admin record (per memory project_rbac_model.md — fine-grained roles, group composition).
Screenshots: 09-billing.png, 15-billing-alert-config.png.
Operator-visible nav (current state)
When kris@moosequest.net (superadmin) loads the console, the top nav shows:
Dashboard | Issues ↗ | Secrets | Feature Flags | Promotions | Sign out
That's 6 entries visible out of 11 wired in base.html. Missing because their gate flag is OFF in prod:
- Security (gated by
console_nav_v2) - Status (gated by
console_nav_v2) — although the route works, it's not in nav - Customers (gated by
console_customer_admin) - Billing (gated by
billing_summary_api+ RBACconsole-billing-read) - Ops dropdown (gated by
console_claude_menu)
Real-world data observations (interesting bits)
-
Rotation handler failures cluster around
CLOUDFLARE_PAGES_READ_TOKEN. The Recent rotations on /secrets shows multiple consecutive "Failed: GET /user/tokens returned status 403" entries from 2026-04-26 with that same token name. This is the same upstream issue we diagnosed today (CF tokens 401-ing the rotation handler) and still hasn't been resolved. -
DEGRADED tile on
support.raxx.appshows http_404 — operator hasn't pointedsupport.raxx.appat anything yet, so the probe gets 404. The Investigate button auto-files a FreeScout ticket on click — that integration works. Probably worth either (a) pointingsupport.raxx.appsomewhere or (b) marking the surface as "not yet deployed" in the registry so it doesn't show DEGRADED. -
2 admins online confirms the admins-presence widget is collecting data even though the
/console/admins/onlinepage returns 404. So the data layer is working; the page-render layer isn't. -
Probe history is dense — site-detail shows ~30 probe entries across the trailing 48 minutes (every ~30s). All
yes/ 200 / sub-150ms. Console health is solid. -
FLAG_ENFORCE_CF_ORIGINtoggle is OFF on console-prod — meaning the console DOES accept requests directly to its Heroku origin URL bypassing CF Access (the same boundary the hook caught me trying to abuse this morning). When you flip this to ON, the agent's hook denial earlier becomes irrelevant because the bypass is closed at the origin.
Recommendations (priority-ordered)
P0 (next 1-2 hours):
-
Diagnose
/console/flagsempty-state. Runheroku run -a raxx-console-prod cat console/config/feature_flags.yaml | head -30to confirm the slug copy is non-empty. If empty, the deploy bundling is broken (PR #1315 might have been incomplete). If non-empty, the consumer is reading from the wrong path. -
Diagnose
/console/admins/online404. Runheroku run -a raxx-console-prod python -c "from console.app import create_app; print('\n'.join(str(r) for r in create_app().url_map.iter_rules() if 'admin' in str(r)))". If the route is in url_map, it's a flag-before-request gate. If not, prod slug is stale. -
Fix
/secrets/historytemplate bug.console/app/blueprints/secrets.py:405route + corresponding template. Should be a quick template-inheritance fix.
P1 (next day):
-
Audit prod flag state once.
heroku config -a raxx-console-prod | grep ^FLAG_(operator). Decide which gates to flip:FLAG_CONSOLE_NAV_V2is the highest-leverage single flip (unlocks Security + Status nav).FLAG_CONSOLE_CLAUDE_MENUsecond (unlocks Ops dropdown).FLAG_CONSOLE_CUSTOMER_ADMINis high-risk, hold until customer onboarding starts. -
Assign
console-billing-readrole to the operator's admin record so /billing renders. Should be a one-row update to whatever admin/role table backs RBAC. -
Flip
FLAG_ENFORCE_CF_ORIGINon raxx-console-prod to close the Heroku-origin bypass. Defense in depth.
P2 (this week):
-
support.raxx.appeither gets pointed somewhere or marked pre-launch so the dashboard isn't permanently DEGRADED. -
Investigate the recurring
CLOUDFLARE_PAGES_READ_TOKENrotation failures. Multiple historical entries in /secrets — needs root-cause beyond just retrying. -
Document the UA-gating discovered in Phase 1 (see PR #1318) — adding
raxx-console-self-probe/1.0UA requirement to the runbook.
What this report does NOT cover
Phase 2 confirmed that pages render or fail with explicit reason; it did NOT exercise:
- Modals + interactions: deploy modal open/close, secret rotate modal, flag promotion mark/approve/promote
- Animations: tile pulse, breathing-line on DOWN state, build-strip (PR #1264 implementation card not yet built)
- Form submissions: none invoked (intentional — no destructive actions)
- Mobile / accessibility: desktop only, 1440×900 viewport
- Real-time updates: activity feed, alerts drawer
- End-to-end deploy flow: clicking a tile's
deploybutton, watching the modal progress
Those are Phase 2.5 — happy to drive them when needed; the auth-state captured this session keeps working as long as the browser stays open.
Open in your queue
| PR | Status | Note |
|---|---|---|
| #1314 | CLEAN, awaiting merge | SRE provisioning batch — flipping CONSOLE_AUDIT_INGEST_TOKEN is what unblocks the empty-data state on /admin/console-versions |
| #1316 | CLEAN, awaiting merge | Data-scientist Monte Carlo |
| #1317 | CLEAN, awaiting merge | Yesterday's static QA — needs the /auth/* correction noted in #1318 |
| #1318 | CLEAN, awaiting merge | Phase 1 reachability + UA-gating finding |
| #1319 | CLEAN, awaiting merge | Phase 2 Playwright scripts |
| #1161 | UNSTABLE | Fidelity — your N + tier decisions |
This report (Phase 2) lands as a separate PR in a moment.
Audit run by: Claude Code main agent + Playwright MCP, 2026-05-07 UTC Operator hand-off points: SSO + passkey + TOTP at session start. Everything else autonomous.