Date: 2026-05-03
Author: PM agent (raxx-pm-bot)
Parent epic: #907
Trigger: Kristerpher's 2026-05-03 ~06:00 UTC pivot to three-flow rotation architecture
Architect doc (in-flight): docs/architecture/velvet/v2-rotation-flows.md — not yet landed; sections referenced below will be cross-linked once the file is committed
Scope doc (v1, for annotation): docs/architecture/velvet/scope.md — supersession notes at end of this document
Kristerpher's directive collapses the V1 single-handler-pattern model into three named flows with an explicit service-bus registration layer:
| Flow | Description | Terminal signal |
|---|---|---|
| Testing | Validates auth, permissions, and token visibility before any mutation | N/A (read-only pass/fail) |
| Operational | Stage 1 → Stage 2 → Stage 3 orchestrated rotation (verify, mint + distribute, validate + revoke) | completed job status |
| Revocation | Terminate token immediately; no re-mint; wait for 401 to confirm |
401 from consumer |
The service-bus model requires that every system that consumes a token registers against that token so that a rotation event fans out to all registered consumers automatically. This replaces the v1 pattern where the distribute step held a hardcoded destination list per handler.
The modal UI per stage (Stage 1 / 2 / 3 progress visibility + type-to-confirm gates) is now a first-class deliverable, not a post-M3 polish item.
Disposition: KEEP — no body edits needed
The app pair scaffold, Postgres add-on, CF Access pattern, and bootstrap config var strategy are identical in v2. The only addition is that rotation_jobs will eventually grow a flow column (testing / operational / revocation) — that is a schema concern in V3, not V1.
Disposition: KEEP — minor body note needed
The read-through proxy surface is unchanged. However the audit log event in /value should be enriched to include flow_context when the read is part of a Testing-flow probe. Propose a comment on #909 (not a body edit):
"v2 note: The
/valueendpoint's audit log line should include an optionalflow_contextfield (testing/operational/revocation) injected by the job runner when the read is part of a named flow. The endpoint itself does not change — the job runner sets context. Flag for implementation PR."
Disposition: REVISE — schema needs two new columns
The v2 three-flow model requires tracking which flow a job belongs to and the current stage within that flow.
Proposed body edits (to propose to card-groomer for the next grooming pass):
flow column: TEXT NOT NULL CHECK (flow IN ('testing', 'operational', 'revocation')) DEFAULT 'operational'stage column: TEXT CHECK (stage IN ('stage_1_verify', 'stage_2_mint_distribute', 'stage_3_validate_revoke', NULL)) — nullable; populated as the job runner advancesstatus CHECK constraint to include 'revoked' as a terminal state distinct from 'completed' (revocation flow completes differently: old token is gone; new token was never minted)subscriber_snapshot JSONB column to capture the registered subscriber list at the time of job creation — immutable after job starts; used for audit and partial-failure replayNo other columns change. Indexes are unaffected.
Disposition: REVISE — three flows, not one kickoff shape
The kickoff endpoint must accept a flow parameter:
POST /tokens/{name}/rotate
{
"flow": "operational" | "testing" | "revocation",
"idempotency_key": "..."
}
For the revocation flow, the endpoint name should arguably be POST /tokens/{name}/revoke (separate endpoint) rather than /rotate with flow=revocation. This is an open decision — see Section 4, decision OD-1.
The poll endpoint GET /tokens/{name}/rotations/{job_id} should return stage in addition to status so the console modal can reflect which stage is in progress.
Proposed body edits:
- Scope section: add flow field to request body; add stage field to response body
- Acceptance criteria: add tests for each flow type kickoff; add test for stage field in poll response
- Explicitly document that flow=testing does NOT advance to mint/distribute; job completes at stage_1 with a tested pseudo-status
Disposition: KEEP — no body edits needed
The auth model is flow-agnostic. The scope matrix (read, rotate) covers all three flows. The revocation flow uses rotate scope per D7 resolution. No changes needed here.
Disposition: KEEP — D2 gating still applies
SSM integration is flow-agnostic (it is a backing store, not a handler). Still blocked on D2 confirmation. No body edits needed.
Disposition: REPLACE
The v1 handler pattern (handler.mint / validate / distribute / revoke as a monolithic four-function object) is superseded by the bus-adapter pattern. Instead of a handler that contains its own distribute logic, v2 has:
token_service/vendors/postmark.py)Proposed replacement card (new issue, not yet filed):
V7-v2: "Build flow runner — three-stage orchestrator for operational + testing + revocation flows"
- Orchestrates Stage 1 (verify auth + permissions), Stage 2 (mint via vendor module + fan-out to registered subscribers), Stage 3 (validate on new token + execute revoke via vendor module)
- Reads subscriber registry for the credential at job creation time; snaps list to subscriber_snapshot JSONB
- Advances stage column in rotation_jobs as it progresses
- Emits per-stage audit events
- Testing flow: executes Stage 1 only; no mutations; job status = tested
- Revocation flow: skips Stage 2 mint; executes Stage 3 revoke only; job status = revoked
See Section 3 (New Card Slate) for the full V7-v2 and related new cards.
Disposition: REPLACE
This card's entire "four-function handler" shape is v1. In v2, the Heroku-specific logic splits into:
token_service/vendors/heroku.py) — mint (OAuth authorization creation) + validate (GET /account) + revoke (delete OAuth authorization). No distribute logic.token_service/adapters/heroku_config_var.py) — one adapter instance per Heroku app; each app in the distribution list is a separate registered subscriber with a named adapterProposed replacement card:
V8-v2: "Heroku vendor module (mint/validate/revoke) + config-var bus adapter"
- Vendor module: mint, validate, revoke via Heroku Platform API (HTTP-only, no CLI)
- Bus adapter: HerokuConfigVarAdapter(app_name) — registered once per Heroku app in the subscriber registry; pushes new token via PATCH /apps/{app}/config-vars
- GH Actions secret becomes a separate bus adapter GithubActionsSecretAdapter (see new card slate)
- Explicit dependency on #925 landing before GH Actions secret adapter can work
- Deprecation marker on console/app/services/rotation_handlers/heroku.py — see Section 6
Disposition: REPLACE
Same pattern as V8. The Cloudflare-specific token resolution and verification logic moves to a vendor module; the Infisical write (currently the only distribute destination) becomes a bus adapter entry.
Proposed replacement card:
V9-v2: "Cloudflare vendor module (mint/validate/revoke) + token-store bus adapter"
- Vendor module: mint (POST /user/tokens), validate (GET /user/tokens/verify), revoke (DELETE /user/tokens/{id})
- Companion-secret pattern for __CF_TOKEN_ID storage is retained
- token_service/adapters/infisical_write.py is the generic bus adapter that handles the Infisical write destination — reused across all credentials, not Cloudflare-specific
Note: V7, V8, V9 replacements together constitute the bus-adapter system's first three concrete implementations. They should be filed as a set with clear dependencies.
Disposition: KEEP — add one note
Still valid. Add a comment note: in v2, the migrated callsite reads from GET /tokens/{name}/value on the Velvet API exactly as specified; the bus architecture does not change how consumers read tokens, only how rotations distribute to them. No body edits needed; the card stands.
Disposition: REVISE — scope expands significantly
The handler-author guide content must be replaced with bus-adapter-author guide content. The four-function interface contract is no longer the extension point — the extension point is: 1. Writing a vendor module (thin: mint/validate/revoke only) 2. Writing a bus adapter (how to push a new token value to a specific destination) 3. Registering a subscriber (mapping credential name + consumer system to an adapter instance)
Additionally, the runbook must cover: - The three-flow modal UX: what each stage shows, what the operator does if a stage stalls - How to perform a revocation from the UI vs. API - How to inspect the subscriber registry for a given credential - The deprecation marker convention for v1 handlers
Proposed body edit for card-groomer:
"Scope section rewrite: replace 'four-function interface contract' sections with vendor-module + bus-adapter-author content per v2 architecture. Runbook section additions: three-flow modal UX ops guide, revocation flow SOP, subscriber registry inspection, v1 handler deprecation pattern. Both docs still gate on v2 handlers landing (now V7-v2, V8-v2, V9-v2)."
The v2 architecture requires approximately 20 cards. Nine carry over from v1 (some revised); eleven are new. The three-flow split plus service-bus infrastructure drives the new count.
| # | Title | Disposition | Milestone |
|---|---|---|---|
| #908 | Scaffold Heroku app pair + Postgres add-on | KEEP | M1 |
| #909 | GET /tokens/{name} read-through proxy | KEEP | M1 |
| #910 | rotation_jobs schema + migration | REVISE (flow, stage, subscriber_snapshot columns) | M1 |
| #911 | POST /tokens/{name}/rotate kickoff + poll | REVISE (flow param, stage in response, separate /revoke endpoint TBD per OD-1) | M1 |
| #912 | Service-token auth middleware | KEEP | M1 |
NV1: "Build subscriber registry — per-credential consumer registration + snapshot on job start"
Parent: #907 | Milestone: M2.5 | Depends on: #910 (rotation_jobs schema)
User story: As the Velvet flow runner, I want to query a registry of which systems have subscribed to a given credential so that a rotation job fans out to all of them without per-handler hardcoded lists.
Scope:
- subscriber_registry Postgres table: (id, credential_name, consumer_name, adapter_class, adapter_config JSONB, enabled BOOL, env)
- GET /tokens/{name}/subscribers — list registered subscribers for a credential
- POST /tokens/{name}/subscribers — register a new subscriber (operator action; not automated)
- At job creation, the flow runner snapshots the enabled subscriber list into rotation_jobs.subscriber_snapshot JSONB
- Infisical write adapter is auto-registered for all Infisical-backed credentials on first rotation
Acceptance criteria:
- Subscriber list for HEROKU_API_KEY contains at minimum: Infisical write, raxx-console-prod config-var, raxx-console-staging config-var, raxx-api-prod config-var, raxx-api-staging config-var, GH Actions secret
- subscriber_snapshot on a new job matches the enabled subscriber list at job creation time
- Disabling a subscriber prevents it from receiving updates on the next rotation (does not revoke the old token from that destination)
Risks:
- Empty subscriber list: if a credential has zero enabled subscribers, the rotate job would complete but nothing gets updated. Mitigation: fail the job at Stage 2 with error: no subscribers registered if snapshot is empty.
- Registry bootstrapping: the registry needs to be seeded before the first rotation. Mitigation: provide a migration seed script that pre-registers known subscribers from the v1 hardcoded lists.
NV2: "Implement bus adapter base class + Infisical-write adapter (first concrete adapter)"
Parent: #907 | Milestone: M2.5 | Depends on: NV1
User story: As a handler author, I want a base class and working reference implementation for a bus adapter so that new adapters follow a consistent interface and the Infisical write case works out of the box.
Scope:
- token_service/adapters/base.py — BusAdapter abstract class: push(credential_name, new_value, context) -> AdapterResult
- token_service/adapters/infisical_write.py — writes new value to Infisical via authorized client; returns AdapterResult(destination, ok, error_message)
- The flow runner calls adapter.push(...) for each subscriber in subscriber_snapshot; collects results; partial failure does not abort remaining adapters
- Per-adapter result is stored back to rotation_jobs.subscriber_snapshot (update in-place with outcome)
Acceptance criteria:
- push() on InfisicalWriteAdapter updates the Infisical secret and returns ok=True
- A failed push() on one adapter does not raise; it returns ok=False with error_message populated
- The flow runner collects all results; if any adapter fails, job status becomes completed_partial (new status value — add to V3 schema revision)
- Tests: happy path, single adapter failure, all adapters fail (job → failed)
NV3: "Implement Heroku config-var bus adapter + GH Actions secret bus adapter"
Parent: #907 | Milestone: M2.5 | Depends on: NV2, and #925 for GH Actions adapter
User story: As the flow runner distributing a HEROKU_API_KEY rotation, I want dedicated adapters for Heroku config-var writes and GH Actions secrets so that each destination is independently testable and re-usable across any credential that needs those destinations.
Scope:
- token_service/adapters/heroku_config_var.py — HerokuConfigVarAdapter(app_name): PATCH /apps/{app}/config-vars via Heroku Platform API; no CLI
- token_service/adapters/github_actions_secret.py — GithubActionsSecretAdapter(secret_name): PyNaCl-encrypted PUT to GH Secrets API; no CLI; depends on GITHUB_API_SECRETS_TOKEN (#925)
- Both adapters: log <REDACTED> for any token value in structured logs
Acceptance criteria:
- HerokuConfigVarAdapter("raxx-console-prod").push("HEROKU_API_KEY", new_val, ctx) sets the config var on that app (verified by GET /apps/raxx-console-prod/config-vars)
- GithubActionsSecretAdapter("HEROKU_API_KEY").push(...) updates the GH Actions secret (verified by GH REST API); fails gracefully if GITHUB_API_SECRETS_TOKEN is missing (returns ok=False, does not raise)
- No subprocess.run, os.system, or CLI invocations in either adapter
- Tests: mock Heroku API success, mock Heroku API 4xx, GH secret encrypted correctly, GH secret missing token returns ok=False
Risks:
- #925 not yet landed: GH Actions adapter cannot be fully exercised in prod until GITHUB_API_SECRETS_TOKEN is provisioned. Mitigation: adapter degrades gracefully; the subscriber can be registered but disabled until the token lands.
NV4: "Implement Cloudflare user-API-token vendor module + token-store adapter"
Parent: #907 | Milestone: M3 | Depends on: NV2 (adapter base)
This is the replacement for V9 (#916). Vendor module only (mint/validate/revoke); the Infisical write destination reuses InfisicalWriteAdapter from NV2 — no custom distribute logic needed.
Scope:
- token_service/vendors/cloudflare_user_api_token.py: mint, validate, revoke
- Companion-secret pattern retained for __CF_TOKEN_ID
- CF auth error 10000 surfaced with remediation hint in error_message
- Register CLOUDFLARE_RAXX_AUTOMATION_API_TOKEN (and other CF tokens per vault taxonomy) as using this vendor module
NV5: "Implement three-stage flow runner (testing / operational / revocation)"
Parent: #907 | Milestone: M2.5 | Depends on: NV1, NV2, #910 (schema with flow + stage columns), #911 (kickoff endpoint)
This is the replacement for V7 (#914) at the orchestration layer.
User story: As an operator triggering any rotation, I want the flow runner to execute the correct stage sequence for the requested flow so that I never need to manually coordinate mint, distribute, and revoke steps.
Scope:
Testing flow (Stage 1 only, no mutations):
- Stage 1: authenticate against the vendor API using the current token; verify the token is visible and the caller has permission to manipulate it
- Job status → tested; no job rows other than this job are modified
- Terminal signal: pass/fail per credential
Operational flow (all three stages):
- Stage 1: same as testing flow — if Stage 1 fails, abort before any mutations
- Stage 2: call vendor.mint(); on success, fan out to all registered subscribers via adapters; update subscriber_snapshot with per-adapter results; if any adapter fails → completed_partial but continue to Stage 3
- Stage 3: call vendor.validate(new_token) for each subscriber that succeeded; on all-pass, call vendor.revoke(old_token); job → completed; if validate fails → do NOT revoke; job → failed (old token still valid)
- Terminal signal: completed (all subscribers updated + revoke executed) or completed_partial (some subscribers failed but revoke still executed) or failed (validate failed, old token preserved)
Revocation flow (revoke only, no mint):
- No Stage 1 or Stage 2 — skip directly to Stage 3
- Stage 3: call vendor.revoke(current_token) only; wait for 401 from a validation probe against the old token to confirm revocation
- Job → revoked on confirmation
- Terminal signal: 401 from probe = confirmed revoked
- Note: this is Kristerpher's insight — revocation is architecturally the same as Stage 3 of the operational flow, extracted as a standalone trigger
Acceptance criteria:
- Testing flow for POSTMARK_SERVER_TOKEN: job lands in tested status; no Infisical mutation; Stage 1 result logged
- Operational flow for POSTMARK_SERVER_TOKEN: job progresses pending → stage_1_verify → stage_2_mint_distribute → stage_3_validate_revoke → completed; all subscribers updated
- Revocation flow for any credential: job progresses to revoked; probe confirms 401 from old token
- If Stage 1 fails in operational flow, job → failed immediately; no mutations
- If Stage 3 validate fails in operational flow, old token NOT revoked; job → failed; audit log captures old_token_preserved: true
- rotation_jobs.stage column advances correctly at each transition
Risks:
- Stage 3 validate/revoke atomicity: validate passes but revoke fails (vendor API error). Mitigation: revoke is retried up to 3 times; if revoke fails after retries, job → completed_partial; old token remains valid alongside new token; alert operator.
- Subscriber partial failure at Stage 2: one destination fails but revoke still executes. Mitigation: completed_partial status makes partial failure visible; runbook documents the recovery procedure.
- Testing flow false negative: Stage 1 probe fails not because the token is invalid but because the vendor API is temporarily down. Mitigation: surface the error message from the vendor API in the testing flow result; operator can distinguish network errors from auth errors.
NV6: "Postmark vendor module (validate-only; mint is pre-staged)"
Parent: #907 | Milestone: M3 | Depends on: NV5
Thin card. The Postmark-specific logic from V7 (#914) that is not superseded by the flow runner: the validate() call and the pre-staged mint pattern. No distribute logic (that is InfisicalWriteAdapter).
NV7: "Heroku vendor module (mint/validate/revoke) — extracted from V8-v2"
Parent: #907 | Milestone: M3 | Depends on: NV5
The Heroku OAuth mint, GET /account validate, and authorization delete revoke — extracted as a vendor module. The config-var distribution is NV3's adapters. This is the replacement for V8 (#915) at the vendor logic layer.
NV8: "Build rotation modal — three-stage progress UI with per-stage status indicators"
Parent: #907 | Milestone: M3 | Depends on: NV5, #911 (poll endpoint returns stage)
User story: As a console operator, I want the rotation modal to show which stage is currently executing so that I understand what is happening and can react if something stalls.
Scope:
- Three visual stages in the modal: Stage 1 (Verify), Stage 2 (Mint + Distribute), Stage 3 (Validate + Revoke)
- Each stage: pending (gray), in-progress (animated), success (green check), failed (red x with error message)
- Modal polls GET /tokens/{name}/rotations/{job_id} for stage and status updates
- Stage 2 shows a subscriber table: each subscriber row updates in real-time as adapter results come in from subscriber_snapshot
- Banner color: red for prod rotation, purple for staging (per console env-switcher memory)
- Flow label displayed in modal header: "Testing" / "Operational" / "Revocation"
Acceptance criteria:
- Stage indicators advance correctly as the job progresses through stages
- Subscriber table renders all entries from subscriber_snapshot; failed rows show error_message
- Revocation flow modal shows only Stage 3 section (Stages 1 and 2 not rendered)
- Testing flow modal shows only Stage 1 section with pass/fail indicators
- Modal is accessible from the Security > Token Management view
NV9: "Subscriber-table view in Security console — per-credential registry inspector"
Parent: #907 | Milestone: M3 | Depends on: NV1 (subscriber registry)
User story: As an operator, I want to see which systems are registered as subscribers for a given credential so that I know all the destinations that will receive an update on the next rotation.
Scope:
- New panel in Security > Token Management: "Subscribers" tab per credential
- Lists consumer_name, adapter_class, enabled toggle (operator can disable without deleting)
- Shows last push result from subscriber_snapshot of the most recent completed job
- Operator can add a subscriber via UI (pre-populates adapter_config form based on adapter_class selection)
NV10: "Type-to-confirm gate for Operational and Revocation flows"
Parent: #907 | Milestone: M3 | Depends on: NV8
User story: As a console operator, I want a type-to-confirm dialog before a mutation-bearing rotation (or revocation) fires so that accidental clicks on a prod credential do not trigger irreversible actions.
Scope:
- Before Stage 2 begins in Operational flow: modal prompts Type the credential name to confirm; entry must match exactly; then Stage 2 proceeds
- Before Revocation flow starts: same gate; prompt text includes: "This will immediately terminate the token. This action cannot be undone."
- Testing flow: no confirmation gate (read-only; no mutations)
- Gate is in the UI only; the API layer does not enforce it (the console operator role implies trust; the UI gate is a UX safeguard, not a security boundary)
NV11: "Update operator runbook + bus-adapter-author guide for v2 architecture"
Parent: #907 | Milestone: M3 | Depends on: NV5, NV6, NV7, NV8
This replaces V11 (#918) scope. Same deliverable location (docs/architecture/velvet/), rewritten for v2 concepts.
Scope changes vs. V11:
- "Handler-author guide" → "Bus-adapter-author guide": vendor module interface, adapter interface, subscriber registration steps
- Runbook additions: three-flow SOP (testing, operational, revocation), subscriber registry inspection, completed_partial recovery, revocation confirmation verification
- Reference docs/architecture/velvet/v2-rotation-flows.md throughout once that doc lands
| Category | v1 cards | v2 cards | Net change |
|---|---|---|---|
| Infrastructure (keep/revise) | 5 | 5 | 0 |
| Infrastructure (new: service bus) | 0 | 4 (NV1-NV4) | +4 |
| Flow runner | 0 | 1 (NV5) | +1 |
| Vendor modules | 3 (V7-V9) | 3 (NV6, NV7, NV4) | 0 (replaced) |
| UI | 0 | 4 (NV8-NV11) | +4 |
| SSM (M2) | 1 | 1 | 0 |
| Migration + docs | 2 (V10, V11) | 2 | 0 |
| Total | 11 | 20 | +9 |
M1 scope does not change under v2. The Heroku app pair, Postgres schema, read proxy, rotate kickoff stub, and auth middleware are all v2-compatible as filed. The schema additions for v2 (flow, stage, subscriber_snapshot columns) are small enough to land in V3 revision before M1 is cut.
M1 success gate (unchanged): GET /tokens/POSTMARK_SERVER_TOKEN with a valid service token returns correct metadata from Velvet staging.
M1 v2 additions:
- rotation_jobs schema includes flow, stage, subscriber_snapshot columns (V3 revision)
- Kickoff endpoint accepts flow param (V4 revision); revoke endpoint question (OD-1) resolved before M1 is cut
M2 scope does not change. It is a backing-store concern that is orthogonal to the bus architecture.
M2 success gate (unchanged): SSM-backed credential readable through GET /tokens/{name}/value.
Hard gate still in effect: D2 (SSM path convention) must be confirmed before M2 starts.
Scope: NV1 (subscriber registry), NV2 (adapter base + Infisical write adapter), NV3 (Heroku + GH Actions adapters), NV5 (flow runner — testing + operational flows), and the existing subscriber pre-registration seed migration.
M2.5 success gate: Operational flow for HEROKU_API_KEY in staging completes with all four Heroku config-var adapters updated and GH Actions secret updated; rotation_jobs row shows completed with correct stage history in subscriber_snapshot.
This milestone de-risks the entire bus model before any UI is built. The UI cards (NV8-NV10) can be developed against a working API.
v2 M3 scope: NV6 (Postmark vendor module), NV8 (three-stage modal UI), NV9 (subscriber-table view), NV10 (type-to-confirm), NV11 (runbook v2), #917 (first console callsite migration)
M3 success gate: Console operator can trigger a Testing flow, Operational flow, and Revocation flow for POSTMARK_SERVER_TOKEN from the UI. All three flows complete correctly, modal reflects per-stage progress, subscriber table shows Infisical write result. Revocation flow terminates the old token and the probe confirms 401.
The revocation flow is architecturally simpler than the operational flow (Stage 3 only), but it is UI-complete and deserves a standalone hardening milestone before it becomes a first-class operator tool.
M4 scope:
- Revocation flow hardened to production: type-to-confirm gate live, audit event vault.rotation.revoked fires, probe confirmation logged
- Revocation flow available for all registered credentials (not just Postmark)
- completed_partial recovery runbook finalized
- Load test: concurrent revocation + rotation on different credentials does not deadlock
M4 gate: M3 complete; no M4 card is started before M3 success gate is passed.
Kristerpher needs a binary answer on each of these before the corresponding card is dispatched. The same questions are flagged in docs/architecture/velvet/v2-rotation-flows.md — answer once, both docs converge.
| ID | Question | Cards gated | Options |
|---|---|---|---|
| OD-1 | Is revocation a separate endpoint (POST /tokens/{name}/revoke) or the rotate endpoint with flow=revocation? |
V4 revision, NV5, NV8, NV10 | A) Separate endpoint (cleaner semantics; revocation is never a rotation); B) flow param on /rotate (one fewer route; matches the v2 model where revocation re-uses Stage 3 logic) |
| OD-2 | What is the completed_partial policy? If some subscribers fail in Stage 2 but validate passes, does Stage 3 revoke still execute? |
NV5 | A) Yes — revoke always executes if validate passes, regardless of subscriber partial failure; B) No — all subscribers must succeed before revoke; partial failure holds old token valid |
| OD-3 | Does the subscriber registry live in Postgres (on the Velvet app) or in Infisical metadata? | NV1 | A) Postgres (consistent with rotation_jobs; queryable via SQL; auditable); B) Infisical metadata (no new table; but couples the bus registry to the secret store being rotated, which is a circular dependency risk) |
| OD-4 | What is the Testing flow output surface? Job row only (operator reads from GET /rotations/{id}) or does it push a structured result to the console Status page? |
NV5, NV8 | A) Job row only (operator pulls); B) Testing flow result is a lightweight "health probe" event surfaced on the Console Status page |
| D2 | SSM path convention: confirm /raxx/{env}/{vendor}/{name} before M2 starts |
#913 V6 | Pre-existing open decision; needs explicit Y/N |
| D3 | Auth model: confirm per-caller scoped tokens vs. single global key before V5 (#912) is finalized | #912 V5 | Pre-existing open decision |
The architect agent is producing docs/architecture/velvet/v2-rotation-flows.md in parallel. This PM doc should be cross-referenced from that doc, and vice versa. Once the architect's doc lands:
v2-rotation-flows.md sections covering the testing, operational, and revocation flows.v2-rotation-flows.md. If the architect's doc resolves any of them structurally, update the OD table above and un-gate the corresponding cards immediately.The Heroku Mode A handler that shipped in PR #906 and PR #934 is operational and correct (CLI-free, HTTP-only). It should remain active until the Velvet v2 bus is live in production with the Heroku config-var adapter and GH Actions secret adapter both confirmed working end-to-end (M2.5 success gate).
At that point:
- Add a file-level deprecation comment to console/app/services/rotation_handlers/heroku.py:
# DEPRECATED: Superseded by token_service/vendors/heroku.py + HerokuConfigVarAdapter (Velvet v2).
# Do not extend. Remove after Velvet M2.5 is verified in production.
# Tracked: https://github.com/raxx-app/TradeMasterAPI/issues/907
- The console rotation UI stays on this handler until V10 (#917) console callsite migration lands
- Do NOT remove the handler before V10 is merged and smoke-tested in production
The CF Access service token provisioning SOP (docs/ops/runbooks/cf-access-service-token-provisioning.md) is a manual runbook, not a handler. It will eventually be superseded by a Velvet Cloudflare-vendor + Infisical-write adapter workflow (NV4), but that is M3 scope. No deprecation action now.
Once docs/architecture/velvet/scope.md is created, annotate the following sections:
| v1 scope section | v2 status |
|---|---|
| "Rotation Handler Abstraction" (four-function interface) | Superseded by v2 vendor module + bus adapter pattern; refer to v2-rotation-flows.md |
| "M3 — First rotation end-to-end" success gate | Updated by v2 M3 re-plan; see Section 4 of this doc |
Handler registry HANDLER_REGISTRY = {name: handler} |
Superseded by subscriber registry + adapter class mapping (NV1); refer to v2-rotation-flows.md |
| Distribute step inside handler | Superseded by bus adapter fan-out in flow runner Stage 2; refer to NV2/NV3 |
Sections not affected: M1, M2, app pair scaffold, Postgres state machine basics, auth model, audit log shape.
Top-3 blast-radius changes:
V7/V8/V9 (#914/#915/#916) — DROP and replace with bus-adapter split. Three vendor modules + four bus adapter cards + one flow runner card replace three monolithic handler cards. This is the largest structural change and cannot be undone once the subscriber registry is seeded.
V3/V4 (#910/#911) — schema and kickoff endpoint must be revised before M1 is cut. The flow, stage, and subscriber_snapshot columns + the flow param on the kickoff endpoint are foundational. Every card that comes after M1 assumes these fields exist. A rework after M1 deploys would require a Postgres migration on a live app.
Revocation flow endpoint shape (OD-1) — decision gates the entire UI layer. If revocation is a separate endpoint (/revoke), the modal, type-to-confirm, and audit log events are named differently than if it is a flow param on /rotate. This decision should be made before NV8 (modal UI) is dispatched.
Six decisions need answers before dispatch begins. OD-1 through OD-4 plus the pre-existing D2 and D3. None are hard to answer — most are binary. Recommend Kristerpher reviews Section 5 and marks each one in a comment on #907.
No new issues filed yet per instructions. This doc is the review artifact. File-and-dispatch on Kristerpher's go.