ADR-0041 — Velvet consumer registration: runtime API + manifest bootstrap (supersedes ADR-0040)
Status: Accepted (2026-05-03) Supersedes: [ADR-0040](https://internal-docs.raxx.app/architecture/adr/0040-velvet-consumer-registration-static-manifest.html) Related: [ADR-0037](https://internal-docs.raxx.app/architecture/adr/0037-velvet-service-bus-subscription-model.html), [ADR-0038](https://internal-docs.raxx.app/architecture/adr/0038-velvet-three-stage-operational-flow.html), [ADR-0039](https://internal-docs.raxx.app/architecture/adr/0039-velvet-revocation-401-criterion.html)
Context
ADR-0040 (merged 2026-05-03 ~06:15 UTC in PR #944) chose static-manifest-only registration for Velvet consumers. The reasoning was security: a runtime registration endpoint widens attack surface.
In conversation 2026-05-03 ~07:00 UTC, the operator answered OQ1 explicitly:
"Runtime API — I'm looking for the flow that makes this a bit more dynamic. I feel like holding values is catastrophic at some point. If you disagree, state your reasons."
Operator's concern: a static manifest drifts from code-reality over time. Consumers added to the codebase without manifest updates are silently excluded from rotations. Consumers removed from the codebase but still in the manifest get repeatedly poked with credentials they no longer use. Drift is a footgun, especially as the consumer count grows.
This ADR supersedes ADR-0040 with a hybrid model that captures both arguments.
Decision
Hybrid registration: manifest is the bootstrap seed; runtime API is the durable source of truth.
- The
subscription-manifest.ymldefines the initial registry state at deploy time. It seeds thesubscriberstable on first boot and is checked-in to source control. - A runtime registration API (
POST /api/v1/subscribers) is the durable path for adding, updating, and removing subscribers without redeploys. After bootstrap, the table is the source of truth — the manifest can drift without breaking anything. - A drift-reporter runs on each Velvet startup: it diffs the manifest against the live
subscriberstable and writes avelvet.subscribers.driftaudit row showingmanifest_only,db_only, andmismatched_fieldsets. The drift is informational, not corrective — operators decide whether to re-seed.
Security mitigations for the runtime endpoint
The OQ1 security concern from ADR-0040 stands; the mitigations make the runtime endpoint safe:
-
Caller authentication — every registration request carries a per-caller scoped Velvet service token (D3, locked 2026-05-03). The middleware (V5/#912) verifies the token's identity against the authz table and rejects unknown callers.
-
mTLS on
/api/v1/subscribers— all registration traffic flows over a mutually-authenticated TLS channel. Internal-only — not reachable from the public internet. Cloudflare Access service-token gate at the edge, plus Velvet's own per-caller token at the app layer (defense in depth). -
Registration TTL with periodic re-register — every subscriber row carries
expires_at = registered_at + 24h. Consumers re-register hourly via a side-effect of normal startup; rows past their TTL are flaggedstaleand excluded from rotation distribute. Stale rows persist for 7 days for audit visibility, then auto-prune. -
Allowlist of known consumer identities — the
velvet_caller_authztable enumerates which callers may register subscribers for which token names. A console caller cannot register a subscriber forSTRIPE_RESTRICTED_KEY, only for credentials it actually uses. The authz table is hand-managed (no runtime registration of callers themselves; that ladder ends). -
Audit on every registration —
velvet.subscriber.registered,velvet.subscriber.updated,velvet.subscriber.removed,velvet.subscriber.expiredevents. Tail the audit log → spot anomalies.
Why hybrid (not pure-runtime)
A pure-runtime model has a chicken-and-egg failure mode: when Velvet itself first boots, the registry is empty. No consumer can register because no consumer knows Velvet exists yet. First rotation against a cold registry would silently distribute to nobody. The bootstrap manifest avoids this by seeding the initial set; runtime API takes over from there.
Consequences
Positive: - Drift between code and config is eliminated — consumers register from inside their own startup paths. - New consumers onboard with code, not config — no separate manifest PR needed. - Removed consumers self-deregister or expire via TTL. - Manifest stays useful for first-deploy and DR scenarios.
Negative: - M1 implementation cost grows: NV1 (#945) now needs both a manifest loader AND a runtime registration handler with mTLS termination. Estimated +1-2 days. - The drift-reporter is an additional moving piece that itself needs monitoring (drift between drift-reporter expectations and reality is a real meta-problem). - Per-caller authz table is hand-managed — adds operator overhead when onboarding new caller classes.
Migration path:
- M1 ships the manifest loader first (NV1 baseline).
- M1.5 ships the runtime API + drift reporter (additive; no removal of manifest support).
- Existing subscribers from the manifest are auto-imported on first M1.5 boot, marked source: manifest_seed.
- Subscribers re-registering via runtime API update their row to source: runtime_api.
Open work
- NV1 (#945) body needs revision to reflect this hybrid model. Add manifest loader + runtime registration as two complementary features.
- The
velvet_caller_authztable needs a separate sub-card; not part of NV1. - The drift reporter needs a separate sub-card; could fold into NV1 or be a tiny standalone NV1.5.
Decision matrix considered
| Option | Pros | Cons |
|---|---|---|
| Pure static manifest (ADR-0040) | Smallest attack surface; everything reviewable in PRs | Drifts catastrophically; cold-start consumers absent from rotation |
| Pure runtime API | Self-healing, no drift | Cold-start with empty registry; no DR seed; no source-control history |
| Hybrid (this ADR) | Bootstrap from manifest, durable via API, drift visible via reporter | More code to ship in M1; operator overhead on authz table |
References
- Memory:
project_velvet_v2_design.md— locked OQ1 answer 2026-05-03 ~07:00 UTC - ADR-0040 (superseded) — retained in tree for history; carries
Status: Superseded by [ADR-0041](https://internal-docs.raxx.app/architecture/adr/0041-velvet-runtime-registration-supersedes-0040.html) - Architect's design doc:
docs/architecture/velvet/v2-rotation-flows.md - Cards: #945 (NV1) needs revision per this ADR; #910 schema needs
subscribers.expires_at+sourcecolumns