ADR-0071: Stripe Billing Tables — Queue as the Authoritative Store
Status: Accepted
Date: 2026-05-11 UTC (v2 revised from 2026-05-10 Proposed; v3 addendum 2026-05-11 UTC)
Supersedes: 0071-stripe-billing-console-db-vs-hybrid.md (original framing: Console-DB-vs-Raptor-DB Hybrid)
Refs: #405 (parent card), docs/architecture/stripe-customer-billing.md
Context
The Console billing design (card #405) introduces three new tables: billing_customer, billing_subscription, and billing_invoice. These tables are populated from Stripe webhooks and support the operator billing console.
The original design doc (2026-05-10) recommended placing the authoritative record in Console-DB (Heroku Postgres, Alembic-managed, operator-facing), with a PII-free mirror in Raptor-DB. That framing predated the operator's 2026-05-09 decision locking Queue as the owner of all customer identity, sessions, RBAC, and audit.
Operator surfaced the conflict in PR #1604 comment 2026-05-11 ~04:26 UTC: "Didn't we break out another service for console and raptor. Should we have billing on either of these two?"
This ADR documents the revised decision: Queue is the authoritative store. The original ADR file (0071-stripe-billing-console-db-vs-hybrid.md) is superseded and should be treated as historical only.
Decision
Queue-DB holds the authoritative Stripe billing record. Raptor holds a PII-free 4-column mirror for paywall enforcement. Console is a pure reader of Queue's HTTP API — no billing schema in Console-DB.
| Service | Role |
|---|---|
| Queue-DB | Authoritative: billing_customer, billing_subscription, billing_invoice, audit/dedup tables |
| Raptor-DB | PII-free mirror: billing_subscription_mirror (4 columns, updated by Queue fan-out) |
| Console | Reader: calls Queue /api/internal/billing/*; no local billing schema |
v3 Addendum: Fail-closed on JIT mirror check (2026-05-11 UTC)
Decision: When Raptor's paywall JIT check finds no billing_subscription_mirror row for an authenticated customer, Raptor returns 402 Payment Required (fail-closed) rather than granting access (fail-open).
Context: Operator review (PR #1604, 2026-05-11) added the JIT mirror check as an explicit failure mode. The architect must choose a default: fail-closed (deny access when mirror row is absent) or fail-open (grant access when mirror row is absent, assuming sync lag).
Reasoning for fail-closed:
A missing mirror row is a data-sync anomaly, not a normal state. Valid customers who have subscribed and been processed by Queue will have a mirror row; the only condition that produces a missing row is a sync failure between Queue and Raptor. Two failure modes exist:
-
Customer IS subscribed, mirror row is missing (sync lag). Result of fail-closed: customer temporarily loses access (false negative). Operator console detects the anomaly via Sentry CRIT alert; mirror is resynchronized; access is restored within minutes.
-
Customer is NOT subscribed (e.g., fraudulent session token or subscription was just canceled), mirror row correctly absent. Result of fail-open: customer gains unauthorized access (false positive = paywall bypass). This is a revenue and compliance failure.
Fail-open prioritizes availability at the cost of the paywall's integrity. For a trading platform where subscription tier gates live-execution features, a paywall bypass is the more severe failure mode. Fail-closed causes a recoverable support ticket; fail-open causes an undetected paywall breach.
Consequences of fail-closed:
- Positive: paywall cannot be bypassed via a sync-lag window. Security and revenue model integrity are preserved.
- Negative: customers with valid subscriptions may receive a transient 402 if mirror sync is lagged. This is mitigated by: (a) the Sentry CRIT alert that fires on every missing-row hit, (b) the nightly reconciler that closes mirror gaps, (c) the short fan-out latency under normal conditions.
- Sentry alert
billing.mirror.missing_rowmust fire on EVERY fail-closed occurrence, not just the first. The operator uses this signal to detect any mirror-sync degradation before it affects more customers.
Revisit condition: If fail-closed produces more than 5 customer-reported access denials per week in production, the team should re-evaluate. The threshold suggests systemic mirror-sync issues that need architectural attention rather than a flip to fail-open.
Consequences
Positive:
- Billing records co-locate with the customer identity record they describe. A Stripe customer IS a Queue customer. Splitting them across services would require a sync relationship between the authoritative customer row (Queue) and a secondary billing record (Console) — unnecessary coupling.
- Queue already owns RBAC primitives. queue-billing-read, queue-billing-write, and queue-billing-mutate permissions fit naturally into the existing <app>-<resource>-<level> naming convention.
- All billing PII is isolated in Queue-DB. Raptor's mirror has no PII. Console has no billing schema. The PII surface is minimal and auditable at a single service.
- Queue outage does not block Raptor's paywall checks — the PII-free mirror is available locally in Raptor even while Queue is down. Raptor's availability is not contingent on Queue's availability on the hot request path.
- Console operator views are naturally fresh — Console reads Queue live at render time, so there is no risk of Console showing stale data from a local cache that has drifted from Queue.
Negative:
- Queue gains a Stripe webhook surface it did not previously have. STRIPE_RESTRICTED_KEY and STRIPE_WEBHOOK_SECRET must be provisioned in Infisical under /Raxx/Queue/Billing/Stripe/.
- The fan-out from Queue to Raptor's mirror endpoint introduces an async sync step. Mirror can be stale by up to 24h if fan-out fails (bounded by nightly reconciler). A canceled subscription could remain active in the mirror for that window.
- Two migration chains: Queue Alembic for the authoritative tables, Raptor Alembic for the mirror table. The Raptor migration is blocked on epic #1556 completing.
- Sub-cards #406–#409 were originally targeted at area:console. They must be retargeted to area:queue. PM action required.
- Console admin views that previously would have read a local Console-DB table now require a service-to-service call to Queue. Console must handle Queue degradation gracefully (surface an error state rather than stale data).
- Fail-closed on missing mirror row (v3 addendum) may cause transient 402 errors for valid customers during sync-lag windows.
Alternatives Considered
Console-DB as authority (original Hybrid framing, 2026-05-10)
The original design placed the authoritative record in Console-DB with a 4-column PII-free mirror in Raptor. Console-DB was selected because it was the natural home for operator admin data and was already Alembic-managed.
Rejected because: - Console is an admin plane. It does not own customer identity. Queue owns the canonical customer row per the 2026-05-09 operator decision. Placing billing in Console would create two separate services (Console and Queue) each claiming partial authority over a single customer entity, requiring sync between them. - Console-DB-as-authority inverts the dependency: Queue would need to call Console to know a customer's billing status, or Console would need to push billing state into Queue. Either direction is wrong. Queue should be the source of truth for all customer-facing state.
Raptor-DB as authority
All billing tables in Raptor-DB. Console queries Raptor's DB or a Raptor API for admin views.
Rejected because: - Billing PII (customer email, address, last4) enters the trading app DB. Raptor already has a complex migration story (epic #1556). Adding billing PII expands the PII surface of the customer-facing service unnecessarily. - Raptor should remain focused on trading operations. Giving it a Stripe webhook surface, PII fields, and billing admin logic crosses the service boundary the architecture is trying to maintain.
Console-DB only (no Raptor mirror)
All billing in Console-DB; Raptor calls Console synchronously on every paywall check.
Rejected because: - Adds synchronous latency to every Raptor API call that gates on plan tier. Console outage = Raptor paywall failure = potential customer lockout or unintended fail-open. This coupling is unacceptable at v1.
Fail-open on missing mirror row
Grant access when billing_subscription_mirror row is absent, assuming the absence reflects sync lag rather than a non-subscriber.
Rejected because: - A missing mirror row cannot be distinguished from a non-subscriber at the Raptor level without a synchronous Queue call (which defeats the mirror's purpose). Fail-open grants access to both cases, meaning any mirror-sync gap is a paywall bypass window. The integrity failure mode is worse than the availability failure mode for this product.
Notes
The 4-column PII-free mirror in Raptor (billing_subscription_mirror) is common to both the original Hybrid design and this revised design. The difference is only in where the authoritative record lives: Queue-DB (this ADR) vs. Console-DB (original). The mirror table definition, LWW guard, and fan-out mechanics are unchanged.
The v3 addendum (fail-closed on JIT check) does not change the authority model — it only specifies Raptor's behavior when the mirror is in an unexpected state.