Raxx · internal docs

internal · gated

ADR-0073: Stripe Billing v1 Implementation Home — Raptor Stopgap

Status: Superseded by ADR-0075 (operator override 2026-05-11 ~21:08 UTC) Date: 2026-05-11 UTC Refs: #405 (parent billing design card), #403 (billing console epic), docs/architecture/stripe-customer-billing.md (v3 design), ADR-0071 (Queue-as-authority), ADR-0065 (Queue strangler-fig), ADR-0075 (override) Supersedes: None — this ADR does not supersede ADR-0071; it defers it.


Operator override 2026-05-11 ~21:08 UTC

"Billing is going to Queue, if it means slippage, so be it. Do not put billing in stopgap mode. Go ahead and wait for full onboarding with stripe once queue is built."

This ADR's Path B recommendation (Raptor stopgap) is rejected. The architectural integrity of Queue-as-authority (ADR-0071) outweighs the launch-deadline pressure. v1 launch proceeds without in-product paid signup; manual Stripe dashboard ops cover early customers until Queue + billing land. See ADR-0075 for the override + revised plan.

The analysis below is preserved for historical record but should not drive new work.


Context

ADR-0071 and stripe-customer-billing.md v3 (2026-05-11 UTC) target Queue as the authoritative home for billing tables (billing_customer, billing_subscription, billing_invoice, and supporting tables). That decision is architecturally correct.

The problem is timing. Queue does not exist as code. The queue/ directory is not in the repository. No Queue Alembic migration chain has been initialized. The Queue Phase 1 design (5 dev-days) + Phase 2 cutover (3 dev-days) together total ~8 dev-days of work that must complete before a single billing table can land. The v1 launch deadline is 2026-05-23 UTC, leaving 12 days. If any Queue Phase 1 work blocks on review, CI, or coordination, the margin collapses.

The three options under consideration:

Path Description
A Build minimal Queue first, then billing on top
B Build Stripe chain in Raptor as v1 stopgap; migrate to Queue post-launch
C Defer billing entirely; charge manually via Stripe dashboard at launch

Decision

Path B: Stripe billing tables land in Raptor's Postgres for v1. Migration to Queue is the planned post-launch epic.

The v3 schema is unchanged. The tables are portable. The decision is about physical hosting, not schema or logic.


Path A Cost — Why Not Now

"Minimal Queue" for billing is not trivial. The queue design (ADR-0065) calls for Queue to be a strangler-fig co-located with Raptor (same dyno, same Postgres DB, queue_ namespace prefix on tables). This is architecturally better than a separate app, but it still requires:

  1. queue/ directory scaffold (Flask app factory, blueprints, services, middleware)
  2. Queue DB migrations: queue_customers, queue_webauthn_credentials, queue_sessions, queue_email_verifications, queue_backup_codes, queue_webauthn_challenges, queue_customer_roles — the full core schema, even if billing is the only active feature
  3. Service-to-service auth middleware (QUEUE_SERVICE_TOKEN_* from env)
  4. JWT mint + RS256 signing infrastructure (QUEUE_JWT_SIGNING_KEY)
  5. Blueprint mount in Raptor's app factory behind FLAG_QUEUE_V1
  6. CI test harness extended to cover Queue's blueprints
  7. Infisical secret paths provisioned: /Raxx/Queue/Billing/Stripe/
  8. Then — and only then — the billing migration can reference Queue's queue_customers.id as the FK anchor

The Queue migration plan estimates Phase 1 at 5 dev-days and Phase 2 (cutover) at 3 dev-days. Billing on top is another 3–5 dev-days (service layer, webhook handler, mirror table, JIT check, console views). Realistic total: 11–13 dev-days.

With 12 days to v1, this requires zero days of slip in Queue scaffolding, zero review latency, and zero bugs in the migration chain. That is not a realistic assumption. Any slip launches without billing at all, which is worse than an explicit stopgap.

Path A's "architectural purity" argument fails on its own terms: cutting corners in Queue's core identity infrastructure to make a billing deadline would produce the kind of auth/session bugs that are far more expensive to fix post-launch than a planned schema migration. Queue deserves to be built correctly.


Path C Cost — Why Not Defer

Path C (no billing at launch) is operationally viable only if every v1 customer is free or operator-comped. The launch plan includes paid tiers. Without an in-product signup flow:

Path C defers billing by definition but it is not a clean defer — it is an ops liability that grows with every paid customer onboarded manually. The schema and logic are already designed (v3 is ready); the only argument against Path C is that the work exists and can ship. Path C is the right answer only if Path B has technical blockers that Path C avoids. It does not.


Path B Rationale

The v3 billing schema does not require Queue to exist. The billing_customer table has a queue_customer_id foreign-reference column, but that column is TEXT (UUID) — not a hard FK into a Queue-owned table. It is a cross-service reference that Queue will eventually own, but Raptor's Postgres can hold the text field with an index regardless of whether Queue has been provisioned.

What actually changes in Path B vs the v3 design:

Dimension v3 design (Queue-as-authority) v1 Path B (Raptor stopgap)
Table location queue/migrations/ backend_v2/db/migrations/
FK anchor queue_customers.id hard FK Text UUID reference (soft reference, no DB FK)
Webhook handler queue/blueprints/billing_webhook.py backend_v2/api/routes/billing_webhook.py
Stripe service layer queue/services/stripe_client.py backend_v2/api/services/stripe_client.py
Internal billing API queue/api/internal/billing/* Raptor internal routes
Mirror table Raptor mirror of Queue-DB Self-mirror (same DB; billing_subscription_mirror is redundant and skipped for v1)
Console access pattern Queue HTTP API Raptor HTTP API
Secrets path /Raxx/Queue/Billing/Stripe/ /Raxx/Raptor/Billing/Stripe/ (or operator migrates the path at Queue cutover — low-effort)
Alembic chain Queue's chain Raptor's existing chain (next available revision after #1556 Postgres migration)

The schema content is identical. billing_customer, billing_subscription, billing_invoice, processed_stripe_events, billing_action_log, billing_reconcile_log — same columns, same indexes, same invariants.

The mirror table simplification: In the v3 design, Queue fans out to Raptor's billing_subscription_mirror because Queue and Raptor are separate authorities. In the Path B stopgap, billing lives in Raptor's Postgres already — the JIT paywall check reads billing_subscription directly (or a view) without a cross-service fan-out. The mirror table concept survives in the post-Queue migration because Queue will eventually need to push state to Raptor. For v1, the fan-out is an intra-process query.

What does not change: Every invariant in stripe-customer-billing.md §2 holds. No stored credentials. PII fields identical. Audit trail via billing_action_log with KMS HMAC chain. GDPR erasure path identical. Kill-switch flag (FLAG_BILLING_RAPTOR_API) isolates billing routes. Paper-first gating unaffected.


Post-Launch Migration Plan (Path B → Queue)

The migration from Raptor to Queue is the explicit post-v1 epic. Prerequisites:

  1. Queue Phase 1 complete (queue/ directory, core tables, API surface, FLAG_QUEUE_V1 live)
  2. Queue Phase 2 complete (Raptor + Antlers auth fully on Queue, queue_customers is the canonical identity table)

Migration steps:

  1. Create Queue billing migration: queue/migrations/versions/queue_0010_stripe_billing_tables.py — this is the v3 migration, unchanged. Tables land in Queue's schema namespace.
  2. Backfill script: queue/ops/backfill_billing_from_raptor.py — copies rows from Raptor's billing_* tables into Queue's tables with row-count validation and idempotency guard. Runs in one transaction; exits non-zero on count mismatch.
  3. Dual-write period: after Queue billing tables exist, Raptor's webhook handler writes to both Raptor-DB and Queue-DB (flag: FLAG_BILLING_DUAL_WRITE). Stripe webhooks continue to land on Raptor's endpoint during transition.
  4. Console retargeting: Console operator views switch from Raptor billing API to Queue's /api/internal/billing/* (a URL change, not a data-shape change).
  5. Cutover: FLAG_BILLING_RAPTOR_API=false disables Raptor billing routes. Stripe webhook endpoint is updated in the Stripe dashboard to Queue's URL.
  6. Raptor cleanup: billing_* tables remain in Raptor-DB for 30 days (read-only), then dropped via migration.

Estimated effort: 3–5 dev-days. No customer-visible impact. Zero-downtime via dual-write + flag cutover.

Risk at migration time: Low. The schema is identical. The data volume at launch will be small (pre-scale v1 customers). The FK softness (text UUID vs hard FK) is resolved when Queue's queue_customers table exists and the Queue billing_customer row is inserted with queue_customer_id = the already-known Queue customer UUID.


Honest Accounting of Path B Debt

This section exists because the architectural debt of Path B should not be papered over.

What we are explicitly accepting:

When this debt comes due: At Queue Phase 2 completion. That is the earliest the billing migration can run. If Queue Phase 2 slips past 3 months post-launch, the debt compounds: more customer rows to backfill, longer dual-write window, more Raptor–Queue divergence risk. The migration epic should be scheduled before Queue Phase 2 is one month out.


Alternatives Considered

Path A: Build minimal Queue first

Rejected — see "Path A Cost" above. 11–13 dev-days of combined Queue + billing work against a 12-day launch window with zero slip budget. The risk is that a blocked Queue scaffold delays billing to post-launch anyway, but without the explicit plan of Path B. Architectural purity is not an argument that survives a launch date.

The more important point: Queue should not be built under schedule pressure. Queue is the identity/session/RBAC service for all customers. If Queue is rushed to host billing, the auth bugs that result are worse than a Raptor billing stopgap.

Path C: Defer billing entirely

Rejected — see "Path C Cost" above. The schema and logic exist. The Stripe test keys are in vault at /MooseQuest/stripe/. Path C produces ongoing manual ops work that grows with customer count. Path B ships the billing chain and eliminates that ops liability.

Path B variant: Queue strangler-fig, billing in Queue namespace anyway

Considered: use Queue's strangler-fig pattern (ADR-0065) to mount Queue as a blueprint inside Raptor, and land billing in the Queue blueprint even before the full Queue Phase 1 is complete — i.e., skip Queue's auth/session infrastructure and only scaffold the billing blueprint.

Rejected because this creates a partial Queue that has billing tables but no queue_customers table, no session infrastructure, and no RBAC. The billing_customer.queue_customer_id FK would reference a nonexistent table. The result is a Queue-shaped API endpoint backed by Raptor identity primitives — which is exactly what Raptor already is. The only gain is a queue/blueprints/ directory instead of backend_v2/api/routes/; the actual technical risk is the same as Path B with additional scaffolding overhead.


Consequences

Positive: - Billing chain (#406–#409, #1630–#1635) can be claimed by feature-developer immediately after this ADR lands. No Queue prerequisite. - The v3 schema ships unchanged. Zero rework when Queue migration runs. - Launch on 2026-05-23 UTC with a working paid signup flow. - Queue is built correctly and on its own timeline post-launch.

Negative: - Billing PII temporarily in Raptor-DB, violating ADR-0071's separation principle for the v1 period. This is the main cost. - Soft FK on billing_customer.queue_customer_id requires application-level guard during v1. - Post-launch migration epic is real engineering work (3–5 dev-days) that must be scheduled before Queue Phase 2 closes. - Two migration chains will both touch billing tables. The Queue billing migration must be written with awareness of the Raptor rows it replaces.


Action Items for PM

  1. Retarget #406, #407, #408, #409, #1630, #1631, #1632, #1633, #1635 from area:queue to area:raptor (or area:backend-v2). Feature-developer can claim these cards now.
  2. File a post-launch Queue billing migration epic linking to this ADR.
  3. Ensure the Queue Phase 2 timeline includes "schedule billing migration" as a prerequisite gate before Phase 2 is declared complete.