Queue Stripe Webhook + Billing Layer — Design
Status: Design v1 — ready for implementation
Date: 2026-05-14 UTC
Author: software-architect
Refs: #1682 (webhook handler), #1632 (founders backfill), #2003 (pre-launch punch list), ADR-0076, ADR-0075, ADR-0071
Parent docs: docs/architecture/stripe-customer-billing.md v3 (schema authority), docs/architecture/queue/queue-phase1-design.md (service layout), docs/architecture/stripe-billing-gap-analysis-2026-05-13.md (gap inventory)
ADR produced: ADR-0088 (idempotency key strategy)
1. Context
Queue (C++) is the authoritative billing service. As of 2026-05-13 UTC the foundation layers are shipped: schema (6 tables via sqitch), Stripe C++ service layer, mirror-sync endpoint, reconcile endpoint, billing CRUD read endpoints, and GoogleTest scaffolding. The single remaining Queue-side blocker for paid subscriptions is the Stripe webhook handler (#1682). This design document covers:
- Webhook ingestion endpoint (HMAC, idempotency, pipeline)
- Idempotent upsert logic for all v1 event types
- Mirror fan-out to Raptor's
billing_subscription_mirror - Receipt + welcome email via Postmark
- Subscribe and Stripe Portal endpoints (minimal v1 shape)
- Raptor paywall middleware (JIT mirror check)
- Founders tier Price ID backfill (#1632)
- Sub-card slate with T-9 sequencing
This document does not re-specify the schema (that lives in stripe-customer-billing.md §4). It provides the behavioral contracts, failure modes, and migration plan for the missing pieces.
2. Invariants
These platform constraints are non-negotiable. Any sub-card that requires violating one must stop and escalate.
| # | Invariant |
|---|---|
| I-1 | No stored credentials. STRIPE_RESTRICTED_KEY and STRIPE_WEBHOOK_SECRET are fetched from Infisical at Queue process startup; held in process memory only; never written to DB, logs, or any persistent store. |
| I-2 | Audit trail for every money-state change. Every subscription or invoice mutation writes a row to billing_action_log with KMS HMAC chain integrity. |
| I-3 | GDPR by default. billing_email and postal address fields are PII. Retention 7 years post-deletion. DSR anonymization path exists. Breach-notification automation must include billing tables. |
| I-4 | Fail-closed. FLAG_QUEUE_BILLING=false returns 503 on all billing routes without redeploy. Missing mirror row → Raptor returns 402 (not 200). |
| I-5 | Vendor-name-free customer copy. Receipt and notification emails say "your Raxx subscription", never "Stripe" or "Alpaca". No broker names in customer-facing surfaces. |
| I-6 | Retrospective framing. Receipts describe what happened ("your payment was processed on DATE"), not what will happen. No forward-looking copy in transactional emails. |
| I-7 | Hide, don't gray. Billing UI surfaces for unsubscribed users are absent from the DOM, not disabled. See §6 for subscribe flow shape. |
| I-8 | All timestamps UTC. Every column TIMESTAMPTZ, every log line ISO 8601 UTC. |
| I-9 | Paper-first gating is orthogonal. The paywall blocks unpaid-tier access; paper-profitable-for-N-cycles blocks live trading for all tiers. Both checks must coexist and are independent. |
| I-10 | Secrets path: /Raxx/Queue/Billing/Stripe/ in Infisical. STRIPE_WEBHOOK_SECRET must be promoted from the legacy /MooseQuest/stripe/ path before E-1 can be claimed. Operator action. |
3. Data Model
Schema is fully specified in stripe-customer-billing.md §4 and implemented in queue/migrations/sqitch/. This section documents the tables touched by the webhook handler and their LWW guard semantics.
3.1 Tables written by the webhook handler
| Table | Natural key | LWW guard column | Dedup table |
|---|---|---|---|
billing_customer |
stripe_customer_id |
updated_at |
processed_stripe_events |
billing_subscription |
stripe_subscription_id |
updated_at |
processed_stripe_events |
billing_invoice |
stripe_invoice_id |
updated_at |
processed_stripe_events |
billing_subscription_mirror |
queue_customer_id |
updated_at |
n/a — fan-out target |
billing_action_log |
append-only id UUID |
n/a | n/a |
processed_stripe_events |
event_id TEXT PK |
n/a | this IS the dedup table |
3.2 processed_stripe_events TTL policy
Rows in processed_stripe_events are retained for 72 hours. Stripe's retry window is ≤72 hours. A row older than 72h is safe to prune by a nightly cleanup job. The dedup check is: SELECT 1 FROM processed_stripe_events WHERE event_id = $1. Any row hit = return 200 immediately.
3.3 Tier downgrade detection
On every customer.subscription.updated or customer.subscription.deleted event, after the upsert, the handler compares the new plan_tier value against the value stored before the write:
if (previous_plan_tier != 'free' && new_plan_tier == 'free') OR
(previous_plan_tier == 'pro_plus' && new_plan_tier in ('pro', 'founders', 'free')) OR
(previous_plan_tier == 'pro' && new_plan_tier in ('founders', 'free'))
→ SET feature_locked_at = NOW() WHERE feature_locked_at IS NULL
feature_locked_at is only written once — it records the first moment of downgrade. Subsequent churns do not overwrite it. The billing_action_log row records both previous_plan_tier and new_plan_tier for audit.
4. APIs / Contracts
4.1 Webhook ingestion — POST /api/v1/billing/webhook
Auth: None (public endpoint, protected by HMAC signature verification). Must NOT be behind the internal_auth_filter.
Flag gate: FLAG_QUEUE_BILLING — returns 503 if false.
Request: Raw body (preserve exactly for HMAC); Stripe-Signature header.
Pipeline (all within a single Postgres transaction except step 10):
1. Extract Stripe-Signature header. Parse t= and v1= components.
2. Timestamp tolerance check: abs(NOW_UTC - t) > 300 seconds → return 400.
(Configurable via STRIPE_WEBHOOK_TOLERANCE_SECONDS env var, default 300.)
Fire Sentry CRIT: billing.webhook.stale_timestamp. No DB write.
3. HMAC-SHA-256 verify: compute HMAC over "<t>.<raw_body>" using STRIPE_WEBHOOK_SECRET.
Use OpenSSL CRYPTO_memcmp for constant-time comparison.
Mismatch → return 400. Fire Sentry CRIT: billing.webhook.hmac_failure. No DB write.
4. Parse event JSON (nlohmann/json). Parse failure → return 400. Fire Sentry ERROR.
5. BEGIN TRANSACTION
6. Check processed_stripe_events for event.id. If found → COMMIT; return 200 (idempotent).
7. Route by event.type. Unrecognized type → INSERT processed_stripe_events; COMMIT; return 200.
8. Upsert target row with LWW guard:
ON CONFLICT (stripe_*_id) DO UPDATE SET ...
WHERE billing_table.updated_at < EXCLUDED.updated_at
9. Run downgrade detector (subscription events only). Set feature_locked_at if applicable.
10. INSERT processed_stripe_events (event.id, created_at = NOW()).
11. INSERT billing_action_log row with KMS HMAC chain.
12. COMMIT
13. [Post-commit, non-transactional] POST /api/internal/billing/mirror-sync to Raptor.
Fire-and-log. Failure → Sentry WARN: billing.mirror.fan_out_failure. Does NOT re-open txn.
14. [Post-commit, non-transactional] If checkout.session.completed or
invoice.payment_succeeded on new subscription → trigger welcome/receipt email via Postmark.
Fire-and-log. Failure → Sentry WARN: billing.email.dispatch_failure. Does NOT affect 200.
15. Return 200.
DB write failure (step 8-12): Return 500 to Stripe. Stripe retries with exponential backoff. Because processed_stripe_events was not committed, the retry is processed normally.
Decision: 5xx on DB failure, not 2xx + local queue. See ADR-0088 §Decision.
4.2 Event types handled in v1
| Event type | Target table | Action |
|---|---|---|
customer.created |
billing_customer |
INSERT; create queue_customer_id mapping if missing |
customer.updated |
billing_customer |
UPSERT with LWW |
customer.deleted |
billing_customer |
Soft-delete: set deleted_at = NOW(); cascade subscription cancel |
customer.subscription.created |
billing_subscription |
UPSERT; init feature_locked_at = NULL |
customer.subscription.updated |
billing_subscription |
UPSERT with LWW; run downgrade detector |
customer.subscription.deleted |
billing_subscription |
UPSERT: status = 'canceled', canceled_at = NOW(); run downgrade detector |
invoice.payment_succeeded |
billing_invoice |
UPSERT; status = 'paid', paid_at; trigger receipt email |
invoice.payment_failed |
billing_invoice |
UPSERT; status = 'open'; trigger payment-failure notification |
invoice.created |
billing_invoice |
UPSERT |
invoice.updated |
billing_invoice |
UPSERT with LWW |
invoice.voided |
billing_invoice |
UPSERT; status = 'void' |
checkout.session.completed |
billing_customer + billing_subscription |
Confirm customer provisioning; trigger welcome email |
All other event types: INSERT into processed_stripe_events (dedup); return 200 without further action.
invoice.created vs. invoice.payment_succeeded order risk: Stripe may deliver invoice.payment_succeeded before invoice.created for fast payments. LWW guard handles this: the later-arriving invoice.created event has an earlier created timestamp and loses the LWW comparison against the already-upserted row with the full payment data.
4.3 Subscribe endpoint — POST /api/v1/billing/subscribe
Auth: Customer bearer token via Raptor's auth layer (Phase 1; Queue Phase 2 will own auth natively).
Request:
{ "plan_tier": "pro" | "pro_plus", "stripe_payment_method_id": "pm_..." }
Pipeline:
1. Resolve queue_customer_id from bearer token claims.
2. Check for existing active billing_subscription row → 409 SUBSCRIPTION_ALREADY_ACTIVE if found.
3. CustomerService::get_or_create() — retrieve existing Stripe customer or create a new one.
4. SubscriptionService::create() — create Stripe subscription with price ID from env (STRIPE_PRICE_ID_PRO or STRIPE_PRICE_ID_PRO_PLUS).
5. Insert billing_customer + billing_subscription rows optimistically (webhook will confirm).
6. Append billing_action_log row.
7. Return { "client_secret": "...", "subscription_id": "sub_..." } for Antlers to confirm payment via Stripe.js.
Price IDs read from env at startup; never hardcoded. If price ID env var is missing → return 503 with Sentry CRIT: billing.subscribe.missing_price_id.
4.4 Stripe Portal session — POST /api/v1/billing/portal
Auth: Customer bearer token.
Pipeline:
1. Resolve queue_customer_id → look up stripe_customer_id in billing_customer.
2. Call stripe.BillingPortal.Session.create({ customer, return_url }).
3. Append billing_action_log row (action = 'portal_session_created').
4. Return { "url": "https://billing.stripe.com/..." }.
This is the v1 mechanism for self-service cancellation, payment method updates, and invoice history.
4.5 Internal mirror-sync — POST /api/internal/billing/mirror-sync
Already shipped (closed #1684). Called by the webhook handler post-commit. Updates billing_subscription_mirror in Raptor with { queue_customer_id, plan_tier, status, current_period_end, updated_at }. LWW guard on updated_at on the Raptor side. Raptor endpoint is at /api/internal/billing/mirror-sync; authenticated via service bearer token (QUEUE_TO_RAPTOR_INTERNAL_TOKEN from env).
5. State Machines and Sequences
5.1 Webhook processing (happy path)
sequenceDiagram
participant S as Stripe
participant WH as Queue POST /api/v1/billing/webhook
participant DB as Queue-DB (Postgres)
participant R as Raptor mirror-sync
participant PM as Postmark
S->>WH: POST event (Stripe-Signature header)
WH->>WH: Timestamp tolerance check (±5 min)
WH->>WH: HMAC-SHA-256 verify (constant-time)
note over WH: 400 + Sentry CRIT on either failure
WH->>DB: BEGIN TRANSACTION
WH->>DB: SELECT processed_stripe_events WHERE event_id = ?
alt already seen
WH->>DB: COMMIT
WH-->>S: 200 (idempotent)
else new event
WH->>DB: UPSERT billing_* (LWW guard)
WH->>WH: downgrade detector (subscription events)
WH->>DB: INSERT processed_stripe_events
WH->>DB: INSERT billing_action_log (KMS HMAC chain)
WH->>DB: COMMIT
WH->>R: POST mirror-sync (fire-and-log)
WH->>PM: dispatch email if applicable (fire-and-log)
WH-->>S: 200
end
5.2 Subscription tier state machine
stateDiagram-v2
[*] --> free : signup
free --> active_founders : checkout.session.completed (founders)
free --> active_pro : checkout.session.completed (pro)
free --> active_pro_plus : checkout.session.completed (pro_plus)
active_pro --> past_due : invoice.payment_failed
active_pro_plus --> past_due : invoice.payment_failed
active_founders --> past_due : invoice.payment_failed
past_due --> active_pro : invoice.payment_succeeded
past_due --> active_pro_plus : invoice.payment_succeeded
past_due --> unpaid : grace period exhausted
unpaid --> canceled : subscription.deleted
active_pro --> canceled : subscription.deleted
active_pro_plus --> canceled : subscription.deleted
canceled --> free : mirror-sync sets plan_tier=free
note right of past_due : feature_locked_at set on downgrade detection
5.3 Customer subscribe flow (E-2 + E-5)
sequenceDiagram
participant A as Antlers (browser)
participant R as Raptor (auth proxy)
participant Q as Queue POST /subscribe
participant ST as Stripe API
A->>R: POST /api/billing/subscribe (proxied)
R->>Q: POST /api/v1/billing/subscribe (bearer token forwarded)
Q->>ST: CustomerService::get_or_create
Q->>ST: SubscriptionService::create (price_id from env)
ST-->>Q: { client_secret, subscription_id }
Q->>DB: INSERT billing_customer + billing_subscription
Q->>DB: INSERT billing_action_log
Q-->>A: { client_secret, subscription_id }
A->>ST: stripe.confirmPayment(client_secret) [Stripe.js]
ST-->>A: payment confirmed
ST->>Q: POST /webhook (customer.subscription.created + invoice.payment_succeeded)
Q->>DB: UPSERT billing rows (webhook pipeline)
Q->>R: POST mirror-sync
note over R: billing_subscription_mirror updated
note over A: Antlers polls /api/billing/snapshot until status=active
6. Raptor Paywall Middleware
This is E-4 scope — must ship alongside or immediately after E-1.
6.1 Migration
New Alembic migration in Raptor's chain (after current chain head — feature-developer confirms revision before claiming):
CREATE TABLE billing_subscription_mirror (
queue_customer_id TEXT PRIMARY KEY,
plan_tier TEXT NOT NULL CHECK (plan_tier IN ('free','founders','pro','pro_plus')),
status TEXT NOT NULL CHECK (status IN (
'active','trialing','past_due','canceled',
'unpaid','incomplete','incomplete_expired')),
current_period_end TIMESTAMPTZ NOT NULL,
updated_at TIMESTAMPTZ NOT NULL
);
CREATE INDEX idx_bsm_status ON billing_subscription_mirror (status)
WHERE status = 'active';
6.2 JIT check middleware
Applied to gated routes (Pro+ feature routes). Logic:
1. Extract queue_customer_id from session token claims.
2. SELECT plan_tier, status FROM billing_subscription_mirror
WHERE queue_customer_id = $1.
3. If no row found → return 402. Fire Sentry CRIT: billing.mirror.missing_row.
4. If status NOT IN ('active', 'trialing') → return 402.
5. If plan_tier does not satisfy the route's required tier → return 402.
6. Otherwise → proceed.
Fail-closed is the invariant. Missing row returns 402, not 200. See ADR-0088 for rationale.
6.3 Mirror-sync receiver (POST /api/internal/billing/mirror-sync)
Authenticated via QUEUE_TO_RAPTOR_INTERNAL_TOKEN service bearer token. On receipt:
INSERT INTO billing_subscription_mirror
(queue_customer_id, plan_tier, status, current_period_end, updated_at)
VALUES ($1, $2, $3, $4, $5)
ON CONFLICT (queue_customer_id) DO UPDATE SET
plan_tier = EXCLUDED.plan_tier,
status = EXCLUDED.status,
current_period_end = EXCLUDED.current_period_end,
updated_at = EXCLUDED.updated_at
WHERE billing_subscription_mirror.updated_at < EXCLUDED.updated_at;
Returns 200 regardless of whether LWW rejected the update. The caller (Queue) logs the response but does not re-try on 200 with a "no update" body.
7. Receipt and Welcome Email Path
7.1 Events that trigger email dispatch
| Trigger event | Template | Idempotency guard |
|---|---|---|
checkout.session.completed (new subscription) |
raxx-welcome-v1 |
billing_action_log row with action='welcome_email_dispatched'; check before sending |
invoice.payment_succeeded (renewal) |
raxx-receipt-v1 |
Same guard on action='receipt_email_dispatched' + entity_id = stripe_invoice_id |
invoice.payment_failed |
raxx-payment-failed-v1 |
Same guard on action='payment_failed_email_dispatched' + entity_id = stripe_invoice_id |
7.2 Idempotency across Stripe redeliveries
Before dispatching any email, the handler checks:
SELECT 1 FROM billing_action_log
WHERE action = $action_type AND entity_id = $entity_id
LIMIT 1
If a row exists, skip the email dispatch. This ensures Stripe event redeliveries do not send duplicate receipts.
7.3 Template copy constraints
- No mention of "Stripe", "Alpaca", or any vendor name in customer-facing copy. Use "your Raxx subscription" and "Raxx" only.
- Retrospective framing: "Your payment of $29.00 USD was processed on 2026-05-14." Not "Your subscription will renew on...".
billing_emailis the delivery address. Never logged in plaintext.- Postmark sender:
no-reply@raxx.app. Template IDs stored in Infisical at/Raxx/Queue/Billing/Postmark/asPOSTMARK_TEMPLATE_WELCOME,POSTMARK_TEMPLATE_RECEIPT,POSTMARK_TEMPLATE_PAYMENT_FAILED.
7.4 Dispatch is fire-and-log
Email dispatch is post-commit and non-transactional. On Postmark API failure: Sentry WARN (billing.email.dispatch_failure); the webhook handler still returns 200 to Stripe. Email delivery is best-effort at the webhook layer; failed dispatches are visible in the Sentry WARN stream for manual follow-up.
8. Founders Tier Price ID Backfill (#1632)
This is a one-shot ops script, not a migration. Full specification is in #1632.
8.1 When it runs
Post-live-mode activation only. Staging can use test-mode price IDs (script works against test mode). The script must NOT run until:
- Operator creates founders-tier Stripe Product + Price object in live mode.
- Stripe account is out of test mode.
8.2 Data shape
Script targets: billing_subscription WHERE plan_tier = 'founders' AND stripe_price_id IS NULL.
Write: UPDATE billing_subscription SET stripe_price_id = $1 WHERE id = $2.
Audit: one billing_action_log row at completion: action = 'ops_backfill_founders_price_id', actor_id = 'operator:manual', payload = { count, stripe_price_id }.
8.3 Execution plan
python queue/ops/backfill_founders_price_ids.py --dry-run --founders-product-id prod_XXX --env staging- Verify output shows expected rows and resolved price ID.
python queue/ops/backfill_founders_price_ids.py --founders-product-id prod_XXX --env staging- Verify
billing_action_logrow written; verifystripe_price_idpopulated. - Repeat steps 1-4 for
--env prodafter live-mode activation. - Archive:
mv queue/ops/backfill_founders_price_ids.py queue/ops/archive/backfill_founders_price_ids_DONE_<date>.py
8.4 Rollback
The script is idempotent. Rolling back means nulling the stripe_price_id for incorrectly set rows — a separate operator script. The risk is low because the script validates exactly-one-active-price before writing. If rollback is needed, write a targeted UPDATE billing_subscription SET stripe_price_id = NULL WHERE plan_tier = 'founders' AND stripe_price_id = 'bad_price_id' with an audit log row.
9. Migrations and Rollout
9.1 Queue-side migrations
All Queue billing schema is already deployed via sqitch (6 migrations, confirmed shipped). No new Queue schema migration is required for the webhook handler or subscribe/portal endpoints.
9.2 Raptor-side migration (E-4)
New Alembic revision (one file) adding billing_subscription_mirror table. Feature-developer confirms next revision number before filing the PR. Blocked on nothing — Raptor Postgres migration confirmed shipped.
9.3 Feature flags
| Flag | Location | Effect when false |
|---|---|---|
FLAG_QUEUE_BILLING |
Infisical + Heroku config, raxx-queue-* |
Queue returns 503 on all billing routes |
FLAG_BILLING_RAPTOR_API |
Infisical + Heroku config, raxx-api-* |
Raptor hides billing API routes |
FLAG_BILLING_AUDIT_WRITES |
Infisical + Heroku config, raxx-queue-* |
Circuit-breaker: halts billing_action_log writes on KMS chain break |
FLAG_ANTLERS_SUBSCRIBE |
backend_v2/api/feature_flags.yaml |
Antlers subscribe page absent from routing |
9.4 Rollout phases
| Phase | Gate | What's live |
|---|---|---|
| Dark | All flags false | Queue deployed; schema exists; no routes exposed |
| Flag — staging | FLAG_QUEUE_BILLING=true on raxx-queue-staging only |
Webhook endpoint live on staging; Stripe CLI replay |
| Flag — prod | After staging soak (E-7) passes | Webhook live on prod with test-mode Stripe keys |
| Beta | Stripe test-mode end-to-end subscribe flow confirmed | All E-1 through E-6 on prod, test-mode keys |
| GA | EIN confirmed; operator swaps to live-mode keys | Live billing active; founders backfill (#1632) runs |
9.5 Live-mode key swap (no redeploy required)
1. Operator: Stripe dashboard → create live-mode products + prices.
2. Operator: update Infisical paths:
/Raxx/Queue/Billing/Stripe/STRIPE_RESTRICTED_KEY → rk_live_...
/Raxx/Queue/Billing/Stripe/STRIPE_WEBHOOK_SECRET → whsec_live_...
/Raxx/Queue/Billing/Stripe/STRIPE_PRICE_ID_PRO → price_live_...
/Raxx/Queue/Billing/Stripe/STRIPE_PRICE_ID_PRO_PLUS → price_live_...
3. heroku restart -a raxx-queue-prod (Queue reads secrets at startup)
4. Run founders backfill (#1632) against prod.
10. Failure and Rollback Story
10.1 Decision: 5xx on DB failure — Queue returns failure code to Stripe
Chosen path: On DB write failure mid-webhook, return 500. Stripe retries with exponential backoff for up to 72 hours. The processed_stripe_events dedup table was not committed, so the retry is processed as a new event without double-write.
Why not 2xx + local queue: A local queue (Redis, BullMQ, SQS) adds an infra dependency, a new failure mode, and a deduplication complexity that the Stripe retry mechanism already provides. Stripe's retry semantics are well-defined and correct for this use case. The billing_action_log + processed_stripe_events pair is sufficient to reconstruct state on any failure mode.
See ADR-0088 for the full argument.
10.2 Failure modes summary
| Failure | Queue behavior | Stripe behavior | Alert |
|---|---|---|---|
| HMAC mismatch | Return 400; no DB write | No retry (400 is terminal) | Sentry CRIT billing.webhook.hmac_failure |
| Stale timestamp | Return 400; no DB write | No retry | Sentry CRIT billing.webhook.stale_timestamp |
| DB write failure | Return 500 | Retries for up to 72h | Sentry ERROR on each failure |
| Mirror fan-out failure | Return 200; log failure | No retry needed | Sentry WARN; reconciler corrects within 24h |
| Email dispatch failure | Return 200; log failure | No retry needed | Sentry WARN; manual follow-up |
| Missing mirror row on Raptor JIT check | Return 402 to customer | n/a | Sentry CRIT billing.mirror.missing_row |
FLAG_QUEUE_BILLING=false |
Return 503 | Retries for up to 72h | No alert (expected kill-switch state) |
| KMS chain break | Circuit-breaker disables audit writes | n/a | Sentry CRIT; FLAG_BILLING_AUDIT_WRITES=false halts writes |
10.3 Kill-switch and rollback
Immediate kill-switch: heroku config:set FLAG_QUEUE_BILLING=false -a raxx-queue-prod takes effect within seconds. Queue returns 503 on all billing routes. Stripe queues webhooks for 72h; they replay automatically when the flag is re-enabled.
Rollback of Raptor mirror migration: alembic downgrade -1 drops billing_subscription_mirror. Paywall middleware must be disabled before or simultaneously — gated behind FLAG_BILLING_RAPTOR_API.
11. Security Considerations
11.1 Webhook endpoint security
- HMAC-SHA-256 verification using OpenSSL EVP (same pattern as Postmark HMAC in SC-E10).
- Constant-time comparison via
CRYPTO_memcmpto prevent timing attacks. - Timestamp tolerance: 5-minute window (configurable). Stale events are rejected before any DB read.
- The endpoint is not behind
internal_auth_filter(it is Stripe-facing), but it is behindflag_gate_filter. STRIPE_WEBHOOK_SECRETis never logged. If a log line is needed for debug, log only the first 8 characters of the computed HMAC.
11.2 Subscribe endpoint security
- Requires customer bearer token (Raptor auth layer in Phase 1).
- Price IDs come from env, never from the request body. A customer cannot request an arbitrary price ID.
- Active-subscription guard prevents double-subscribe race.
11.3 PII handling
billing_emailis never written to any log. The Postmark dispatch call logs onlystripe_invoice_id, not the recipient address.- Billing tables are in scope for breach-notification automation. Operator action: add
billing_customerandbilling_subscriptionto the breach-scope inventory before GA.
11.4 Secrets
| Secret | Infisical path | Rotatable without redeploy? |
|---|---|---|
STRIPE_RESTRICTED_KEY |
/Raxx/Queue/Billing/Stripe/STRIPE_RESTRICTED_KEY |
Yes — heroku restart -a raxx-queue-prod |
STRIPE_WEBHOOK_SECRET |
/Raxx/Queue/Billing/Stripe/STRIPE_WEBHOOK_SECRET |
Yes — same |
STRIPE_PRICE_ID_PRO |
/Raxx/Queue/Billing/Stripe/STRIPE_PRICE_ID_PRO |
Yes |
STRIPE_PRICE_ID_PRO_PLUS |
/Raxx/Queue/Billing/Stripe/STRIPE_PRICE_ID_PRO_PLUS |
Yes |
QUEUE_TO_RAPTOR_INTERNAL_TOKEN |
/Raxx/Queue/Internal/QUEUE_TO_RAPTOR_INTERNAL_TOKEN |
Yes |
| Postmark template IDs | /Raxx/Queue/Billing/Postmark/POSTMARK_TEMPLATE_* |
Yes |
Pre-flight operator action: STRIPE_WEBHOOK_SECRET must be promoted from /MooseQuest/stripe/STRIPE_WEBHOOK_SECRET to /Raxx/Queue/Billing/Stripe/STRIPE_WEBHOOK_SECRET before E-1 can be claimed. Test keys only until EIN lands.
12. Sub-Card Slate + T-9 Timeline
These are implementation cards for feature-developer. The operator and PM will spawn GitHub issues from this slate.
Card E-1 (critical path): Stripe webhook handler
Files: queue/src/stripe/hmac_util.h/.cpp, queue/src/handlers/billing/webhook_handler.h/.cpp; promote test stubs in queue/tests/unit/test_hmac_util.cpp + test_webhook_processor.cpp to real implementations; promote GTEST_SKIP in queue/tests/integration/test_billing_webhook_integration.cpp.
Size: M (2-3 days)
Pre-flight: Operator promotes STRIPE_WEBHOOK_SECRET to /Raxx/Queue/Billing/Stripe/.
Card E-2: Subscribe endpoint
Files: queue/src/handlers/billing/subscribe_handler.h/.cpp; register route in main.cpp.
Size: M (2-3 days)
Depends on: E-1 (webhook must exist to receive subscription.created confirmation)
Card E-3: Stripe Portal session endpoint
Files: queue/src/handlers/billing/portal_handler.h/.cpp; register route.
Size: S (1 day)
Depends on: E-2 (customer must exist in Stripe)
Card E-4: Raptor billing_subscription_mirror + paywall middleware
Files: New Alembic migration; backend_v2/api/middleware/billing_paywall.py; backend_v2/api/routes/billing_mirror_sync.py.
Size: M (2-3 days)
Depends on: E-1 shipped to staging (fan-out must push data to test against)
Card E-5: Antlers subscribe flow (Payment Element)
Files: frontend/trademaster_ui/src/pages/Subscribe/; feature-flagged behind FLAG_ANTLERS_SUBSCRIBE.
Size: L (3-4 days)
Depends on: E-2 (subscribe endpoint); answer to OQ-1 (Checkout vs Payment Element) needed first
Open question blocking: OQ-1 — operator must choose Stripe Checkout (hosted redirect) vs Payment Element (embedded). Payment Element = better UX, +1 day impl. Checkout = faster, redirect-based.
Card E-6: Payment failure email + grace period logic
Files: Postmark template raxx-payment-failed-v1; grace-period logic in webhook handler (extension of E-1 scope or follow-on).
Size: M (2-3 days)
Depends on: E-1; OQ-2 answer needed (grace period duration: 7 days? two-failure rule?)
Card E-7: FLAG_QUEUE_BILLING staging soak + flip SOP
Files: docs/ops/runbooks/billing/flag-queue-billing-flip.md.
Size: S (1 day ops)
Depends on: E-1, E-4; operator registers staging webhook endpoint in Stripe dashboard
T-9 Timeline (2026-05-14 → 2026-05-23 UTC)
Day 1 (May 14): E-1 claimed. Operator promotes STRIPE_WEBHOOK_SECRET.
Day 2-3 (May 15-16): E-1 implementation. E-2 and E-3 in parallel if 2nd dev available.
Day 4 (May 17): E-1 merged to staging. E-4 starts (mirror migration + paywall).
Day 5-6 (May 18-19): E-4 continues. E-5 starts (requires OQ-1 answer by May 16).
Day 6 (May 19): E-6 starts in parallel with E-5 (requires OQ-2 answer by May 16).
Day 7 (May 20): E-2, E-3, E-4, E-6 merged. E-5 in final review.
Day 8 (May 21): E-5 merged. E-7 starts (staging soak, all event types).
Day 9 (May 22): E-7 SOP finalized. All cards merged to staging. Prod cutover begins.
Day 10 (May 23): FLAG_QUEUE_BILLING=true on prod. Test-mode billing live. Launch.
Critical path: E-1. Every other card depends on it. Zero slip budget — E-1 must start today.
With one developer: realistic window is 11-13 days. With two developers parallelizing E-2/E-3/E-4 against E-1, 9-10 days is achievable. The T-9 timeline requires two-developer parallelism from Day 2.
13. Open Questions
These require operator decisions before the blocking sub-cards can be claimed.
-
OQ-1 (blocks E-5, needed by 2026-05-16 UTC): Checkout integration shape. Stripe Checkout (hosted redirect, faster to build) vs. Payment Element (embedded, better UX continuity, +1 day). Both are valid for v1. Default recommendation: Stripe Checkout for speed; Payment Element post-v1 polish.
-
OQ-2 (blocks E-6, needed by 2026-05-16 UTC): Grace period policy. How many days between
invoice.payment_failedand access suspension? One failure or two beforestatus = 'unpaid'? Recommendation: 7-day grace, suspend on second failure within that window. Requires operator confirmation. -
OQ-3 (blocks E-5): Does the subscribe flow live at
/subscribein Antlers, inside the onboarding wizard, or triggered from the dashboard? Thegetraxx.compricing page links to#pricingwhich is currently a dead hash. Operator to confirm the UX entry point. -
OQ-4 (operator action, blocks E-1): Promote
STRIPE_WEBHOOK_SECRETfrom/MooseQuest/stripe/STRIPE_WEBHOOK_SECRETto/Raxx/Queue/Billing/Stripe/STRIPE_WEBHOOK_SECRETin Infisical. Test-mode key only; live-mode follows with EIN. -
OQ-5 (operator action, needed before GA): Billing tables (
billing_customer,billing_subscription,billing_invoice) must be added to the breach-notification scope inventory. Is there an existing inventory document to update, or does this need a new tracking card? -
OQ-6 (informational — must fix before E-5 ships):
docs/marketing/pricing-v2.mdlists "Email alerts | Yes | Yes | Yes + SMS/webhook" for Pro+ tier. SMS conflicts with the no-SMS-channel invariant. The SMS/webhook copy must be corrected before the subscribe UI ships against this pricing copy.