Queue Phase 1 — C++ Foundation + Billing
Status: Design v1
Owner: software-architect
Date: 2026-05-11 UTC
Governing ADR: ADR-0076 (C++ language selection, timeline assessment, stack picks)
Milestone: raxx.app v1 — first paid customer
Refs:
- docs/architecture/queue/design.md — full Queue identity service (Phase 2+)
- docs/architecture/stripe-customer-billing.md v3 — billing schema (unchanged)
- ADR-0071 — Queue as billing authority
- ADR-0076 — C++ selection + timeline
1. Context
This document governs Queue Phase 1: the minimum C++ service to host billing safely. It is not the full Queue identity service. That is Phase 2 (queue/design.md).
Why Phase 1 is billing-only: The operator's 2026-05-11 UTC decisions established two things simultaneously — Queue ships in C++, and billing ships in Queue for v1. Queue's full identity service (WebAuthn, sessions, JWT, RBAC, audit consolidation) is architecturally correct but out of scope for Phase 1. Phase 1 establishes:
- The C++ build infrastructure (Dockerfile, CMake, vcpkg, Heroku container stack)
- The Postgres schema for billing (6 tables, unchanged from
stripe-customer-billing.mdv3) - The minimum HTTP surface to process Stripe webhooks, fan out to Raptor's mirror, and serve Console reads
Raptor's Python auth layer continues handling customer authentication during Phase 1. Phase 2 brings WebAuthn and session management into Queue.
2. Invariants
All TradeMasterAPI invariants apply. The following are specifically material to this service:
| # | Invariant |
|---|---|
| I-1 | No stored credentials. STRIPE_RESTRICTED_KEY and STRIPE_WEBHOOK_SECRET are read from Infisical at process startup, held in memory, never written to the database or any log. |
| I-2 | All timestamps UTC. Every timestamp column is TIMESTAMPTZ. Every log line includes an ISO 8601 UTC timestamp. |
| I-3 | Audit trail for every money-state change. All mutations to billing_customer, billing_subscription, billing_invoice emit a row to billing_action_log with KMS HMAC chain integrity. |
| I-4 | GDPR by default. Billing PII has a 7-year retention floor, DSR anonymization path, and breach-notification coverage. |
| I-5 | Fail-closed. FLAG_QUEUE_BILLING=false returns 503 on all billing routes. FLAG_BILLING_AUDIT_WRITES=false halts new billing_action_log writes on chain break. |
| I-6 | Memory safety discipline. No raw new/delete. All resources RAII-managed. No raw char* for PII. AddressSanitizer and UBSan enabled in CI debug build. |
| I-7 | No inline secrets. All secrets from Infisical at startup. scripts/ci/check_no_credential_fields.sh grep covers queue/. |
| I-8 | Stripe is authoritative for subscription state. Queue's DB reflects Stripe state via webhook upsert + nightly reconciler. Queue never overrides Stripe except via explicit operator action logged to billing_action_log. |
3. Repository Layout
queue/
CMakeLists.txt ← root build file; vcpkg toolchain included
vcpkg.json ← dependency manifest: drogon, libpqxx, nlohmann-json,
jwt-cpp, spdlog, sentry-native, curl
Dockerfile ← multi-stage: build stage (gcc:13-bookworm),
runtime stage (debian:bookworm-slim)
Procfile ← not used in Heroku container mode (CMD in Dockerfile)
.dockerignore
src/
main.cpp ← Drogon app startup; load env; set routes; app().run()
config/
app_config.hpp ← typed config struct; populated from env at startup
secrets.cpp / .hpp ← Infisical fetch at startup; no secrets in global state
controllers/
health_controller.cpp ← GET /health
billing_customer_controller.cpp
billing_subscription_controller.cpp
billing_webhook_controller.cpp
billing_internal_controller.cpp ← /api/internal/* (mirror-sync, console reads)
services/
stripe_client.cpp / .hpp ← libcurl wrapper for Stripe REST API
webhook_processor.cpp / .hpp ← HMAC verify, dedup, upsert, mirror fan-out
billing_action_log.cpp / .hpp ← KMS HMAC chain writer
middleware/
internal_auth_filter.cpp ← Bearer token validation for /api/internal/*
flag_gate_filter.cpp ← Returns 503 when FLAG_QUEUE_BILLING=false
db/
db_client.hpp ← Thin wrapper around Drogon's DbClient; RAII transactions
util/
hmac_util.cpp / .hpp ← OpenSSL EVP HMAC-SHA-256 for webhook signature verify
include/
queue/ ← public headers (types, error codes, response shapes)
tests/
unit/
test_hmac_util.cpp
test_webhook_processor.cpp
test_stripe_client_mock.cpp
integration/
test_billing_webhook_integration.cpp
docker-compose.test.yml ← postgres:16 + queue binary against real DB
migrations/
sqitch/
sqitch.conf
sqitch.plan
deploy/
01-billing-schema.sql
02-billing-subscription-mirror.sql
03-billing-action-log.sql
04-billing-processed-events.sql
05-billing-reconcile-log.sql
06-billing-reliability-view.sql
revert/
(reverse order of deploy/)
verify/
(assert tables exist + check indexes)
ops/
backfill_billing_from_raptor.py ← post-launch data migration script (Python OK for one-shot ops)
reconciler_check.sh ← calls /api/internal/billing/reconcile; run by GH Actions cron
sops/
rotation/
stripe-restricted-key.md
queue-service-tokens.md
4. Data Model
Schema is unchanged from stripe-customer-billing.md v3. Restated here for completeness with C++-specific notes.
Migration chain (sqitch)
| Migration | Tables | Notes |
|---|---|---|
| 01-billing-schema | billing_customer, billing_subscription, billing_invoice |
Core billing tables |
| 02-billing-subscription-mirror | billing_subscription_mirror |
PII-free; Raptor reads this |
| 03-billing-action-log | billing_action_log |
Money-state audit; KMS HMAC chain |
| 04-billing-processed-events | processed_stripe_events |
Idempotency dedup table |
| 05-billing-reconcile-log | billing_reconcile_log |
Nightly reconciler drift log |
| 06-billing-reliability-view | v_customer_payment_reliability |
Derived view; see stripe-customer-billing.md §4.6 |
All migrations are additive in Phase 1. sqitch revert drops them in reverse order. No data loss on revert (pre-cutover; no real billing data yet).
C++ model types
Each billing table maps to a plain C++ struct:
// src/models/billing_customer.hpp
struct BillingCustomer {
std::string id; // UUID string
std::string queue_customer_id;
std::string stripe_customer_id;
std::string billing_email; // PII; never logged
std::optional<std::string> billing_name; // PII; optional
std::optional<std::string> address_line1; // PII
std::optional<std::string> address_line2;
std::optional<std::string> address_city;
std::optional<std::string> address_state;
std::optional<std::string> address_postal_code;
std::optional<std::string> address_country; // retained post-erasure
std::optional<std::string> default_pm_last4;
std::optional<std::string> default_pm_brand;
std::string customer_segment; // enum: 'founders'|'organic'|...
std::optional<std::string> acquisition_source;
std::string stripe_created_at; // ISO 8601 UTC
std::string created_at;
std::string updated_at;
};
PII fields are annotated in comments. The logging layer has a log_safe() variant that omits all PII-annotated fields.
5. API Surface (Phase 1)
All responses are JSON. All errors:
{ "error": { "code": "machine_readable", "message": "human readable" } }
Health
GET /health
No auth. Returns 200 within 1 second or Heroku restarts the dyno.
{ "status": "ok", "service": "queue", "version": "0.1.0" }
If DB connection fails: returns 503 {"error":{"code":"db_unavailable","message":"..."}}.
Billing — public surface (called by Raptor/Console via service token)
POST /api/v1/billing/customers
Auth: Bearer service token (QUEUE_SERVICE_TOKEN_RAPTOR or QUEUE_SERVICE_TOKEN_CONSOLE)
Body: Full BillingCustomer fields (JSON).
Response 201: {"id":"<uuid>"} — creates billing_customer row; emits billing_action_log entry.
GET /api/v1/billing/customers/:id
Auth: Bearer service token.
Response 200: BillingCustomer JSON (all fields, including PII — caller is Console). Returns 404 if not found.
GET /api/v1/billing/subscriptions/:queue_customer_id
Auth: Bearer service token.
Response 200:
{
"subscription": { /* BillingSubscription */ },
"plan_tier": "founders",
"status": "active",
"current_period_end": "2026-06-11T21:00:00Z"
}
Returns 404 if no active subscription. Returns most recent active/trialing row.
Stripe Webhook Receiver
POST /api/v1/billing/webhook
Auth: None (Stripe calls this). Stripe-Signature header verified via HMAC-SHA-256 (STRIPE_WEBHOOK_SECRET).
Body: Stripe event JSON.
Processing pipeline (all within a DB transaction):
1. Verify Stripe-Signature header → 400 immediately on failure (security event; Sentry CRIT)
2. Parse event JSON (nlohmann/json)
3. Check processed_stripe_events for event.id → 200 immediately if already seen (idempotent)
4. Route by event type: customer.*, customer.subscription.*, invoice.*
5. Upsert billing row (LWW guard on updated_at)
6. Detect tier downgrade; set feature_locked_at if new tier < previous tier
7. Insert processed_stripe_events row
8. Insert billing_action_log row (KMS HMAC chain)
9. Fan out mirror sync to Raptor (POST /api/internal/billing/mirror-sync via libcurl)
10. Return 200 to Stripe
Response 200 (always, even on partial success): Stripe retries on 5xx. Returning 200 after DB write failure and before Stripe retry would be incorrect — the transaction is atomic. If the transaction fails, return 500 so Stripe retries.
Handled event types: customer.created, customer.updated, customer.deleted, customer.subscription.created, customer.subscription.updated, customer.subscription.deleted, invoice.created, invoice.updated, invoice.payment_succeeded, invoice.payment_failed, invoice.voided
Internal surface (mTLS not in Phase 1; Bearer token)
POST /api/internal/billing/mirror-sync
Auth: Bearer QUEUE_SERVICE_TOKEN_RAPTOR
Purpose: Called by Queue's own webhook processor after a subscription upsert; also callable by the nightly reconciler. Updates Raptor's billing_subscription_mirror.
Body:
{
"queue_customer_id": "<uuid>",
"plan_tier": "founders",
"status": "active",
"current_period_end": "2026-06-11T21:00:00Z",
"updated_at": "2026-05-11T21:00:00Z"
}
Response 204 on success. Queue fans out via libcurl POST to Raptor's RAPTOR_BASE_URL/api/internal/billing/mirror-sync with the same payload. Failure is logged (Sentry WARN) but does not fail the webhook transaction.
POST /api/internal/billing/reconcile
Auth: Bearer service token (GH Actions bot token or Console service token)
Purpose: Triggers nightly reconciliation. Calls Stripe API, compares to DB, writes billing_reconcile_log rows. Does not auto-correct.
Response 200:
{ "mismatches_found": 0, "checked_subscriptions": 47 }
6. Webhook Sequence
sequenceDiagram
participant S as Stripe
participant WH as Queue /api/v1/billing/webhook
participant QDB as Queue-DB (Postgres)
participant KMS as AWS KMS
participant R as Raptor /api/internal/billing/mirror-sync
S->>WH: POST event (Stripe-Signature header)
WH->>WH: HMAC verify (OpenSSL EVP + STRIPE_WEBHOOK_SECRET)
alt Signature invalid
WH-->>S: 400 (security event; Sentry CRIT)
else Signature valid
WH->>QDB: BEGIN TRANSACTION
WH->>QDB: SELECT FROM processed_stripe_events WHERE event_id = ?
alt Already processed
WH->>QDB: ROLLBACK
WH-->>S: 200 (idempotent)
else New event
WH->>QDB: UPSERT billing_* row (LWW guard on updated_at)
WH->>KMS: GenerateMac(previous_hash || row_payload)
KMS-->>WH: hmac_hash
WH->>QDB: INSERT billing_action_log (hmac_chain_hash)
WH->>QDB: INSERT processed_stripe_events
WH->>QDB: COMMIT
WH->>R: POST /api/internal/billing/mirror-sync (libcurl; fire-and-log)
WH-->>S: 200
end
end
7. Internal Auth Model
Phase 1 uses Bearer tokens, not mTLS. Each calling service has a dedicated token:
| Service | Token env var (Queue side) | Set on service |
|---|---|---|
| Raptor | QUEUE_SERVICE_TOKEN_RAPTOR |
Raptor env: QUEUE_BEARER_TOKEN=<same> |
| Console | QUEUE_SERVICE_TOKEN_CONSOLE |
Console env: QUEUE_BEARER_TOKEN=<same> |
| GH Actions reconciler | QUEUE_SERVICE_TOKEN_CRON |
GH Actions secret |
Tokens are loaded at startup into an in-memory std::unordered_set<std::string>. The internal_auth_filter Drogon middleware checks the Authorization: Bearer <token> header against this set before routing to any /api/internal/* handler.
Token rotation: update Infisical, restart raxx-queue-{prod,staging} dyno. No redeploy required.
8. Build + Deploy Pipeline
Dockerfile (multi-stage)
Stage 1 — build (gcc:13-bookworm)
- apt-get: cmake, ninja, libssl-dev, libcurl4-openssl-dev, libpq-dev, uuid-dev, git
- vcpkg install (from vcpkg.json manifest) — layer-cached in CI via GitHub Actions cache
- cmake configure + ninja build
- strip binary
Stage 2 — runtime (debian:bookworm-slim)
- apt-get: libssl3, libcurl4, libpq5 (runtime deps only)
- COPY --from=build /app/queue_server /usr/local/bin/queue_server
- CMD ["queue_server", "--port", "$PORT"]
Heroku stack:set container is required. The heroku.yml declares a single web process type.
GH Actions workflow (.github/workflows/queue-deploy.yml)
Trigger: push to main (path filter: queue/**)
Jobs:
1. build-test:
- Restore vcpkg cache (key: vcpkg-${{ hashFiles('queue/vcpkg.json') }})
- docker build --target build-stage (outputs test binary)
- Run unit tests (ctest)
- Run integration tests (docker-compose.test.yml)
- Save vcpkg cache
2. build-release (needs: build-test):
- docker build --target runtime-stage
- docker push heroku.com/raxx-queue-staging/web
- heroku container:release web -a raxx-queue-staging
- Wait for health check: curl https://raxx-queue-staging.herokuapp.com/health
3. promote-to-prod (needs: build-release, manual approval gate):
- docker tag ... heroku.com/raxx-queue-prod/web
- heroku container:release web -a raxx-queue-prod
- Wait for health check
9. Migrations Deployment
sqitch runs in the Heroku release phase. Add to heroku.yml:
release:
command:
- sqitch deploy --verify db:pg:$DATABASE_URL
sqitch is installed in the runtime Docker image. Migration failures abort the release (Heroku release phase contract: non-zero exit = rollback to previous slug).
Rollback: sqitch revert is available via heroku run sqitch revert db:pg:$DATABASE_URL. This drops all Phase 1 tables — only safe pre-cutover with no real billing data. Post-cutover rollback is a data migration, not a sqitch revert.
10. Rollout Plan
| Phase | Gate | What changes |
|---|---|---|
| Dark | FLAG_QUEUE_BILLING=false (default) |
Queue deployed; all billing routes return 503. Migrations applied. Health check responds. |
| Internal flag | FLAG_QUEUE_BILLING=true on staging only |
Billing routes live on raxx-queue-staging. Stripe test-mode webhook pointed at staging. |
| Integration test | Stripe test-mode events replay cleanly; webhook idempotency confirmed; mirror sync to Raptor staging verified | — |
| Beta | FLAG_QUEUE_BILLING=true on prod |
Live webhook; Stripe live-mode endpoint registered to raxx-queue-prod. Console reads prod Queue API. |
| GA | 48h soak with no P0/P1 billing incidents | Flag gate removed; always-on. |
11. Security Considerations
| Question | Answer |
|---|---|
| What PII does this collect? | billing_email, billing_name, address fields in billing_customer. See stripe-customer-billing.md §7.1. |
| What is the retention period? | 7 years post-customer-deletion (SOC2/tax compliance floor). After 7 years: anonymize in-place. |
| How is it deleted on DSR? | Anonymize in-place: billing_email → tombstone token; address fields → NULL. Invoice rows retained for tax. Tracked in #1630. |
| What is logged for audit? | All money-state mutations in billing_action_log with KMS HMAC chain. Stripe event dedup in processed_stripe_events. spdlog INFO for every successful webhook event (no PII in log lines; only event_id, event_type, stripe_customer_id). |
| Does any part store a credential that could be replayed? | No. STRIPE_RESTRICTED_KEY and STRIPE_WEBHOOK_SECRET read from Infisical at startup; held in process memory only; never written to DB or logs. |
| What happens on breach? | 72h GDPR Art. 33 notification. Queue-DB billing tables added to breach-scope inventory. Existing breach-notification automation path handles the notification. |
| Where are secrets? | Infisical /Raxx/Queue/Billing/Stripe/ for Stripe keys; /Raxx/Queue/ for service tokens. All rotatable without redeploy. |
| Kill-switch? | FLAG_QUEUE_BILLING=false returns 503 on all billing routes. FLAG_BILLING_AUDIT_WRITES=false halts KMS chain writes on chain break (W-KMS scenario). |
| Memory safety? | AddressSanitizer + UBSan in CI debug build. No raw new/delete. RAII throughout. No raw char* for PII strings. |
12. Open Questions
OQ-1 — Language confirmation after timeline numbers: The honest estimate for C++ Phase 1 billing from scratch is 22–32 days. 2026-05-23 UTC is not achievable. Operator said timeline is a target; this is the real number. Does the operator confirm: proceed with C++ and accept the slip? (Operator's 21:27 UTC statement points to yes; confirmed in ADR-0076 OQ-1.)
OQ-2 — Phase 1 identity scope: This design defers WebAuthn and sessions to Phase 2. Raptor's Python auth handles customer authentication during Phase 1. Is the operator comfortable with this split?
OQ-3 — Postgres instance for Queue:
Does Queue share Raptor's Postgres instance (Heroku Standard-0 add-on) in Phase 1, or does it get its own? Sharing is simpler for Phase 1 but couples the two services at the DB layer. Own instance is cleaner but costs ~$50/mo more. Recommendation: share in Phase 1 (billing tables live in Queue's schema namespace); own instance in Phase 3 (per queue/migration-plan.md).
OQ-4 — DSR and retention for v1 launch: