Billing Data Model
Status: Reference (retrospective) Epic: #757 — Fixed-Cost Billing Dashboard Refs: #758 (schema), #760 (Heroku collector), #761 (AWS collector), #762 (CF collector), #763 (Postmark collector), #764 (fixed-cost seed), #765 (aggregator), #766 (RBAC), #768 (UI panel), #769 (alert engine), #770 (alert config)
1. Context
Epic #757 built a consolidated ops-internal billing view so the team can see platform infrastructure spend for the current calendar month without logging into each vendor portal separately. The data model was built incrementally via five sub-cards (collectors + seed + aggregator) and this document consolidates what was produced into a single reference.
The model covers operator-side spend tracking only — it has no concept of customer subscriptions, invoices to customers, or Stripe charge records. Those are future concerns and the open questions section flags where they would plug in.
2. Invariants
- No vendor credential is stored in the database. Collectors read API keys from environment variables that are injected at runtime from vault.
- The
raw_payload_jsoncolumn stores sanitized vendor responses. Any field that could contain an auth token is stripped before storage; the collector comments document which fields are sanitized per vendor. - All timestamps are UTC ISO-8601 strings. The schema was designed for SQLite
(
TEXTcolumns);TIMESTAMPTZ NOT NULLvariants are noted inline for a future Postgres migration. - Collectors never UPDATE or DELETE existing rows for past billing periods.
Idempotency within the current period is achieved by a DELETE + INSERT
transaction on
(vendor, period_start). Historical rows are append-only. - Spend amounts are stored as
REAL/NUMERIC(10,4). Sub-cent rounding is not material for the current use case; this is noted as a revisit trigger if the platform adds metered billing with high precision requirements. - Every access to the billing panel routes through the
console-billing-readRBAC v2 role check before any data is returned.
3. Data Model
3.1 Tables (Raptor — SQLite)
All five tables live in the Raptor (backend_v2) SQLite database, governed by
sequential migrations under backend_v2/db/migrations/.
vendor_billing_snapshots (migration 011)
One row per collector run per vendor per billing period. Collectors INSERT; they never UPDATE or DELETE a row for a past period.
vendor_billing_snapshots
------------------------
id INTEGER PK AUTOINCREMENT
vendor TEXT NOT NULL -- 'heroku' | 'aws' | 'cloudflare' | 'postmark'
period_start TEXT NOT NULL -- YYYY-MM-DD (billing period first day)
period_end TEXT NOT NULL -- YYYY-MM-DD (billing period last day)
fetched_at TEXT NOT NULL -- UTC ISO-8601 timestamp of collector run
current_spend_usd REAL NOT NULL -- spend at fetch time (USD)
projected_spend_usd REAL NULL -- end-of-period projection; NULL if unknown
coverage_type TEXT NOT NULL -- CHECK IN ('api', 'derived', 'fixed')
raw_payload_json TEXT NULL -- sanitized vendor response blob
INDEX idx_vbs_vendor_period ON (vendor, period_start)
coverage_type semantics:
- api — spend fetched directly from a vendor invoice API (Heroku, AWS)
- derived — spend calculated from usage metrics × a pricing table (Cloudflare, Postmark)
- fixed — flat operator-entered amount (unused in this table; used in vendor_billing_fixed)
vendor_billing_fixed (migration 011)
One row per vendor for flat-rate costs entered or computed by the operator.
The fixed_costs.py seed script rebuilds this table from billing_fixed_costs.yaml on each runner pass.
vendor_billing_fixed
--------------------
id INTEGER PK AUTOINCREMENT
vendor TEXT NOT NULL UNIQUE -- e.g. 'github', 'infisical', '1password'
label TEXT NULL -- human-readable display name
monthly_amount_usd REAL NOT NULL -- known flat monthly cost (USD)
note TEXT NULL -- operator note (plan name, tier, etc.)
updated_at TEXT NULL -- UTC ISO-8601 timestamp of last seed run
UNIQUE INDEX idx_vbf_vendor ON (vendor)
Amount resolution priority (applied by fixed_costs.py):
1. annual_total_usd / 12
2. seats * tier_rate_usd
3. monthly_amount_usd (explicit)
4. 0.00 with a [NEEDS OPERATOR INPUT] note prefix
vendor_billing_alert_config (migration 012)
Per-vendor and aggregate spend threshold configuration. The alert engine reads this table to decide when to fire breach emails.
vendor_billing_alert_config
---------------------------
id INTEGER PK AUTOINCREMENT
vendor TEXT NOT NULL UNIQUE -- vendor key or '_aggregate'
threshold_usd NUMERIC(10,4) NULL -- monthly threshold in USD; NULL = disabled
enabled INTEGER NOT NULL -- CHECK IN (0, 1); default 1
UNIQUE INDEX idx_vbac_vendor ON (vendor)
The sentinel _aggregate vendor key represents the total-spend threshold across
all vendors combined. The alert engine evaluates it separately from
per-vendor thresholds.
Default seeds (applied at app boot by alert_config.py):
| vendor | threshold_usd |
|---|---|
_aggregate |
200.00 |
heroku |
50.00 |
aws |
30.00 |
vendor_billing_alert_log (migration 013)
Append-only log of alert emails sent. Used by the alert engine for 24-hour deduplication.
vendor_billing_alert_log
------------------------
id INTEGER PK AUTOINCREMENT
vendor TEXT NOT NULL -- vendor key or '_aggregate'
period_start TEXT NOT NULL -- YYYY-MM-DD billing period start
alert_type TEXT NOT NULL -- CHECK IN ('vendor', 'aggregate')
sent_at TEXT NOT NULL -- UTC ISO-8601 timestamp of send
threshold_pct REAL NOT NULL -- projected/threshold * 100 at send time
INDEX idx_vbal_vendor_period ON (vendor, period_start, sent_at)
console_billing_action_log (console Alembic migration 0004)
Tracks cancel / refund / credit / retention operations against customer accounts.
Lives in the console (Alembic-managed) database, not Raptor. Used by the
@require_permission decorator to enforce a rolling 24-hour ops ceiling and
the pro-plus retention gate.
console_billing_action_log
--------------------------
id TEXT (UUID) PK
admin_id TEXT (UUID) FK → admins.id ON DELETE SET NULL; nullable
action_type TEXT NOT NULL -- CHECK IN ('cancel','refund','credit','retention_attempted')
customer_id TEXT (UUID) nullable
customer_tier TEXT nullable
created_at TIMESTAMPTZ NOT NULL
INDEX on admin_id
INDEX on created_at
3.2 Entity Relationships
vendor_billing_snapshots (one per vendor per period per collector run)
vendor ──────────────────────────────────────────────────────────┐
│ shared key space
vendor_billing_fixed (one per vendor, flat-rate) │
vendor ──────────────────────────────────────────────────────────┘
│
▼
aggregator.py (aggregate_billing)
│
├─── reads vendor_billing_snapshots (latest per vendor per period)
└─── reads vendor_billing_fixed (all rows)
│
▼
aggregated result dict
{ period, vendors[], totals }
│
┌───────────────┴──────────────────────┐
▼ ▼
alert_engine.py console billing panel
reads vendor_billing_alert_config /billing (GET)
writes vendor_billing_alert_log /api/billing/summary
4. Vendor Cost Ingestion — Collectors
All four active collectors share the same write contract: one row to
vendor_billing_snapshots per run per billing period. Within the current
period, a DELETE + INSERT transaction replaces the previous row.
Heroku (heroku_collector.py, issue #760)
- Source:
GET /account/invoiceson the Heroku Platform API v3 coverage_type = 'api'- Locates the current-period invoice by
state = 'pending'|'open' projected_spend_usd = current_spend_usd(Heroku invoices are real-time accumulating)- Auth:
HEROKU_BILLING_API_KEYfrom env (vault path/Raxx/Console/Billing/) - Feature flag:
FLAG_HEROKU_BILLING_COLLECTOR
AWS (aws_collector.py, issue #761)
- Source:
ce:GetCostAndUsage(MTD) +ce:GetCostForecast+cloudwatch:GetMetricStatistics(Lightsail egress) coverage_type = 'api'- Data lag: ~24 hours (Cost Explorer lag); stored in
raw_payload_json["data_lag_hours"] - Rate-capped: skips execution if the most recent
fetched_atforvendor="aws"is under 6 hours old (≤4 API calls/day, capping Cost Explorer charges at ~$1.20/mo) projected_spend_usdfromce:GetCostForecast; stored asNULLwhen AWS returnsDataUnavailableException(normal early in billing cycle)- Egress overage from Lightsail CloudWatch metrics is added to projected spend when
AWS_LIGHTSAIL_INSTANCE_NAMEis set - Auth:
AWS_BILLING_ACCESS_KEY_ID+AWS_BILLING_SECRET_ACCESS_KEYfrom env - Feature flag:
FLAG_AWS_BILLING_COLLECTOR
Cloudflare (cloudflare_collector.py, issue #762)
- Source:
GET /zones/{zone_id}/subscriptionper zone, enumerated frombackend_v2/config/billing_cloudflare_zones.yaml coverage_type = 'derived'(reads plan rates, not an invoice)- Zone IDs in YAML (not hardcoded); operator edits YAML to add/remove domains
- HTTP 404 on a zone = free-tier ($0 contribution); not treated as an error
projected_spend_usd = current_spend_usd(flat subscription)- Auth:
CLOUDFLARE_BILLING_TOKENfrom env; must use the automation token scope, not the DNS-edit token - Feature flag:
FLAG_CLOUDFLARE_BILLING_COLLECTOR
Postmark (postmark_collector.py, issue #763)
- Source:
GET /stats/outboundwith current-month date range coverage_type = 'derived'(email count × pricing tier rate)- Pricing table hardcoded in module; annotated with
pricing_as_ofdate - Tiers:
developer(free, default),basic,pro,platform; tier set viaPOSTMARK_BILLING_TIER - Projection: linear extrapolation —
(current_spend / days_elapsed) * days_in_month - Auth:
POSTMARK_SERVER_TOKENfrom env - Feature flag:
FLAG_POSTMARK_BILLING_COLLECTOR
Fixed costs (fixed_costs.py, issue #764)
Not a network collector. Reads backend_v2/config/billing_fixed_costs.yaml and seeds vendor_billing_fixed. Runs as step 5 in runner.py. Idempotent; re-running rebuilds all rows from YAML.
5. Aggregator Pattern (issue #765)
aggregator.py:aggregate_billing() is the single read entry point for the
billing panel and the alert engine. It never writes.
Read contract:
1. Determine period_start and period_end for the current calendar month.
2. From vendor_billing_snapshots: fetch the single most-recent row per vendor
where period_start matches the current month (max fetched_at per vendor).
3. From vendor_billing_fixed: fetch all rows.
4. Merge into a vendor list; compute totals.
Vendor merging rule: snapshot vendors and fixed vendors are additive. If a vendor appears in both tables, both entries appear in the result (they represent distinct cost components — dynamic usage vs. known flat fee). There is no deduplication by vendor key across the two tables.
has_null_entries: set to true when any vendor_billing_fixed row has
monthly_amount_usd = 0.00 AND the note contains the needs operator input
marker. Distinguishes confirmed-free vendors from unresolved entries.
data_lag_hours: floor-hours since the most-recent fetched_at for
snapshot vendors. null for fixed-cost vendors (no collection timestamp).
Return shape:
{
"period": { "start": "YYYY-MM-DD", "end": "YYYY-MM-DD" },
"vendors": [
{
"vendor": str,
"label": str | null,
"current_spend_usd": float,
"projected_spend_usd": float | null,
"coverage_type": "api" | "derived" | "fixed",
"data_lag_hours": int | null,
"needs_operator_input": bool
}, ...
],
"totals": {
"current_spend_usd": float,
"projected_spend_usd": float,
"tracked_vendor_count": int,
"has_null_entries": bool
}
}
6. RBAC Overlay (issue #766, migration 0014)
Access to billing data is restricted at two layers.
Console routes (/billing, /billing/alert-config) require the
console-billing-read RBAC v2 role, enforced by @require_rbac_role decorator.
RBAC v2 role: console-billing-read
- Permission: console:billing:read
- Default assignment: break-glass group only (default-deny for all other groups)
- Operators explicitly grant it to groups via rbac_group_roles
Feature flag: FLAG_BILLING_SUMMARY_API must also be on for the routes to
serve data. When off the routes return 404 regardless of role.
Field-level visibility: there is currently no column-level access control
within the billing panel — any user with console-billing-read sees all dollar
amounts. A future billing-admin vs. billing-support distinction (where
support sees usage rates but not USD amounts) is not yet implemented. See
Open Questions.
Audit trail: The console_billing_action_log table in the console DB
captures cancel / refund / credit / retention events tied to admin_id. This
is the per-action audit trail for ops that affect customer billing state. It is
distinct from the vendor-spend snapshot tables.
7. Scheduler / Runner
runner.py is the daily cron entry point, invoked by
.github/workflows/billing-collector-cron.yml at 06:00 UTC.
Execution order:
1. Heroku collector
2. AWS collector (rate-cap applies)
3. Cloudflare collector
4. Postmark collector
5. Fixed-costs seed
6. Alert engine (only when FLAG_BILLING_THRESHOLD_ALERTS=true and at least one collector succeeded)
Each step is independently wrapped in try/except. A single collector failure does not abort the remaining collectors. Exit code is 0 if at least one collector succeeded; 1 if all failed.
DB path injected via BILLING_DB_PATH environment variable from the workflow.
8. UI Consumers (issue #768)
The console billing panel at /billing renders vendor cards with:
- Per-vendor: current spend, projected spend, coverage type, data lag indicator
- Aggregate totals: current spend, projected spend, vendor count, null-entries warning
- Alert threshold indicators (when FLAG_BILLING_THRESHOLD_ALERTS is on)
The panel calls /api/billing/summary via client-side JS on mount; that route
calls aggregate_billing() and serializes the result dict.
The alert config page at /billing/alert-config calls /api/billing/alert-config
to read/write vendor_billing_alert_config rows.
Both routes require FLAG_BILLING_SUMMARY_API=true and console-billing-read role.
9. Future Hooks
These are not designed here. They are noted as integration points.
- Stripe collector: a fifth dynamic-collector entry in
runner.py, writing tovendor_billing_snapshotswithvendor='stripe'. Requires a Stripe restricted key withBalance: Readscope in vault. - Customer-facing invoice surface: would introduce new tables
(
customer_invoices,invoice_line_items) in a separate migration set. The current vendor-spend model is an ops-internal cost ledger and does not map directly to customer invoices. - Invoice generation / PDF receipts: a post-aggregation step that would read
the aggregator result and render a document. Would plug in after step 5 in
runner.py. - Chargeback handling: would extend
console_billing_action_logwith achargeback_disputeaction type and link to a disputes table. - Field-level RBAC (billing-support vs. billing-admin): would require a
second role (
console-billing-admin) with a separate permission forconsole:billing:amounts:read;console-billing-readwould be scoped to usage metrics only.
10. Open Questions
These surfaced during consolidation and need a decision before the next billing epic is scoped.
-
Snapshot vs. fixed table split — unify before Stripe?
vendor_billing_snapshotsandvendor_billing_fixedserve different write patterns (append-style collector rows vs. idempotent seed rows) but the aggregator merges them into the same vendor list. Adding a Stripe collector would go into snapshots; that is straightforward. The open question is whether the UI should expose the split (showing "dynamic" vs. "fixed" vendor categories) or flatten it before adding more collectors. Decision affects the dashboard design for the next epic. -
No UNIQUE constraint on
(vendor, period_start)invendor_billing_snapshots. Migration 011 does not enforce uniqueness at the DB level on that pair. Idempotency is enforced in application code (DELETE + INSERT transaction per collector). A future concurrency bug or direct DB write could produce duplicate rows that confuse the aggregator'sMAX(fetched_at)query. Should a UNIQUE constraint be added in a follow-on migration, or is the application- level guard sufficient? -
Postmark pricing table is hardcoded in source.
postmark_collector.pycontains apricing_as_ofdate and a tier dict. If Postmark reprices, an operator must update the source file and redeploy. Is a YAML-backed pricing config (similar tobilling_fixed_costs.yaml) preferred for operator self-service, or is the code-level constant acceptable given infrequent repricing? -
AWS collector is in
runner.pyas a stub comment referencing #761, but the moduleaws_collector.pyis fully implemented. The runner's step 2 reads# TODO(#761): replace this stub...but the collector code exists. This is a code discrepancy: the runner does not callcollect_aws_billing(). Was the integration step missed, or is it gated pending a specific infra prerequisite? -
Field-level billing RBAC not yet implemented. Currently anyone with
console-billing-readsees all dollar amounts. Is abilling-supportrole (usage rates only, no USD amounts) needed before support team access is granted, or is the default-deny posture (only break- glass has the role today) sufficient for now? -
console_billing_action_logvs. vendor-spend tables — are they linked? The action log (cancel/refund/credit events) lives in the console Alembic DB while vendor-spend tables live in the Raptor SQLite DB. There is no FK or join across them. If a future billing reconciliation feature needs to correlate refunds with monthly vendor costs, a cross-DB reporting pattern or a data export step would be required.
11. Security Checklist
- PII collected: none in the billing snapshot tables.
console_billing_action_logstorescustomer_id(UUID) andcustomer_tier(string); no name, email, or financial detail. - Retention: snapshot rows are not purged; there is no TTL policy in the current schema. If vendor API responses in
raw_payload_jsoncontain account-level metadata that is considered PII, a retention policy should be scoped. - Deletion on DSR: no current mechanism to purge billing rows on a customer data subject request. Relevant only if
customer_idinconsole_billing_action_logis scoped to a data subject. - Audit log: every billing action against a customer account is written to
console_billing_action_logwithadmin_id,action_type, andcreated_at. This is the money-state audit trail. - No stored credentials: confirmed. All vendor API keys are injected via env/vault at runtime.
raw_payload_jsonfields that could carry auth material are stripped by each collector before storage. - Breach notification: no automated breach notification is wired into the billing data path. If
raw_payload_jsonwere to contain sensitive material that was exposed, the existing GDPR breach-notification automation (separate from this module) would apply. - Secrets location:
HEROKU_BILLING_API_KEY,CLOUDFLARE_BILLING_TOKEN,POSTMARK_SERVER_TOKEN,AWS_BILLING_ACCESS_KEY_ID,AWS_BILLING_SECRET_ACCESS_KEYall live in Infisical vault under/Raxx/Console/Billing/. All are rotatable without schema changes. - Kill-switch:
FLAG_BILLING_SUMMARY_APIgates all billing panel routes. Individual collector flags (FLAG_HEROKU_BILLING_COLLECTOR, etc.) gate each collector independently. Setting any flag tofalseviaheroku config:setstops data collection without a deploy.