Raxx · internal docs

internal · gated ↑ index

Velvet v2 Infrastructure: Cost + Compliance Analysis

Issue #908 — Pre-provisioning briefing for Kristerpher

Status: research-only. This document does NOT constitute legal or tax advice. Before acting on DPA, data-classification, or vendor-contract questions, consult an attorney licensed in your operating jurisdiction and a CPA for expense treatment. Last updated: 2026-05-04. Sources as of that date — verify freshness before committing.


TL;DR

The two-app / two-Postgres layout proposed in #908 costs ~$150/mo (not $200) because Standard-1X dynos are $25 each, not $50. The bigger issue surfaced by this research is Heroku's corporate posture: Salesforce froze enterprise Heroku feature development in February 2026 and ended new enterprise sales — the platform is in a sustaining-engineering phase with no published EOL date but clear sunset signals. For a service that sits on the critical rotation path, that vendor risk deserves an explicit architectural decision now, before any deploy automation is wired in. The recommended architecture (Section 5) is the two-app Heroku layout but with a 12-month migration checkpoint to AWS-native if Heroku announces EOL or drops Standard tier support.


1. Cost Breakdown — Confirmed Numbers

1a. Heroku rate card (confirmed May 2026)

Resource Unit price Source
Standard-1X dyno $25.00/mo heroku.com/pricing
Eco dyno $5.00/mo (shared pool, sleeps) heroku.com/pricing
Heroku Postgres Essential-0 $5.00/mo (1 GB, 20 conn, no rollback) elements.heroku.com/addons/heroku-postgresql
Heroku Postgres Essential-2 $20.00/mo (32 GB, 40 conn, no rollback) elements.heroku.com/addons/heroku-postgresql
Heroku Postgres Standard-0 $50.00/mo (4 GB RAM, 64 GB disk, 200 conn, 4-day rollback, dedicated) elements.heroku.com/addons/heroku-postgresql
Heroku Postgres Standard-2 $200.00/mo (8 GB RAM, 256 GB disk, 500 conn) elements.heroku.com/addons/heroku-postgresql

1b. #908 as-proposed: two Standard-1X + two Standard-0 Postgres

Line item Monthly Notes
raxx-velvet-prod dyno (Standard-1X) $25.00 Always-on, 512 MB RAM
raxx-velvet-staging dyno (Standard-1X) $25.00 Always-on
raxx-velvet-prod Postgres (Standard-0) $50.00 4 GB RAM, 64 GB disk, 4-day rollback
raxx-velvet-staging Postgres (Standard-0) $50.00 Same tier; see Section 3b for staging tier discussion
Total new spend $150.00/mo
Existing Heroku baseline ~$80.00/mo raxx-console + raxx-api (prod + staging each)
Total Heroku after #908 ~$230.00/mo

Your estimate was $200/mo. Actual is $150/mo new spend. The error: Standard-1X is $25, not $50. Standard-1X at $50 does not exist in the current rate card; the next step up is Performance-M at $250/mo.

1c. Projections

Horizon New Velvet spend Total Heroku (incl. baseline)
12 months $1,800 ~$2,760
36 months $5,400 ~$8,280

These projections assume no dyno scaling, no staging-tier downgrade, and no Heroku price changes. At 36 months, vendor-risk mitigation (Section 3a) makes the 36-month projection speculative — it may not run on Heroku for that full window.


2. Alternative Architectures

(a) Single-instance Velvet — one app + one Postgres

What you lose: The staging environment for Velvet itself. Rotation flows for staging tokens (e.g., raxx-api-staging Heroku config vars) would run against prod Velvet. This means a bug in a new adapter (e.g., the GitHub Actions adapter in B6) that causes an unhandled exception crashes prod Velvet mid-rotation, leaving a rotation job in minting or distributing state with no clean recovery path in a non-prod lane.

Given the v2 design's three-stage state machine and the explicit design principle that testing-flow rotations run against throwaway tokens, the staging Velvet instance exists primarily to validate the service-bus fan-out and consumer adapters before they touch prod credentials. Collapsing to single-instance eliminates that safety lane entirely.

Cost diff:

Item Monthly savings vs. proposed
Drop raxx-velvet-staging dyno -$25.00
Drop staging Postgres (Standard-0) -$50.00
Net -$75.00/mo ($75/mo instead of $150/mo)

Verdict: Viable only if Velvet's flow_type=testing path is considered sufficient as a safety lane, and if all adapter development happens in a feature branch that runs against a local Postgres in CI. The ops risk is non-trivial: the 2026-05-02 incident (rot_2b1e) happened precisely because there was no staging rehearsal path to catch the old-token auth-probe gap. Losing the staging instance trades $75/mo in savings for the same class of incident risk.


(b) Eco / Mini tier dynos — does cold-start kill the use case?

The Eco dyno ($5/mo) sleeps after 30 minutes of inactivity. Wake time is typically 5–15 seconds but can reach 30+ seconds under cold JVM/Python startup.

Velvet's use pattern: rotation jobs are operator-initiated, not ambient. Between rotations the dyno may be idle for hours or days. A sleeping Velvet means:

Verdict: Eco tier is architecturally incompatible with Velvet's role as an operator-critical service. Issue #908 already documents this risk and specifies Standard-1X minimum. Confirm.

The cost savings ($40/mo across two dynos vs. $50/mo for two Standard-1X dynos) do not justify the operational risk.


(c) Co-locate Velvet on console-prod (no separate Heroku apps)

Velvet as a Flask blueprint registered in the existing raxx-console-prod app.

Cost: $0 additional. Velvet runs on the existing console dyno.

Risks:

  1. Every console deploy restarts Velvet. Velvet holds in-memory state during active rotation jobs (the SSE stream to the console UI, the fan-out goroutine pool). A dyno restart mid-rotation drops all in-progress distribute/validate operations. The operator sees a hung modal. Recovery requires manually querying the Postgres rotation_jobs table to determine which consumers received the new token and which did not. This is exactly the rot_2b1e recovery scenario.

  2. Console blast radius. A Velvet bug (e.g., an unhandled exception in the GitHub Actions NaCl adapter) that crashes the worker also takes down the console UI. An operator trying to investigate a status incident can't reach the console because Velvet's rotate endpoint threw a 500.

  3. RBAC separation. The v2 design (invariant I3, ADR-0037) requires that Velvet's service-token auth is separate from the console's session auth. Co-location doesn't prevent this technically but it increases the surface area for auth middleware to accidentally bleed scope — a rotation-scoped service token issued to a CI job could accidentally hit a console-only endpoint on the same process.

  4. Postgres schema isolation. Co-location implies sharing the console's Postgres, which means rotation_jobs and rotation_job_consumers tables live in the same database as console session data. Rotation audit records (operator_id, credential_name, timestamps) and console session data become a joint recovery unit — a console Postgres failover takes out rotation history at the same time.

Verdict: Not viable for a production rotation service. The restart-mid-rotation risk alone rules it out. The $150/mo separation cost is the correct expense for the isolation this use case requires.


(d) AWS-native: ECS Fargate + RDS PostgreSQL

Raxx already has AWS account 521228113048. SSM Parameter Store is already in use for AWS-resident workloads.

Pricing (us-east-1, May 2026):

Resource Config Monthly estimate
Fargate task (prod) 0.25 vCPU, 0.5 GB RAM, ~730 hrs/mo ~$7.70 compute + ~$1.20 memory = ~$9/mo
Fargate task (staging) Same ~$9/mo
RDS PostgreSQL t3.micro (prod) 1 GB RAM, 2 vCPU, 20 GB gp3 storage ~$22/mo instance + ~$2.30/mo storage = ~$24/mo
RDS PostgreSQL t3.micro (staging) Same ~$24/mo
ALB (Application Load Balancer, prod) ~$16/mo base + LCU charges ~$18/mo
ALB (staging) Same ~$18/mo
ECR (container registry) Minimal at this scale ~$1/mo
CloudWatch logs Minimal ~$2/mo
Total AWS-native ~$105/mo

Notes on these estimates: - Fargate pricing: $0.04048/vCPU-hr + $0.004445/GB-hr (us-east-1, Linux/x86). Source: aws.amazon.com/fargate/pricing/ - RDS t3.micro: ~$0.018/hr on-demand = ~$13/mo; t3.small ~$0.036/hr = ~$26/mo. Source: instances.vantage.sh/aws/rds/db.t3.micro With 20 GB gp3 storage at $0.115/GB-mo = $2.30/mo. - ALB: $0.008/LCU-hr + $0.0225/ALB-hr base = ~$16-18/mo at very low traffic. - RDS t3.micro does not include read replicas or Multi-AZ. Multi-AZ doubles the RDS cost (~$26/mo → ~$52/mo per instance).

AWS-native vs. Heroku comparison:

Heroku (proposed) AWS-native
Monthly cost $150/mo ~$105/mo (no Multi-AZ) / ~$157/mo (with Multi-AZ)
Setup time ~2 hrs (CLI + Procfile) ~8-16 hrs (ECS task defs, VPC, ALB, IAM, RDS)
Ops complexity Very low Moderate-to-high
Existing IAM N/A claude-infisical-bootstrap only (#334 pending)
SSM integration Via env vars on dyno Native (same VPC)
Vendor risk High (see Section 3a) Low (AWS not going anywhere)
CF Access pattern Existing console pattern reusable Requires separate WAF/ALB integration
RDS backup default Automated 7-day (can extend to 35 days) Same

Verdict (this option): AWS-native costs about the same as Heroku once you add Multi-AZ (which you'd want on a rotation service). The setup cost is 4-8x higher and the IAM posture in account 521228113048 is not yet production-hardened (#334 is pending). AWS-native is the correct long-term destination if Heroku announces EOL — plan the architecture to support a migration, but don't build it today.


(e) Fly.io / Railway / Render

Fly.io:

Railway:

Render:

Verdict (alternatives): None of these is a meaningful improvement over the Heroku proposal for Velvet's specific profile — operator-triggered, low concurrency, needs reliable uptime, needs audit-grade Postgres, needs CF Access integration, needs to fit the existing Raxx ops pattern. Fly.io is the most viable alternative but requires DPA investigation and new ops tooling. AWS-native is the more natural migration target given existing infrastructure.


3. Vendor Risk + Lock-in

3a. Heroku corporate posture — material risk assessment

What happened (February 2026):

What is still available: - Credit-card (pay-as-you-go) customers can continue provisioning apps and add-ons, including Standard dynos and Standard-0 Postgres, as of May 2026. - No published EOL date for Standard/Eco tiers. - Enterprise contracts (not applicable to Raxx's current posture) are no longer sold.

Risk assessment for Velvet:

Risk Likelihood Impact Notes
Heroku announces Standard-tier EOL in 12 months Medium Critical Migration to AWS ~2-4 weeks of eng work
Heroku raises Standard-0 Postgres pricing Medium Low-Medium Still competitive at 2x current price
Heroku has an extended outage (>4 hrs) during a rotation Low-Medium High Velvet design has abort/resume; actual impact is delayed rotation, not data loss
Heroku is fully shut down with <6 months notice Low Critical Historical precedent: Salesforce gave 18 months notice on free tier EOL

Lock-in posture: Velvet's Heroku dependency is shallow — the app itself is a portable Flask container; Postgres schema is standard SQL; the only Heroku-specific items are Procfile, heroku config:set, and the add-on attachment. A migration to ECS+RDS or Fly.io is a sprint-level task, not a quarter-level task, if the architecture keeps the app 12-factor clean.

Recommendation: Design the #908 scaffold explicitly for portability. Do not use Heroku-specific Postgres features (no Heroku-specific connection pooling extensions, no Heroku-only backup APIs). Use standard DATABASE_URL env var. Document the migration path in the runbook (B11).


3b. Standard-0 vs Essential tier — durability + Velvet threat model

Feature Essential-0 ($5/mo) Essential-2 ($20/mo) Standard-0 ($50/mo)
Storage 1 GB 32 GB 64 GB
Connections 20 40 200
Dedicated server No No Yes
Fork / Follow No No Yes
Point-in-time rollback No No Yes (4-day window)
Logical backups (PGBackups) 7 daily, 1 weekly Same Same + fork/follow
Downtime tolerance (SLA basis) <4 hrs/mo <4 hrs/mo <1 hr/mo

Does Velvet's threat model demand Standard-0?

The rotation_jobs and rotation_job_consumers tables are an audit log and state machine, not application data. The key question is: what happens if that Postgres is unavailable or corrupt?

Verdict on Standard-0 for staging: The staging Postgres does not require Standard-0. Essential-2 ($20/mo) provides sufficient storage and connections for a staging rotation service. The rollback window (4-day PITR) is a prod concern, not a staging concern.

Revised cost with Essential-2 for staging:

Item Monthly
raxx-velvet-prod dyno (Standard-1X) $25.00
raxx-velvet-staging dyno (Standard-1X) $25.00
raxx-velvet-prod Postgres (Standard-0) $50.00
raxx-velvet-staging Postgres (Essential-2) $20.00
Total new spend $120.00/mo
Total Heroku after #908 (revised) ~$200.00/mo

Savings: $30/mo ($360/yr) vs. dual Standard-0.


4. Compliance + Data Classification

4a. What Velvet's Postgres stores

Per the v2 design doc (docs/architecture/velvet/v2-rotation-flows.md, Section 9):

Data field Classification Personal data?
rotation_jobs.id (UUID) Job identifier No
rotation_jobs.credential_name Token taxonomy name (e.g., HK_PLATFORM_FULL) No — refers to a system credential, not a person
rotation_jobs.operator_id Console account identifier — design doc says "opaque UUID" Potentially yes — see below
rotation_jobs.new_token_hash / old_token_hash SHA-256 hex of credential value No — hashes are not credential values
rotation_jobs.error_message Error detail, no credential values per invariant I1 No
rotation_job_consumers.* Consumer IDs, distribute/validate status, HTTP status codes No
Audit timestamps (UTC) All *_at fields No, by themselves

The operator_id question:

The v2 design calls operator_id an "opaque UUID" — but its value comes from the console's authentication system. If the console maps operator_id to a real user account (email address, name), then operator_id in a rotation_jobs row is a pseudonymous personal data point under GDPR Article 4(1): it can be re-identified by cross-referencing the console's user table.

Whether operator_id constitutes personal data depends on: 1. Whether Velvet can, in the context of Raxx's operations, reasonably re-identify the individual from that UUID. If Raxx is the controller and maintains a user table mapping UUID → email, the answer is almost certainly yes. 2. Whether the operator_id is a Raxx internal identity (a UUID Raxx assigns) or a third-party identity (e.g., a GitHub user ID). If GitHub-linked, it may be considered personal data.

Consult an attorney on whether operator_id as stored constitutes personal data under GDPR (if Raxx processes any EU-resident operator data) or CCPA (if Raxx later employs operators who are California residents). This is a question for a privacy attorney, not a technical decision.

CCPA applicability today: CCPA applies when a business collects personal information of 100,000+ California consumers OR derives 50%+ of annual revenue from selling personal information. Raxx is pre-launch with no paying customers. CCPA thresholds are not met today. This changes if Raxx onboards employees or contractors whose personal data Velvet's audit log records. Source: California Civil Code §1798.140(d).

4b. Heroku as a processor — DPA implications

Current posture (no paying customers): Heroku is a subprocessor of operator data. The Salesforce DPA (April 2026 revision) covers Heroku and is available at:

https://www.salesforce.com/content/dam/web/en_us/www/documents/legal/Agreements/data-processing-addendum.pdf

Heroku's GDPR guide (devcenter.heroku.com/articles/gdpr) confirms Heroku's role as a processor under GDPR — they process data only on customer (controller) instructions.

When Raxx takes paying customers: If Velvet's audit log contains operator_id values that are personal data (see 4a above), and those operators are EU-resident or California-resident users, Raxx as a controller would need: 1. A DPA with Heroku (Salesforce DPA covers this — no custom negotiation required for standard pay-as-you-go accounts; customers "accept" it by accepting the Heroku ToS). 2. A Record of Processing Activities (RoPA) entry for Velvet's audit log — the data category, purpose (security audit trail), retention period (2 years per v2 design), and processor (Heroku/Salesforce).

Consult an attorney on whether Raxx's current Heroku ToS acceptance is sufficient DPA coverage under GDPR Article 28, or whether a separately executed DPA with Salesforce is advisable given the credential-management nature of Velvet's data.

4c. Tax treatment of Heroku spend

Heroku (operated by Salesforce, a US LLC/corporation) is a domestic vendor. The $150/mo Velvet infrastructure spend is a business operating expense — software/services category.

Pre-incorporation (Raxx operating as an individual or informally): the expense is deductible as a Schedule C business expense if Raxx is a sole proprietorship, or as an ordinary business expense for any entity that exists at the time of payment.

Consult your CPA on: (1) whether pre-incorporation infrastructure expenses are deductible and what documentation is required; (2) whether any Heroku costs need to be capitalized rather than expensed if they are directly attributable to building a product vs. ongoing operations; (3) how to account for vendor migration costs (AWS transition) if they occur.

No unusual tax treatment is flagged by this research. Heroku is a straightforward US vendor expense. No sales tax implications visible at this scale (confirm with CPA for California nexus).


5. Recommendation

Pick: Heroku Standard-1X dynos + Standard-0 prod Postgres + Essential-2 staging Postgres, with a 12-month Heroku EOL checkpoint.

Why

  1. Lowest immediate eng cost. The Heroku scaffold (#908) reuses the exact pattern already running for raxx-console and raxx-api. The CF Access policy, heroku config:set pattern, Procfile, and ops tooling are already understood. A net-new ECS+RDS deployment would cost 8-16 hours of infrastructure work before any Velvet code ships — that's time better spent on B2-B5 (the actual rotation logic).

  2. Cost is confirmed lower than estimated. $150/mo ($120/mo with Essential-2 staging) is a defensible spend for a service that replaces manual, error-prone credential management. The 2026-05-02 incident (rot_2b1e) required manual hand-sync across four Heroku apps — that's ops time that costs more than $150/mo if it recurs.

  3. Portability-first scaffold. Design #908 to be 12-factor portable: DATABASE_URL only, no Heroku-specific add-on APIs, standard Alembic migrations, no heroku CLI in the dyno (per acceptance criteria). A future migration to ECS+RDS or Fly.io is a 1-sprint task, not a re-architecture.

  4. Vendor risk is manageable at this stage. Heroku has not announced a Standard-tier EOL. The sustaining-engineering mode means stability without new features — acceptable for a scaffold that Raxx controls entirely. Set a calendar checkpoint at 2027-05-01 (12 months) to re-evaluate. If Heroku announces Standard-tier EOL before then, treat migration as a P0 sprint.

  5. Standard-0 prod Postgres is the right call. The 4-day PITR window justifies the cost delta over Essential tier. Losing rotation audit logs due to a Postgres failure on a credential-management service would be a material ops incident.

  6. Essential-2 for staging Postgres. Staging does not need PITR. The $30/mo saving across 12 months is $360 — worth capturing.

Followup decisions required before #908 ships

Decision Owner Deadline
Confirm Velvet staging can run against Essential-2 Postgres Kristerpher (ops) Before provisioning
Confirm operator_id is a Raxx-internal opaque UUID (not email, not GitHub ID) Architect Before B2 schema merge
Review Salesforce DPA — confirm standard ToS acceptance is sufficient for current ops posture Attorney Before Velvet processes any personal data
Document Heroku EOL checkpoint in ops calendar Kristerpher With #908 merge
Set up Velvet migration runbook stub (B11) with AWS-native as the documented fallback Eng With B11 card

6. Questions for Kristerpher

For attorney

  1. DPA sufficiency: Does accepting Heroku/Salesforce's standard Terms of Service (which incorporate the Salesforce DPA by reference) constitute a valid Article 28 GDPR data processing agreement for Raxx's use case, given that Velvet's Postgres may store pseudonymous personal data (operator_id)?

  2. operator_id classification: If operator_id in rotation_jobs is a Raxx-internal UUID that maps to a named employee or contractor in a separate system, does it constitute personal data under GDPR or CCPA? What documentation is needed to justify the 2-year retention period?

  3. Vendor lock-in risk (contractual): Raxx has no formal contract with Heroku beyond click-through ToS. If Heroku announces EOL with less than 90 days notice, what obligations (if any) does Salesforce have to provide data export and migration support? Is the Salesforce DPA + ToS sufficient, or should Raxx negotiate a service continuity addendum?

  4. IP assignment for Velvet code: If any Velvet development is done by contractors (not employees), are the IP assignment clauses in place? Velvet's rotation logic constitutes proprietary business logic, not just infrastructure. This is an existing open question in docs/business/ip-assignment.md.

For CPA

  1. Expense deductibility (pre-incorporation): Heroku spend is currently paid by the operator personally. Is the $150/mo Velvet infrastructure spend deductible as a startup expense, and what documentation is needed to support that deduction?

  2. Capitalization vs. expensing: Velvet is being built as a component of Raxx, a pre-revenue product. Are the Heroku infrastructure costs (and the engineering time) capitalizable under Section 197 / ASC 350-40, or are they currently expensible as R&D? This changes when Raxx incorporates and sets accounting policy.

  3. Vendor migration cost treatment: If Raxx migrates from Heroku to AWS in 12-18 months, are the migration costs (eng time, parallel-run overlap costs) capitalizable or expensed in the period incurred?

  4. Sales tax: Does Heroku charge sales tax on Standard-1X dynos in California? Is that expense properly accounted for?


Sources