ADR 0063 — Scale Tier Latency Budget + Numeric Upgrade Triggers for Shadow-Table Writes
Status: Accepted
Date: 2026-05-09 UTC
Deciders: operator (Kristerpher), software-architect
Refs: customer-audit-unified/design.md §9, migration-plan.md §trigger-to-async, ADR-0059
Context
ADR-0059 chose synchronous Postgres triggers over application dual-write or WAL CDC for shadow table writes. The trade-off is trigger overhead: each INSERT/UPDATE/DELETE on a customer-touching primary table fires a trigger that synchronously inserts into the corresponding *_history table.
At pre-launch volume (hundreds of mutations per day), this overhead is negligible. At 10K customers with market-open burst write patterns, it becomes a measurable latency contribution. ADR-0059 noted "add WAL CDC as a scale-trigger upgrade path documented in a follow-up ADR." This is that ADR.
The operator decision is: start synchronous, watch one metric, upgrade on a numeric threshold — not on a subjective judgment call. Upgrade criteria must be specific enough that the team does not need to redesign under pressure when the threshold trips.
Decision
Three-tier upgrade path with explicit numeric thresholds:
Tier 1 — Synchronous triggers (v1 through ~1K customers)
Postgres AFTER triggers on customers, subscriptions, positions, trades, sessions, passkeys. Shadow INSERT fires within the same transaction as the primary mutation.
Metric to watch: audit.trigger.insert_p99_ms — Sentry custom measurement on the p99 INSERT latency of the trades table (the highest-volume table). Measured by the application, recorded weekly via sentry.set_measurement.
Capacity at this tier:
| Metric | v1 (100 customers) | 1K customers |
|---|---|---|
| Sustained writes/sec | 0.01 | 0.14 |
| Burst writes/sec (market open) | 0.5 | 5 |
| Estimated trigger overhead per trade INSERT | ~0.3–0.5 ms | ~0.3–0.5 ms |
| Estimated p99 INSERT latency (trades table) | < 5 ms | < 8 ms |
Tier 2 — Async shadow writes via pg_notify + background worker
Upgrade condition: p99 INSERT latency on trades table exceeds 50 ms sustained for 7 consecutive days.
The "7 consecutive days" window prevents reacting to transient spikes (hot Heroku dyno, network jitter). The 50 ms threshold is chosen such that the user experience of trade submission is not measurably degraded (typical perceived UI latency threshold is 100–200 ms; 50 ms leaves headroom).
What changes:
- Triggers are replaced with NOTIFY calls: NOTIFY audit_shadow_channel, '<json_payload>'.
- Background worker (jobs/shadow_writer.py) subscribes via LISTEN audit_shadow_channel. Inserts rows into *_history tables asynchronously.
- Lag monitoring: queue depth measured as the worker's processed_at - notified_at delta. Alert if depth > 1,000 items (approximately 1 minute of sustained 10K-customer burst).
- During cutover: brief dual-path window (both trigger + worker) to ensure no rows are dropped.
Trade-off accepted: shadow tables lag the primary table by up to ~100–500 ms under load. This is documented in runbooks: "shadow tables have eventual-consistency lag under high write load."
Capacity at this tier:
| Metric | 1K customers | 10K customers |
|---|---|---|
| Shadow write overhead on primary INSERT | ~0.05 ms (notify only) | ~0.05 ms |
| Worker throughput | 1K rows/sec (single worker) | 5K rows/sec (3 workers) |
| Max lag under burst | ~0.5s | ~2s |
Tier 3 — WAL Change Data Capture (CDC)
Upgrade condition: customer_audit_events exceeds 100 million rows OR 10,000 active customers (first hit).
At this scale, even async worker overhead from pg_notify may not keep pace with write bursts. Postgres logical replication (WAL CDC) is the appropriate mechanism.
What changes:
- Create a Postgres logical replication slot.
- CDC consumer process (e.g., Debezium or a custom consumer) subscribes to the WAL stream.
- Shadow rows are written by the CDC consumer; triggers are removed entirely.
- customer_audit_events remains writer-endpoint-based (no change; CDC is only for shadow tables).
Trade-off accepted: WAL CDC adds a new infrastructure component (replication slot, consumer process, monitoring). This is justified at 10K customers; it is not justified pre-launch.
Decision criteria to promote: when either numeric threshold is met, the team has a documented 1–2 week path (not a redesign) to Tier 3. The schema remains compatible — shadow tables do not change, only the write mechanism changes.
Partitioning Thresholds
| Threshold | Action |
|---|---|
customer_audit_events reaches 1 million rows |
Declare TimescaleDB hypertable with weekly chunks on at_utc |
customer_audit_events reaches 100 million rows |
Enable hypertable compression on chunks > 90 days. Review chunk interval. |
Any single *_history table reaches 1 million rows |
Add same TimescaleDB hypertable treatment |
Partitioning is declarative: the threshold triggers a schema migration (Alembic), not a code change.
Consequences
Positive
- The team has a clear, pre-agreed contract for when to upgrade. No subjective judgment required under pressure.
- The metric (
audit.trigger.insert_p99_ms) is measured from Day 1, so the trend is visible before the threshold is hit. - The two-step upgrade path (async before CDC) avoids premature infrastructure complexity. CDC is a meaningful operational commitment; 50 ms trigger latency is the right signal for the first upgrade, not as an anticipatory measure.
- Capacity numbers are documented in the design doc rather than requiring rediscovery at upgrade time.
Negative
- The 50 ms threshold is an engineering estimate, not a load-tested measurement. It could be wrong. Mitigation: the Sentry metric captures actual data from Day 1; if the threshold appears to be too high or too low after observing real data, it can be adjusted before being hit in anger.
- The async Tier 2 path introduces eventual consistency in shadow tables. Features that depend on shadow table reads (e.g., a "what did this row look like at time T?" forensic query) must document that they may see stale data up to the lag window.
- WAL CDC (Tier 3) adds Heroku Postgres configuration complexity (replication slots have their own management overhead). The team must monitor replication lag.
Alternatives Considered
Upgrade on subjective judgment ("when it feels slow")
No numeric threshold; team upgrades when query performance becomes a concern.
Rejected: "when it feels slow" is undefined and creates debate under pressure. Numeric thresholds let the team make the upgrade decision proactively and without friction.
Pre-emptive async from Day 1
Ship Tier 2 (async shadow writes via pg_notify) from the start, bypassing Tier 1.
Rejected: introduces a background worker dependency and eventual-consistency lag before they are needed. At pre-launch scale, synchronous triggers add < 1 ms per mutation. The simpler system is correct until simplicity becomes the wrong tradeoff.
Skip Tier 2, go directly to WAL CDC at 10K customers
No intermediate pg_notify tier; upgrade directly from triggers to WAL CDC.
Rejected: the gap between 50 ms trigger latency and 10K customers / 100M rows may be many months. During that period the team has no upgrade path; the only option would be an emergency CDC deployment. The pg_notify intermediate tier provides a fast, low-risk upgrade path that can be executed in 1–2 days.