ADR 0110 — MBT Phase 2: Intraday 1-Minute Bar Feed
Status: Proposed
Date: 2026-05-28 UTC
Deciders: product owner (Kristerpher), software-architect
Scope: Raptor (backend_v2/), MBT fill engine, bar-cache layer
Refs: ADR-0108 OQ-1 (resolved), ADR-0108 (MBT engine design), ADR-0109 (BYOB roadmap), PR #3023 (Bundle 2 historical_bars pattern)
Context
ADR-0108 shipped the MBT engine design with end-of-day bars as the v1 bar source (Phase 1). OQ-1 in that ADR asked: should Phase 2 fetch intraday bars for fill evaluation, and if so, at what granularity?
OQ-1 is now resolved: intraday 1-minute bars.
Why 1-min over 5-min
Limit orders are the primary beneficiary. A limit order fills if price touches the limit level within a bar. At 5-min granularity, a price touch that happened between the 1-min marks is invisible — the bar's high/low range appears to miss the limit, but the actual intrabar path would have filled it. 1-min bars narrow this window by 5x, providing materially more realistic fill simulation for the options-heavy and limit-heavy strategies the strategy library (ADR-0107) supports.
This matches HFT-adjacent customer expectations: a user who has traded intraday will reject fill results that differ obviously from what they know happened. 1-min reduces that gap to a defensible residual.
Cost trade-off accepted
Relative to EOD:
- Volume: ~390 bars/symbol/day vs 1; approximately 50x raw row count.
- Vs 5-min: ~10x more rows for a meaningful fill-fidelity improvement.
- Storage at 100 symbols × 5 years: ~50M rows, ~10GB data + ~5GB index. Postgres handles this with proper partitioning-friendly indexes.
- API cost: Alpaca's 1-min bars endpoint is rate-limited at 200 req/min; each request returns up to 10,000 bars (~25 trading days for one symbol). Initial warm for 100 symbols × 252 days ≈ 1,000 requests → ~5 minutes at full rate. Acceptable; rate limit is not a constraint at v1 scale.
Invariants
- No stored credentials — the Alpaca market-data key (
ALPACA_MARKET_DATA_KEY) lives in the secret store, never in code or committed files. - Audit trail for every state change that affects money — every fill written by the intraday bar engine is audited identically to EOD fills (ADR-0108 §5).
- Paper-first gating — intraday fill quality does not bypass the paper-profitable graduation gate (PR #3021); it improves the quality of evidence gathered there.
- Credentials into infra, not code — Alpaca API key remains in env/secret store.
Decision
MBT Phase 2 upgrades the bar source to intraday 1-minute bars fetched from
Alpaca's /v2/stocks/{symbol}/bars?timeframe=1Min REST endpoint. A dedicated
historical_bars_1min table stores the cache. A nightly batch job warm-fills
yesterday's bars for all symbols active in any live strategy; on-demand fetch
fills gaps when a backtest or fill evaluation requests a date range not yet cached.
Streaming (WebSocket) ingestion is deferred to v2.
Data Model
New table: historical_bars_1min
A dedicated table, separate from the existing historical_bars (EOD) table.
Reasons: EOD bars remain lightweight for strategy-library uses that only need
daily resolution; pruning and archival policies differ; partitioning (if added
post-v1) is cleaner on a single-timeframe table.
CREATE TABLE historical_bars_1min (
symbol TEXT NOT NULL,
timestamp TIMESTAMPTZ NOT NULL, -- bar open time, UTC
open NUMERIC(12,4),
high NUMERIC(12,4),
low NUMERIC(12,4),
close NUMERIC(12,4),
volume BIGINT,
adjusted BOOLEAN NOT NULL DEFAULT FALSE, -- TRUE after split reconciliation
fetched_at TIMESTAMPTZ NOT NULL DEFAULT now(), -- when we pulled this bar
PRIMARY KEY (symbol, timestamp)
);
CREATE INDEX idx_hb1m_symbol_ts ON historical_bars_1min (symbol, timestamp DESC);
Why not extend historical_bars with a timeframe column?
The historical_bars primary key is (symbol, date). Extending with
(symbol, date, timeframe) requires dropping and recreating the primary key, a
table-lock operation on the existing data set. A dedicated table avoids that
migration risk. See Alternatives section.
Migration path
Feature-developer sub-card carries the Alembic migration with the
-- POSTGRES-ONLY sentinel for any PL/pgSQL blocks. Rollback: DROP TABLE
historical_bars_1min; — no data in the existing historical_bars table is
affected.
Bar Feed Design
Source
Alpaca /v2/stocks/{symbol}/bars with timeframe=1Min, adjustment=split,
feed=iex (free tier) or feed=sip (if operator has an Alpaca data subscription).
Crypto is out of scope for v1 (deferred until customer demand confirms the
investment). Pre-market and after-hours bars default to excluded (RTH only):
start=09:30:00, end=16:00:00 per trading session in the query. Extended-hours
opt-in is a per-strategy flag deferred to a follow-on card.
Batch warm job (v1)
A nightly Celery task (tasks/warm_intraday_bars.py) runs after market close
(~17:00 ET / 22:00 UTC):
- Query
strategiestable for all symbols referenced in any active strategy. - For each symbol, call Alpaca
/v2/stocks/{symbol}/bars?timeframe=1Minfor yesterday's session. - Upsert rows into
historical_bars_1min(ON CONFLICT DO UPDATE on(symbol, timestamp)to handle the 30-day split-reconciliation window). - Rate-limit guard: exponential backoff on HTTP 429; max 3 retries; alert via Sentry if all retries exhausted.
On-demand gap fill (extending Bundle 2 pattern)
When the MBT fill engine or backtest runner requests bars for (symbol, start,
end) and finds a gap in historical_bars_1min, it calls
historical_bars_service.py (PR #3023) extended to accept timeframe='1min'.
The on-demand fetch pulls from Alpaca, caches the result, and returns the bars.
This mirrors the existing EOD on-demand pattern exactly.
Cache read path
SELECT * FROM historical_bars_1min
WHERE symbol = :sym
AND timestamp BETWEEN :start AND :end
ORDER BY timestamp ASC;
If the result set is empty or has gaps > 1 trading day, trigger on-demand fetch.
Split reconciliation
Alpaca may retrospectively adjust bars after a split. The nightly job re-pulls
the trailing 30 days with adjustment=split and upserts, setting
adjusted = TRUE on touched rows. This ensures cached bars reflect the
corrected prices that fill simulation will use.
Fill Engine Integration
No change to the fill(order, bar) → fill_result | None interface defined in
ADR-0108 §5. The fill engine is parameterized on bar source; Phase 2 passes
1-min bars instead of EOD bars. The fill model semantics (limit touches
bar.low/bar.high, market order fills at midpoint) are identical — 1-min
bars simply provide a tighter price window.
Mode parameter
mbt_fill_engine.py already accepts a mode parameter ('live' vs 'replay').
Phase 2 adds a bar_resolution parameter: '1day' (default, Phase 1) or
'1min' (Phase 2). The bar-resolution is selected at fill-engine startup based
on FLAG_MBT_INTRADAY_BARS.
Rollout Sequence
Phase 1 (current, [ADR-0108](https://internal-docs.raxx.app/architecture/adr/0108-mbt-engine-design.html))
FLAG_MBT_ENGINE=0 → FLAG_MBT_ENGINE=1
Bar source: historical_bars (EOD)
Phase 2 (this ADR)
FLAG_MBT_INTRADAY_BARS=0 (new flag, default off)
→ Ship table + warm job + on-demand fetch extension
→ Operator enables FLAG_MBT_INTRADAY_BARS for dogfood
→ 7-day comparison: same orders evaluated against EOD vs 1-min bars
→ Flag promoted to default-on for new strategies; existing strategies
migrate on next fill-cycle
Rollback at any point: FLAG_MBT_INTRADAY_BARS=0 reverts the fill engine to
EOD bars. The historical_bars_1min table is left in place (no data loss).
Bar Volume Estimates
| Scope | Bars |
|---|---|
| Per symbol, per trading day | 390 (6.5 hr × 60 min) |
| Per symbol, per year | ~98,000 (252 days × 390) |
| Per symbol, 5 years | ~490,000 |
| 100 symbols × 5 years | ~49,000,000 |
| Raw storage (est. 200 bytes/row) | ~10 GB |
| Index storage (est.) | ~5 GB |
| Total | ~15 GB |
Postgres B-tree on (symbol, timestamp DESC) covers the primary read pattern.
If storage grows beyond 100 GB (more symbols, longer history), add
pg_partman-based range partitioning by month — this is a post-v1 concern.
Streaming (Deferred to v2)
Alpaca's wss://stream.data.alpaca.markets/v2/iex provides real-time 1-min bar
events. This would allow the fill engine to evaluate resting orders against live
intraday bars, turning MBT into a near-real-time paper engine during market hours.
Deferred reasons: (1) operational complexity of a persistent WebSocket connection from Raptor; (2) Celery + Redis are already the async pattern — a WebSocket subscriber is a new infra concern; (3) at v1 scale the demand for sub-minute paper fills has not been confirmed. Revisit trigger: customer feedback post-Phase 2 dogfood, or MBT order volume triggers the C-2 Tier 1 criterion from the language tier policy.
Language Choice Rationale
This ADR extends the MBT module within Raptor — no new standalone service is introduced. Raptor is Tier 2 (Python) per ADR-0108 §10. The bar-cache layer (nightly Celery task, on-demand fetch) follows the same classification: domain logic, no p99 < 5ms latency budget, no auth material in scope.
Service: historical_bars_1min cache layer within Raptor
Language tier: Tier 2 — Python
Rationale: The batch warm job and on-demand fetch run on Celery workers at schedule/request frequency — not on a hot latency path. The fill-engine parameterization (passing 1-min bars instead of EOD bars) is in-process within Raptor's request handling. No Tier 1 criteria (C-1 through C-6) are met.
API contract portability: The HTTP contract (trading routes, bar-source flag)
is unchanged from ADR-0108. A future Tier 1 port of the fill engine would
consume the same historical_bars_1min table via the same SQL interface without
redesign.
Alternatives Considered
Extend historical_bars with a timeframe column
ALTER TABLE historical_bars ADD COLUMN timeframe TEXT NOT NULL DEFAULT '1Day';
ALTER TABLE historical_bars DROP CONSTRAINT historical_bars_pkey;
ALTER TABLE historical_bars ADD PRIMARY KEY (symbol, date, timeframe);
Rejected. Dropping and re-adding the primary key is a table-lock operation that
blocks reads during migration on a live production table. The schema conflation
of EOD and 1-min data in one table also complicates pruning: EOD bars are kept
indefinitely (lightweight, used by the strategy library); 1-min bars have a
retention limit (5 years for Pro+). A single-table design would require
WHERE timeframe = '1Day' predicates everywhere EOD access occurs today.
5-minute bars
Considered and rejected by operator (this ADR records that decision). 5-min bars are ~10x fewer rows than 1-min but provide ~5x worse limit-touch granularity. For limit-order-heavy strategies (the primary MBT use case), the fill fidelity loss is not acceptable. 1-min is the minimum granularity where limit simulation becomes defensible.
Third-party bar data vendor (Polygon.io, Databento)
Deferred. Alpaca's market-data endpoint is already in use for EOD bars (PR
3023) — same credential, same API client, no new vendor. Polygon and Databento
offer tick-level data that would improve fill fidelity further, but introduce new vendor contracts, billing, and credential surface. Revisit if Alpaca's free tier rate limits become a sustained constraint post-launch.
Pre-warm all symbols at startup
Rejected. At 100+ symbols × 5 years, that is ~50M rows fetched at startup — a 5-hour Alpaca API job at maximum rate. Opportunistic (request-triggered) + nightly incremental warm provides the same coverage with no startup burst.
Security / GDPR Checklist
-
PII collected: none.
historical_bars_1mincontains market data (OHLCV for publicly traded symbols) — no user identifiers, no trade amounts, no financial PII.symbolandtimestampare not PII. -
Retention period: market data is not subject to GDPR retention limits (not personal data). Operational retention: 5 years for active symbols, older data archived to S3 cold storage (post-v1). EOD bars in
historical_barsare not affected. -
Deletion on DSR: not applicable. The table contains no user-linked rows. DSR erasure for MBT trade history is handled by the
paper_orders,paper_positions, andpaper_accountsCASCADE DELETE (ADR-0108 §9). -
Audit trail: the nightly warm job and on-demand fetch write Sentry breadcrumbs and application logs. No
audit_logrows — no money or permissions are affected by bar ingestion. Fill decisions that use these bars are audited at the fill layer (ADR-0108 §5). -
Stored credentials: none specific to this ADR.
ALPACA_MARKET_DATA_KEYlives in the secret store, not in code. Same credential used for EOD bars. -
Breach notification path: market data is not personal data. A breach of
historical_bars_1minexposes no user PII. Standard Sentry alerting covers unexpected access patterns. -
Secrets location + rotation:
ALPACA_MARKET_DATA_KEYin Infisical / Heroku config. Rotatable without redeploy (Velvet rotation pattern). No new secrets introduced. -
Kill-switch:
FLAG_MBT_INTRADAY_BARS=0reverts the fill engine to EOD bars instantly.MBT_TRADING_DISABLED=1(ADR-0108) remains the order-level kill-switch. Warm job can be disabled by removing the Celery beat schedule entry without a redeploy.
Open Questions
-
Crypto in scope? Do 1-min bars extend to crypto symbols? Alpaca exposes
/v1beta3/crypto/bars?timeframe=1Min. Default: deferred until confirmed customer demand. Requires operator decision before the crypto sub-card can be claimed. -
Extended-hours bars? Pre-market (04:00–09:30 ET) and after-hours (16:00–20:00 ET) bars are available from Alpaca but are noisier and have wider spreads. Default: RTH only. Allow per-strategy opt-in flag deferred to a follow-on card. No operator decision needed before v1 sub-cards.
-
Retention archive to S3. Bars older than 5 years for Pro+ (or 3 years for Pro, 90 days for Free) could be archived to S3 Glacier. This is a post-v1 concern but the schema (
fetched_at,adjustedcolumns) supports a future archival job without a schema change.
Risks
| Risk | Mitigation |
|---|---|
| Alpaca rate-limit hit during warm burst | Exponential backoff, max 3 retries; Sentry alert on exhaustion; warm job runs off-peak (22:00 UTC) |
| Partial bar cached (bar not yet closed) | Warm job runs after 22:00 UTC — all RTH bars for the prior day are closed by then. Intraday on-demand fetch flags bars where timestamp > session_close_utc - 1min as potentially partial |
| Split adjustment after initial cache | Nightly re-pull of trailing 30 days with adjusted=TRUE upsert; fetched_at updated so reconciliation is auditable |
| Storage growth beyond 15 GB estimate | Index on (symbol, timestamp DESC) supports efficient range pruning; pg_partman available as post-v1 option |
| Alpaca data quality gaps (missing bars) | On-demand fetch logs gaps to Sentry; fill engine handles gap as "no bar available — use previous bar or queue to next available bar" per ADR-0108 §5 |
Consequences
Positive
- Limit-order fill simulation becomes defensible for intraday strategies — a 5x improvement in price-touch granularity over 5-min, 390x over EOD.
- Bar cache serves both the MBT fill engine and the backtest runner (ADR-0108 §6 backtest reuse principle), no fork.
- BYOB paper parity confirmed: same
historical_bars_1mintable serves all broker-agnostic fill simulation per ADR-0109 relationship. - Nightly warm job decouples Alpaca API latency from the order-submission hot path (on-demand fetch is a cold path only).
Negative / risks
- ~15 GB additional storage at 100 symbols × 5 years; grows linearly with active symbol count and retention window.
- Nightly warm job adds a Celery beat task; failure is a silent detection gap unless Sentry alert fires (mitigated by alerting design above).
- Fill fidelity still lower than tick-level NBBO data; disclosed in UI.
Neutral
ALPACA_MARKET_DATA_KEYcredential surface is unchanged — same key, same usage pattern, just more API calls.- EOD
historical_barstable and its usage are entirely unaffected.
References
- [ADR-0108](https://internal-docs.raxx.app/architecture/adr/0108-mbt-engine-design.html) — MBT engine design; OQ-1 resolved here
- [ADR-0109](https://internal-docs.raxx.app/architecture/adr/0109-byob-roadmap.html) — BYOB roadmap; MBT is the permanent paper layer
- [ADR-0107](https://internal-docs.raxx.app/architecture/adr/0107-strategy-library.html) — strategy library; fill compliance integration
- PR #3023 — Bundle 2:
historical_barson-demand pattern this ADR extends - Alpaca Market Data API docs:
/v2/stocks/{symbol}/bars
Revisit when
- Customer demand confirms crypto strategy interest → extend to
/v1beta3/crypto/bars. - Alpaca 429 rate-limit events sustain beyond one warm cycle → evaluate Polygon or Databento as alternative data vendors.
- MBT order volume triggers C-2 Tier 1 criterion (p99 > 100ms sustained) → re-evaluate whether bar-cache layer warrants a Tier 1 promotion.
- Streaming tick fills are requested by customers post-Phase 2 dogfood → revisit WebSocket ingestion deferred above.