Raxx · internal docs

internal · gated

ADR 0110 — MBT Phase 2: Intraday 1-Minute Bar Feed

Status: Proposed Date: 2026-05-28 UTC Deciders: product owner (Kristerpher), software-architect Scope: Raptor (backend_v2/), MBT fill engine, bar-cache layer Refs: ADR-0108 OQ-1 (resolved), ADR-0108 (MBT engine design), ADR-0109 (BYOB roadmap), PR #3023 (Bundle 2 historical_bars pattern)


Context

ADR-0108 shipped the MBT engine design with end-of-day bars as the v1 bar source (Phase 1). OQ-1 in that ADR asked: should Phase 2 fetch intraday bars for fill evaluation, and if so, at what granularity?

OQ-1 is now resolved: intraday 1-minute bars.

Why 1-min over 5-min

Limit orders are the primary beneficiary. A limit order fills if price touches the limit level within a bar. At 5-min granularity, a price touch that happened between the 1-min marks is invisible — the bar's high/low range appears to miss the limit, but the actual intrabar path would have filled it. 1-min bars narrow this window by 5x, providing materially more realistic fill simulation for the options-heavy and limit-heavy strategies the strategy library (ADR-0107) supports.

This matches HFT-adjacent customer expectations: a user who has traded intraday will reject fill results that differ obviously from what they know happened. 1-min reduces that gap to a defensible residual.

Cost trade-off accepted

Relative to EOD:

Invariants


Decision

MBT Phase 2 upgrades the bar source to intraday 1-minute bars fetched from Alpaca's /v2/stocks/{symbol}/bars?timeframe=1Min REST endpoint. A dedicated historical_bars_1min table stores the cache. A nightly batch job warm-fills yesterday's bars for all symbols active in any live strategy; on-demand fetch fills gaps when a backtest or fill evaluation requests a date range not yet cached. Streaming (WebSocket) ingestion is deferred to v2.


Data Model

New table: historical_bars_1min

A dedicated table, separate from the existing historical_bars (EOD) table. Reasons: EOD bars remain lightweight for strategy-library uses that only need daily resolution; pruning and archival policies differ; partitioning (if added post-v1) is cleaner on a single-timeframe table.

CREATE TABLE historical_bars_1min (
    symbol      TEXT        NOT NULL,
    timestamp   TIMESTAMPTZ NOT NULL,   -- bar open time, UTC
    open        NUMERIC(12,4),
    high        NUMERIC(12,4),
    low         NUMERIC(12,4),
    close       NUMERIC(12,4),
    volume      BIGINT,
    adjusted    BOOLEAN     NOT NULL DEFAULT FALSE,  -- TRUE after split reconciliation
    fetched_at  TIMESTAMPTZ NOT NULL DEFAULT now(),  -- when we pulled this bar
    PRIMARY KEY (symbol, timestamp)
);

CREATE INDEX idx_hb1m_symbol_ts ON historical_bars_1min (symbol, timestamp DESC);

Why not extend historical_bars with a timeframe column?

The historical_bars primary key is (symbol, date). Extending with (symbol, date, timeframe) requires dropping and recreating the primary key, a table-lock operation on the existing data set. A dedicated table avoids that migration risk. See Alternatives section.

Migration path

Feature-developer sub-card carries the Alembic migration with the -- POSTGRES-ONLY sentinel for any PL/pgSQL blocks. Rollback: DROP TABLE historical_bars_1min; — no data in the existing historical_bars table is affected.


Bar Feed Design

Source

Alpaca /v2/stocks/{symbol}/bars with timeframe=1Min, adjustment=split, feed=iex (free tier) or feed=sip (if operator has an Alpaca data subscription).

Crypto is out of scope for v1 (deferred until customer demand confirms the investment). Pre-market and after-hours bars default to excluded (RTH only): start=09:30:00, end=16:00:00 per trading session in the query. Extended-hours opt-in is a per-strategy flag deferred to a follow-on card.

Batch warm job (v1)

A nightly Celery task (tasks/warm_intraday_bars.py) runs after market close (~17:00 ET / 22:00 UTC):

  1. Query strategies table for all symbols referenced in any active strategy.
  2. For each symbol, call Alpaca /v2/stocks/{symbol}/bars?timeframe=1Min for yesterday's session.
  3. Upsert rows into historical_bars_1min (ON CONFLICT DO UPDATE on (symbol, timestamp) to handle the 30-day split-reconciliation window).
  4. Rate-limit guard: exponential backoff on HTTP 429; max 3 retries; alert via Sentry if all retries exhausted.

On-demand gap fill (extending Bundle 2 pattern)

When the MBT fill engine or backtest runner requests bars for (symbol, start, end) and finds a gap in historical_bars_1min, it calls historical_bars_service.py (PR #3023) extended to accept timeframe='1min'. The on-demand fetch pulls from Alpaca, caches the result, and returns the bars. This mirrors the existing EOD on-demand pattern exactly.

Cache read path

SELECT * FROM historical_bars_1min
WHERE symbol = :sym
  AND timestamp BETWEEN :start AND :end
ORDER BY timestamp ASC;

If the result set is empty or has gaps > 1 trading day, trigger on-demand fetch.

Split reconciliation

Alpaca may retrospectively adjust bars after a split. The nightly job re-pulls the trailing 30 days with adjustment=split and upserts, setting adjusted = TRUE on touched rows. This ensures cached bars reflect the corrected prices that fill simulation will use.


Fill Engine Integration

No change to the fill(order, bar) → fill_result | None interface defined in ADR-0108 §5. The fill engine is parameterized on bar source; Phase 2 passes 1-min bars instead of EOD bars. The fill model semantics (limit touches bar.low/bar.high, market order fills at midpoint) are identical — 1-min bars simply provide a tighter price window.

Mode parameter

mbt_fill_engine.py already accepts a mode parameter ('live' vs 'replay'). Phase 2 adds a bar_resolution parameter: '1day' (default, Phase 1) or '1min' (Phase 2). The bar-resolution is selected at fill-engine startup based on FLAG_MBT_INTRADAY_BARS.


Rollout Sequence

Phase 1 (current, [ADR-0108](https://internal-docs.raxx.app/architecture/adr/0108-mbt-engine-design.html))
  FLAG_MBT_ENGINE=0 → FLAG_MBT_ENGINE=1
  Bar source: historical_bars (EOD)

Phase 2 (this ADR)
  FLAG_MBT_INTRADAY_BARS=0 (new flag, default off)
  → Ship table + warm job + on-demand fetch extension
  → Operator enables FLAG_MBT_INTRADAY_BARS for dogfood
  → 7-day comparison: same orders evaluated against EOD vs 1-min bars
  → Flag promoted to default-on for new strategies; existing strategies
     migrate on next fill-cycle

Rollback at any point: FLAG_MBT_INTRADAY_BARS=0 reverts the fill engine to EOD bars. The historical_bars_1min table is left in place (no data loss).


Bar Volume Estimates

Scope Bars
Per symbol, per trading day 390 (6.5 hr × 60 min)
Per symbol, per year ~98,000 (252 days × 390)
Per symbol, 5 years ~490,000
100 symbols × 5 years ~49,000,000
Raw storage (est. 200 bytes/row) ~10 GB
Index storage (est.) ~5 GB
Total ~15 GB

Postgres B-tree on (symbol, timestamp DESC) covers the primary read pattern. If storage grows beyond 100 GB (more symbols, longer history), add pg_partman-based range partitioning by month — this is a post-v1 concern.


Streaming (Deferred to v2)

Alpaca's wss://stream.data.alpaca.markets/v2/iex provides real-time 1-min bar events. This would allow the fill engine to evaluate resting orders against live intraday bars, turning MBT into a near-real-time paper engine during market hours.

Deferred reasons: (1) operational complexity of a persistent WebSocket connection from Raptor; (2) Celery + Redis are already the async pattern — a WebSocket subscriber is a new infra concern; (3) at v1 scale the demand for sub-minute paper fills has not been confirmed. Revisit trigger: customer feedback post-Phase 2 dogfood, or MBT order volume triggers the C-2 Tier 1 criterion from the language tier policy.


Language Choice Rationale

This ADR extends the MBT module within Raptor — no new standalone service is introduced. Raptor is Tier 2 (Python) per ADR-0108 §10. The bar-cache layer (nightly Celery task, on-demand fetch) follows the same classification: domain logic, no p99 < 5ms latency budget, no auth material in scope.

Service: historical_bars_1min cache layer within Raptor

Language tier: Tier 2 — Python

Rationale: The batch warm job and on-demand fetch run on Celery workers at schedule/request frequency — not on a hot latency path. The fill-engine parameterization (passing 1-min bars instead of EOD bars) is in-process within Raptor's request handling. No Tier 1 criteria (C-1 through C-6) are met.

API contract portability: The HTTP contract (trading routes, bar-source flag) is unchanged from ADR-0108. A future Tier 1 port of the fill engine would consume the same historical_bars_1min table via the same SQL interface without redesign.


Alternatives Considered

Extend historical_bars with a timeframe column

ALTER TABLE historical_bars ADD COLUMN timeframe TEXT NOT NULL DEFAULT '1Day';
ALTER TABLE historical_bars DROP CONSTRAINT historical_bars_pkey;
ALTER TABLE historical_bars ADD PRIMARY KEY (symbol, date, timeframe);

Rejected. Dropping and re-adding the primary key is a table-lock operation that blocks reads during migration on a live production table. The schema conflation of EOD and 1-min data in one table also complicates pruning: EOD bars are kept indefinitely (lightweight, used by the strategy library); 1-min bars have a retention limit (5 years for Pro+). A single-table design would require WHERE timeframe = '1Day' predicates everywhere EOD access occurs today.

5-minute bars

Considered and rejected by operator (this ADR records that decision). 5-min bars are ~10x fewer rows than 1-min but provide ~5x worse limit-touch granularity. For limit-order-heavy strategies (the primary MBT use case), the fill fidelity loss is not acceptable. 1-min is the minimum granularity where limit simulation becomes defensible.

Third-party bar data vendor (Polygon.io, Databento)

Deferred. Alpaca's market-data endpoint is already in use for EOD bars (PR

3023) — same credential, same API client, no new vendor. Polygon and Databento

offer tick-level data that would improve fill fidelity further, but introduce new vendor contracts, billing, and credential surface. Revisit if Alpaca's free tier rate limits become a sustained constraint post-launch.

Pre-warm all symbols at startup

Rejected. At 100+ symbols × 5 years, that is ~50M rows fetched at startup — a 5-hour Alpaca API job at maximum rate. Opportunistic (request-triggered) + nightly incremental warm provides the same coverage with no startup burst.


Security / GDPR Checklist


Open Questions

  1. Crypto in scope? Do 1-min bars extend to crypto symbols? Alpaca exposes /v1beta3/crypto/bars?timeframe=1Min. Default: deferred until confirmed customer demand. Requires operator decision before the crypto sub-card can be claimed.

  2. Extended-hours bars? Pre-market (04:00–09:30 ET) and after-hours (16:00–20:00 ET) bars are available from Alpaca but are noisier and have wider spreads. Default: RTH only. Allow per-strategy opt-in flag deferred to a follow-on card. No operator decision needed before v1 sub-cards.

  3. Retention archive to S3. Bars older than 5 years for Pro+ (or 3 years for Pro, 90 days for Free) could be archived to S3 Glacier. This is a post-v1 concern but the schema (fetched_at, adjusted columns) supports a future archival job without a schema change.


Risks

Risk Mitigation
Alpaca rate-limit hit during warm burst Exponential backoff, max 3 retries; Sentry alert on exhaustion; warm job runs off-peak (22:00 UTC)
Partial bar cached (bar not yet closed) Warm job runs after 22:00 UTC — all RTH bars for the prior day are closed by then. Intraday on-demand fetch flags bars where timestamp > session_close_utc - 1min as potentially partial
Split adjustment after initial cache Nightly re-pull of trailing 30 days with adjusted=TRUE upsert; fetched_at updated so reconciliation is auditable
Storage growth beyond 15 GB estimate Index on (symbol, timestamp DESC) supports efficient range pruning; pg_partman available as post-v1 option
Alpaca data quality gaps (missing bars) On-demand fetch logs gaps to Sentry; fill engine handles gap as "no bar available — use previous bar or queue to next available bar" per ADR-0108 §5

Consequences

Positive

Negative / risks

Neutral


References


Revisit when