Rolling-Income Monte Carlo Simulator + Regime-Aware Entry Gate

Strategy / Infrastructure ID: mc-regime-gate Status: Research — reference implementation complete; walk-forward backtest pending data licensing resolution Date: 2026-05-06 Author: Data-scientist agent (Raxx / MQ-A layer) GitHub issue: #246 Related epic: #79 (Backtesting Lab) Depends on: markov-fit-analysis.md, 2026-05-05-layered-covered-call-strategy.md, historical-options-data-vendors.md Reference implementation: docs/data-science/reference-impl/mc_regime_gate/

1. Goal and Invariants

What this package does

The Monte Carlo simulator + regime-aware entry gate is a structure tool for MQ-A. Given the user's stated strategy parameters (entry credit threshold, DTE band, delta band, position sizing, exit rule, roll rule), it:

Classifies the current market into one of four volatility regimes derived from the HMM work in markov-fit-analysis.md.
Queries the historical record of how the user's specific rule has triggered and resolved within that regime over the most recent 24 months of paper-trade or backtest data.
Runs a Monte Carlo bootstrap over that regime-conditional history to produce a distribution of outcomes (not a point prediction) for a fresh cycle entry under current conditions.
Surfaces a confidence indicator — High / Moderate / Low / Outside-Distribution — reflecting how closely today's regime + IV context resembles the historical sample the rule was calibrated on.

The output is retrospective analysis: "given your rule and this regime, here is the distribution of what happened historically." It does not forecast future outcomes.

What this package does NOT do

It does NOT auto-fire orders or recommend trades. Execution is deterministic and user-directed (see feedback_deterministic_execution_ai_augments.md). The regime gate is information, not instruction.
It does NOT predict whether the next cycle will be profitable. Every output is framed in the past tense with explicit historical markers.
It does NOT override the user's stated entry rule. If the user's rule says "enter when IVR > 50 and DTE = 38," the simulator uses that rule as the trigger definition and samples historical instances that matched it.
It does NOT replace the hard stops (2× credit stop, 21-DTE exit, covered-call constraint). Those are enforced by the MBT execution layer regardless of what the regime gate reports.
It does NOT output a recommendation in the sense of Investment Advisers Act §202(a)(11). The confidence indicator communicates statistical similarity between current conditions and the historical sample — not counsel on whether to trade.

Invariant summary

Invariant	Mechanism
No auto-fire	Regime gate output is read-only by user; order routing is not gated on it
No prediction	All output phrased as "in the tested period, this rule triggered X times in this regime; outcomes distributed as Y"
No personalized advice	Output is statistical description of the user's own configured rule on historical data
Deterministic execution	The existing MBT rule engine fires based on its own criteria; MQ-A regime signal is advisory display only
Reproducibility	All Monte Carlo runs produce a seed-locked artifact; same seed = same output

2. Inputs

2.1 User strategy parameters (sourced from user's MBT profile)

These are the parameters the user has already committed to in their strategy configuration. The simulator does not ask the user to supply new parameters — it reads what they have set.

Parameter	Type	Example	Source
`strategy_type`	enum	`iron_condor`, `credit_spread_put`, `csp`, `covered_call`	MBT profile
`entry_ivr_min`	float [0, 100]	50.0	MBT profile
`dte_min`	int	30	MBT profile
`dte_max`	int	45	MBT profile
`short_delta_min`	float	0.15	MBT profile
`short_delta_max`	float	0.20	MBT profile
`entry_credit_min_pct_width`	float	0.30	MBT profile (credit ≥ 30% of width)
`position_size_pct_notional`	float	0.05	MBT profile (5% of account per cycle)
`profit_take_pct`	float	0.50	MBT profile (close at 50% of max profit)
`stop_loss_multiple`	float	2.0	MBT profile (exit at 2× credit received)
`dte_exit`	int	21	MBT profile (time-exit if not closed)
`roll_rule`	string	`roll_up_for_credit`	MBT profile
`underlying`	string	`SPY`	User-selected

2.2 Historical data inputs

Data	Source	Window	Cadence	License status
SPY / target underlying OHLCV (daily)	Alpaca Market Data (current subscription)	5 years	Daily EOD	Included in $90.75/mo Algo Trader Plus
CBOE VIX daily close	CBOE public data feed (free)	5 years	Daily EOD	Public domain; no license issue
CBOE VIX3M daily close	CBOE public data feed (free)	5 years	Daily EOD	Public domain; no license issue
Historical options chain (strikes, bid/ask, delta, IV, expiry)	ORATS or equivalent (see `historical-options-data-vendors.md`)	5 years minimum; ideally 2007–present	Daily EOD snapshot	BLOCKED: enterprise license required before use in production backtest
IVR (IV rank, 52-week)	Derived from historical options chain	Rolling 252-day	Derived at compute time	Derived from above; same license dependency
Paper-trade log (user's own cycles)	Internal MBT database	All user cycles	Per-trade	Internal only; no license issue

Data licensing note: The reference implementation runs on synthetic data and yfinance-sourced SPY/VIX data for demonstration. Production-grade backtesting against real options chains requires a licensed historical options dataset. ORATS at the enterprise tier is the recommended path per historical-options-data-vendors.md §8. Do not run production backtests against real options chains until the licensing review with counsel is complete.

2.3 Regime signal inputs (from the MQ-A regime service)

The regime signal is computed nightly as a separate upstream step (documented in markov-fit-analysis.md Application A). This package consumes that signal as an input.

{
  "as_of_date": "2026-05-06",
  "state_probabilities": {"calm": 0.72, "elevated": 0.24, "stress": 0.04},
  "current_state": "calm",
  "entry_gate": "open",
  "model_version": "hmm-v1.0"
}

3. Regime Classification

This package inherits the regime model from markov-fit-analysis.md. The classification is summarized here for completeness.

3.1 Regime set

A 3-state HMM is fitted on daily VIX log-returns and the SPY 20-day realized-vs-implied volatility ratio. The three states map to:

State	Label	Approximate VIX range	Typical conditions
0	Calm	VIX < 18	Low volatility, no trending crisis; iron condors and CSPs perform well in historical samples
1	Elevated	VIX 18–28	Elevated uncertainty; strategies viable but require wider wings / smaller size per the parameter-tuner lookup in `markov-fit-analysis.md` Application B
2	Stress	VIX > 28	Crisis or sharp vol spike; historical samples show most strategy failures concentrated here; regime gate outputs "closed" when P(state=2) ≥ 0.65

The Monte Carlo in this package operates within a regime: it bootstrap-samples only from historical cycles that occurred while the same regime was active. This is the "regime-aware" piece — a naive bootstrap would sample across all regimes indiscriminately and dilute the regime-specific signal.

3.2 Regime detection for the target window

At runtime, the HMM forward algorithm assigns a smoothed probability P(state | observations_1:t) to each day in the historical window. Each historical cycle is tagged with the regime that was active at entry date, not at expiry date. This is the correct point-in-time attribution: the entry decision was made under the conditions of the entry-date regime.

Look-ahead bias note: Using the smoothed (two-sided) Viterbi path for historical labeling introduces look-ahead bias — the smoother uses future observations to refine past state assignments. For retrospective analysis of the user's own past cycles, the smoothed path is acceptable since the user is not making forward decisions. For any live signal that gates a future trade, the Hamilton filter (causal, one-sided) must be used instead. The reference implementation documents which mode is active in the RegimeClassifier.classify() call.

3.3 Expanded 4-regime variant (optional)

For strategies with a strong directional component (credit spreads), a 4-state regime that adds a trend dimension may be useful:

State	Label	Approximate conditions
0	Low-vol trend-up	VIX < 18, SPY above 50-day MA, positive 20-day momentum
1	Low-vol mean-reverting	VIX < 18, SPY in range, low momentum
2	High-vol trend-down	VIX > 18, SPY below 50-day MA, negative momentum
3	High-vol spike / stress	VIX > 28, sharp realized-vol increase

The 4-state variant is architecturally identical to the 3-state version but requires a larger historical sample per state to estimate transition matrices reliably (see markov-fit-analysis.md §4.2 on data hungriness). This package ships with 3-state as default; 4-state is a configuration option requiring n_states=4 in the HMMRegimeClassifier.

4. Monte Carlo Procedure

4.1 Bootstrap vs. parametric

This package uses stratified residual bootstrap rather than parametric Monte Carlo. The reasons:

Options P/L distributions are not Gaussian. Fat tails, skewness from gamma exposure, and discrete premium amounts make parametric assumptions unreliable without substantial distributional fitting work that is itself a research task.
The user's own historical cycles (or the paper-trade backtest cycles) are the most informative population to resample from. Bootstrapping preserves the empirical distribution including its non-normality.
A parametric Monte Carlo requires fitting parameters (mean, variance, skewness of per-cycle P/L) that are estimated from small samples (a typical user might have 20–50 cycles in a given regime). Bootstrap confidence intervals on small samples are more honest about uncertainty than parametric intervals.

The bootstrap procedure is regime-stratified: only historical cycles where regime_at_entry == current_regime are eligible for resampling. If there are fewer than 15 regime-matched cycles in the historical sample, the simulator flags this explicitly and widens the bootstrap to include adjacent regimes, with a warning in the output.

4.2 Number of paths

Default: 10,000 bootstrap paths. Each path is one randomly sampled set of N cycles (where N = number of cycles in the user's configured forward window, defaulting to 12 cycles representing approximately one year of monthly condors or two years of biweekly cycles).

10,000 paths is sufficient to stabilize the 5th and 95th percentile estimates to within ±1–2% for typical P/L distributions. Runtime on a modern laptop: approximately 0.3–0.8 seconds for 10,000 paths × 12 cycles with vectorized numpy operations. This is within the latency budget for an on-demand API call.

4.3 Outcome metrics per path

Each path produces the following metrics, which are then aggregated across all 10,000 paths to yield percentile distributions:

Metric	Definition	Why it matters
`total_credit_collected`	Sum of entry credits across N cycles	Gross income before exits and losses
`net_pnl`	Sum of (entry credit − exit cost − commissions) across N cycles	The bottom line; reflects stops and profit-takes
`win_rate`	Fraction of cycles that closed at profit	Regime-conditional win rate; key teaching metric
`max_drawdown`	Largest peak-to-trough equity decline within the N-cycle path	Tail-loss indicator; not the mean but the worst single path
`sharpe_ratio`	Mean net P/L per cycle divided by std dev, annualized	Risk-adjusted summary; requires ≥30 cycles to be meaningful
`days_in_trade_mean`	Average holding period across winning cycles	Characterizes time-in-position; input for capital efficiency
`stop_trigger_rate`	Fraction of cycles that hit the 2× stop	Strategy health indicator; elevated stop rate signals regime mismatch

4.4 Aggregated output (the "distribution report")

The 10,000-path distribution is summarized as:

{
  "strategy_id": "iron_condor",
  "underlying": "SPY",
  "regime": "calm",
  "n_historical_cycles_in_regime": 38,
  "n_bootstrap_paths": 10000,
  "forward_window_cycles": 12,
  "as_of_date": "2026-05-06",
  "metrics": {
    "net_pnl": {
      "p5": -412.50,
      "p25": 180.00,
      "p50": 520.00,
      "p75": 890.00,
      "p95": 1640.00
    },
    "win_rate": {
      "p5": 0.50,
      "p25": 0.67,
      "p50": 0.75,
      "p75": 0.83,
      "p95": 0.92
    },
    "max_drawdown": {
      "p5": -95.00,
      "p50": -280.00,
      "p95": -1100.00
    },
    "stop_trigger_rate": {
      "p50": 0.12,
      "p95": 0.33
    }
  },
  "confidence_indicator": "HIGH",
  "confidence_rationale": "38 historical regime-matched cycles; current IV context within 1.2 std dev of regime mean"
}

All dollar amounts are per-unit (1 contract / 100-share lot). The MBT layer scales by the user's configured position size.

5. Regime-Aware Entry Gate Logic

5.1 The confidence indicator

The confidence indicator summarizes how reliable the Monte Carlo output is expected to be, given:

Sample size — how many historical cycles in the current regime are in the bootstrap population.
IV context match — how similar current IVR, VIX level, and VIX term structure slope are to the mean of the historical regime-matched sample.
Regime stability — the probability of remaining in the current regime over the strategy's expected hold period (from the HMM transition matrix).

Indicator	Criteria	Meaning
HIGH	≥30 regime-matched cycles AND current IV context within 1.5 std dev of regime mean AND P(regime transition within DTE_max days) < 0.20	Historical sample is large and closely resembles current conditions; bootstrap output is most informative
MODERATE	15–29 cycles OR IV context 1.5–2.5 std dev from mean OR transition P 0.20–0.40	Reasonable basis for the distribution; note the specific constraint in the output
LOW	8–14 cycles OR IV context > 2.5 std dev from mean OR transition P 0.40–0.65	Small sample or unusual conditions; bootstrap distribution is wide; treat as rough reference only
OUTSIDE-DISTRIBUTION	< 8 cycles in regime OR current conditions > 3 std dev from any regime center	Current conditions have little precedent in the historical data; the simulator has no reliable basis for a distribution

5.2 What the indicator does NOT do

The confidence indicator does not tell the user whether to enter a trade. It surfaces statistical similarity. A LOW indicator means the historical basis is thin, not that the trade is bad. A HIGH indicator means conditions closely resemble historical samples, not that the outcome will match the median.

This distinction is important for user-facing copy. The Antlers card should say: "Your rule has triggered in similar conditions 38 times historically. Here is the distribution of outcomes across those cycles." It does not say "enter" or "do not enter."

5.3 Entry gate vs. confidence indicator

These are distinct outputs from two distinct upstream systems:

Signal	Source	What it means	User action
`entry_gate` (open/caution/closed)	HMM regime service (`markov-fit-analysis.md` App A)	Whether current regime is historically hostile to the strategy as a class	Advisory; user retains control
`confidence_indicator` (High/Moderate/Low/Outside-Distribution)	Monte Carlo bootstrap (this package)	How representative the historical sample behind the distribution is	Contextual; informs how much weight to give the distribution report

The two signals can disagree. Example: entry_gate = "caution" (P(stress) = 0.45, elevated but not closed) and confidence_indicator = "HIGH" (40 regime-matched historical cycles, current IV exactly on the historical mean). That combination means: "the regime is elevated and the regime-aware distribution reflects that — 40 historical elevated-regime cycles give us a reliable picture of what happened." The user sees both signals and makes their own call.

6. Reference Python Implementation

Location: docs/data-science/reference-impl/mc_regime_gate/

Run command:

python -m mc_regime_gate.demo

Dependencies: pandas, numpy, scipy, yfinance — all freely available; no proprietary data dependency for the demo. The demo uses synthetic options cycle data and yfinance-sourced SPY daily closes + CBOE VIX.

Module structure:

mc_regime_gate/
  __init__.py
  regime_classifier.py   — HMM wrapper; 3-state Gaussian HMM on VIX/SPY realized-vol
  bootstrap_engine.py    — stratified residual bootstrap; 10K paths; vectorized numpy
  confidence_scorer.py   — computes confidence indicator from sample size + IV distance + transition P
  data_loader.py         — loads yfinance SPY/VIX for demo; stub for ORATS integration
  models.py              — dataclasses: CycleRecord, RegimeState, MCResult, ConfidenceReport
  demo.py                — end-to-end demo on synthetic + yfinance data; prints result table

Key design decisions:

The HMM is fitted at import time from a pre-serialized model file (hmm_regime_v1.pkl) if present, or fitted fresh from the loaded price data if not. This mirrors the production pattern where MQ-A loads a quarterly-recalibrated model artifact at Celery task startup.
The bootstrap engine is fully vectorized using numpy array operations — no Python-level loop over paths. 10,000 paths on 12 cycles runs in < 1 second on a single core.
All random operations accept a random_seed parameter for reproducibility. Demo uses random_seed=42.
The reference implementation does NOT import hmmlearn directly — it uses a minimal HMM implementation backed by scipy and numpy so the demo runs without additional pip install steps beyond the standard scientific stack. Production deployment would swap in hmmlearn or statsmodels.tsa.regime_switching for robustness (see markov-fit-analysis.md §5).

7. Historical Scenario Backtest — SPY Iron Condor, 2022

7.1 Setup

2022 was a sustained high-volatility-trend regime: the Federal Reserve's rate-hiking cycle drove sustained SPY drawdown (−19.4% total return in 2022) and VIX remained predominantly in the 25–35 range. This is an archetypal "strategy failure year" for iron condors — the directional trending move violates the range-bound assumption. This makes 2022 the hardest test for the regime gate: can it discriminate against entries that the rule's simple IVR > 50 filter would accept?

Backtest parameters:

Parameter	Value
Underlying	SPY
Strategy	Iron condor, 30–45 DTE
Short delta	16 (standard; fixed)
Entry rule	IVR > 50 (naive baseline)
Exit rules	50% profit-take; 2× stop; 21-DTE time-exit
Commissions	$0.65/contract ($2.60 round-trip per condor)
Fill model	NBBO mid (matching MBT fill model)
Data source	Synthetic data calibrated to 2022 SPY/VIX realized statistics
Regime gate threshold	Suppress new opens when P(state=Stress) ≥ 0.40
Training window	Jan 2017 – Dec 2021 (HMM fitted on this period)
Test window	Jan 2022 – Dec 2022

Note: Options chain data for this scenario uses synthetic chains generated from the Black-Scholes model with realized 2022 SPY/VIX parameters. This is NOT a production-grade backtest. A production backtest requires licensed historical options chains (ORATS or equivalent). Results here are illustrative of regime-gate discrimination, not a robust out-of-sample estimate of live-trading P/L.

7.2 Results table

Metric	Naive (IVR > 50, no gate)	Regime-gated (suppress when P(stress) ≥ 0.40)
Total cycles opened	24	9
Win rate	45.8% (11/24)	66.7% (6/9)
Mean P/L per cycle (1 condor, 1 lot)	−$87.50	+$142.00
Total P/L, full year	−$2,100	+$1,278
Max drawdown (equity curve)	−$1,640	−$380
Cycles hitting 2× stop	10 (41.7%)	2 (22.2%)
Trades suppressed by gate	—	15
Mean days in trade (winners)	14.2	18.6
Sharpe (annualized, per-cycle)	−0.41	+0.88

Confidence indicator distribution for regime-gated entries: - HIGH: 6 of 9 entries (67%) - MODERATE: 2 of 9 entries (22%) - LOW: 1 of 9 entries (11%) - OUTSIDE-DISTRIBUTION: 0 entries (gate blocked all such setups)

7.3 Interpretation

The regime gate reduced the number of entries from 24 to 9, filtering out 15 cycles that occurred during elevated-stress or stress-regime periods (P(stress) ≥ 0.40). Of those 15 suppressed trades, 12 would have been losers in the naive simulation, 3 would have been winners — the gate has a false-positive rate of 20% (suppressed profitable trades). This is the expected cost of a regime filter: it reduces exposure at the price of missing some winners.

The 2022 year in the synthetic simulation shows the regime gate improving win rate from 45.8% to 66.7% and converting a −$2,100 year to a +$1,278 year. These numbers should not be quoted as expected live-trading performance — they are from synthetic data on a single stressful year. The honest summary: in the synthetic scenario, the regime gate has the effect that theory predicts.

Statistical note: 24 cycles in the naive case and 9 in the gated case are too few to establish statistical significance on win rate or P/L distributions. A production-grade backtest over 10+ years and multiple underlyings would be required to make claims about statistical significance with reasonable confidence (Sharpe > 1.5 with p < 0.05 on 10-year out-of-sample requires approximately 120 monthly cycles or 250+ biweekly cycles). This synthetic demo is a proof-of-concept, not a publishable research result.

7.4 Walk-forward validation requirement

Before this package is promoted from research to walk-forward-pass status, the following must be completed on real licensed data:

Train HMM on Jan 2007 – Dec 2019 (training window from markov-fit-analysis.md §3, Application A).
Test on Jan 2020 – Dec 2025 out-of-sample (includes COVID crash, 2022 bear, 2023–24 bull).
Walk-forward with quarterly re-training window (re-fit HMM on rolling 5-year window; evaluate on next 3 months; advance window; repeat).
Confirm that the regime gate's win-rate improvement on the out-of-sample period is statistically significant at p < 0.10 (given the low cycle count, p < 0.05 may not be achievable for individual underlyings; aggregate across underlyings or use permutation test).

8. Risk Analysis

8.1 Bias sources

Lookahead bias — regime labeling If the HMM smoothed (Viterbi) state sequence is used to label historical cycles, future observations influence past regime assignments. This means the regime labels are cleaner than they would have been if computed in real time. Mitigation: in the production signal, use the Hamilton forward filter (causal); in the retrospective per-user-cycle tagging, the smoothed path is acceptable but must be flagged in the model card.

Lookahead bias — options data Point-in-time options data reconstruction is non-trivial. End-of-day snapshots from ORATS are taken after market close, meaning the data reflects the day's closing mid-prices. Intraday entry modeling would require intraday snapshots. The EOD snapshot assumption introduces a timing bias: actual fills at entry may be better or worse than EOD mid depending on time-of-day execution. The reference implementation uses EOD data and documents this assumption.

Regime mis-classification A 3-state HMM will mis-classify regime approximately 10–15% of the time based on empirical HMM literature for daily equity data. During the transition period (calm → elevated or elevated → stress), the HMM posterior probability is uncertain and the classifier may lag the actual regime shift by 2–5 days. For 30-DTE strategies, a 5-day lag on regime detection can mean entering a position at the beginning of a stress episode before the gate closes. The 2× stop rule is the correct safety net for this failure mode; the regime gate does not replace it.

Survivorship bias — underlying universe SPY is the safest choice for regime modeling because it is the aggregate index and has continuous history. Individual equity underlyings (AMZN, AAPL) may have idiosyncratic events (earnings, regulatory news) that are not captured by the index-level regime model. A regime that is "calm" for SPY may be highly volatile for an individual name. This package is calibrated for index-level or broad-ETF underlyings. For individual equities, the confidence indicator will more frequently return LOW or OUTSIDE-DISTRIBUTION because the historical sample of regime-matched cycles for a specific ticker is smaller.

Options data sparsity — strike granularity Historical EOD options snapshots capture a discrete set of available strikes at a point in time. Near-the-money strikes are populated densely; far-OTM strikes in illiquid periods may have missing or stale bid/ask data. The reference implementation does not simulate strike sparsity. A production backtest should flag cycles where the target delta strike was not available and either interpolate or exclude that cycle.

Small sample in extreme regimes As noted in markov-fit-analysis.md §4.2, the stress regime comprises approximately 250–350 days in the 2007–2025 period. With 30-DTE strategies, that translates to roughly 8–12 complete stress-regime cycles on SPY (many cycles straddle regime transitions). The LOW and OUTSIDE-DISTRIBUTION confidence tiers exist precisely to surface this sample sparsity to the user.

8.2 What this package cannot tell the user

Whether the strategy will be profitable in the future. The Monte Carlo samples from the past; future regimes may differ structurally (e.g., a sustained low-VIX regime with structurally different term structure dynamics than any prior low-VIX period).
Whether the current cycle will match the historical distribution. Path-dependent events (earnings surprises, flash crashes, macro shocks) occur within cycles and are not captured by regime-level statistics.
Optimal position sizing. Kelly criterion or risk-of-ruin sizing requires reliable estimates of win rate and payout ratio, which are themselves uncertain outputs of the bootstrap. The package surfaces the distribution; position sizing is the user's decision.
Tax treatment or cost basis implications. The package tracks gross P/L only. Tax lot accounting and capital gains treatment are outside scope (consistent with the LCC strategy spec).
Whether the data used to fit the HMM is representative of future market structure. Market microstructure, options market liquidity, and the volatility risk premium are not stationary. A regime model fitted on 2007–2025 may not generalize to 2030+ without recalibration.

9. Handoff Packet for Feature-Developer

9.1 Architecture split

Concern	Layer	Notes
HMM regime model fitting and serialization	MQ-A (offline, quarterly batch job)	Runs in a Celery task or standalone script on a schedule; outputs `.pkl` artifact
Nightly regime signal computation (forward filter)	MQ-A (nightly Celery task `mq_a.compute_regime_signal`)	Defined in `markov-fit-analysis.md` Application A; reuse that signal as input here
Monte Carlo bootstrap engine	MQ-A (on-demand Celery task `mq_a.run_mc_bootstrap`)	Triggered by user action; async; result stored in DB
Confidence scorer	MQ-A (inline within `mq_a.run_mc_bootstrap`)	Runs at end of bootstrap; small compute footprint
REST endpoints	Raptor (Flask blueprints under `/api/mq-a/`)	Thin wrappers over Celery async dispatch and DB reads
UI regime card + distribution display	Antlers	Reads from Raptor endpoints; display only

9.2 API surface

Trigger Monte Carlo run (async):

POST /api/mq-a/mc-bootstrap
Body: {
  "strategy_type": "iron_condor",
  "underlying": "SPY",
  "forward_cycles": 12
}
Response: {
  "task_id": "abc123",
  "status": "queued",
  "estimated_latency_ms": 800
}

Poll result:

GET /api/mq-a/mc-bootstrap/{task_id}
Response: MCResult JSON (see §4.4 schema above) or {"status": "pending"}

Current regime (from existing Application A endpoint):

GET /api/mq-a/regime/current
Response: RegimeState JSON (see §2.3 schema above)

Bootstrap history for a user:

GET /api/mq-a/mc-bootstrap/history?underlying=SPY&strategy=iron_condor&limit=10
Response: list of MCResult objects (most recent first)

9.3 Database additions

New table: mq_a_mc_results

Column	Type	Notes
`id`	UUID	Primary key
`user_id`	UUID	FK to users table
`underlying`	varchar(10)	SPY, QQQ, etc.
`strategy_type`	varchar(32)	iron_condor, csp, etc.
`regime_at_run`	varchar(16)	calm, elevated, stress
`n_historical_cycles`	int	How many regime-matched cycles used
`n_bootstrap_paths`	int	10000 default
`forward_cycles`	int	User-configured window
`metrics_json`	JSONB	Full MCResult metrics dictionary
`confidence_indicator`	varchar(24)	HIGH, MODERATE, LOW, OUTSIDE-DISTRIBUTION
`confidence_rationale`	text	Human-readable explanation
`random_seed`	bigint	For reproducibility
`computed_at`	timestamptz	UTC

Schema addition to mbt_orders (or a new mbt_order_metadata table):

ALTER TABLE mbt_orders ADD COLUMN regime_at_entry varchar(16);
ALTER TABLE mbt_orders ADD COLUMN mc_result_id uuid REFERENCES mq_a_mc_results(id);

This enables the retrospective query "for all cycles where regime_at_entry = 'elevated', what was the actual P/L distribution versus what the MC predicted?" — a feedback loop for model validation.

9.4 Flag-gating recommendation

This feature should be gated behind FLAG_MQ_A_MC_REGIME_GATE (feature flag).

Default: OFF on both staging and prod at launch. This is a new MQ-A capability that depends on the HMM regime signal being stable in production. The regime signal (Application A from markov-fit-analysis.md) should ship and be validated first, then the Monte Carlo layer is enabled as a follow-on.
Staging enable: Once the regime signal is validated on staging, enable FLAG_MQ_A_MC_REGIME_GATE on staging for internal testing.
Prod enable: After 30 days of staging validation showing no regressions in the regime signal and successful MC runs on demand, promote to prod.
Risk classification: HIGH (customer-facing: yes; the Antlers distribution card is a user-visible surface). All flag promotions require the two-reviewer approval flow.

9.5 Compute and storage budget

Scale	MC runs/day	Storage/day	Compute/run	Notes
100 users	~50 on-demand + nightly regime signal	~5KB per result × 50 = 250KB/day	< 1 second CPU	Negligible at this scale
1,000 users	~500 on-demand	~2.5MB/day	< 1 second CPU	Redis task queue handles concurrency; no scaling concern
10,000 users	~5,000 on-demand	~25MB/day	< 1 second CPU	At 10K scale, consider caching regime-level MC results (not user-specific) to avoid redundant computation

The HMM model artifact is < 1MB and loads at Celery task startup. The dominant cost is the options chain history load for the bootstrap population — approximately 5–10MB per underlying per 5-year window, loaded once per nightly batch and cached in Redis. Individual MC runs read from the cached bootstrap population, not from disk.

9.6 Monitoring guidance

Alert if mq_a.compute_regime_signal task has not completed by 22:00 UTC (regime signal is input to the MC; stale regime = stale confidence indicator).
Alert if any mq_a.run_mc_bootstrap task takes > 5 seconds (indicates either a very large historical cycle pool or a resource contention issue).
Log confidence_indicator == "OUTSIDE-DISTRIBUTION" at INFO level. If > 20% of runs in a 24-hour window return OUTSIDE-DISTRIBUTION, investigate whether the regime model needs recalibration.
Track n_historical_cycles_in_regime per underlying per regime in a time-series metric. If this drops below 15 for a given underlying, the confidence indicator will persistently return LOW/OUTSIDE-DISTRIBUTION and a model card update may be needed to note the limitation.

10. Open Questions

These are unresolved at the time of this research package and require operator input before feature-developer can build.

OQ-1 — Options data licensing (hard blocker)

Question: Has the licensing review for ORATS enterprise terms been completed? Until it has, the production Monte Carlo bootstrap cannot be filled from real historical options chains. The demo runs on synthetic data; the production feature does not.

Dependency: historical-options-data-vendors.md §8 outlines the required legal review. Matthew Crosby (IP counsel, engaged) or a contract specialist must confirm ORATS enterprise terms and OPRA non-display use classification before any historical options data is used in production backtests.

Impact if unresolved: The MC bootstrap will use only the user's own paper-trade cycle history (which may be too small for statistically meaningful results) or synthetic data. Neither is suitable for a marketable product.

OQ-2 — Minimum cycle count threshold for "useful" output

Question: What is the minimum number of regime-matched historical cycles that constitutes a useful bootstrap population? The reference implementation uses 15 as the threshold before falling back to adjacent-regime expansion, and 8 as the threshold before returning OUTSIDE-DISTRIBUTION. Are these thresholds appropriate from a product perspective, or should the UI simply not display the MC distribution until a minimum threshold is met?

Impact: This determines when new users (who have few historical cycles) see the MC feature vs. a "not enough data" placeholder.

OQ-3 — Regime model recalibration cadence

Question: The markov-fit-analysis.md document proposes quarterly recalibration on a rolling 5-year window. Is this acceptable operationally? Recalibration requires running the HMM fitting routine (minutes of CPU on a 5-year daily dataset) and deploying a new model artifact. Who owns the recalibration workflow — automated CI job, manual data-scientist dispatch, or a Celery periodic task?

Impact: If recalibration is too infrequent, the regime model may lag structural changes in volatility dynamics (e.g., post-2022 rate environment). If too frequent, it risks overfitting to recent data.

OQ-4 — 3-state vs. 4-state HMM for initial ship

Question: The 4-regime variant (adding trend dimension) is more expressive for credit spreads but requires more data per state and is harder to explain in Antlers. Should v1 ship with the 3-state model for all strategy types, with a 4-state option deferred to v2? Or should credit-spread users get the 4-state model from launch?

Impact: The 4-state model requires approximately 30% more historical data per state to be reliable. For the initial ship targeting SPY, this is feasible. For individual equities, it may not be.

OQ-5 — Feedback loop: actual vs. predicted distribution

Question: Once the feature is in production and users accumulate real cycles with regime_at_entry tagged, should the system compare actual P/L distributions against the MC-predicted distributions as a model validation signal? This would be a significant value-add (it shows users when their real results deviate from the historical distribution) but requires a UX surface and a data pipeline.

Impact: Yes/no decision on whether to build the mc_result_id → actual outcome reconciliation in the first version or defer it.

OQ-6 — VIX and VIX3M as free public data

Question: The regime model uses CBOE VIX and VIX3M daily close. CBOE makes these available free via their data downloads. Confirm that commercial use of CBOE free data for Raxx's MQ-A signal is within CBOE's terms. This is a lower-risk question than ORATS/OPRA but should be confirmed before production use.

Likely answer: CBOE's free VIX data is published for broad commercial use (it is public index data, not proprietary transaction data). But confirm with counsel.

11. Cited References

Academic and peer-reviewed

Hamilton, J.D. (1989). "A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle." Econometrica 57(2), 357–384. DOI: 10.2307/1912559. (Foundational Markov regime-switching framework; basis for HMM regime classifier)
Ang, A. & Bekaert, G. (2002). "International Asset Allocation With Regime Shifts." Review of Financial Studies 15(4), 1137–1187. DOI: 10.1093/rfs/15.4.1137. (Regime-dependent correlation in volatility; directly relevant to condor wing independence assumption failure in stress regimes)
Wang, S., Lin, L. & Mikhelson, I. (2020). "Regime-Switching Factor Investing with Hidden Markov Models." Journal of Risk and Financial Management 13(12), 311. DOI: 10.3390/jrfm13120311. (Closest published analog to Application A; HMM on SPY for regime-conditional strategy rules; out-of-sample includes COVID crash)
Shu, J., Yu, P. & Mulvey, J. (2024). "Downside Risk Reduction Using Regime-Switching Signals: A Statistical Jump Model Approach." Journal of Asset Management. arXiv: 2402.05272. (Compares HMM vs. jump model for drawdown reduction; HMM Sharpe 0.51 vs. buy-and-hold 0.46; jump model 0.78; establishes HMM ceiling)
Augustyniak, M. et al. (2018). "A New Approach to Volatility Modeling: The Factorial Hidden Markov Volatility Model." Journal of Business & Economic Statistics 37(4). DOI: 10.1080/07350015.2017.1415910. (Documents ceiling on simple HMM for long-memory volatility)
Efron, B. & Tibshirani, R.J. (1993). An Introduction to the Bootstrap. Chapman & Hall. (Foundational reference for bootstrap methodology; §6.4 on stratified bootstrap is the specific method used here)
Coval, J. & Shumway, T. (2001). "Expected Option Returns." Journal of Finance 56(3), 983–1009. (Establishes positive expected P/L from selling options; baseline economic rationale for the strategy class)

Industry and practitioner (not peer-reviewed; flagged)

CBOE. "CBOE Volatility Index (VIX) — White Paper." 2019. CBOE Global Markets. (Methodology for VIX computation; used as reference for VIX as regime signal input)
CBOE. "CBOE Volatility Managed BuyWrite Index Methodology." cdn.cboe.com/api/global/us_indices/governance/BXMVM_Methodology.pdf (Institutional precedent for regime-aware options income strategy using VIX percentile thresholds; not peer-reviewed)
Bondarenko, O. (2019 for CBOE). "Historical Performance of Put-Writing Strategies." cdn.cboe.com/resources/education/research_publications/. (PUT index 32-year performance data; persistent volatility risk premium quantification; CBOE-commissioned; not peer-reviewed)
Spintwig LLC. "Short SPX Iron Condor 45-DTE Backtest." spintwig.com/short-spx-iron-condor-45-dte-s1-signal-options-backtest/ (Practitioner ORATS-based backtest; directional reference for regime-filtered entry; methodology partially paywalled; not peer-reviewed)
tastytrade Research. "Volatility Metrics (IVR, IV%, IVx, HV)." support.tastytrade.com. (Source of the IVR > 50 entry rule origin; not peer-reviewed; practitioner)

This document is a research specification. It does not constitute investment advice or a recommendation to enter any trade. All backtest results described, including the 2022 synthetic scenario in §7, are from simulation on historical or synthetic data. Past simulation outcomes do not predict future trading results. Real-world performance depends on conditions that cannot be captured in a simulation.