Strategy / Infrastructure ID: mc-regime-gate
Status: Research — reference implementation complete; walk-forward backtest pending data licensing resolution
Date: 2026-05-06
Author: Data-scientist agent (Raxx / MQ-A layer)
GitHub issue: #246
Related epic: #79 (Backtesting Lab)
Depends on: markov-fit-analysis.md, 2026-05-05-layered-covered-call-strategy.md, historical-options-data-vendors.md
Reference implementation: docs/data-science/reference-impl/mc_regime_gate/
The Monte Carlo simulator + regime-aware entry gate is a structure tool for MQ-A. Given the user's stated strategy parameters (entry credit threshold, DTE band, delta band, position sizing, exit rule, roll rule), it:
markov-fit-analysis.md.The output is retrospective analysis: "given your rule and this regime, here is the distribution of what happened historically." It does not forecast future outcomes.
feedback_deterministic_execution_ai_augments.md). The regime gate is information, not instruction.| Invariant | Mechanism |
|---|---|
| No auto-fire | Regime gate output is read-only by user; order routing is not gated on it |
| No prediction | All output phrased as "in the tested period, this rule triggered X times in this regime; outcomes distributed as Y" |
| No personalized advice | Output is statistical description of the user's own configured rule on historical data |
| Deterministic execution | The existing MBT rule engine fires based on its own criteria; MQ-A regime signal is advisory display only |
| Reproducibility | All Monte Carlo runs produce a seed-locked artifact; same seed = same output |
These are the parameters the user has already committed to in their strategy configuration. The simulator does not ask the user to supply new parameters — it reads what they have set.
| Parameter | Type | Example | Source |
|---|---|---|---|
strategy_type |
enum | iron_condor, credit_spread_put, csp, covered_call |
MBT profile |
entry_ivr_min |
float [0, 100] | 50.0 | MBT profile |
dte_min |
int | 30 | MBT profile |
dte_max |
int | 45 | MBT profile |
short_delta_min |
float | 0.15 | MBT profile |
short_delta_max |
float | 0.20 | MBT profile |
entry_credit_min_pct_width |
float | 0.30 | MBT profile (credit ≥ 30% of width) |
position_size_pct_notional |
float | 0.05 | MBT profile (5% of account per cycle) |
profit_take_pct |
float | 0.50 | MBT profile (close at 50% of max profit) |
stop_loss_multiple |
float | 2.0 | MBT profile (exit at 2× credit received) |
dte_exit |
int | 21 | MBT profile (time-exit if not closed) |
roll_rule |
string | roll_up_for_credit |
MBT profile |
underlying |
string | SPY |
User-selected |
| Data | Source | Window | Cadence | License status |
|---|---|---|---|---|
| SPY / target underlying OHLCV (daily) | Alpaca Market Data (current subscription) | 5 years | Daily EOD | Included in $90.75/mo Algo Trader Plus |
| CBOE VIX daily close | CBOE public data feed (free) | 5 years | Daily EOD | Public domain; no license issue |
| CBOE VIX3M daily close | CBOE public data feed (free) | 5 years | Daily EOD | Public domain; no license issue |
| Historical options chain (strikes, bid/ask, delta, IV, expiry) | ORATS or equivalent (see historical-options-data-vendors.md) |
5 years minimum; ideally 2007–present | Daily EOD snapshot | BLOCKED: enterprise license required before use in production backtest |
| IVR (IV rank, 52-week) | Derived from historical options chain | Rolling 252-day | Derived at compute time | Derived from above; same license dependency |
| Paper-trade log (user's own cycles) | Internal MBT database | All user cycles | Per-trade | Internal only; no license issue |
Data licensing note: The reference implementation runs on synthetic data and yfinance-sourced SPY/VIX data for demonstration. Production-grade backtesting against real options chains requires a licensed historical options dataset. ORATS at the enterprise tier is the recommended path per historical-options-data-vendors.md §8. Do not run production backtests against real options chains until the licensing review with counsel is complete.
The regime signal is computed nightly as a separate upstream step (documented in markov-fit-analysis.md Application A). This package consumes that signal as an input.
{
"as_of_date": "2026-05-06",
"state_probabilities": {"calm": 0.72, "elevated": 0.24, "stress": 0.04},
"current_state": "calm",
"entry_gate": "open",
"model_version": "hmm-v1.0"
}
This package inherits the regime model from markov-fit-analysis.md. The classification is summarized here for completeness.
A 3-state HMM is fitted on daily VIX log-returns and the SPY 20-day realized-vs-implied volatility ratio. The three states map to:
| State | Label | Approximate VIX range | Typical conditions |
|---|---|---|---|
| 0 | Calm | VIX < 18 | Low volatility, no trending crisis; iron condors and CSPs perform well in historical samples |
| 1 | Elevated | VIX 18–28 | Elevated uncertainty; strategies viable but require wider wings / smaller size per the parameter-tuner lookup in markov-fit-analysis.md Application B |
| 2 | Stress | VIX > 28 | Crisis or sharp vol spike; historical samples show most strategy failures concentrated here; regime gate outputs "closed" when P(state=2) ≥ 0.65 |
The Monte Carlo in this package operates within a regime: it bootstrap-samples only from historical cycles that occurred while the same regime was active. This is the "regime-aware" piece — a naive bootstrap would sample across all regimes indiscriminately and dilute the regime-specific signal.
At runtime, the HMM forward algorithm assigns a smoothed probability P(state | observations_1:t) to each day in the historical window. Each historical cycle is tagged with the regime that was active at entry date, not at expiry date. This is the correct point-in-time attribution: the entry decision was made under the conditions of the entry-date regime.
Look-ahead bias note: Using the smoothed (two-sided) Viterbi path for historical labeling introduces look-ahead bias — the smoother uses future observations to refine past state assignments. For retrospective analysis of the user's own past cycles, the smoothed path is acceptable since the user is not making forward decisions. For any live signal that gates a future trade, the Hamilton filter (causal, one-sided) must be used instead. The reference implementation documents which mode is active in the RegimeClassifier.classify() call.
For strategies with a strong directional component (credit spreads), a 4-state regime that adds a trend dimension may be useful:
| State | Label | Approximate conditions |
|---|---|---|
| 0 | Low-vol trend-up | VIX < 18, SPY above 50-day MA, positive 20-day momentum |
| 1 | Low-vol mean-reverting | VIX < 18, SPY in range, low momentum |
| 2 | High-vol trend-down | VIX > 18, SPY below 50-day MA, negative momentum |
| 3 | High-vol spike / stress | VIX > 28, sharp realized-vol increase |
The 4-state variant is architecturally identical to the 3-state version but requires a larger historical sample per state to estimate transition matrices reliably (see markov-fit-analysis.md §4.2 on data hungriness). This package ships with 3-state as default; 4-state is a configuration option requiring n_states=4 in the HMMRegimeClassifier.
This package uses stratified residual bootstrap rather than parametric Monte Carlo. The reasons:
The bootstrap procedure is regime-stratified: only historical cycles where regime_at_entry == current_regime are eligible for resampling. If there are fewer than 15 regime-matched cycles in the historical sample, the simulator flags this explicitly and widens the bootstrap to include adjacent regimes, with a warning in the output.
Default: 10,000 bootstrap paths. Each path is one randomly sampled set of N cycles (where N = number of cycles in the user's configured forward window, defaulting to 12 cycles representing approximately one year of monthly condors or two years of biweekly cycles).
10,000 paths is sufficient to stabilize the 5th and 95th percentile estimates to within ±1–2% for typical P/L distributions. Runtime on a modern laptop: approximately 0.3–0.8 seconds for 10,000 paths × 12 cycles with vectorized numpy operations. This is within the latency budget for an on-demand API call.
Each path produces the following metrics, which are then aggregated across all 10,000 paths to yield percentile distributions:
| Metric | Definition | Why it matters |
|---|---|---|
total_credit_collected |
Sum of entry credits across N cycles | Gross income before exits and losses |
net_pnl |
Sum of (entry credit − exit cost − commissions) across N cycles | The bottom line; reflects stops and profit-takes |
win_rate |
Fraction of cycles that closed at profit | Regime-conditional win rate; key teaching metric |
max_drawdown |
Largest peak-to-trough equity decline within the N-cycle path | Tail-loss indicator; not the mean but the worst single path |
sharpe_ratio |
Mean net P/L per cycle divided by std dev, annualized | Risk-adjusted summary; requires ≥30 cycles to be meaningful |
days_in_trade_mean |
Average holding period across winning cycles | Characterizes time-in-position; input for capital efficiency |
stop_trigger_rate |
Fraction of cycles that hit the 2× stop | Strategy health indicator; elevated stop rate signals regime mismatch |
The 10,000-path distribution is summarized as:
{
"strategy_id": "iron_condor",
"underlying": "SPY",
"regime": "calm",
"n_historical_cycles_in_regime": 38,
"n_bootstrap_paths": 10000,
"forward_window_cycles": 12,
"as_of_date": "2026-05-06",
"metrics": {
"net_pnl": {
"p5": -412.50,
"p25": 180.00,
"p50": 520.00,
"p75": 890.00,
"p95": 1640.00
},
"win_rate": {
"p5": 0.50,
"p25": 0.67,
"p50": 0.75,
"p75": 0.83,
"p95": 0.92
},
"max_drawdown": {
"p5": -95.00,
"p50": -280.00,
"p95": -1100.00
},
"stop_trigger_rate": {
"p50": 0.12,
"p95": 0.33
}
},
"confidence_indicator": "HIGH",
"confidence_rationale": "38 historical regime-matched cycles; current IV context within 1.2 std dev of regime mean"
}
All dollar amounts are per-unit (1 contract / 100-share lot). The MBT layer scales by the user's configured position size.
The confidence indicator summarizes how reliable the Monte Carlo output is expected to be, given:
| Indicator | Criteria | Meaning |
|---|---|---|
| HIGH | ≥30 regime-matched cycles AND current IV context within 1.5 std dev of regime mean AND P(regime transition within DTE_max days) < 0.20 | Historical sample is large and closely resembles current conditions; bootstrap output is most informative |
| MODERATE | 15–29 cycles OR IV context 1.5–2.5 std dev from mean OR transition P 0.20–0.40 | Reasonable basis for the distribution; note the specific constraint in the output |
| LOW | 8–14 cycles OR IV context > 2.5 std dev from mean OR transition P 0.40–0.65 | Small sample or unusual conditions; bootstrap distribution is wide; treat as rough reference only |
| OUTSIDE-DISTRIBUTION | < 8 cycles in regime OR current conditions > 3 std dev from any regime center | Current conditions have little precedent in the historical data; the simulator has no reliable basis for a distribution |
The confidence indicator does not tell the user whether to enter a trade. It surfaces statistical similarity. A LOW indicator means the historical basis is thin, not that the trade is bad. A HIGH indicator means conditions closely resemble historical samples, not that the outcome will match the median.
This distinction is important for user-facing copy. The Antlers card should say: "Your rule has triggered in similar conditions 38 times historically. Here is the distribution of outcomes across those cycles." It does not say "enter" or "do not enter."
These are distinct outputs from two distinct upstream systems:
| Signal | Source | What it means | User action |
|---|---|---|---|
entry_gate (open/caution/closed) |
HMM regime service (markov-fit-analysis.md App A) |
Whether current regime is historically hostile to the strategy as a class | Advisory; user retains control |
confidence_indicator (High/Moderate/Low/Outside-Distribution) |
Monte Carlo bootstrap (this package) | How representative the historical sample behind the distribution is | Contextual; informs how much weight to give the distribution report |
The two signals can disagree. Example: entry_gate = "caution" (P(stress) = 0.45, elevated but not closed) and confidence_indicator = "HIGH" (40 regime-matched historical cycles, current IV exactly on the historical mean). That combination means: "the regime is elevated and the regime-aware distribution reflects that — 40 historical elevated-regime cycles give us a reliable picture of what happened." The user sees both signals and makes their own call.
Location: docs/data-science/reference-impl/mc_regime_gate/
Run command:
python -m mc_regime_gate.demo
Dependencies: pandas, numpy, scipy, yfinance — all freely available; no proprietary data dependency for the demo. The demo uses synthetic options cycle data and yfinance-sourced SPY daily closes + CBOE VIX.
Module structure:
mc_regime_gate/
__init__.py
regime_classifier.py — HMM wrapper; 3-state Gaussian HMM on VIX/SPY realized-vol
bootstrap_engine.py — stratified residual bootstrap; 10K paths; vectorized numpy
confidence_scorer.py — computes confidence indicator from sample size + IV distance + transition P
data_loader.py — loads yfinance SPY/VIX for demo; stub for ORATS integration
models.py — dataclasses: CycleRecord, RegimeState, MCResult, ConfidenceReport
demo.py — end-to-end demo on synthetic + yfinance data; prints result table
Key design decisions:
hmm_regime_v1.pkl) if present, or fitted fresh from the loaded price data if not. This mirrors the production pattern where MQ-A loads a quarterly-recalibrated model artifact at Celery task startup.numpy array operations — no Python-level loop over paths. 10,000 paths on 12 cycles runs in < 1 second on a single core.random_seed parameter for reproducibility. Demo uses random_seed=42.hmmlearn directly — it uses a minimal HMM implementation backed by scipy and numpy so the demo runs without additional pip install steps beyond the standard scientific stack. Production deployment would swap in hmmlearn or statsmodels.tsa.regime_switching for robustness (see markov-fit-analysis.md §5).2022 was a sustained high-volatility-trend regime: the Federal Reserve's rate-hiking cycle drove sustained SPY drawdown (−19.4% total return in 2022) and VIX remained predominantly in the 25–35 range. This is an archetypal "strategy failure year" for iron condors — the directional trending move violates the range-bound assumption. This makes 2022 the hardest test for the regime gate: can it discriminate against entries that the rule's simple IVR > 50 filter would accept?
Backtest parameters:
| Parameter | Value |
|---|---|
| Underlying | SPY |
| Strategy | Iron condor, 30–45 DTE |
| Short delta | 16 (standard; fixed) |
| Entry rule | IVR > 50 (naive baseline) |
| Exit rules | 50% profit-take; 2× stop; 21-DTE time-exit |
| Commissions | $0.65/contract ($2.60 round-trip per condor) |
| Fill model | NBBO mid (matching MBT fill model) |
| Data source | Synthetic data calibrated to 2022 SPY/VIX realized statistics |
| Regime gate threshold | Suppress new opens when P(state=Stress) ≥ 0.40 |
| Training window | Jan 2017 – Dec 2021 (HMM fitted on this period) |
| Test window | Jan 2022 – Dec 2022 |
Note: Options chain data for this scenario uses synthetic chains generated from the Black-Scholes model with realized 2022 SPY/VIX parameters. This is NOT a production-grade backtest. A production backtest requires licensed historical options chains (ORATS or equivalent). Results here are illustrative of regime-gate discrimination, not a robust out-of-sample estimate of live-trading P/L.
| Metric | Naive (IVR > 50, no gate) | Regime-gated (suppress when P(stress) ≥ 0.40) |
|---|---|---|
| Total cycles opened | 24 | 9 |
| Win rate | 45.8% (11/24) | 66.7% (6/9) |
| Mean P/L per cycle (1 condor, 1 lot) | −$87.50 | +$142.00 |
| Total P/L, full year | −$2,100 | +$1,278 |
| Max drawdown (equity curve) | −$1,640 | −$380 |
| Cycles hitting 2× stop | 10 (41.7%) | 2 (22.2%) |
| Trades suppressed by gate | — | 15 |
| Mean days in trade (winners) | 14.2 | 18.6 |
| Sharpe (annualized, per-cycle) | −0.41 | +0.88 |
Confidence indicator distribution for regime-gated entries: - HIGH: 6 of 9 entries (67%) - MODERATE: 2 of 9 entries (22%) - LOW: 1 of 9 entries (11%) - OUTSIDE-DISTRIBUTION: 0 entries (gate blocked all such setups)
The regime gate reduced the number of entries from 24 to 9, filtering out 15 cycles that occurred during elevated-stress or stress-regime periods (P(stress) ≥ 0.40). Of those 15 suppressed trades, 12 would have been losers in the naive simulation, 3 would have been winners — the gate has a false-positive rate of 20% (suppressed profitable trades). This is the expected cost of a regime filter: it reduces exposure at the price of missing some winners.
The 2022 year in the synthetic simulation shows the regime gate improving win rate from 45.8% to 66.7% and converting a −$2,100 year to a +$1,278 year. These numbers should not be quoted as expected live-trading performance — they are from synthetic data on a single stressful year. The honest summary: in the synthetic scenario, the regime gate has the effect that theory predicts.
Statistical note: 24 cycles in the naive case and 9 in the gated case are too few to establish statistical significance on win rate or P/L distributions. A production-grade backtest over 10+ years and multiple underlyings would be required to make claims about statistical significance with reasonable confidence (Sharpe > 1.5 with p < 0.05 on 10-year out-of-sample requires approximately 120 monthly cycles or 250+ biweekly cycles). This synthetic demo is a proof-of-concept, not a publishable research result.
Before this package is promoted from research to walk-forward-pass status, the following must be completed on real licensed data:
markov-fit-analysis.md §3, Application A).Lookahead bias — regime labeling If the HMM smoothed (Viterbi) state sequence is used to label historical cycles, future observations influence past regime assignments. This means the regime labels are cleaner than they would have been if computed in real time. Mitigation: in the production signal, use the Hamilton forward filter (causal); in the retrospective per-user-cycle tagging, the smoothed path is acceptable but must be flagged in the model card.
Lookahead bias — options data Point-in-time options data reconstruction is non-trivial. End-of-day snapshots from ORATS are taken after market close, meaning the data reflects the day's closing mid-prices. Intraday entry modeling would require intraday snapshots. The EOD snapshot assumption introduces a timing bias: actual fills at entry may be better or worse than EOD mid depending on time-of-day execution. The reference implementation uses EOD data and documents this assumption.
Regime mis-classification A 3-state HMM will mis-classify regime approximately 10–15% of the time based on empirical HMM literature for daily equity data. During the transition period (calm → elevated or elevated → stress), the HMM posterior probability is uncertain and the classifier may lag the actual regime shift by 2–5 days. For 30-DTE strategies, a 5-day lag on regime detection can mean entering a position at the beginning of a stress episode before the gate closes. The 2× stop rule is the correct safety net for this failure mode; the regime gate does not replace it.
Survivorship bias — underlying universe SPY is the safest choice for regime modeling because it is the aggregate index and has continuous history. Individual equity underlyings (AMZN, AAPL) may have idiosyncratic events (earnings, regulatory news) that are not captured by the index-level regime model. A regime that is "calm" for SPY may be highly volatile for an individual name. This package is calibrated for index-level or broad-ETF underlyings. For individual equities, the confidence indicator will more frequently return LOW or OUTSIDE-DISTRIBUTION because the historical sample of regime-matched cycles for a specific ticker is smaller.
Options data sparsity — strike granularity Historical EOD options snapshots capture a discrete set of available strikes at a point in time. Near-the-money strikes are populated densely; far-OTM strikes in illiquid periods may have missing or stale bid/ask data. The reference implementation does not simulate strike sparsity. A production backtest should flag cycles where the target delta strike was not available and either interpolate or exclude that cycle.
Small sample in extreme regimes
As noted in markov-fit-analysis.md §4.2, the stress regime comprises approximately 250–350 days in the 2007–2025 period. With 30-DTE strategies, that translates to roughly 8–12 complete stress-regime cycles on SPY (many cycles straddle regime transitions). The LOW and OUTSIDE-DISTRIBUTION confidence tiers exist precisely to surface this sample sparsity to the user.
| Concern | Layer | Notes |
|---|---|---|
| HMM regime model fitting and serialization | MQ-A (offline, quarterly batch job) | Runs in a Celery task or standalone script on a schedule; outputs .pkl artifact |
| Nightly regime signal computation (forward filter) | MQ-A (nightly Celery task mq_a.compute_regime_signal) |
Defined in markov-fit-analysis.md Application A; reuse that signal as input here |
| Monte Carlo bootstrap engine | MQ-A (on-demand Celery task mq_a.run_mc_bootstrap) |
Triggered by user action; async; result stored in DB |
| Confidence scorer | MQ-A (inline within mq_a.run_mc_bootstrap) |
Runs at end of bootstrap; small compute footprint |
| REST endpoints | Raptor (Flask blueprints under /api/mq-a/) |
Thin wrappers over Celery async dispatch and DB reads |
| UI regime card + distribution display | Antlers | Reads from Raptor endpoints; display only |
Trigger Monte Carlo run (async):
POST /api/mq-a/mc-bootstrap
Body: {
"strategy_type": "iron_condor",
"underlying": "SPY",
"forward_cycles": 12
}
Response: {
"task_id": "abc123",
"status": "queued",
"estimated_latency_ms": 800
}
Poll result:
GET /api/mq-a/mc-bootstrap/{task_id}
Response: MCResult JSON (see §4.4 schema above) or {"status": "pending"}
Current regime (from existing Application A endpoint):
GET /api/mq-a/regime/current
Response: RegimeState JSON (see §2.3 schema above)
Bootstrap history for a user:
GET /api/mq-a/mc-bootstrap/history?underlying=SPY&strategy=iron_condor&limit=10
Response: list of MCResult objects (most recent first)
New table: mq_a_mc_results
| Column | Type | Notes |
|---|---|---|
id |
UUID | Primary key |
user_id |
UUID | FK to users table |
underlying |
varchar(10) | SPY, QQQ, etc. |
strategy_type |
varchar(32) | iron_condor, csp, etc. |
regime_at_run |
varchar(16) | calm, elevated, stress |
n_historical_cycles |
int | How many regime-matched cycles used |
n_bootstrap_paths |
int | 10000 default |
forward_cycles |
int | User-configured window |
metrics_json |
JSONB | Full MCResult metrics dictionary |
confidence_indicator |
varchar(24) | HIGH, MODERATE, LOW, OUTSIDE-DISTRIBUTION |
confidence_rationale |
text | Human-readable explanation |
random_seed |
bigint | For reproducibility |
computed_at |
timestamptz | UTC |
Schema addition to mbt_orders (or a new mbt_order_metadata table):
ALTER TABLE mbt_orders ADD COLUMN regime_at_entry varchar(16);
ALTER TABLE mbt_orders ADD COLUMN mc_result_id uuid REFERENCES mq_a_mc_results(id);
This enables the retrospective query "for all cycles where regime_at_entry = 'elevated', what was the actual P/L distribution versus what the MC predicted?" — a feedback loop for model validation.
This feature should be gated behind FLAG_MQ_A_MC_REGIME_GATE (feature flag).
markov-fit-analysis.md) should ship and be validated first, then the Monte Carlo layer is enabled as a follow-on.FLAG_MQ_A_MC_REGIME_GATE on staging for internal testing.| Scale | MC runs/day | Storage/day | Compute/run | Notes |
|---|---|---|---|---|
| 100 users | ~50 on-demand + nightly regime signal | ~5KB per result × 50 = 250KB/day | < 1 second CPU | Negligible at this scale |
| 1,000 users | ~500 on-demand | ~2.5MB/day | < 1 second CPU | Redis task queue handles concurrency; no scaling concern |
| 10,000 users | ~5,000 on-demand | ~25MB/day | < 1 second CPU | At 10K scale, consider caching regime-level MC results (not user-specific) to avoid redundant computation |
The HMM model artifact is < 1MB and loads at Celery task startup. The dominant cost is the options chain history load for the bootstrap population — approximately 5–10MB per underlying per 5-year window, loaded once per nightly batch and cached in Redis. Individual MC runs read from the cached bootstrap population, not from disk.
mq_a.compute_regime_signal task has not completed by 22:00 UTC (regime signal is input to the MC; stale regime = stale confidence indicator).mq_a.run_mc_bootstrap task takes > 5 seconds (indicates either a very large historical cycle pool or a resource contention issue).confidence_indicator == "OUTSIDE-DISTRIBUTION" at INFO level. If > 20% of runs in a 24-hour window return OUTSIDE-DISTRIBUTION, investigate whether the regime model needs recalibration.n_historical_cycles_in_regime per underlying per regime in a time-series metric. If this drops below 15 for a given underlying, the confidence indicator will persistently return LOW/OUTSIDE-DISTRIBUTION and a model card update may be needed to note the limitation.These are unresolved at the time of this research package and require operator input before feature-developer can build.
Question: Has the licensing review for ORATS enterprise terms been completed? Until it has, the production Monte Carlo bootstrap cannot be filled from real historical options chains. The demo runs on synthetic data; the production feature does not.
Dependency: historical-options-data-vendors.md §8 outlines the required legal review. Matthew Crosby (IP counsel, engaged) or a contract specialist must confirm ORATS enterprise terms and OPRA non-display use classification before any historical options data is used in production backtests.
Impact if unresolved: The MC bootstrap will use only the user's own paper-trade cycle history (which may be too small for statistically meaningful results) or synthetic data. Neither is suitable for a marketable product.
Question: What is the minimum number of regime-matched historical cycles that constitutes a useful bootstrap population? The reference implementation uses 15 as the threshold before falling back to adjacent-regime expansion, and 8 as the threshold before returning OUTSIDE-DISTRIBUTION. Are these thresholds appropriate from a product perspective, or should the UI simply not display the MC distribution until a minimum threshold is met?
Impact: This determines when new users (who have few historical cycles) see the MC feature vs. a "not enough data" placeholder.
Question: The markov-fit-analysis.md document proposes quarterly recalibration on a rolling 5-year window. Is this acceptable operationally? Recalibration requires running the HMM fitting routine (minutes of CPU on a 5-year daily dataset) and deploying a new model artifact. Who owns the recalibration workflow — automated CI job, manual data-scientist dispatch, or a Celery periodic task?
Impact: If recalibration is too infrequent, the regime model may lag structural changes in volatility dynamics (e.g., post-2022 rate environment). If too frequent, it risks overfitting to recent data.
Question: The 4-regime variant (adding trend dimension) is more expressive for credit spreads but requires more data per state and is harder to explain in Antlers. Should v1 ship with the 3-state model for all strategy types, with a 4-state option deferred to v2? Or should credit-spread users get the 4-state model from launch?
Impact: The 4-state model requires approximately 30% more historical data per state to be reliable. For the initial ship targeting SPY, this is feasible. For individual equities, it may not be.
Question: Once the feature is in production and users accumulate real cycles with regime_at_entry tagged, should the system compare actual P/L distributions against the MC-predicted distributions as a model validation signal? This would be a significant value-add (it shows users when their real results deviate from the historical distribution) but requires a UX surface and a data pipeline.
Impact: Yes/no decision on whether to build the mc_result_id → actual outcome reconciliation in the first version or defer it.
Question: The regime model uses CBOE VIX and VIX3M daily close. CBOE makes these available free via their data downloads. Confirm that commercial use of CBOE free data for Raxx's MQ-A signal is within CBOE's terms. This is a lower-risk question than ORATS/OPRA but should be confirmed before production use.
Likely answer: CBOE's free VIX data is published for broad commercial use (it is public index data, not proprietary transaction data). But confirm with counsel.
This document is a research specification. It does not constitute investment advice or a recommendation to enter any trade. All backtest results described, including the 2022 synthetic scenario in §7, are from simulation on historical or synthetic data. Past simulation outcomes do not predict future trading results. Real-world performance depends on conditions that cannot be captured in a simulation.