Data Science — Strategy Inventory

This README is the authoritative inventory of all active research strategies and their pipeline status.

Status stages: research → backtested → walk-forward-pass → ready-for-feature-dev → in-product

Active Strategies

lcc-income-cycle — Layered Covered Call Income Cycle

Field	Value
Status	ready-for-feature-dev
Date Added	2026-05-05
OQs Locked	2026-06-05 (all 5 resolved — see #3282)
Ticker Scope	AMZN (Phase 1); AAPL, MSFT, NVDA, SPY, QQQ (Phase 2)
Strategy Type	Income / Share + Covered Call
Walk-Forward Validated	No — backtest not yet run; OQs were the blocker; data licensing (ORATS) still required for historical backtest run
Ready for Feature-Dev	Yes — all 5 open questions resolved; PM can file engineering cards

Locked decisions summary: - OQ-1 (max roll debit): per-strategy-config, default 50% of original credit; system refuses above threshold + 3-option operator card — locked 2026-06-05 - OQ-2 (early assignment): 4-option educational full-screen card, no autonomous action, state=EARLY_ASSIGNMENT_PENDING — locked 2026-06-05 - OQ-3 (uncovered shares / shares-below-N): auto-disable + push + email alert; 3-option operator card — locked 2026-06-05 - OQ-4 (sentiment staleness): 1 market day; High-Uncertainty fallback; strategy continues — locked 2026-06-05 - OQ-5 (tax-lot default): FIFO (IRS default); per-trade override at order ticket — locked 2026-06-05

Documents: - Strategy spec: docs/data-science/2026-05-05-layered-covered-call-strategy.md - Data schema: docs/data-science/strategies/lcc-income-cycle/data-schema.md - Backtest config: docs/data-science/strategies/lcc-income-cycle/backtest-config.json - Failure modes: docs/data-science/strategies/lcc-income-cycle/failure-modes.md - Alert spec: docs/data-science/strategies/lcc-income-cycle/alert-spec.md - Model card: docs/data-science/strategies/lcc-income-cycle/model-card.md - Reference impl: docs/data-science/reference-impl/layered_covered_call/

Engineering handoff: 7 atomic engineering cards defined in the strategy spec's ## Engineering handoff section. Estimated total feature-dev effort: 7.5 days. PM should file cards against #3282.

sentiment-journal-shape-1 — Personal Sentiment Journal (Shape 1)

Field	Value
Status	ready-for-feature-dev
Date Added	2026-06-05
Strategy Type	User-Annotation / Behavioral Quality Tracking
Instrument Class	All (iron condor, vertical spread, CSP, covered call, equity)
Walk-Forward Validated	N/A — data-capture + query feature, not a signal
Ready for Feature-Dev	Yes — no open blockers

Documents: - Strategy brief: docs/data-science/2026-06-05-personal-sentiment-journal-shape-1.md - Reference impl: docs/data-science/reference-impl/sentiment_journal/ - taxonomy.py — pre/post label enums + JSON schemas - schema.sql — proposed DDL (migration 046 shape) - query.py — backtest-query reference function - demo.py — worked example with toy data (runs clean, stdlib + pandas/numpy only)

Taxonomy decision: - Pre-trade: Bullish / Bearish / Neutral / HighUncertainty (extends existing LCC set to all structures) - Post-trade: Disciplined / Patient / Adjusted / Panicked / Surprised (five labels, behavioral-quality axis)

Open questions before productization: None blocking v1. BLR running in parallel on legal posture; iterate post-handoff if findings change framing.

Next step: feature-developer picks up strategy brief (§6 Handoff Packet). Console migration 0145 required in same PR as implementation.

etf-nav-discount — ETF NAV Discount Entry

Field	Value
Status	research
Date Added	2026-05-12
GitHub Issue	#479
Source	User feedback (iMessage, 2026-04-29 UTC)
Strategy Type	Equity / Scheduled Conditional Buy
Instrument Class	US-listed ETFs (equity, bond, commodity)
Walk-Forward Validated	No — backtest not yet run on real data
Ready for Feature-Dev	No — 5 open questions; regulatory clearance is the hard blocker

Documents: - Strategy spec: docs/data-science/strategies/etf-nav-discount/spec.md - Data schema: docs/data-science/strategies/etf-nav-discount/data-schema.md - Backtest config: docs/data-science/strategies/etf-nav-discount/backtest-config.json - Failure modes: docs/data-science/strategies/etf-nav-discount/failure-modes.md - Model card: docs/data-science/strategies/etf-nav-discount/model-card.md - Reference impl: docs/data-science/reference-impl/etf_nav_discount/ - NL parsing research: docs/architecture/research/nl-strategy-dsl-parsing.md

Open questions before productization: 1. (OQ-1 — hard blocker) business-legal-researcher must scope the investment-adviser registration question for automated execution of user-authored AI-parsed strategies. 2. (OQ-2 — hard blocker) Commercial licensing for NAV data (fund sites, ETF.com). BLR must confirm before scraper ships to production. 3. (OQ-3 — backtest blocker) Backtest not yet run on real data. Assign to feature-developer or Kristerpher with Alpaca sandbox access. Runtime ~3 min. 4. (OQ-4 — product decision) Kristerpher's headline-vs-adjacent call per issue #479. 5. (OQ-5 — UX) Human-confirm flow: every trigger, or only first live-trading activation?

Next step: Kristerpher decides on OQ-4 (headline vs. adjacent). BLR dispatched on OQ-1 + OQ-2 in parallel. Once OQ-1 is cleared, assign OQ-3 backtest run.

mc-regime-gate — Rolling-Income Monte Carlo + Regime-Aware Entry Gate

Field	Value
Status	research
Date Added	2026-05-06
GitHub Issue	#246
Epic	#79 (Backtesting Lab)
Strategy Type	Infrastructure / Signal — supports IC-001, CS-001, CSP-001, lcc-income-cycle
Walk-Forward Validated	No — demo runs on synthetic data; production backtest blocked on data licensing
Ready for Feature-Dev	No — 6 open questions; data licensing is the hard blocker

Documents: - Research package: docs/data-science/2026-05-06-rolling-income-monte-carlo-regime-gate.md - Reference impl: docs/data-science/reference-impl/mc_regime_gate/

Open questions before productization: 1. (OQ-1 — hard blocker) ORATS enterprise license confirmed? Production backtest blocked until yes. 2. (OQ-2) Minimum cycle count for "useful" output — 8/15 thresholds acceptable? 3. (OQ-3) HMM recalibration cadence — quarterly OK? Who owns the workflow? 4. (OQ-4) 3-state vs. 4-state HMM for initial ship — 3-state for all, or 4-state for credit spreads? 5. (OQ-5) Build the MC-vs-actual feedback loop in v1, or defer? 6. (OQ-6) Confirm CBOE free VIX data commercial use is in-scope with counsel.

Next step: Resolve OQ-1 (data licensing) with Matthew Crosby or contract counsel. Once resolved, this moves to backtested and the walk-forward validation can begin on real data.

Prior Research (Pre-LCC)

Iron Condor / Credit Spread / CSP Framework (source brief)

Status: Source material ingested; not yet formalized into a strategy spec.

Source documents in docs/data-science/sources/: - strategy.md — core strategy description (iron condors, credit spreads, CSPs) - execution_workflow.md — current workflow gaps - strengths_weaknesses.md — Kristerpher's self-assessment - data_schema.json — original trade-tracking schema (iron condor oriented) - agent_prompt.txt — prior automation prompt - roadmap.md — original build plan

The LCC strategy (above) is the first formally specified and researched strategy. The iron condor / CSP strategies from the source brief are candidates for a second research sprint once LCC has shipped to feature-dev.

Supporting Research

docs/data-science/historical-options-data-vendors.md — vendor comparison for historical options data (ORATS, Tradier, CBOE DataShop, etc.)
docs/data-science/markov-fit-analysis.md — prior Markov chain analysis (context for volatility regime modeling)
docs/architecture/research/nl-strategy-dsl-parsing.md — NL-to-DSL parsing approach comparison for the natural-language strategy authoring surface (issue #479)

Backtest Artifacts

Run artifacts live under docs/data-science/backtests/<strategy-id>/<YYYY-MM-DD-run-id>/. No backtest runs have been executed yet. The lcc-income-cycle backtest config is ready; it requires confirmed access to historical options chain data (ORATS or Tradier subscription) before the run can produce valid results.

The mc-regime-gate demo runs on synthetic data and requires no historical options data. Production backtest is blocked on the ORATS enterprise licensing review.

The etf-nav-discount backtest config is ready; it requires Alpaca historical bars (Alpaca Basic subscription) + NAV data CSV. Runtime estimate: ~3 minutes. Blocked on OQ-1 (regulatory clearance) before productization; backtest itself can run for research validation once OQ-3 is assigned.