Data Science — Strategy Inventory
This README is the authoritative inventory of all active research strategies and their pipeline status.
Status stages:
research → backtested → walk-forward-pass → ready-for-feature-dev → in-product
Active Strategies
lcc-income-cycle — Layered Covered Call Income Cycle
| Field | Value |
|---|---|
| Status | ready-for-feature-dev |
| Date Added | 2026-05-05 |
| OQs Locked | 2026-06-05 (all 5 resolved — see #3282) |
| Ticker Scope | AMZN (Phase 1); AAPL, MSFT, NVDA, SPY, QQQ (Phase 2) |
| Strategy Type | Income / Share + Covered Call |
| Walk-Forward Validated | No — backtest not yet run; OQs were the blocker; data licensing (ORATS) still required for historical backtest run |
| Ready for Feature-Dev | Yes — all 5 open questions resolved; PM can file engineering cards |
Locked decisions summary: - OQ-1 (max roll debit): per-strategy-config, default 50% of original credit; system refuses above threshold + 3-option operator card — locked 2026-06-05 - OQ-2 (early assignment): 4-option educational full-screen card, no autonomous action, state=EARLY_ASSIGNMENT_PENDING — locked 2026-06-05 - OQ-3 (uncovered shares / shares-below-N): auto-disable + push + email alert; 3-option operator card — locked 2026-06-05 - OQ-4 (sentiment staleness): 1 market day; High-Uncertainty fallback; strategy continues — locked 2026-06-05 - OQ-5 (tax-lot default): FIFO (IRS default); per-trade override at order ticket — locked 2026-06-05
Documents:
- Strategy spec: docs/data-science/2026-05-05-layered-covered-call-strategy.md
- Data schema: docs/data-science/strategies/lcc-income-cycle/data-schema.md
- Backtest config: docs/data-science/strategies/lcc-income-cycle/backtest-config.json
- Failure modes: docs/data-science/strategies/lcc-income-cycle/failure-modes.md
- Alert spec: docs/data-science/strategies/lcc-income-cycle/alert-spec.md
- Model card: docs/data-science/strategies/lcc-income-cycle/model-card.md
- Reference impl: docs/data-science/reference-impl/layered_covered_call/
Engineering handoff:
7 atomic engineering cards defined in the strategy spec's ## Engineering handoff section.
Estimated total feature-dev effort: 7.5 days. PM should file cards against #3282.
sentiment-journal-shape-1 — Personal Sentiment Journal (Shape 1)
| Field | Value |
|---|---|
| Status | ready-for-feature-dev |
| Date Added | 2026-06-05 |
| Strategy Type | User-Annotation / Behavioral Quality Tracking |
| Instrument Class | All (iron condor, vertical spread, CSP, covered call, equity) |
| Walk-Forward Validated | N/A — data-capture + query feature, not a signal |
| Ready for Feature-Dev | Yes — no open blockers |
Documents:
- Strategy brief: docs/data-science/2026-06-05-personal-sentiment-journal-shape-1.md
- Reference impl: docs/data-science/reference-impl/sentiment_journal/
- taxonomy.py — pre/post label enums + JSON schemas
- schema.sql — proposed DDL (migration 046 shape)
- query.py — backtest-query reference function
- demo.py — worked example with toy data (runs clean, stdlib + pandas/numpy only)
Taxonomy decision: - Pre-trade: Bullish / Bearish / Neutral / HighUncertainty (extends existing LCC set to all structures) - Post-trade: Disciplined / Patient / Adjusted / Panicked / Surprised (five labels, behavioral-quality axis)
Open questions before productization: None blocking v1. BLR running in parallel on legal posture; iterate post-handoff if findings change framing.
Next step: feature-developer picks up strategy brief (§6 Handoff Packet). Console migration 0145 required in same PR as implementation.
etf-nav-discount — ETF NAV Discount Entry
| Field | Value |
|---|---|
| Status | research |
| Date Added | 2026-05-12 |
| GitHub Issue | #479 |
| Source | User feedback (iMessage, 2026-04-29 UTC) |
| Strategy Type | Equity / Scheduled Conditional Buy |
| Instrument Class | US-listed ETFs (equity, bond, commodity) |
| Walk-Forward Validated | No — backtest not yet run on real data |
| Ready for Feature-Dev | No — 5 open questions; regulatory clearance is the hard blocker |
Documents:
- Strategy spec: docs/data-science/strategies/etf-nav-discount/spec.md
- Data schema: docs/data-science/strategies/etf-nav-discount/data-schema.md
- Backtest config: docs/data-science/strategies/etf-nav-discount/backtest-config.json
- Failure modes: docs/data-science/strategies/etf-nav-discount/failure-modes.md
- Model card: docs/data-science/strategies/etf-nav-discount/model-card.md
- Reference impl: docs/data-science/reference-impl/etf_nav_discount/
- NL parsing research: docs/architecture/research/nl-strategy-dsl-parsing.md
Open questions before productization:
1. (OQ-1 — hard blocker) business-legal-researcher must scope the investment-adviser
registration question for automated execution of user-authored AI-parsed strategies.
2. (OQ-2 — hard blocker) Commercial licensing for NAV data (fund sites, ETF.com).
BLR must confirm before scraper ships to production.
3. (OQ-3 — backtest blocker) Backtest not yet run on real data. Assign to
feature-developer or Kristerpher with Alpaca sandbox access. Runtime ~3 min.
4. (OQ-4 — product decision) Kristerpher's headline-vs-adjacent call per issue #479.
5. (OQ-5 — UX) Human-confirm flow: every trigger, or only first live-trading activation?
Next step: Kristerpher decides on OQ-4 (headline vs. adjacent). BLR dispatched on OQ-1 + OQ-2 in parallel. Once OQ-1 is cleared, assign OQ-3 backtest run.
mc-regime-gate — Rolling-Income Monte Carlo + Regime-Aware Entry Gate
| Field | Value |
|---|---|
| Status | research |
| Date Added | 2026-05-06 |
| GitHub Issue | #246 |
| Epic | #79 (Backtesting Lab) |
| Strategy Type | Infrastructure / Signal — supports IC-001, CS-001, CSP-001, lcc-income-cycle |
| Walk-Forward Validated | No — demo runs on synthetic data; production backtest blocked on data licensing |
| Ready for Feature-Dev | No — 6 open questions; data licensing is the hard blocker |
Documents:
- Research package: docs/data-science/2026-05-06-rolling-income-monte-carlo-regime-gate.md
- Reference impl: docs/data-science/reference-impl/mc_regime_gate/
Open questions before productization: 1. (OQ-1 — hard blocker) ORATS enterprise license confirmed? Production backtest blocked until yes. 2. (OQ-2) Minimum cycle count for "useful" output — 8/15 thresholds acceptable? 3. (OQ-3) HMM recalibration cadence — quarterly OK? Who owns the workflow? 4. (OQ-4) 3-state vs. 4-state HMM for initial ship — 3-state for all, or 4-state for credit spreads? 5. (OQ-5) Build the MC-vs-actual feedback loop in v1, or defer? 6. (OQ-6) Confirm CBOE free VIX data commercial use is in-scope with counsel.
Next step: Resolve OQ-1 (data licensing) with Matthew Crosby or contract counsel. Once
resolved, this moves to backtested and the walk-forward validation can begin on real data.
Prior Research (Pre-LCC)
Iron Condor / Credit Spread / CSP Framework (source brief)
Status: Source material ingested; not yet formalized into a strategy spec.
Source documents in docs/data-science/sources/:
- strategy.md — core strategy description (iron condors, credit spreads, CSPs)
- execution_workflow.md — current workflow gaps
- strengths_weaknesses.md — Kristerpher's self-assessment
- data_schema.json — original trade-tracking schema (iron condor oriented)
- agent_prompt.txt — prior automation prompt
- roadmap.md — original build plan
The LCC strategy (above) is the first formally specified and researched strategy. The iron condor / CSP strategies from the source brief are candidates for a second research sprint once LCC has shipped to feature-dev.
Supporting Research
docs/data-science/historical-options-data-vendors.md— vendor comparison for historical options data (ORATS, Tradier, CBOE DataShop, etc.)docs/data-science/markov-fit-analysis.md— prior Markov chain analysis (context for volatility regime modeling)docs/architecture/research/nl-strategy-dsl-parsing.md— NL-to-DSL parsing approach comparison for the natural-language strategy authoring surface (issue #479)
Backtest Artifacts
Run artifacts live under docs/data-science/backtests/<strategy-id>/<YYYY-MM-DD-run-id>/.
No backtest runs have been executed yet. The lcc-income-cycle backtest config is ready;
it requires confirmed access to historical options chain data (ORATS or Tradier subscription)
before the run can produce valid results.
The mc-regime-gate demo runs on synthetic data and requires no historical options data.
Production backtest is blocked on the ORATS enterprise licensing review.
The etf-nav-discount backtest config is ready; it requires Alpaca historical bars
(Alpaca Basic subscription) + NAV data CSV. Runtime estimate: ~3 minutes.
Blocked on OQ-1 (regulatory clearance) before productization; backtest itself can run
for research validation once OQ-3 is assigned.