Handoff Packet — Issue #84
Feature-developer: read this first
What to build
Add six new fields to the calculate_metrics() function in backend_v2/api/routes/backtest.py and surface them in the Summary tab of BacktestResults.js.
No new data fetches. No new DB tables. No new API endpoints. This is a pure metrics extension.
Backend — exact file to touch
File: backend_v2/api/routes/backtest.py
Function: calculate_metrics(initial_capital, equity_curve, trades) — lines 789–874
Step 1: Add calculate_risk_metrics() as a standalone helper
Copy the full calculate_risk_metrics() function from docs/research/issue-84/reference-implementation.py into backtest.py. Place it directly above calculate_metrics().
Import nothing new — numpy and math are already imported.
Step 2: Wire it into calculate_metrics()
Inside calculate_metrics(), the existing code already computes:
daily_returns(list of floats) — lines 829–833annualized_return(float, decimal form before the* 100) — line 816max_drawdown(float, decimal form before the* 100) — line 819–824equity_values(list of floats) — line 809
Add this block before the final return statement (after line 853, before line 855):
# issue-84: extended risk metrics
risk_metrics = calculate_risk_metrics(
daily_returns=daily_returns,
annualized_return=annualized_return, # decimal, pre-percentage
max_drawdown=max_drawdown, # decimal, pre-percentage
equity_values=equity_values,
benchmark_returns=None,
)
Then merge into the return dict:
return {
"total_return": round(total_return * 100, 2),
"annualized_return": round(annualized_return * 100, 2),
"max_drawdown": round(max_drawdown * 100, 2),
"sharpe_ratio": round(sharpe_ratio, 2),
...existing fields...,
# issue-84 additions:
**risk_metrics,
}
Critical: annualized_return and max_drawdown are in decimal form inside calculate_metrics() before being multiplied by 100 at return time. Pass the decimal values to calculate_risk_metrics(). The helper uses them in decimal form internally.
New fields added to the response payload
| Field | Type | Notes |
|---|---|---|
sortino_ratio |
float or null | null when < 20 obs or no losing days |
calmar_ratio |
float or null | null when max_drawdown == 0 or < 20 obs |
var_95 |
float or null | positive %, e.g. 2.1 |
var_99 |
float or null | positive % |
cvar_95 |
float or null | positive %, always >= var_95 |
cvar_99 |
float or null | positive %, always >= var_99 |
ulcer_index |
float or null | lower is better |
beta |
null | always null in v1 |
insufficient_data |
bool | true when obs < 20 |
observations |
int | count of daily returns used |
These fields are also propagated in run_strategy_comparison() via the strategy_metrics dict inside the loop (line 939) — no change needed there because it already uses {**strategy_metrics} in the per-strategy result. Verify this after your change.
Frontend — exact file to touch
File: frontend/trademaster_ui/src/components/BacktestResults.js
Tab: Summary → "Performance Metrics" card (lines 354–391)
Changes needed
-
Add a "Risk Metrics" card as a third card below "Performance Metrics" in the Summary tab. Suggested placement: after the existing
<Col md={6}>Performance Metrics block, add a new<Row>with a full-width card. -
Fields to display:
| Label | Key | Format |
|---|---|---|
| Sortino Ratio | results.sortino_ratio |
safeToFixed(value, 2) |
| Calmar Ratio | results.calmar_ratio |
safeToFixed(value, 2) |
| VaR (95%) | results.var_95 |
safeToFixed(value, 2) + '%' |
| VaR (99%) | results.var_99 |
safeToFixed(value, 2) + '%' |
| CVaR / Exp. Shortfall (95%) | results.cvar_95 |
safeToFixed(value, 2) + '%' |
| CVaR / Exp. Shortfall (99%) | results.cvar_99 |
safeToFixed(value, 2) + '%' |
| Ulcer Index | results.ulcer_index |
safeToFixed(value, 2) |
- Insufficient-data notice: When
results.insufficient_data === true, render a Bootstrap<Alert variant="info">above the Risk Metrics card:
{results.insufficient_data && (
<Alert variant="info">
Risk metrics require at least 20 trading days of data
({results.observations ?? 0} days in this run).
</Alert>
)}
-
Obfuscate mode: VaR and CVaR are percentage fields, not dollar amounts — they are not passed through
formatMoney(). They describe percentage loss, not absolute dollar loss. Render them as plain percentage strings regardless of obfuscate mode. -
safeToFixedalready handles null/undefined by returning'--'— no additional null guard needed on the display side. -
The comparison view (
BacktestingResults.js) and the metrics table incompare_strategiesresponse also propagate these fields via themetricsarray. That component will display them automatically if it usesstrategy_metricsfrom the API — check whether it has its own hardcoded field list and update accordingly.
What MUST stay retrospective
All six metrics are computed from the user's own historical equity curve. The UI copy must not frame them predictively. Suggested labels:
- "VaR (95%)" — NOT "projected loss"
- "Worst-day loss at 95% (historical)" — acceptable
- The tooltip or footnote on the Risk Metrics card should read: "Calculated from your historical backtest run. Reflects past performance of this strategy on this data."
Feature flag
No feature flag needed. These are additive metrics on an existing endpoint. They do not change any existing field values.
Tests to write
Backend (backend_v2/tests/)
-
Unit test for
calculate_risk_metrics()— createtests/test_backtest_risk_metrics.py: -test_sortino_happy_path— 30 synthetic returns with some negatives; assertsortino_ratiois float and > 0. -test_sortino_no_downside_returns— all-positive returns; assertsortino_ratio is None. -test_calmar_zero_drawdown— flat equity curve; assertcalmar_ratio is None. -test_var_cvar_invariant— assertcvar_95 >= var_95andcvar_99 >= var_99on 50 random samples. -test_ulcer_index_flat_curve— constant equity; assertulcer_index == 0.0. -test_ulcer_index_monotone_decline— steadily declining equity; assertulcer_index > 0. -test_insufficient_data_guard— pass 5 returns; assertinsufficient_data is Trueand all metric fields areNone. -test_insufficient_data_boundary— pass exactly 20 returns; assertinsufficient_data is False. -
Integration test on
/api/backtestresponse — extendbacktest_export_api_tests.pyor add a new integration test: - Mockget_market_data_service()to return 30+ synthetic bars. - POST to/api/backtestand assert response containssortino_ratio,calmar_ratio,var_95,var_99,cvar_95,cvar_99,ulcer_indexkeys. - Assert all values are either float or null (not missing entirely). - Assertbetais alwaysnullin v1.
Frontend (frontend/trademaster_ui/src/tests/)
- Component test — extend
backtestResultsViewMode.test.jsor addbacktestRiskMetrics.test.js: - RenderBacktestResultswith a mock result containing all six new fields. - Assert each field label appears in the document. - Render withinsufficient_data: trueand assert the info alert is visible. - Render with all six fields asnulland assert--appears in each cell (safeToFixed behavior). - Assert VaR/CVaR values render with%suffix. - Assert VaR/CVaR do NOT go throughformatMoneywhenisObfuscated=true— verify the raw%label is still shown.
Estimated scope
Backend: ~60 lines (new helper + wiring). Frontend: ~80 lines (new card + alert). Tests: ~100 lines.
No migrations. No new dependencies. numpy is already imported.
Open questions for operator
None. Card is fully specified. The only deferred item is Beta to SPY — the card says "Beta (when benchmark provided)" and v1 always returns null. A follow-on card can wire in SPY bars if Kristerpher wants it.