Raxx · internal docs

internal · gated

ADR 0057 — Reasonator re-scoring: model SHA as first-class provenance field

Status: Accepted Date: 2026-05-09 UTC Refs: #1385, docs/architecture/reasonator/design.md (Decision 4), #1380 (Decision 2 — model version governance)


Context

FinBERT model weights evolve. A ProsusAI/finbert checkpoint update could change the score for an identical headline. Without version provenance, a historical score is ambiguous: which model produced it? Was this computed before or after the model update?

The operator requirement (from handoff doc, OQ-2): every score must carry the exact model SHA used to produce it. If the model updates, historical rows must be re-scorable with provenance of both the old and new score preserved.

Two re-scoring strategies were considered: overwrite-in-place, and append-with-audit.


Decision

Append-with-audit. sentiment_events stores the current (latest) score. Every score write — initial and re-score — is appended to sentiment_score_audit with both the new score and (for re-scores) the previous score and its model SHA.

POST /v1/score/rescore carries previous_score and previous_model_sha in the request; Reasonator echoes them in the response alongside the new scores. Raptor writes the full comparison row to sentiment_score_audit.

scorer_model_version in sentiment_events is always the SHA of the model that produced the current score.

FINBERT_MODEL_SHA is an env var in the Reasonator config, sourced from Infisical. Changing the SHA and redeploying (or SIGHUPing) Reasonator triggers the model reload. A re-score sweep job is then triggered to update historical rows.


Consequences


Alternatives Considered

Overwrite-in-place: Simpler — just update sentiment_events.sentiment_score and scorer_model_version. Loses the old score permanently. Rejected: the operator requirement for reproducibility requires the old score to be preserved for comparison.

Separate score versions table (one row per version per event): More normalized. Querying current scores requires a MAX(scored_at) join. Higher query complexity with no additional benefit over the append-to-audit pattern. Rejected.

No re-scoring: Treat each model SHA as producing a distinct, non-comparable score series. Simple but means users cannot get a consistent historical view after a model update. Rejected — the operator explicitly requires re-scoring as a supported operation.