Raxx · internal docs

internal · gated ↑ index

Reasonator API Contract

Service: Reasonator (sentiment scoring service) Version: v1 Date: 2026-05-09 UTC Refs: docs/architecture/reasonator/design.md, #1385, #89

All times UTC ISO 8601. All requests authenticated via Bearer token. All responses application/json.


Authentication

Every request must carry:

Authorization: Bearer <REASONATOR_SERVICE_TOKEN>
X-Raxx-Tier: pro_plus | pro
X-Raxx-Request-ID: <uuid>     # caller-generated; echoed in response for tracing

On auth failure: 401 Unauthorized

{
  "error": "unauthorized",
  "message": "Invalid or missing service token"
}

On tier header missing: 400 Bad Request

{
  "error": "missing_tier_header",
  "message": "X-Raxx-Tier header is required"
}

Endpoints

POST /score

Purpose: Synchronous scoring of a batch of headlines. Pro+ real-time path. Tier: Available to both pro and pro_plus. Pro+ requests are processed before Pro requests in the internal priority queue. Latency target: - pro_plus, ≤10 headlines: p99 < 2s - pro_plus, ≤50 headlines: p99 < 8s - pro, any size: best-effort, no SLA (use /score/batch for large Pro batches)

Request:

{
  "model_sha": "abc123def456",
  "headlines": [
    {
      "id": "se_1234",
      "text": "Apple surges on record earnings beat",
      "symbol": "AAPL",
      "published_at": "2026-04-14T10:32:00Z"
    }
  ]
}

Fields: - model_sha — required. Must match the currently loaded model SHA. If it does not match, Reasonator returns 409 Conflict (see error codes). This enforces score provenance. - headlines[].id — caller-assigned opaque ID (echoed in response). Use sentiment_events.id as the value. - headlines[].text — the headline string. Max 512 characters. - headlines[].symbol — ticker symbol. - headlines[].published_at — UTC ISO 8601.

Max batch size: 100 headlines per request. Larger batches → use /score/batch.

Response 200:

{
  "request_id": "req_uuid",
  "model_sha": "abc123def456",
  "model_name": "ProsusAI/finbert",
  "scored_at": "2026-05-09T14:22:00Z",
  "results": [
    {
      "id": "se_1234",
      "label": "positive",
      "score": 0.79,
      "confidence": 0.91
    }
  ],
  "unscored": [],
  "latency_ms": 312
}

Fields: - label — one of positive, negative, neutral. - score — float -1.0 to +1.0. Positive=+1 scaled by confidence, negative=-1 scaled by confidence, neutral=0 scaled by confidence. - confidence — softmax probability of the winning class [0.0, 1.0]. - unscored — array of id values that could not be scored (e.g., text too short, scoring error). Never drops silently. - latency_ms — wall-clock time for the scoring operation only (not network).

Error responses: - 400 — validation failure (missing fields, batch too large) - 401 — auth failure - 409 Conflictmodel_sha does not match loaded model - 429 Too Many Requests — rate limit exceeded (see rate limit headers) - 503 Service Unavailable — model not yet loaded (during warm-up window)


POST /score/batch

Purpose: Asynchronous bulk scoring. Pro background path. Accepts up to 2,000 headline IDs per job. Tier: pro and pro_plus. Pro jobs are lower priority in the internal queue.

Request:

{
  "model_sha": "abc123def456",
  "job_id": "caller-assigned-uuid",
  "headlines": [
    {
      "id": "se_1234",
      "text": "Apple surges on record earnings beat",
      "symbol": "AAPL",
      "published_at": "2026-04-14T10:32:00Z"
    }
  ]
}

job_id is caller-assigned (Raptor generates the UUID). This makes the endpoint idempotent — submitting the same job_id twice returns the existing job status rather than creating a duplicate.

Response 202 Accepted:

{
  "job_id": "caller-assigned-uuid",
  "status": "queued",
  "queued_at": "2026-05-09T14:22:00Z",
  "estimated_completion_at": "2026-05-09T14:27:00Z"
}

GET /score/batch/{job_id}

Purpose: Poll job status and retrieve results when complete.

Response 200 (pending):

{
  "job_id": "caller-assigned-uuid",
  "status": "processing",
  "progress": 0.42,
  "queued_at": "2026-05-09T14:22:00Z",
  "estimated_completion_at": "2026-05-09T14:27:00Z"
}

Response 200 (complete):

{
  "job_id": "caller-assigned-uuid",
  "status": "complete",
  "model_sha": "abc123def456",
  "model_name": "ProsusAI/finbert",
  "completed_at": "2026-05-09T14:26:44Z",
  "results": [
    {
      "id": "se_1234",
      "label": "positive",
      "score": 0.79,
      "confidence": 0.91
    }
  ],
  "unscored": [],
  "total_scored": 498,
  "total_unscored": 2,
  "duration_ms": 284400
}

Response 404: Job not found (expired or never submitted). Results expire after 10 minutes. Raptor must collect results within this window.

Response 200 (failed):

{
  "job_id": "caller-assigned-uuid",
  "status": "failed",
  "error": "model_load_error",
  "message": "FinBERT model could not be loaded",
  "failed_at": "2026-05-09T14:22:31Z"
}

POST /score/rescore

Purpose: Re-score a batch of previously-scored headlines with a new model SHA. Used when the FinBERT model is updated.

Request:

{
  "new_model_sha": "def789abc012",
  "headlines": [
    {
      "id": "se_1234",
      "text": "Apple surges on record earnings beat",
      "symbol": "AAPL",
      "published_at": "2026-04-14T10:32:00Z",
      "previous_score": 0.79,
      "previous_model_sha": "abc123def456"
    }
  ]
}

previous_score and previous_model_sha are carried for audit — Reasonator echoes them in the response alongside the new scores. This allows Raptor to write both old and new scores to sentiment_score_audit.

Max batch size: 500 headlines per request.

Response 200:

{
  "new_model_sha": "def789abc012",
  "model_name": "ProsusAI/finbert",
  "rescored_at": "2026-05-09T14:22:00Z",
  "results": [
    {
      "id": "se_1234",
      "label": "positive",
      "score": 0.81,
      "confidence": 0.93,
      "previous_score": 0.79,
      "previous_model_sha": "abc123def456",
      "score_delta": 0.02
    }
  ]
}

score_delta is the signed difference (new_score - previous_score). Positive = model thinks headline is more positive than before. Useful for monitoring model drift.

Note: /score/rescore requires Reasonator to have the new_model_sha already loaded. If Reasonator is running with a different SHA, it returns 409 Conflict. The re-scoring sweep job must coordinate a model reload before calling this endpoint.


GET /health

Purpose: Liveness + readiness check. Used by keep-alive cron and deploy health gate.

Response 200:

{
  "status": "ok",
  "model_loaded": true,
  "model_sha": "abc123def456",
  "model_name": "ProsusAI/finbert",
  "queue_depth": {
    "pro_plus": 0,
    "pro": 3
  },
  "uptime_seconds": 3600,
  "version": "1.0.0"
}

Response 503 (model not loaded):

{
  "status": "warming_up",
  "model_loaded": false,
  "model_sha": null,
  "eta_seconds": 30
}

The deploy health check must tolerate a 503 with status: "warming_up" during the first 120 seconds after dyno start (model download + load time). After 120 seconds, a 503 is a genuine failure.


Rate Limit Headers

All responses include:

X-Reasonator-RateLimit-Tier: pro_plus
X-Reasonator-RateLimit-Limit: 10
X-Reasonator-RateLimit-Remaining: 7
X-Reasonator-RateLimit-Reset: 1746799320
X-Reasonator-Request-ID: <echoed from request>

On 429 Too Many Requests:

{
  "error": "rate_limit_exceeded",
  "retry_after_seconds": 12
}

Error Code Reference

HTTP Status error field Meaning
400 validation_error Missing or invalid request fields
400 batch_too_large Batch exceeds max size for this endpoint
400 headline_too_long One or more headlines exceed 512 chars
401 unauthorized Missing or invalid Bearer token
404 job_not_found Batch job ID not found or expired
409 model_sha_mismatch Requested model SHA != loaded model SHA
429 rate_limit_exceeded Tier rate limit exceeded
503 model_warming_up Model not yet loaded (startup)
503 scorer_unavailable Scoring pipeline error (not a transient warm-up)
500 internal_error Unexpected error — check Sentry

Versioning

The API is versioned via the path prefix /v1/ — omitted in this document for clarity but required in production. Example: POST /v1/score. The v1 prefix is set in the Reasonator Flask app factory via a blueprint prefix.

Breaking changes require a new version prefix. The old version remains available for a documented deprecation window (minimum 30 days).