Raxx · internal docs

internal · gated ↑ index

CI Health Gate Runbook

This runbook covers triage for Sprint readiness CI failures in:

What the gate currently checks

  1. scripts/ci/run_smoke.sh - backend_v2 integration smoke tests - frontend integration smoke test
  2. backend_v2/tests/integration/cli_hooks_smoke.py - placeholder synthetic checks for upcoming CLI doctor and data commands
  3. scripts/ci/validate_chart_exports.py (non-blocking in current wave) - CSV: header/schema + row quality + fixture content diff checks - PDF: header/EOF/page-marker integrity checks when artifact exists - JPEG: SOI/EOI/dimension integrity checks when artifact exists

Fast triage flow

  1. Open the failed GitHub Actions run URL from the PR comment.
  2. Identify which step failed: - Run health gate checks - Slack Notify
  3. Reproduce locally from repo root: - bash scripts/ci/run_health_gate.sh
  4. Fix and rerun only the failing scope first, then rerun the full gate.

Common failure patterns

Secret-safe behavior

Incremental hardening backlog (next step to production-grade)

  1. Replace placeholder CLI hooks with real doctor command checks.
  2. Add explicit synthetic probes for critical data/trading contracts with retry policy.
  3. Add actionlint or equivalent workflow lint enforcement in CI.
  4. Add alert routing/escalation policy for repeated gate failures.