Raxx · internal docs

internal · gated

DET-OPS-002 — Postgres p99 latency drift

Rule ID: DET-OPS-002 Title: Postgres p99 query latency exceeding 2× the 7-day baseline per statement Category: ops Last validated: 2026-06-04 (initial catalog, dormant) State: dormant — requires pg_stat_statements extension verified active on raxx-api-prod Postgres

Telemetry source

Statistical method + baseline window

Threshold + expected FP rate

Alert route

Escalation owner

Test fixture / synthetic positive

See _fixtures/postgres_p99_drift_positive.json for a synthetic pg_stat_statements snapshot where the SELECT FROM strategies WHERE user_id = ? statement shows mean 184ms today vs. baseline μ=42ms.

What to do when this fires

  1. Identify the statement(s). Recent deploy that changed the query? New feature flag that altered the working set? Missing index on a new column?
  2. Run EXPLAIN ANALYZE against the live statement to confirm the plan changed.
  3. If a plan flip explains the degradation: dispatch sre-agent for the appropriate remedy (index, query rewrite, table tune).
  4. Re-baseline pg_stat_statements after remediation: SELECT pg_stat_statements_reset(); — and mark the next baseline window's first day as re-baselined in _log/.

What NOT to do