DET-DATA-001 — audit log gap window
Rule ID: DET-DATA-001
Title: Gap > 5 minutes in audit-event write cadence during business hours
Category: data
Last validated: 2026-06-04 (initial catalog, dormant — needs post-launch baseline)
State: dormant — placeholder threshold active (30 min) pre-launch; tighten to 5 min once 7d of real-customer cadence baseline exists
Telemetry source
- Postgres tables:
customer_audit_events(Raptor-side) andconsole_audit_events(Console-side). - Writers:
backend_v2/api/services/customer_audit_writer_service.py,console/app/services/customer_audit.py,console/app/services/audit.py. - Query:
SELECT max(created_at) FROM customer_audit_events;and analog for console table. Compare tonow()minus baseline-expected-gap.
Statistical method + baseline window
- Method: time-gap detection on max(
created_at) timestamp per table. - Baseline window: 7 days, time-of-day-aware. Expected normal-cadence gap is hour-of-day-dependent (off-hours much sparser).
- Fire condition: observed gap exceeds 99.9th percentile of historical hour-of-day gap distribution, AND observed gap > 5 min absolute floor.
Threshold + expected FP rate
- Pre-launch placeholder: gap > 30 min during 12:00–22:30 UTC weekdays. Off-hours: gap > 4 hours.
- Post-launch tightening: dynamic 99.9th percentile with 5-min absolute floor during 12:00–22:30 UTC weekdays.
- Expected FP rate (post-launch): < 1 per month. Genuine quiet windows do occur (overnight, no operator activity, no customer activity) — these get caught by the time-of-day baseline, not flagged.
Alert route
- CRITICAL (gap > 60 min during business hours):
#raxx-ops-alert-sev1. Either the audit writer has failed (data-integrity SEV1) or an attacker is intentionally pausing the log (security SEV1). - HIGH (gap 30–60 min business hours):
#raxx-ops-alert-sev2-5/#raxx-ops-alert-sev2. - MEDIUM (gap 5–30 min business hours, post-baseline): ops@ daily digest.
Escalation owner
- sre-agent — first responder. Possible writer outage, DB connection pool exhaustion, replica lag.
- security-agent if sre-agent rules out operational cause AND there are no operator-action audit rows in the gap window despite known operator activity (= adversary suppressing).
Test fixture / synthetic positive
See _fixtures/audit_log_gap_window_positive.json for a synthetic detection-run state showing 47-minute gap at 16:00 UTC on a weekday.
What to do when this fires
- Confirm operator was active in the gap window (Slack/git activity, recent deploy). Active operator + no audit rows = writer-side bug or DB-side issue.
- Check Heroku Postgres status, recent migrations, and writer-service Sentry events for the same window.
- If sre-agent confirms writer outage: dispatch sre-agent for remediation; tag fire as
confirmed-operationalafter fix. - If no operational cause: dispatch security-agent SEV1 — hash chain (DET-DATA-002) becomes the next check; if hash chain is intact across the gap, the gap is benign (no writes occurred); if hash chain is broken, the gap is adversarial.
What NOT to do
- Do not assume a quiet gap is benign because nothing else is firing. Audit-log silence during expected activity is itself the signal.
- Do not extend the gap threshold "to reduce noise" — the rule has been intentionally tightened to 5 min post-baseline because audit-log integrity is non-negotiable.
- Do not bypass DET-DATA-002 (hash chain) when DET-DATA-001 fires. Both run; both contribute to the diagnosis.