Raxx · internal docs

internal · gated

Session-bootstrap runbook

System: operator shell + agent shell credential bootstrapping Owner: operator (Kristerpher) Issues: #680, #2330 Script: scripts/ops/session-bootstrap.sh Cron companion: scripts/ops/session-bootstrap-cron.sh Last incident: 2026-05-17 (stale Infisical machine identity + expired CF_WAF_EDIT_RAXX_APP blocked WAF deploy; #2330) Last reviewed: 2026-05-18


Purpose

Every fresh operator or agent shell needs six credentials to do programmatic ops: the Infisical machine identity, three Cloudflare API tokens, a Heroku API key, and a GitHub PAT. These credentials are sourced from the operator's local environment (Keychain or shell exports). They drift silently — CF tokens expire, Heroku authorizations get revoked, GitHub PATs hit their TTL, and Infisical machine identity secrets can be rotated out from under an active session.

This runbook covers:

  1. When to run the bootstrap
  2. One-time Keychain setup (store credentials once, bootstrap reads them forever)
  3. What each PASS/FAIL/SKIP means and how to fix it
  4. Rotating each credential

When to run

Trigger Command
Fresh terminal session before ops work source scripts/ops/session-bootstrap.sh
Agent just failed with "API token invalid" or "401" bash scripts/ops/session-bootstrap.sh (diagnose first)
After rotating any credential bash scripts/ops/session-bootstrap.sh (verify the new value works)
Daily proactive check (automated via cron) bash scripts/ops/session-bootstrap-cron.sh

Use source when you want the credentials exported into your current shell. Use bash when you just want a PASS/FAIL diagnostic without changing your env.


Layered auth picture

  Agent / operator request
       │
       ▼
  Cloudflare Access edge  ← CF_ACCESS_CLIENT_ID + CF_ACCESS_CLIENT_SECRET
       │                    (GH Actions secrets — not in scope for this script)
       ▼
  Infisical vault          ← INFISICAL_CLIENT_ID + CLIENT_SECRET   ← verified by this script
       │                    (machine identity universal-auth login)
       ▼
  Vault entries:
    CLOUDFLARE_ACCESS_MGMT_TOKEN   ← verified by this script
    CF_WORKER_DEPLOY               ← verified by this script
    CF_WAF_EDIT_RAXX_APP           ← verified by this script
    HEROKU_API_KEY                 ← verified by this script
    GITHUB_API_SECRETS_TOKEN       ← verified by this script
       │
       ▼
  Downstream APIs (CF DNS/Workers, Heroku, GitHub)

CF Access is the FIRST gate. If it's broken, everything behind it is unreachable. The CF Access gate creds (CF_ACCESS_CLIENT_ID/SECRET) are provisioned separately (GH Actions secrets + Keychain) and are not validated by this script — they're infrastructure-level, not op-level.


Credentials verified

Credential Purpose Validation endpoint
INFISICAL_CLIENT_ID/SECRET Infisical machine identity — gates all vault reads POST /api/v1/auth/universal-auth/login (HTTP 200 = valid)
CLOUDFLARE_ACCESS_MGMT_TOKEN CF management (Access, DNS edit, accounts) GET /client/v4/user/tokens/verify
CF_WORKER_DEPLOY Workers + Routes deploy (Zone:Workers Routes:Edit scope) GET /client/v4/user/tokens/verify
CF_WAF_EDIT_RAXX_APP WAF rules edit on raxx.app zone (WAF:Edit, Zone:Read) GET /client/v4/user/tokens/verify
HEROKU_API_KEY Heroku Platform API (apps, config vars, dynos) GET /account
GITHUB_API_SECRETS_TOKEN GitHub ops API (secrets, Actions, repos) GET /user

All validation calls are read-only. No write operations are performed.

CF_WAF_EDIT_RAXX_APP is sourced from vault (/MooseQuest/cloudflare/, env prod) or from the macOS Keychain (raxx-cf-waf-edit-raxx-app). If both are absent the check FAILs. The companion secret CF_WAF_EDIT_RAXX_APP__EXPIRES_AT is printed alongside the PASS line when present, giving advance notice of upcoming expiry.


One-time Keychain setup

Run once per machine to store credentials in the macOS Keychain. The bootstrap script reads from Keychain automatically, so you don't need to export vars manually each session.

# Replace <VALUE> with the actual credential value each time.
# -T /bin/bash allows unattended reads without Touch ID prompt.
# -U updates if the entry already exists.

# Infisical machine identity creds come from vault — export them directly
# rather than storing in Keychain (they are env vars, not secrets here).
# Set in ~/.zshenv:
#   export INFISICAL_CLIENT_ID="<value from vault /MooseQuest/infisical/>"
#   export INFISICAL_CLIENT_SECRET="<value from vault /MooseQuest/infisical/>"

security add-generic-password \
  -s "raxx-cf-access-mgmt-token" \
  -a "claude-bootstrap" \
  -w "<CLOUDFLARE_ACCESS_MGMT_TOKEN value>" \
  -T /bin/bash -U

security add-generic-password \
  -s "raxx-cf-worker-deploy-token" \
  -a "claude-bootstrap" \
  -w "<CF_WORKER_DEPLOY value>" \
  -T /bin/bash -U

security add-generic-password \
  -s "raxx-cf-waf-edit-raxx-app" \
  -a "claude-bootstrap" \
  -w "<CF_WAF_EDIT_RAXX_APP value from vault /MooseQuest/cloudflare/>" \
  -T /bin/bash -U

security add-generic-password \
  -s "raxx-heroku-api-key" \
  -a "claude-bootstrap" \
  -w "<HEROKU_API_KEY value>" \
  -T /bin/bash -U

security add-generic-password \
  -s "raxx-github-api-secrets-token" \
  -a "claude-bootstrap" \
  -w "<GITHUB_API_SECRETS_TOKEN value>" \
  -T /bin/bash -U

After storing, verify the bootstrap reads them:

# Start a clean subshell with no credential env vars to test Keychain reads.
env -i HOME="${HOME}" PATH="${PATH}" bash scripts/ops/session-bootstrap.sh

Security note on -T /bin/bash

-T /bin/bash means any bash process on this Mac can read these Keychain entries without a Touch ID prompt. This is acceptable for the credentials covered here (all rotatable, none are root/private keys). If you prefer the Touch ID friction, omit -T /bin/bash — the bootstrap will prompt for Keychain access each time it reads a credential.


Shell RC integration (optional)

To automatically bootstrap on every new terminal:

# Add to ~/.zshrc (interactive sessions only):
if [[ -f "${HOME}/repo/TradeMasterAPI/scripts/ops/session-bootstrap.sh" ]]; then
  source "${HOME}/repo/TradeMasterAPI/scripts/ops/session-bootstrap.sh"
fi

Or for a quieter approach that only alerts on failure:

# Only print output if bootstrap fails:
if ! bash "${HOME}/repo/TradeMasterAPI/scripts/ops/session-bootstrap.sh" >/dev/null 2>&1; then
  echo "[session-bootstrap] FAIL — run: bash scripts/ops/session-bootstrap.sh"
fi

What PASS/FAIL/SKIP means

[PASS] — credential valid

The validation endpoint returned the expected success response. The credential is active and usable for the current session.

[FAIL] — credential invalid or missing

The validation endpoint returned an error (401, 403, CF success:false) or the credential was not found in either the environment or the Keychain.

Each FAIL line includes a rotate via: pointer:

Credential Rotation doc
INFISICAL_CLIENT_ID/SECRET docs/ops/runbooks/rotation/infisical-service-token.md
CLOUDFLARE_ACCESS_MGMT_TOKEN docs/ops/runbooks/rotation/cloudflare-user-api-token.md
CF_WORKER_DEPLOY docs/ops/runbooks/rotation/cloudflare-user-api-token.md + #1093 (scope: Zone:Workers Routes:Edit)
CF_WAF_EDIT_RAXX_APP docs/ops/runbooks/cloudflare-tokens.md (Failure mode B)
HEROKU_API_KEY docs/ops/runbooks/rotation/heroku-platform-token.md
GITHUB_API_SECRETS_TOKEN docs/ops/runbooks/rotation/github-pat.md

After rotating, store the new value in Keychain (see "One-time Keychain setup" above, using -U to update). Then re-run the bootstrap to confirm PASS.

[SKIP] — check could not run

The prerequisite for this check was not met:

SKIP reason Meaning Action
infisical CLI not installed infisical binary not in PATH Install: brew install infisical/get-cli/infisical. The machine identity HTTP check still runs; only the CLI-version line is skipped.

SKIP does not cause the bootstrap to exit non-zero. Only FAIL does.

Note: as of #2330, missing INFISICAL_CLIENT_ID/SECRET is a FAIL, not a SKIP. The machine identity check is now a first-class credential — it must pass for vault reads to work.


Failure modes

Failure mode A: Cloudflare token expired or revoked

Symptom:

[FAIL] CLOUDFLARE_ACCESS_MGMT_TOKEN    (redacted cfto…cdef) — CF returned success:false
       rotate via: docs/ops/runbooks/rotation/cloudflare-user-api-token.md

Cause: The Cloudflare user API token has expired (if expires_at was set at creation) or was revoked (manual revocation or the PUT /value roll ran but the new value wasn't stored). The old secret is invalidated immediately on roll.

Fix: 1. Follow docs/ops/runbooks/rotation/cloudflare-user-api-token.md. 2. For CF_WORKER_DEPLOY: ensure the new token has scope Zone:Workers Routes:Edit (see #1093 for the scope confusion incident). 3. Store the new value in Keychain: bash security add-generic-password -s "raxx-cf-access-mgmt-token" \ -a "claude-bootstrap" -w "<NEW_VALUE>" -T /bin/bash -U 4. Re-run bootstrap to confirm PASS.

Failure mode B: Heroku authorization revoked

Symptom:

[FAIL] HEROKU_API_KEY    (redacted hero…1234) — HTTP 401 (invalid or revoked)
       rotate via: docs/ops/runbooks/rotation/heroku-platform-token.md

Cause: The Heroku authorization was explicitly revoked (via heroku authorizations:revoke) or expired (non-global scope tokens have optional expiry). Long-lived direct-authorization tokens do not auto-expire — revocation is always explicit.

Fix: 1. Follow docs/ops/runbooks/rotation/heroku-platform-token.md. 2. Update Keychain: bash security add-generic-password -s "raxx-heroku-api-key" \ -a "claude-bootstrap" -w "<NEW_VALUE>" -T /bin/bash -U 3. Re-run bootstrap.

Failure mode C: GitHub PAT expired

Symptom:

[FAIL] GITHUB_API_SECRETS_TOKEN    (redacted gith…cdef) — HTTP 401 (invalid or expired)
       rotate via: docs/ops/runbooks/rotation/github-pat.md

Cause: Fine-grained PATs have mandatory expiry (max 1 year, we use 90 days). Classic PATs do not auto-expire but can be revoked.

Fix: 1. Follow docs/ops/runbooks/rotation/github-pat.md. 2. Update Keychain: bash security add-generic-password -s "raxx-github-api-secrets-token" \ -a "claude-bootstrap" -w "<NEW_VALUE>" -T /bin/bash -U 3. Re-run bootstrap.

Failure mode D: credential not in env or Keychain

Symptom:

[FAIL] HEROKU_API_KEY    (not in env or Keychain)
       rotate via: docs/ops/runbooks/rotation/heroku-platform-token.md

Cause: This is a fresh machine or the Keychain entry was never created (first-time setup), or Keychain was reset/migrated.

Fix: Run the one-time Keychain setup for the missing credential (see above).

Failure mode E: Infisical machine identity stale or missing

Symptom:

[FAIL] INFISICAL_CLIENT_ID/SECRET         HTTP 401 — credentials rejected
       rotate via: docs/ops/runbooks/rotation/infisical-service-token.md

or

[FAIL] INFISICAL_CLIENT_ID/SECRET         (not set in environment)
       rotate via: docs/ops/runbooks/rotation/infisical-service-token.md

Cause: The INFISICAL_CLIENT_ID/SECRET in the current env are stale or were rotated without updating the local env. Every vault read (including pulling CF_WAF_EDIT_RAXX_APP and other op credentials) will fail if this check fails.

Fix: 1. Check the current values: printf '%s' "${INFISICAL_CLIENT_ID:0:4}…" (partial only — do not print full value). 2. Pull fresh values from the Infisical web UI or vault (path: /MooseQuest/infisical/). 3. Update ~/.zshenv with the new values. 4. Open a new shell (or source ~/.zshenv) to pick up the new values. 5. Re-run bootstrap to confirm PASS.

If the machine identity itself needs to be recreated (not just rotated), see: docs/ops/runbooks/rotation/infisical-service-token.md.

Failure mode F: CF_WAF_EDIT_RAXX_APP expired or missing

Symptom:

[FAIL] CF_WAF_EDIT_RAXX_APP               (redacted cfwa…cdef) — CF returned success:false
       rotate via: docs/ops/runbooks/cloudflare-tokens.md (Failure mode B)

or

[FAIL] CF_WAF_EDIT_RAXX_APP               (not in env or Keychain)
       rotate via: docs/ops/runbooks/cloudflare-tokens.md (Failure mode B)

Cause: The CF_WAF_EDIT_RAXX_APP token has expired (90-day rotation cadence), was revoked, or was never populated in the Keychain on this machine.

Fix: 1. Follow docs/ops/runbooks/cloudflare-tokens.md → Failure mode B. 2. Store the new value in Keychain: bash security add-generic-password -s "raxx-cf-waf-edit-raxx-app" \ -a "claude-bootstrap" -w "<NEW_VALUE>" -T /bin/bash -U 3. Update the CF_WAF_EDIT_RAXX_APP__EXPIRES_AT companion in vault to 90 days from today. 4. Re-run bootstrap to confirm PASS.

Failure mode G: network / CF Access gate down

Symptom:

[FAIL] CLOUDFLARE_ACCESS_MGMT_TOKEN    (redacted cfto…cdef) — no response (network/CF down?)

Cause: The CF API is unreachable (network failure, CF outage, or the CF Access gate in front of vault.raxx.app is blocking the request).

Fix: 1. Check Cloudflare status at https://www.cloudflarestatus.com 2. Verify your network has internet access: curl -sS https://1.1.1.1 >/dev/null && echo ok 3. If CF Access is blocking: check that CF_ACCESS_CLIENT_ID/SECRET are valid for the Access application on vault.raxx.app. 4. If it's an outage, wait and re-run once Cloudflare is healthy.


Daily proactive check (cron)

scripts/ops/session-bootstrap-cron.sh runs the bootstrap silently and sends a Slack DM (channel D0AJ7K184TV) only when a credential fails. Use it for daily early-warning without interactive interruption.

Install via crontab:

# Run at 08:00 UTC daily, alert on failure only.
0 8 * * * /path/to/repo/scripts/ops/session-bootstrap-cron.sh

Or via launchd on macOS (preferred — survives sleep/wake):

<!-- ~/Library/LaunchAgents/app.raxx.session-bootstrap.plist -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
  <key>Label</key>
  <string>app.raxx.session-bootstrap</string>
  <key>ProgramArguments</key>
  <array>
    <string>/bin/bash</string>
    <string>/path/to/repo/scripts/ops/session-bootstrap-cron.sh</string>
  </array>
  <key>StartCalendarInterval</key>
  <dict>
    <key>Hour</key>
    <integer>8</integer>
    <key>Minute</key>
    <integer>0</integer>
  </dict>
  <key>StandardOutPath</key>
  <string>/tmp/raxx-session-bootstrap.log</string>
  <key>StandardErrorPath</key>
  <string>/tmp/raxx-session-bootstrap.log</string>
</dict>
</plist>

Load with: launchctl load ~/Library/LaunchAgents/app.raxx.session-bootstrap.plist


Idempotency

The bootstrap is safe to run multiple times: - Each run validates from scratch (no cached results used in validation). - The only side effect is writing ~/.raxx-session-bootstrap-last-validated with the UTC timestamp of the last successful run. - Running twice produces identical output (barring credential state changes).

Override the state file path for testing:

RAXX_BOOTSTRAP_STATE_FILE=/tmp/test-state bash scripts/ops/session-bootstrap.sh

Emergency stop

If you need to kill a stuck credential validation (e.g., curl hanging on a network timeout), the script respects Ctrl-C. Each validation has a --max-time 10 timeout, so the maximum stuck time is 6 × 10s = 60s for all six credentials.

To force-clear all credential exports from the current shell:

unset INFISICAL_CLIENT_ID INFISICAL_CLIENT_SECRET \
      CLOUDFLARE_ACCESS_MGMT_TOKEN CF_WORKER_DEPLOY CF_WAF_EDIT_RAXX_APP \
      HEROKU_API_KEY GITHUB_API_SECRETS_TOKEN

Refs