Session-bootstrap runbook
System: operator shell + agent shell credential bootstrapping
Owner: operator (Kristerpher)
Issues: #680, #2330
Script: scripts/ops/session-bootstrap.sh
Cron companion: scripts/ops/session-bootstrap-cron.sh
Last incident: 2026-05-17 (stale Infisical machine identity + expired CF_WAF_EDIT_RAXX_APP blocked WAF deploy; #2330)
Last reviewed: 2026-05-18
Purpose
Every fresh operator or agent shell needs six credentials to do programmatic ops: the Infisical machine identity, three Cloudflare API tokens, a Heroku API key, and a GitHub PAT. These credentials are sourced from the operator's local environment (Keychain or shell exports). They drift silently — CF tokens expire, Heroku authorizations get revoked, GitHub PATs hit their TTL, and Infisical machine identity secrets can be rotated out from under an active session.
This runbook covers:
- When to run the bootstrap
- One-time Keychain setup (store credentials once, bootstrap reads them forever)
- What each PASS/FAIL/SKIP means and how to fix it
- Rotating each credential
When to run
| Trigger | Command |
|---|---|
| Fresh terminal session before ops work | source scripts/ops/session-bootstrap.sh |
| Agent just failed with "API token invalid" or "401" | bash scripts/ops/session-bootstrap.sh (diagnose first) |
| After rotating any credential | bash scripts/ops/session-bootstrap.sh (verify the new value works) |
| Daily proactive check (automated via cron) | bash scripts/ops/session-bootstrap-cron.sh |
Use source when you want the credentials exported into your current shell.
Use bash when you just want a PASS/FAIL diagnostic without changing your env.
Layered auth picture
Agent / operator request
│
▼
Cloudflare Access edge ← CF_ACCESS_CLIENT_ID + CF_ACCESS_CLIENT_SECRET
│ (GH Actions secrets — not in scope for this script)
▼
Infisical vault ← INFISICAL_CLIENT_ID + CLIENT_SECRET ← verified by this script
│ (machine identity universal-auth login)
▼
Vault entries:
CLOUDFLARE_ACCESS_MGMT_TOKEN ← verified by this script
CF_WORKER_DEPLOY ← verified by this script
CF_WAF_EDIT_RAXX_APP ← verified by this script
HEROKU_API_KEY ← verified by this script
GITHUB_API_SECRETS_TOKEN ← verified by this script
│
▼
Downstream APIs (CF DNS/Workers, Heroku, GitHub)
CF Access is the FIRST gate. If it's broken, everything behind it is
unreachable. The CF Access gate creds (CF_ACCESS_CLIENT_ID/SECRET) are
provisioned separately (GH Actions secrets + Keychain) and are not validated
by this script — they're infrastructure-level, not op-level.
Credentials verified
| Credential | Purpose | Validation endpoint |
|---|---|---|
INFISICAL_CLIENT_ID/SECRET |
Infisical machine identity — gates all vault reads | POST /api/v1/auth/universal-auth/login (HTTP 200 = valid) |
CLOUDFLARE_ACCESS_MGMT_TOKEN |
CF management (Access, DNS edit, accounts) | GET /client/v4/user/tokens/verify |
CF_WORKER_DEPLOY |
Workers + Routes deploy (Zone:Workers Routes:Edit scope) | GET /client/v4/user/tokens/verify |
CF_WAF_EDIT_RAXX_APP |
WAF rules edit on raxx.app zone (WAF:Edit, Zone:Read) | GET /client/v4/user/tokens/verify |
HEROKU_API_KEY |
Heroku Platform API (apps, config vars, dynos) | GET /account |
GITHUB_API_SECRETS_TOKEN |
GitHub ops API (secrets, Actions, repos) | GET /user |
All validation calls are read-only. No write operations are performed.
CF_WAF_EDIT_RAXX_APP is sourced from vault (/MooseQuest/cloudflare/, env prod) or from the
macOS Keychain (raxx-cf-waf-edit-raxx-app). If both are absent the check FAILs.
The companion secret CF_WAF_EDIT_RAXX_APP__EXPIRES_AT is printed alongside the PASS line when
present, giving advance notice of upcoming expiry.
One-time Keychain setup
Run once per machine to store credentials in the macOS Keychain. The bootstrap script reads from Keychain automatically, so you don't need to export vars manually each session.
# Replace <VALUE> with the actual credential value each time.
# -T /bin/bash allows unattended reads without Touch ID prompt.
# -U updates if the entry already exists.
# Infisical machine identity creds come from vault — export them directly
# rather than storing in Keychain (they are env vars, not secrets here).
# Set in ~/.zshenv:
# export INFISICAL_CLIENT_ID="<value from vault /MooseQuest/infisical/>"
# export INFISICAL_CLIENT_SECRET="<value from vault /MooseQuest/infisical/>"
security add-generic-password \
-s "raxx-cf-access-mgmt-token" \
-a "claude-bootstrap" \
-w "<CLOUDFLARE_ACCESS_MGMT_TOKEN value>" \
-T /bin/bash -U
security add-generic-password \
-s "raxx-cf-worker-deploy-token" \
-a "claude-bootstrap" \
-w "<CF_WORKER_DEPLOY value>" \
-T /bin/bash -U
security add-generic-password \
-s "raxx-cf-waf-edit-raxx-app" \
-a "claude-bootstrap" \
-w "<CF_WAF_EDIT_RAXX_APP value from vault /MooseQuest/cloudflare/>" \
-T /bin/bash -U
security add-generic-password \
-s "raxx-heroku-api-key" \
-a "claude-bootstrap" \
-w "<HEROKU_API_KEY value>" \
-T /bin/bash -U
security add-generic-password \
-s "raxx-github-api-secrets-token" \
-a "claude-bootstrap" \
-w "<GITHUB_API_SECRETS_TOKEN value>" \
-T /bin/bash -U
After storing, verify the bootstrap reads them:
# Start a clean subshell with no credential env vars to test Keychain reads.
env -i HOME="${HOME}" PATH="${PATH}" bash scripts/ops/session-bootstrap.sh
Security note on -T /bin/bash
-T /bin/bash means any bash process on this Mac can read these Keychain
entries without a Touch ID prompt. This is acceptable for the credentials
covered here (all rotatable, none are root/private keys). If you prefer the
Touch ID friction, omit -T /bin/bash — the bootstrap will prompt for
Keychain access each time it reads a credential.
Shell RC integration (optional)
To automatically bootstrap on every new terminal:
# Add to ~/.zshrc (interactive sessions only):
if [[ -f "${HOME}/repo/TradeMasterAPI/scripts/ops/session-bootstrap.sh" ]]; then
source "${HOME}/repo/TradeMasterAPI/scripts/ops/session-bootstrap.sh"
fi
Or for a quieter approach that only alerts on failure:
# Only print output if bootstrap fails:
if ! bash "${HOME}/repo/TradeMasterAPI/scripts/ops/session-bootstrap.sh" >/dev/null 2>&1; then
echo "[session-bootstrap] FAIL — run: bash scripts/ops/session-bootstrap.sh"
fi
What PASS/FAIL/SKIP means
[PASS] — credential valid
The validation endpoint returned the expected success response. The credential is active and usable for the current session.
[FAIL] — credential invalid or missing
The validation endpoint returned an error (401, 403, CF success:false) or the credential was not found in either the environment or the Keychain.
Each FAIL line includes a rotate via: pointer:
| Credential | Rotation doc |
|---|---|
INFISICAL_CLIENT_ID/SECRET |
docs/ops/runbooks/rotation/infisical-service-token.md |
CLOUDFLARE_ACCESS_MGMT_TOKEN |
docs/ops/runbooks/rotation/cloudflare-user-api-token.md |
CF_WORKER_DEPLOY |
docs/ops/runbooks/rotation/cloudflare-user-api-token.md + #1093 (scope: Zone:Workers Routes:Edit) |
CF_WAF_EDIT_RAXX_APP |
docs/ops/runbooks/cloudflare-tokens.md (Failure mode B) |
HEROKU_API_KEY |
docs/ops/runbooks/rotation/heroku-platform-token.md |
GITHUB_API_SECRETS_TOKEN |
docs/ops/runbooks/rotation/github-pat.md |
After rotating, store the new value in Keychain (see "One-time Keychain setup"
above, using -U to update). Then re-run the bootstrap to confirm PASS.
[SKIP] — check could not run
The prerequisite for this check was not met:
| SKIP reason | Meaning | Action |
|---|---|---|
infisical CLI not installed |
infisical binary not in PATH |
Install: brew install infisical/get-cli/infisical. The machine identity HTTP check still runs; only the CLI-version line is skipped. |
SKIP does not cause the bootstrap to exit non-zero. Only FAIL does.
Note: as of #2330, missing INFISICAL_CLIENT_ID/SECRET is a FAIL, not a SKIP. The machine
identity check is now a first-class credential — it must pass for vault reads to work.
Failure modes
Failure mode A: Cloudflare token expired or revoked
Symptom:
[FAIL] CLOUDFLARE_ACCESS_MGMT_TOKEN (redacted cfto…cdef) — CF returned success:false
rotate via: docs/ops/runbooks/rotation/cloudflare-user-api-token.md
Cause: The Cloudflare user API token has expired (if expires_at was set
at creation) or was revoked (manual revocation or the PUT /value roll ran
but the new value wasn't stored). The old secret is invalidated immediately on roll.
Fix:
1. Follow docs/ops/runbooks/rotation/cloudflare-user-api-token.md.
2. For CF_WORKER_DEPLOY: ensure the new token has scope Zone:Workers Routes:Edit
(see #1093 for the scope confusion incident).
3. Store the new value in Keychain:
bash
security add-generic-password -s "raxx-cf-access-mgmt-token" \
-a "claude-bootstrap" -w "<NEW_VALUE>" -T /bin/bash -U
4. Re-run bootstrap to confirm PASS.
Failure mode B: Heroku authorization revoked
Symptom:
[FAIL] HEROKU_API_KEY (redacted hero…1234) — HTTP 401 (invalid or revoked)
rotate via: docs/ops/runbooks/rotation/heroku-platform-token.md
Cause: The Heroku authorization was explicitly revoked (via
heroku authorizations:revoke) or expired (non-global scope tokens have
optional expiry). Long-lived direct-authorization tokens do not auto-expire —
revocation is always explicit.
Fix:
1. Follow docs/ops/runbooks/rotation/heroku-platform-token.md.
2. Update Keychain:
bash
security add-generic-password -s "raxx-heroku-api-key" \
-a "claude-bootstrap" -w "<NEW_VALUE>" -T /bin/bash -U
3. Re-run bootstrap.
Failure mode C: GitHub PAT expired
Symptom:
[FAIL] GITHUB_API_SECRETS_TOKEN (redacted gith…cdef) — HTTP 401 (invalid or expired)
rotate via: docs/ops/runbooks/rotation/github-pat.md
Cause: Fine-grained PATs have mandatory expiry (max 1 year, we use 90 days). Classic PATs do not auto-expire but can be revoked.
Fix:
1. Follow docs/ops/runbooks/rotation/github-pat.md.
2. Update Keychain:
bash
security add-generic-password -s "raxx-github-api-secrets-token" \
-a "claude-bootstrap" -w "<NEW_VALUE>" -T /bin/bash -U
3. Re-run bootstrap.
Failure mode D: credential not in env or Keychain
Symptom:
[FAIL] HEROKU_API_KEY (not in env or Keychain)
rotate via: docs/ops/runbooks/rotation/heroku-platform-token.md
Cause: This is a fresh machine or the Keychain entry was never created (first-time setup), or Keychain was reset/migrated.
Fix: Run the one-time Keychain setup for the missing credential (see above).
Failure mode E: Infisical machine identity stale or missing
Symptom:
[FAIL] INFISICAL_CLIENT_ID/SECRET HTTP 401 — credentials rejected
rotate via: docs/ops/runbooks/rotation/infisical-service-token.md
or
[FAIL] INFISICAL_CLIENT_ID/SECRET (not set in environment)
rotate via: docs/ops/runbooks/rotation/infisical-service-token.md
Cause: The INFISICAL_CLIENT_ID/SECRET in the current env are stale or
were rotated without updating the local env. Every vault read (including pulling
CF_WAF_EDIT_RAXX_APP and other op credentials) will fail if this check fails.
Fix:
1. Check the current values: printf '%s' "${INFISICAL_CLIENT_ID:0:4}…" (partial only — do not print full value).
2. Pull fresh values from the Infisical web UI or vault (path: /MooseQuest/infisical/).
3. Update ~/.zshenv with the new values.
4. Open a new shell (or source ~/.zshenv) to pick up the new values.
5. Re-run bootstrap to confirm PASS.
If the machine identity itself needs to be recreated (not just rotated), see:
docs/ops/runbooks/rotation/infisical-service-token.md.
Failure mode F: CF_WAF_EDIT_RAXX_APP expired or missing
Symptom:
[FAIL] CF_WAF_EDIT_RAXX_APP (redacted cfwa…cdef) — CF returned success:false
rotate via: docs/ops/runbooks/cloudflare-tokens.md (Failure mode B)
or
[FAIL] CF_WAF_EDIT_RAXX_APP (not in env or Keychain)
rotate via: docs/ops/runbooks/cloudflare-tokens.md (Failure mode B)
Cause: The CF_WAF_EDIT_RAXX_APP token has expired (90-day rotation cadence), was
revoked, or was never populated in the Keychain on this machine.
Fix:
1. Follow docs/ops/runbooks/cloudflare-tokens.md → Failure mode B.
2. Store the new value in Keychain:
bash
security add-generic-password -s "raxx-cf-waf-edit-raxx-app" \
-a "claude-bootstrap" -w "<NEW_VALUE>" -T /bin/bash -U
3. Update the CF_WAF_EDIT_RAXX_APP__EXPIRES_AT companion in vault to 90 days from today.
4. Re-run bootstrap to confirm PASS.
Failure mode G: network / CF Access gate down
Symptom:
[FAIL] CLOUDFLARE_ACCESS_MGMT_TOKEN (redacted cfto…cdef) — no response (network/CF down?)
Cause: The CF API is unreachable (network failure, CF outage, or the CF Access gate in front of vault.raxx.app is blocking the request).
Fix:
1. Check Cloudflare status at https://www.cloudflarestatus.com
2. Verify your network has internet access: curl -sS https://1.1.1.1 >/dev/null && echo ok
3. If CF Access is blocking: check that CF_ACCESS_CLIENT_ID/SECRET are valid
for the Access application on vault.raxx.app.
4. If it's an outage, wait and re-run once Cloudflare is healthy.
Daily proactive check (cron)
scripts/ops/session-bootstrap-cron.sh runs the bootstrap silently and sends
a Slack DM (channel D0AJ7K184TV) only when a credential fails. Use it for
daily early-warning without interactive interruption.
Install via crontab:
# Run at 08:00 UTC daily, alert on failure only.
0 8 * * * /path/to/repo/scripts/ops/session-bootstrap-cron.sh
Or via launchd on macOS (preferred — survives sleep/wake):
<!-- ~/Library/LaunchAgents/app.raxx.session-bootstrap.plist -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>app.raxx.session-bootstrap</string>
<key>ProgramArguments</key>
<array>
<string>/bin/bash</string>
<string>/path/to/repo/scripts/ops/session-bootstrap-cron.sh</string>
</array>
<key>StartCalendarInterval</key>
<dict>
<key>Hour</key>
<integer>8</integer>
<key>Minute</key>
<integer>0</integer>
</dict>
<key>StandardOutPath</key>
<string>/tmp/raxx-session-bootstrap.log</string>
<key>StandardErrorPath</key>
<string>/tmp/raxx-session-bootstrap.log</string>
</dict>
</plist>
Load with: launchctl load ~/Library/LaunchAgents/app.raxx.session-bootstrap.plist
Idempotency
The bootstrap is safe to run multiple times:
- Each run validates from scratch (no cached results used in validation).
- The only side effect is writing ~/.raxx-session-bootstrap-last-validated
with the UTC timestamp of the last successful run.
- Running twice produces identical output (barring credential state changes).
Override the state file path for testing:
RAXX_BOOTSTRAP_STATE_FILE=/tmp/test-state bash scripts/ops/session-bootstrap.sh
Emergency stop
If you need to kill a stuck credential validation (e.g., curl hanging on a
network timeout), the script respects Ctrl-C. Each validation has a
--max-time 10 timeout, so the maximum stuck time is 6 × 10s = 60s for all
six credentials.
To force-clear all credential exports from the current shell:
unset INFISICAL_CLIENT_ID INFISICAL_CLIENT_SECRET \
CLOUDFLARE_ACCESS_MGMT_TOKEN CF_WORKER_DEPLOY CF_WAF_EDIT_RAXX_APP \
HEROKU_API_KEY GITHUB_API_SECRETS_TOKEN
Refs
- Issues: #680, #2330
- Related:
docs/ops/runbooks/rotation/infisical-service-token.md - Related:
docs/ops/runbooks/rotation/cloudflare-user-api-token.md - Related:
docs/ops/runbooks/cloudflare-tokens.md(CF_WAF_EDIT_RAXX_APP rotation) - Related:
docs/ops/runbooks/rotation/heroku-platform-token.md - Related:
docs/ops/runbooks/rotation/github-pat.md - Related: #1093 (CF_WORKER_DEPLOY scope — Zone:Workers Routes:Edit)
- Memory:
project_session_env_staleness.md - 2026-04-30 incident: stale CF tokens blocked Postmark DNS work
- 2026-05-17 incident: stale Infisical machine identity + expired CF_WAF_EDIT_RAXX_APP blocked WAF deploy (#2330)