Per-PR Context Swap — Agent Identity Routing

Status: Design complete; sub-cards filed
Date: 2026-05-16 UTC
Tracking: #2070
ADR: 0128
Owner: software-architect

1. Context

Every agent dispatched from the main orchestrator thread must open GitHub PRs, file issues, and push commits. Today all three GitHub Apps (raxx-dev-bot, raxx-ops-bot, raxx-pm-bot) are defined and mapped to agent classes in scripts/agents/agent_bot_map.yaml. The mapping is canonical but not enforced at spawn time — agents that skip the wrapper default to the operator's PAT, which attributes the PR to MooseQuest (Kristerpher).

This has two concrete consequences:

Self-approval block. GitHub prevents an author from approving their own PRs. If every agent-dispatched PR authors as MooseQuest, Kristerpher cannot use the standard review flow — he must admin-merge, bypassing the approval gate.
Audit trail degradation. When eight PRs from a single session all author as MooseQuest, there is no signal distinguishing operator-authored work from agent-dispatched work. The workflow UUID trace (see workflow-uuid-tracing.md) relies on PR author as one classification signal.

Operator picked Option C — per-PR context swap (see #2070): the bot identity that opens a PR is determined by the agent type that produced it, not a shared default. This gives the most precise attribution and the best audit trail, at the cost of higher implementation complexity.

2. Invariants

These constraints are non-negotiable and take precedence over all design choices:

No stored credentials. Bot private key PEMs live in Infisical only. They are never written to disk, never echoed to logs, never committed to any file. heroku config:set of any token value must silence stdout (>/dev/null).
Secrets are rotatable without redeploy. Rotating a bot PEM in Infisical takes effect on the next agent spawn — no local file update, no operator action.
Audit trail for every state change that affects permissions or PR authorship. The agent type, workflow UUID, and bot identity used are emitted in the PR body.
Vault-first secret access. Per feedback_secrets_in_vault_sop, the orchestrator never prompts the operator for tokens. All credentials flow from Infisical via the existing mint_github_token.py path.
No operator-identity PR authorship. After rollout, zero agent-dispatched PRs may author as MooseQuest. The self-approval block must be structurally impossible, not just avoided by convention.

3. Current State Audit

3.1 Agent-to-bot mapping (canonical, from `agent_bot_map.yaml`)

Agent class	Bot identity	Vault path
`feature-developer`, `ux-polisher`, `ux-designer`	`raxx-dev-bot`	`/MooseQuest/raxx-dev-bot/`
`sre-agent`, `security-agent`, `card-groomer`	`raxx-ops-bot`	`/MooseQuest/raxx-ops-bot/`
`product-manager`, `software-architect`, `marketing-strategist`, `business-legal-researcher`, `data-scientist`	`raxx-pm-bot`	`/MooseQuest/raxx-pm-bot/`

The mapping is complete. No new GitHub Apps are required. The "conductor" role (main orchestrator thread) does not open PRs — it dispatches agents. The orchestrator's only GitHub actions are issue comments and agent spawn calls.

3.2 Identity verification

None of the three Apps map to the operator's GitHub account (MooseQuest). Operator self-approval is unblocked the moment all agent-dispatched PRs route through these identities.

3.3 Current gap

agent_bot_map.yaml exists and is correct, but:

Agent prompt templates do not uniformly instruct agents to call with_bot_token.sh or mint_github_token.py before PR creation.
The orchestrator does not inject a GH_TOKEN_<BOT> env var into agent spawn context — agents are responsible for self-service token minting, but this depends on consistent prompt compliance.
No smoke test verifies that a PR opened by a given agent class actually authors as the expected bot.
No PR body convention stamps the workflow UUID or agent type for audit tracing.

4. Token Routing Design

4.1 Vault paths (existing, no changes required)

/MooseQuest/raxx-dev-bot/APP_ID
/MooseQuest/raxx-dev-bot/INSTALLATION_ID
/MooseQuest/raxx-dev-bot/PRIVATE_KEY_PEM

/MooseQuest/raxx-ops-bot/APP_ID
/MooseQuest/raxx-ops-bot/INSTALLATION_ID
/MooseQuest/raxx-ops-bot/PRIVATE_KEY_PEM

/MooseQuest/raxx-pm-bot/APP_ID
/MooseQuest/raxx-pm-bot/INSTALLATION_ID
/MooseQuest/raxx-pm-bot/PRIVATE_KEY_PEM

The vault paths are already defined and populated per the existing agent-github-identity.md design. No new secrets or vault folders are needed.

4.2 Resolver: agent type → bot name

The resolver is scripts/agents/agent_bot_map.yaml. No code changes; this file is already the canonical source. Any new agent class added to the codebase MUST have an entry in this map before its first PR can open.

4.3 Token injection — orchestrator responsibility

When the orchestrator dispatches an agent, it reads agent_bot_map.yaml to determine the agent's bot, then either:

(Preferred, session pattern) mints a scoped GH_TOKEN for that bot and passes it into the agent's environment at spawn time. The agent uses it for all gh calls without additional token minting.
(Fallback, per-call pattern) relies on the agent's prompt to call with_bot_token.sh <bot-name> gh ... for each individual gh call.

The session pattern is preferred for agents that make ≥3 gh calls (most feature-developer and software-architect runs). The per-call pattern is acceptable for lightweight agents (card-groomer filing a single issue).

4.4 Failure mode and fallback

If token minting fails, with_bot_token.sh logs a warning to stderr and falls back to the operator PAT. This fallback must be surfaced in the agent's final output so the operator notices. A PR opened via the fallback path is still functional but will show MooseQuest as author — the self-approval block re-appears. The fallback is acceptable for agent tasks that don't open PRs.

5. Audit Trail Extension

Each agent-dispatched PR body must include the following four-field audit block as a trailing section (separated by ---):

Agent: <agent-class>
Bot identity: <raxx-dev-bot | raxx-ops-bot | raxx-pm-bot>
Workflow UUID: <wfl_* if dispatched from orchestrator; n/a for standalone>
Refs #<parent-issue>

Field semantics:

Agent: — the agent class that produced this PR (e.g. feature-developer, sre-agent). Matches the name: frontmatter in .claude/agents/<agent>.md.
Bot identity: — the bot login used to open the PR per ADR-0096.
Workflow UUID: — the wfl_* UUID from the orchestrator session. Set to n/a when invoked standalone; never omit or fabricate.
Refs #NNN: — parent issue cross-reference. Use Closes #NNN in the Why section for auto-close on merge; repeat as Refs #NNN here for tracing.

This convention is enforced through agent prompt templates (SC-IDENT-3, #2288) and validated by scripts/ci/check_agent_pr_audit_trail.py (SC-IDENT-4, #2291), which warns pre-launch and fails post-launch (--enforce flag).

Canonical reference for the block format: .claude/agents/_shared-conventions.md §Audit trail block convention.

Traceability chain: GitHub PR author (raxx-dev-bot[bot]) → PR body Agent: feature-developer → Workflow UUID: wfl_* → full trace in trace_events table (see workflow-uuid-tracing.md).

6. Agent Prompt Template Additions

Each agent definition at .claude/agents/<agent>.md requires a "GitHub identity" preamble (5–8 lines) that:

Names the bot: "You open PRs and file issues as raxx-dev-bot."
References the session pattern: "For ≥3 gh calls, use the session pattern."
States the PR body convention: "Include the Agent: / Bot identity: / Workflow UUID: header block."
States the fallback warning requirement: "If token mint fails, surface the warning in your final output."

The software-architect and sre-agent definitions already include partial GitHub identity preambles (referencing with_bot_token.sh). The update standardizes the format across all agents and adds the PR body convention.

7. Sequence Diagram

sequenceDiagram
    participant Op as Operator (Kristerpher)
    participant Orch as Orchestrator (main thread)
    participant Map as agent_bot_map.yaml
    participant Vault as Infisical Vault
    participant GH as GitHub API
    participant Agent as Dispatched Agent

    Op->>Orch: dispatch feature-developer on #2150
    Orch->>Map: lookup bot for feature-developer
    Map-->>Orch: raxx-dev-bot
    Orch->>Vault: GET /MooseQuest/raxx-dev-bot/{APP_ID,INSTALLATION_ID,PRIVATE_KEY_PEM}
    Vault-->>Orch: credentials
    Orch->>GH: POST /app/installations/{id}/access_tokens (JWT signed with PEM)
    GH-->>Orch: GH_TOKEN=ghs_...  (1-hour validity)
    Orch->>Agent: spawn with GH_TOKEN injected
    Agent->>Agent: implements feature
    Agent->>GH: gh pr create (GH_TOKEN = raxx-dev-bot token)
    GH-->>Agent: PR #NNNN authored by raxx-dev-bot[bot]
    Agent-->>Orch: PR URL + sub-card links
    Orch-->>Op: "PR #NNNN ready for review (authored by raxx-dev-bot)"
    Op->>GH: approve PR #NNNN  ← self-approval block GONE

8. Migrations

No schema migrations. No application code changes.

Changes are confined to: - scripts/agents/agent_bot_map.yaml — already correct; no changes needed - .claude/agents/*.md — prompt template additions (6 files) - scripts/agents/mint_github_token.py — potential minor update if orchestrator inject path needs a --agent-class flag (feature-developer evaluates this in SC-IDENT-2)

Rollback: revert .claude/agents/*.md prompt changes. All agents fall back to per-call with_bot_token.sh usage or operator PAT. No data migration needed.

9. Rollout Plan

Phase	Gate	Description
Dark	SC-IDENT-5 complete	Verify all 3 Apps are operational; tokens mint clean from vault
SC-IDENT-1	vault entries confirmed	Confirm vault path schema; document any gaps
SC-IDENT-2	SC-IDENT-1 done	Orchestrator injects per-agent GH_TOKEN at spawn time
SC-IDENT-3	SC-IDENT-2 done	Agent prompt templates updated; PR body convention added
SC-IDENT-4	SC-IDENT-3 done	Audit header block (Agent/Bot/Workflow UUID) in all agent PR bodies
SC-IDENT-6	SC-IDENT-3 done	Smoke test: open 1 PR per agent type; verify author is the expected bot
GA	SC-IDENT-6 passes	Zero agent-dispatched PRs authored as MooseQuest

All phases can run sequentially within one sprint. No feature flags needed — this is infrastructure-layer plumbing, not a user-visible feature.

10. Security Considerations

PII collected: None. Bot identities and workflow UUIDs are operational metadata, not personal data.

Retention: PR body audit headers are retained for the lifetime of the PR. Vault credentials rotate on a 365-day cadence per docs/ops/runbooks/rotation/github-app-installation-token.md.

DSR: Not applicable — no user PII flows through bot identity routing.

Credential replay: Bot PEM keys never leave Infisical. The minted token (ghs_*) has 1-hour validity. The orchestrator's only durable artifact is the Infisical Machine Identity (INFISICAL_CLIENT_ID + INFISICAL_CLIENT_SECRET), which does not grant GitHub access directly.

Audit trail: Every PR opened by an agent bot is auditable by bot identity on GitHub (filter by author). The PR body convention adds agent class and workflow UUID for cross-reference with the trace store.

Secrets: PRIVATE_KEY_PEM for all three bots lives at /MooseQuest/<bot-name>/PRIVATE_KEY_PEM in Infisical. Rotatable without redeploy. The mint script reads them live on every invocation — no local cache.

Kill-switch: Revoking a bot's GitHub App installation immediately prevents any further token minting for that bot. Agents fall back to operator PAT (with warning). Full revocation of all three Apps is the nuclear kill-switch for agent-dispatched GitHub actions.

Breach: If a minted ghs_* token is leaked, it expires in ≤1 hour and is scoped to the single installation. Impact: an attacker could open PRs as the bot identity for up to 1 hour. Mitigation: revoke the App installation immediately upon detection. No credential has replay value beyond the 1-hour window.

11. Open Questions

None blocking sub-card implementation. Operator has locked Option C. All vault paths are already populated.

One deferred question for post-rollout: should the orchestrator's own issue comments (not PR opens) also use a bot identity, or is operator-attributed orchestrator commentary acceptable? This is aesthetic, not structural — the self-approval block only applies to PR authorship.

agent-github-identity.md — base design (3-bot model)
workflow-uuid-tracing.md — UUID tracing for audit
adr/0128-per-pr-context-swap-agent-identity.md
scripts/agents/agent_bot_map.yaml — canonical agent→bot mapping
scripts/agents/mint_github_token.py — token mint helper
scripts/agents/with_bot_token.sh — per-call wrapper
Issue #2070 — parent card (operator decision)
Issues #335, #336 — existing App provisioning