Raxx · internal docs

internal · gated

Queue C++ Scaffold Review — vcpkg Discipline Post-Incident

Status: Accepted
Date: 2026-05-13 UTC
Author: software-architect
Refs: #2021, #2028, #2029, #2030, #2031
Governing decisions: [[project_language_tier_philosophy]], [[project_queue_identity_service]], ADR-0076
ADRs produced: ADR-0085, ADR-0086, ADR-0087


1. Context

Queue is the first tier-1 C++ service per [[project_language_tier_philosophy]]. It owns customer source-of-truth, RBAC, sessions, audit, and Stripe billing per [[project_queue_identity_service]]. The decision to build in C++ was made explicitly and correctly (ADR-0076): quality over timeline, no planned rewrite.

On 2026-05-13 UTC, three back-to-back vcpkg/Dockerfile bugs surfaced during the first staging deploy attempt for #2021. Each required a full Docker build (~20 minutes) to discover and fix. A fourth bug class remains unconfirmed without further investigation. This document:

  1. Analyzes the incident and its process root cause.
  2. Classifies the full failure taxonomy for this class of bug.
  3. Establishes dep-pinning discipline for Queue and all future tier-1 C++ services.
  4. Specifies the CI guard (#2030) that closes the detection gap.
  5. Names the current remediation path.
  6. Answers whether the Queue C++ scaffold is fundamentally sound.
  7. Defines how this discipline propagates to future tier-1 services.

This is a process and culture review as much as a technical one. The bugs themselves were trivial to fix. The problem is that they should not have reached a deploy attempt.


2. Invariants

These TradeMasterAPI invariants are material to this review:

# Invariant
I-1 Audit trail for every state change. vcpkg.json is a build contract. A change that has not been locally validated before PR open is a state change without verification — the equivalent of an unreviewed schema migration.
I-2 No stored credentials. Not directly implicated here, but the same discipline that prevents credential drift prevents dependency contract drift.
I-3 Paper-first gating. Analogously: build-first gating. A vcpkg.json that has never been run through vcpkg install is analogous to a strategy that has never run in paper mode. It cannot be deployed in production state.

Additional process invariant added by this review:

P-1: A vcpkg.json must be verified against its pinned builtin-baseline via a clean vcpkg install in a fresh container before any PR that touches it is opened. No exceptions. No fix-forward.


3. Section A — Incident Summary

Timeline (all UTC 2026-05-13)

Time Event
~14:00 First deploy attempt for #2021. Docker build fails inside the vcpkg install step.
~14:20 Root cause: --depth 1 in git clone for vcpkg means the builtin-baseline SHA (3508985146f1b1d248c67ead13f8f54be5b4f5da) is not in the shallow object store. vcpkg cannot resolve baseline versions. Fixed in #2028: remove --depth 1.
~14:30 Second deploy attempt. New failure: error: drogon does not have a feature named openssl.
~14:50 Root cause: queue/vcpkg.json declared "features": ["postgres", "openssl"] for drogon. At the pinned baseline (3508985146f1b1d248c67ead13f8f54be5b4f5da), drogon's valid feature set is ctl, mysql, orm, postgres, redis, sqlite3, yaml. No openssl feature exists — drogon links openssl as a transitive top-level dependency, not as a user-selectable feature. Fixed in #2031: remove openssl from drogon's feature list.
~15:00 Third deploy attempt. New failure: libpqxx 7.9.1 does not exist at the pinned baseline. The vcpkg port registry at that baseline SHA jumps from 7.9.0#1 directly to 7.9.2. The declared "version>=": "7.9.1" cannot be satisfied.
~15:10 SRE escalates per the "no more fix-forwards" rule. Pattern is clear: queue/vcpkg.json was authored without ever running vcpkg install against its pinned builtin-baseline.

Bug catalogue

# Bug Category PR
1 --depth 1 prevents baseline SHA resolution Dockerfile infrastructure #2028
2 openssl is not a valid drogon feature at pinned baseline Feature-not-defined-at-baseline #2031
3 libpqxx 7.9.1 does not exist at pinned baseline Version-not-in-registry-at-baseline Unfixed (Card A)
4+ Possible: sentry-native, jwt-cpp, gtest, spdlog, nlohmann-json version constraints not verified Same class as bug 3 Unfixed (Card A)

Root cause analysis: why did this happen?

The root cause is a process gap, not a coding error.

queue/vcpkg.json was authored during the initial scaffold work, which per ADR-0076's timeline estimate was expected to take 4-6 days and is explicitly noted as "the riskiest unknown." The author declared dependency versions that seemed reasonable from documentation and from current vcpkg HEAD — but never ran vcpkg install against the pinned builtin-baseline SHA in a fresh container.

This is structurally identical to authoring a database migration without running it locally first. The vcpkg manifest is a build contract. Its builtin-baseline locks the port registry to a specific point in time. Any version>= declaration that is not satisfiable at that exact baseline SHA will fail at build time — and the only way to verify it is to run vcpkg install.

The specific contributing factors:

  1. No local validation gate. There is no step in the PR checklist or SDLC SOP that requires the author to run vcpkg install before opening a PR. There is no CI step that runs it on PR.

  2. Baseline was chosen by SHA, not by iterating from a known-good state. The baseline 3508985146f1b1d248c67ead13f8f54be5b4f5da was pinned without verifying that all declared versions exist at that SHA. The correct workflow is: pick a baseline, run vcpkg install, observe what it actually resolves, update the manifest to match.

  3. First-time C++ infrastructure tax. ADR-0076 Risk R-2 explicitly called this out: "The first container build WILL have problems." What it did not anticipate is that the problems would arrive serially, each requiring a 20-minute build iteration to discover. The risk was correctly identified but the mitigation (validate locally before CI) was not enforced.

  4. No vcpkg-lock.json committed. vcpkg supports a machine-generated lockfile. Without it, vcpkg install is the only mechanism to verify the manifest is consistent. With a lockfile committed, the author proves the manifest resolves by the act of generating the lock.

Why fix-forward is wrong here

Per [[feedback_no_naive_conflict_resolver]]: the "audit-once-not-iterate-bug-by-bug" principle. Fix-forward — applying one-line fix PRs sequentially, one per 20-minute build cycle — is the wrong remediation pattern because:

  1. It does not enumerate the failure space. Three bugs surfaced from three deploy attempts. There are likely more (all 9 packages in queue/vcpkg.json are unverified against the pinned baseline). Fix-forwarding discovers them one at a time at 20 minutes each.

  2. It does not close the process gap. Fixing bug 3 does not prevent bug 4. Only a complete audit of all packages (Card A) and a CI guard (Card B) close the loop.

  3. It creates a misleading history. A series of one-line fix PRs obscures the root cause in the PR history. A single Card A PR that audits all packages and documents the verified state is the correct artifact.

The operator's "no more fix-forwards" rule is correct. Card A is one PR that audits all 9 packages at once.


4. Section B — Failure-Class Taxonomy

Class 1: Version-not-in-registry-at-baseline

What causes it: A version>= or version constraint in vcpkg.json specifies a version that does not exist in the vcpkg port registry at the pinned builtin-baseline SHA. The registry at any given SHA has a specific set of available versions per package; if the declared minimum is between two available versions, or above all available versions, vcpkg install fails immediately.

Examples: libpqxx 7.9.1 at baseline 3508985146f1b1d248c67ead13f8f54be5b4f5da — registry jumps from 7.9.0#1 to 7.9.2, so 7.9.1 does not exist.

How to detect before deploy: Run vcpkg install --triplet x64-linux --x-manifest-root=<dir> in a fresh container cloned from the pinned baseline. Fails immediately with "version X not found in database."

How to fix: Change version>= to the nearest available version at the baseline that satisfies the need. Or bump the baseline to a SHA where the desired version exists.

CI guard: vcpkg install --dry-run on PR catches this in seconds (see Section D).

Class 2: Feature-not-defined-at-baseline

What causes it: A features list in a dependency block names a feature that the port does not define at the pinned baseline SHA. vcpkg port manifests evolve; a feature that exists in current HEAD may not have existed at an older baseline.

Examples: drogon openssl feature at baseline 3508985146f1b1d248c67ead13f8f54be5b4f5da — drogon at that SHA defines ctl, mysql, orm, postgres, redis, sqlite3, yaml. No openssl feature.

How to detect before deploy: Same vcpkg install run. Fails immediately with "package X does not have a feature named Y."

How to fix: Remove the invalid feature, or replace it with the correct mechanism (in this case, openssl is a top-level dep, not a drogon feature).

CI guard: Same vcpkg install --dry-run catches this.

Class 3: Cross-package version incompatibility

What causes it: Two packages in the manifest have conflicting transitive version requirements that vcpkg cannot satisfy simultaneously. Unlike Python (pip/poetry), vcpkg resolves dependencies at the binary level; some version conflicts produce linker errors that --dry-run may not catch.

Current exposure: Not yet observed in Queue. Possible as the dep set grows (jwt-cpp + openssl version coupling; sentry-native + curl version coupling).

How to detect: Only a full build against the resolved package set will catch linker-level conflicts. --dry-run catches resolution-phase conflicts.

How to fix: Add an overrides block in vcpkg.json to pin the conflicting package, or bump the baseline.

CI guard: A full build in CI (the existing Docker build step) catches this. The --dry-run guard does not prevent this class — it catches resolution failures, not link-time failures.

Class 4: Build-cache pollution

What causes it: GH Actions vcpkg binary cache (VCPKG_DEFAULT_BINARY_CACHE) retains a compiled binary for a package at a previous version. If the manifest changes a version constraint and the cache key does not change (i.e., hashFiles('queue/vcpkg.json') alone is used), stale cached binaries can be used silently, producing a binary that does not match the current manifest.

Current exposure: The Dockerfile's RUN mkdir -p /opt/vcpkg-cache step and GH Actions cache keyed on hashFiles('queue/vcpkg.json') are the current cache mechanism. If the baseline SHA changes but the manifest file's hash does not (impossible in practice — the baseline is in vcpkg.json), the cache could be stale. In practice this is low-risk because the cache key includes vcpkg.json content.

How to detect: Periodically run a full cold-cache build in CI (e.g., monthly or on baseline bump). Compare the vcpkg binary ABI log.

How to fix: Bust the cache by changing the cache key strategy (include vcpkg HEAD SHA in the key, not just vcpkg.json hash).

CI guard: Low priority for v1. Revisit when the first baseline bump lands.

Class 5: Stale baseline

What causes it: The builtin-baseline SHA was correct when pinned, but as the codebase ages and dependencies need updating, the gap between the baseline SHA and current vcpkg HEAD widens. A stale baseline means any new package added to the manifest must have a version that exists at the old SHA — limiting which versions are available. Over time, the pinned baseline becomes a constraint that blocks legitimate dependency updates.

Current exposure: Baseline 3508985146f1b1d248c67ead13f8f54be5b4f5da was current as of Queue's scaffolding (approximately 2026-05-11 UTC). It will age.

How to detect: When adding a new package or updating a version constraint fails with "version not in database," it often indicates the baseline needs bumping. vcpkg's x-update-baseline command can identify an appropriate newer baseline.

How to fix: Bump builtin-baseline to a recent SHA. Run vcpkg install in a fresh container. Address any version constraint changes the bump surfaces.

Ownership: A quarterly baseline bump is sufficient for v1 stage. Assign to the engineer who last touched queue/vcpkg.json.


5. Section C — Dep-Pinning Discipline for Queue

The mandatory validation step

Before any PR that adds or changes queue/vcpkg.json is opened, the author must run:

# From inside a fresh gcc:13-bookworm container, with a full (non-shallow) vcpkg clone:
git clone https://github.com/microsoft/vcpkg.git /opt/vcpkg
/opt/vcpkg/bootstrap-vcpkg.sh -disableMetrics
/opt/vcpkg/vcpkg install --triplet x64-linux --x-manifest-root=/path/to/queue/

This must succeed without errors before the PR is opened. The PR description must state: "Verified: vcpkg install ran clean in container on [date] UTC."

This is P-1 from the invariants section. It is not optional. The CI guard (Card B) enforces this post-PR but the author's local run is the first gate.

A convenience script at queue/scripts/verify-vcpkg-local.sh (Card C) documents the exact Docker invocation for authors who do not want to manually manage the container.

Pin policy: version>= vs exact version

This is a real tradeoff (see ADR-0085):

Policy Pros Cons
version>= (current) Looser — less maintenance overhead when bumping baseline; allows automatic minor/patch upgrades within the same baseline Versions that satisfy >= but are not the one you tested can introduce behavioral differences; harder to reproduce exact state
Exact version Fully reproducible state; every dev and CI run resolves identically; the PR review can verify exact versions Requires explicit update PR to upgrade any package; more maintenance churn on baseline bumps

Recommendation for Queue: Use exact version for all packages. Queue is a tier-1 service handling billing PII and money-state mutations. Reproducibility and auditability outweigh the minor maintenance overhead. The CI guard (Card B) will catch version-not-available errors at PR time, making the maintenance cost low.

The transition from version>= to version is part of Card A (the baseline audit).

Baseline-update cadence

Trigger Action
A new package is added that does not exist at the current baseline Bump baseline to a SHA where the package exists. Run vcpkg install clean. Update all version constraints to match new baseline.
A security advisory requires a dep upgrade that the current baseline cannot satisfy Emergency baseline bump. Same procedure.
Quarterly cadence (planned) Engineer who last touched queue/vcpkg.json runs vcpkg x-update-baseline to get a candidate newer SHA. Runs vcpkg install clean. Files a PR with the result. No urgency required if no security or compatibility trigger.

Baseline bump is always a standalone PR. It is never combined with feature work. This isolates the "changed 5 package versions simultaneously" surface from application logic changes.

Documentation that lives next to vcpkg.json

A queue/vcpkg-notes.md file (Card C scope) must contain:

  1. The date the current baseline SHA was last verified.
  2. For each dependency: why it was chosen (one sentence), and whether any features were deliberately excluded.
  3. Any overrides block entries and the reason.
  4. The last person to run a clean local vcpkg install and on what date.

This is not a heavyweight document. It is 2-3 paragraphs. Its purpose is to make the next engineer's baseline bump faster by recording what was considered.


6. Section D — CI Guard (#2030 integration)

What the guard does

A new GH Actions job vcpkg-manifest-check triggers on any PR that touches queue/vcpkg.json or queue/Dockerfile. It:

  1. Checks out the PR branch.
  2. Clones vcpkg (full clone, not shallow) at the baseline SHA specified in queue/vcpkg.json.
  3. Runs vcpkg install --dry-run --triplet x64-linux --x-manifest-root=queue/.
  4. Exits with the vcpkg exit code. Non-zero = PR check fails.

What failure shape it produces

The --dry-run flag causes vcpkg to: - Resolve all version constraints against the registry at the baseline SHA. - Validate all feature names against port manifests at that SHA. - Print the resolved package graph. - Exit 0 if the graph is consistent; non-zero otherwise.

The exact error messages that would have caught today's bugs:

# Bug 2 (would have been caught):
error: drogon does not have a feature named 'openssl'

# Bug 3 (would have been caught):
error: libpqxx@7.9.1: could not satisfy dependency constraints
  available versions: 7.9.0, 7.9.0#1, 7.9.2, ...

Note: --dry-run does not compile packages. It only resolves the graph. It catches Class 1 and Class 2 failures (see Section B). It does not catch Class 3 (link-time conflicts) — the full Docker build job catches those.

Note: Bug 1 (shallow clone preventing baseline resolution) is caught by the guard's own setup: the guard does a full clone, so if the Dockerfile were to reintroduce --depth 1, the guard would still succeed because it uses its own clone. To catch a Dockerfile regression on --depth 1, a separate step should validate that the Dockerfile does not contain --depth 1 for the vcpkg clone (a grep check, ~3 lines).

Caching strategy

The --dry-run step does not compile packages, so it is fast (~30-60 seconds). However, the vcpkg clone itself is ~600MB and slow on a cold runner. Cache it:

- uses: actions/cache@v3
  with:
    path: /opt/vcpkg
    key: vcpkg-tool-${{ hashFiles('queue/vcpkg.json') }}
    restore-keys: vcpkg-tool-

The cache key includes vcpkg.json hash. On a baseline bump the key changes and vcpkg is re-cloned. On unchanged vcpkg.json the cached clone is reused. Total job time: 10-20 seconds on a warm cache, 2-3 minutes on cold.

This does NOT use the same cache as the full Docker build's binary cache. The guard only needs the vcpkg tool and the port registry; it never compiles anything.

Tradeoff analysis

Dimension Cost Benefit
GH Actions minutes ~0.5-2 min per PR touching queue/vcpkg.json (warm/cold) Catches Class 1 + Class 2 bugs in seconds vs 20-min build
Maintenance Minimal — the check is a vcpkg CLI invocation, not custom logic No false positives once the manifest is in a valid state
Timing Catches bug at PR review time, not at deploy attempt Prevents the "3 bugs in 3 deploy attempts" pattern entirely

The tradeoff is unambiguously worth it. 0.5-2 min per PR that touches queue/vcpkg.json is negligible. The 60+ minutes of deploy-attempt time lost on 2026-05-13 UTC represents an order-of-magnitude worse alternative.


7. Section E — Current State Remediation (Card A)

The operator selected Option B: a single, complete audit PR rather than continued fix-forwarding.

Card A (defined in Section 9) performs the following in one PR:

  1. Spin up gcc:13-bookworm container with a full vcpkg clone at builtin-baseline=3508985146f1b1d248c67ead13f8f54be5b4f5da.
  2. For each of the 9 packages in queue/vcpkg.json, identify the nearest available version at that baseline that satisfies the service's actual requirement.
  3. Migrate all version>= constraints to exact version constraints at the resolved versions.
  4. Resolve the libpqxx 7.9.1 bug: update to 7.9.2 or the nearest available.
  5. Confirm no other packages exhibit the same problem (sentry-native, jwt-cpp, gtest, spdlog, nlohmann-json, curl, openssl, drogon).
  6. Run vcpkg install clean. Zero errors. Zero warnings about unavailable versions.
  7. Optionally commit a vcpkg-lock.json if the installed set can be locked (see Section F).
  8. Add queue/vcpkg-notes.md documenting the baseline verification date and per-package rationale.

This PR unblocks #2021.

Card A does NOT modify application code. It is a queue/vcpkg.json + queue/vcpkg-notes.md change only.


8. Section F — Should We Change the Queue C++ Scaffold?

Drogon version: stay at 1.9.6 or upgrade baseline?

Recommendation: keep baseline 3508985146f1b1d248c67ead13f8f54be5b4f5da, which pins Drogon at 1.9.6, for v1.

Rationale:

The one exception: if Card A's audit reveals that no stable set of versions is satisfiable at the current baseline (unlikely given Drogon 1.9.6 was chosen from this baseline), then bumping to a newer baseline becomes part of Card A.

Should vcpkg-lock.json be committed?

vcpkg supports a machine-generated vcpkg.json-derived lockfile (accessible via vcpkg install --x-write-process-timeout or the lock-generation tooling in vcpkg 2024+). The tradeoff (see ADR-0086):

Aspect With vcpkg-lock.json Without
Reproducibility Fully reproducible — every build gets the exact same binary set Reproducible within a baseline but version>= allows variation
Proof of authorship Committing the lock proves the author ran vcpkg install Author must self-attest
Maintenance overhead Lock must be regenerated on every dep change No regeneration step
Noise in PR diffs Lock file is large and machine-generated; clutters diff Cleaner PRs

Recommendation: Do not commit vcpkg-lock.json initially. The mandatory local validation step (P-1) and the exact-version pin policy (Section C) together provide equivalent reproducibility guarantees without the maintenance overhead of a regenerated lockfile. If Queue's dep set grows or baseline drift becomes a recurring problem, revisit locking in a future Card F sprint.


9. Section G — Process Propagation to Future Tier-1 Services

If a second C++ service ships (Velvet rewrite, MQ-A rewrite), it will start with a vcpkg.json. Without documented discipline, it will repeat today's incident.

Tier-1 C++ service onboarding checklist

This checklist lives at docs/sdlc/cpp-service-onboarding.md (Card C scope). Every new tier-1 C++ service scaffold PR must have this checklist completed in the PR description:

[ ] vcpkg.json authored with exact `version` constraints (not `version>=`)
[ ] builtin-baseline SHA chosen and documented in vcpkg-notes.md
[ ] vcpkg install run clean in a fresh gcc:13-bookworm container against the pinned baseline
[ ] All packages verified: no "version not in database" errors, no "feature not found" errors
[ ] Dockerfile uses full git clone (no --depth 1) for vcpkg
[ ] vcpkg-notes.md added alongside vcpkg.json with baseline date + per-package rationale
[ ] CI vcpkg-manifest-check job is wired for the new service's path
[ ] AddressSanitizer + UBSan enabled in the debug/CI build
[ ] No raw new/delete in any path touching PII or money state
[ ] Secrets read from Infisical at startup; no secrets in build system or Dockerfile ARGs

Who owns the SOP

The docs/sdlc/cpp-service-onboarding.md document is owned by the software-architect role. It is updated when: - A new C++ service is added (checklist validated against the new service's reality). - A new failure class is discovered that the checklist does not cover.

feature-developer claims the checklist before opening a scaffold PR. card-groomer verifies the checklist is present in the PR description before accepting the card.

Does today's incident warrant pausing tier-1 C++ ambitions?

No. The Queue C++ scaffold is fundamentally sound.

The incident exposed a process gap (no mandatory local validation before PR), not a architectural gap. The application code — Drogon framework selection, libpqxx, CMake/sqitch/ninja pipeline, multi-stage Dockerfile, Heroku container deployment model, GoogleTest suite — is correct. ADR-0076's stack choices were made with alternatives considered; none of them are implicated by today's bugs.

What today's incident reveals is that the first time a team builds C++ infrastructure, the build contract (vcpkg.json) demands the same verification rigor as application code — but this step was not yet encoded in the process. Card B (CI guard) and Card C (SOP document) close this gap permanently.

The tier-1 C++ ambition is correct. The process that enforces it was incomplete. This review completes it.


10. Sequence: Card A Unblocking #2021

sequenceDiagram
    participant Dev as feature-developer
    participant Docker as gcc:13-bookworm container
    participant vcpkg as vcpkg (full clone, pinned baseline)
    participant PR as GitHub PR (Card A)
    participant CI as GH Actions (Card B guard)
    participant 2021 as Issue #2021

    Dev->>Docker: docker run gcc:13-bookworm
    Dev->>vcpkg: git clone --no-depth vcpkg; bootstrap
    Dev->>vcpkg: vcpkg install --x-manifest-root=queue/
    vcpkg-->>Dev: Resolve 9 packages; note version errors
    Dev->>Dev: Fix version constraints (exact pin at nearest available)
    Dev->>vcpkg: vcpkg install (repeat until clean)
    Dev->>PR: Open Card A PR (vcpkg.json + vcpkg-notes.md)
    PR->>CI: vcpkg-manifest-check job runs (Card B guard)
    CI-->>PR: PASS
    PR-->>2021: Merges; blocks lifted; Docker build proceeds

11. Migrations

Not applicable for this review. No schema changes. No application code changes.


12. Rollout Plan

Step Action Owner
Immediate Card A: full vcpkg.json audit + exact-pin PR feature-developer
Within 3 days Card B: CI vcpkg-manifest-check job feature-developer
Within 5 days Card C: cpp-service-onboarding.md SOP software-architect
Within 5 days Card D: document pinning policy decision (exact vs >=) software-architect
Post-v1 (optional) Card E: baseline bump to current vcpkg HEAD feature-developer
Post-v1 (optional) Card F: evaluate vcpkg-lock.json discipline feature-developer

Card A is the critical-path unblock for #2021. All other cards harden the process but do not block #2021.


13. Security Considerations

This incident has no direct security impact: the bugs are in build-time dependency resolution, not in runtime behavior or credential handling. The vcpkg.json does not contain secrets. No invariants were violated.

However, the following security-adjacent implications apply:

No credential, PII, audit, or DSR implications.


14. Open Questions

OQ-1 — Baseline bump scope for Card A: If auditing the current baseline reveals more than 1-2 packages need version adjustments, should Card A simultaneously bump to a newer baseline where all packages have clean version availability? Or should it stay at the current baseline and work within its constraints? Recommendation: stay at current baseline for Card A; baseline bump is Card E.

OQ-2 — CI runner for vcpkg-manifest-check: The vcpkg install --dry-run step is Linux-only (x64-linux triplet). GH Actions ubuntu-latest is correct. If the project migrates to Ubicloud runners (per [[project_ci_billing]]), confirm the runner image has git and basic build tools available (they do on standard Ubicloud images). No decision needed now.

OQ-3 — vcpkg-lock.json format stability: vcpkg's lockfile format has changed across versions. If we adopt lockfiles (Card F), pin the vcpkg tool version alongside the baseline SHA. This is a decision for Card F, not now.


15. Sub-Cards (for PM to file)

The following cards are scoped for feature-developer or software-architect to claim after card-groomer grooming. Ordered by criticality.


Card A — Queue vcpkg.json full audit + lock against pinned baseline

Blocks: #2021
Size: M (2-4 hours; container work + iteration)
Risk: Medium (could surface additional version mismatches requiring judgment calls)
Dependencies: None — can be claimed immediately.

Title: fix(queue): audit all vcpkg.json packages against pinned baseline; exact-pin versions

Body: The three deploy bugs on 2026-05-13 UTC reveal that queue/vcpkg.json was never run through vcpkg install against its pinned builtin-baseline=3508985146f1b1d248c67ead13f8f54be5b4f5da. This card completes that audit and converts all version>= constraints to exact pinned versions.

Acceptance criteria: - [ ] Spin up gcc:13-bookworm container; full (non-shallow) vcpkg clone. - [ ] Run vcpkg install --triplet x64-linux --x-manifest-root=queue/ against the pinned baseline. - [ ] For each of the 9 packages: identify the exact available version at the baseline SHA. Convert version>= to version at the resolved value. - [ ] Resolve libpqxx 7.9.1 → nearest available (expected: 7.9.2). - [ ] Confirm no other packages have unavailable versions at the baseline. - [ ] vcpkg install exits 0 with no errors. - [ ] Add queue/vcpkg-notes.md: baseline SHA, verification date (UTC), per-package rationale (one sentence each). - [ ] PR description states: "Verified: vcpkg install ran clean in container on [date] UTC." - [ ] Do NOT modify application code. vcpkg.json + vcpkg-notes.md only.

Refs: #2021, docs/architecture/queue-cpp-scaffold-review-2026-05-13.md


Card B — Implement CI vcpkg dry-run guard (#2030)

Blocks: Future vcpkg.json changes
Size: M (half-day; GH Actions YAML + caching setup)
Risk: Low
Dependencies: Card A (should be green before the guard is wired so the initial check passes).

Title: reliability(ci): add vcpkg-manifest-check job for queue/vcpkg.json changes

Body: Closes #2030. Adds a GH Actions job that runs vcpkg install --dry-run on every PR touching queue/vcpkg.json or queue/Dockerfile. Catches Class 1 (version-not-in-registry) and Class 2 (feature-not-defined) failures at PR time instead of at deploy attempt.

Acceptance criteria: - [ ] New job vcpkg-manifest-check in .github/workflows/ (new file or added to existing queue CI workflow). - [ ] Triggers on pull_request when queue/vcpkg.json or queue/Dockerfile is in the changed files. - [ ] Runs vcpkg install --dry-run --triplet x64-linux --x-manifest-root=queue/ using a full (non-shallow) vcpkg clone at the builtin-baseline SHA from the PR's queue/vcpkg.json. - [ ] GH Actions cache keyed on hashFiles('queue/vcpkg.json') for the vcpkg tool clone. Warm-cache job time target: under 60 seconds. - [ ] Job fails the PR on non-zero vcpkg exit code. - [ ] Add a grep step that fails if queue/Dockerfile contains git clone --depth for the vcpkg clone (prevents regression of #2028). - [ ] Tested: a deliberately-broken vcpkg.json (invalid feature name) fails the check; a valid vcpkg.json passes. - [ ] Does NOT add time to PRs that don't touch queue/vcpkg.json or queue/Dockerfile.

Refs: #2030, docs/architecture/queue-cpp-scaffold-review-2026-05-13.md


Card C — SDLC doc: C++ service onboarding checklist

Blocks: Future tier-1 C++ services (Velvet rewrite, MQ-A rewrite)
Size: S (2-3 hours; doc-only)
Risk: Low
Dependencies: None. Can be written immediately.

Title: docs(sdlc): add cpp-service-onboarding.md with vcpkg discipline checklist

Body: Creates docs/sdlc/cpp-service-onboarding.md. Documents the mandatory checklist every new C++ service scaffold must complete before the scaffold PR is opened. Prevents the 2026-05-13 UTC incident from recurring on Velvet or MQ-A.

Acceptance criteria: - [ ] New file at docs/sdlc/cpp-service-onboarding.md. - [ ] Contains the 10-item checklist from Section G of docs/architecture/queue-cpp-scaffold-review-2026-05-13.md. - [ ] Documents the exact docker run command for local vcpkg verification. - [ ] Documents the exact vcpkg version vs version>= policy decision (per ADR-0085). - [ ] Documents the baseline-bump cadence (quarterly or security-triggered). - [ ] Links to docs/architecture/queue-cpp-scaffold-review-2026-05-13.md as the incident background. - [ ] Doc is under 500 words. It is a checklist, not an essay.

Refs: docs/architecture/queue-cpp-scaffold-review-2026-05-13.md


Card D — Document and lock the Queue dep-pinning policy decision

Blocks: Card B (the CI guard needs to know which policy is enforced)
Size: S (1-2 hours; ADR update)
Risk: Low
Dependencies: None.

Title: docs(adr): [ADR-0085](https://internal-docs.raxx.app/architecture/adr/0085-flag-reconciler-bidirectional-sync.html) — vcpkg version pinning policy for tier-1 C++ services

Body: ADR-0085 is drafted in docs/architecture/adr/0085-vcpkg-version-pinning-policy.md. It records the decision: exact version pinning (not version>=) for all tier-1 C++ services, with rationale and alternatives considered. This card moves it from "Proposed" to "Accepted" after operator review.

Acceptance criteria: - [ ] ADR-0085 status changed from "Proposed" to "Accepted." - [ ] Any final objections to the exact-version policy surface and are addressed in the ADR's Consequences section. - [ ] docs/architecture/queue-cpp-scaffold-review-2026-05-13.md §C references the accepted ADR.

Refs: ADR-0085, docs/architecture/queue-cpp-scaffold-review-2026-05-13.md


Card E (optional) — Bump vcpkg builtin-baseline to current

Does not block v1 launch.
Size: M (2-4 hours; re-audit all packages at new baseline)
Risk: Medium (new baseline may surface additional version constraint changes)
Dependencies: Card A must be merged and green first.

Title: chore(queue): bump vcpkg builtin-baseline to current HEAD; re-verify all packages

Body: Optional post-v1 maintenance. Moves queue/vcpkg.json's builtin-baseline from 3508985146f1b1d248c67ead13f8f54be5b4f5da to a recent vcpkg HEAD SHA. Gets newer package versions (potentially including security patches). Card A must be complete and the CI guard (Card B) must be wired before this card is claimed.

Acceptance criteria: - [ ] builtin-baseline updated to a SHA from within the last 30 days. - [ ] vcpkg install runs clean in a fresh container at the new baseline. - [ ] All 9 package version constraints updated to exact versions available at the new baseline. - [ ] CI vcpkg-manifest-check (Card B) passes on the PR. - [ ] queue/vcpkg-notes.md updated with new verification date. - [ ] PR is standalone — no application code changes.

Refs: docs/architecture/queue-cpp-scaffold-review-2026-05-13.md §F


Card F (optional) — Evaluate and adopt vcpkg-lock.json discipline

Does not block v1 launch.
Size: S-M (research + decision + optional implementation)
Risk: Low
Dependencies: Card A, Card B.

Title: chore(queue): evaluate vcpkg-lock.json; adopt or explicitly decline

Body: vcpkg supports a machine-generated lockfile that proves the author ran vcpkg install and provides fully reproducible builds independent of exact-version pinning alone. ADR-0086 records the current decision (no lockfile for v1). This card re-evaluates after v1 launch with operational experience.

Acceptance criteria: - [ ] Review vcpkg lockfile format stability at current vcpkg version. - [ ] If adopted: vcpkg-lock.json committed alongside vcpkg.json; regeneration step documented in onboarding checklist (Card C update). - [ ] If declined again: ADR-0086 updated with additional rationale from operational experience. - [ ] Either way: decision documented.

Refs: ADR-0086, docs/architecture/queue-cpp-scaffold-review-2026-05-13.md §F