Queue C++ Scaffold Review — vcpkg Discipline Post-Incident
Status: Accepted
Date: 2026-05-13 UTC
Author: software-architect
Refs: #2021, #2028, #2029, #2030, #2031
Governing decisions: [[project_language_tier_philosophy]], [[project_queue_identity_service]], ADR-0076
ADRs produced: ADR-0085, ADR-0086, ADR-0087
1. Context
Queue is the first tier-1 C++ service per [[project_language_tier_philosophy]]. It owns customer source-of-truth, RBAC, sessions, audit, and Stripe billing per [[project_queue_identity_service]]. The decision to build in C++ was made explicitly and correctly (ADR-0076): quality over timeline, no planned rewrite.
On 2026-05-13 UTC, three back-to-back vcpkg/Dockerfile bugs surfaced during the first staging deploy attempt for #2021. Each required a full Docker build (~20 minutes) to discover and fix. A fourth bug class remains unconfirmed without further investigation. This document:
- Analyzes the incident and its process root cause.
- Classifies the full failure taxonomy for this class of bug.
- Establishes dep-pinning discipline for Queue and all future tier-1 C++ services.
- Specifies the CI guard (#2030) that closes the detection gap.
- Names the current remediation path.
- Answers whether the Queue C++ scaffold is fundamentally sound.
- Defines how this discipline propagates to future tier-1 services.
This is a process and culture review as much as a technical one. The bugs themselves were trivial to fix. The problem is that they should not have reached a deploy attempt.
2. Invariants
These TradeMasterAPI invariants are material to this review:
| # | Invariant |
|---|---|
| I-1 | Audit trail for every state change. vcpkg.json is a build contract. A change that has not been locally validated before PR open is a state change without verification — the equivalent of an unreviewed schema migration. |
| I-2 | No stored credentials. Not directly implicated here, but the same discipline that prevents credential drift prevents dependency contract drift. |
| I-3 | Paper-first gating. Analogously: build-first gating. A vcpkg.json that has never been run through vcpkg install is analogous to a strategy that has never run in paper mode. It cannot be deployed in production state. |
Additional process invariant added by this review:
P-1: A vcpkg.json must be verified against its pinned builtin-baseline via a clean vcpkg install in a fresh container before any PR that touches it is opened. No exceptions. No fix-forward.
3. Section A — Incident Summary
Timeline (all UTC 2026-05-13)
| Time | Event |
|---|---|
| ~14:00 | First deploy attempt for #2021. Docker build fails inside the vcpkg install step. |
| ~14:20 | Root cause: --depth 1 in git clone for vcpkg means the builtin-baseline SHA (3508985146f1b1d248c67ead13f8f54be5b4f5da) is not in the shallow object store. vcpkg cannot resolve baseline versions. Fixed in #2028: remove --depth 1. |
| ~14:30 | Second deploy attempt. New failure: error: drogon does not have a feature named openssl. |
| ~14:50 | Root cause: queue/vcpkg.json declared "features": ["postgres", "openssl"] for drogon. At the pinned baseline (3508985146f1b1d248c67ead13f8f54be5b4f5da), drogon's valid feature set is ctl, mysql, orm, postgres, redis, sqlite3, yaml. No openssl feature exists — drogon links openssl as a transitive top-level dependency, not as a user-selectable feature. Fixed in #2031: remove openssl from drogon's feature list. |
| ~15:00 | Third deploy attempt. New failure: libpqxx 7.9.1 does not exist at the pinned baseline. The vcpkg port registry at that baseline SHA jumps from 7.9.0#1 directly to 7.9.2. The declared "version>=": "7.9.1" cannot be satisfied. |
| ~15:10 | SRE escalates per the "no more fix-forwards" rule. Pattern is clear: queue/vcpkg.json was authored without ever running vcpkg install against its pinned builtin-baseline. |
Bug catalogue
| # | Bug | Category | PR |
|---|---|---|---|
| 1 | --depth 1 prevents baseline SHA resolution |
Dockerfile infrastructure | #2028 |
| 2 | openssl is not a valid drogon feature at pinned baseline |
Feature-not-defined-at-baseline | #2031 |
| 3 | libpqxx 7.9.1 does not exist at pinned baseline |
Version-not-in-registry-at-baseline | Unfixed (Card A) |
| 4+ | Possible: sentry-native, jwt-cpp, gtest, spdlog, nlohmann-json version constraints not verified | Same class as bug 3 | Unfixed (Card A) |
Root cause analysis: why did this happen?
The root cause is a process gap, not a coding error.
queue/vcpkg.json was authored during the initial scaffold work, which per ADR-0076's timeline estimate was expected to take 4-6 days and is explicitly noted as "the riskiest unknown." The author declared dependency versions that seemed reasonable from documentation and from current vcpkg HEAD — but never ran vcpkg install against the pinned builtin-baseline SHA in a fresh container.
This is structurally identical to authoring a database migration without running it locally first. The vcpkg manifest is a build contract. Its builtin-baseline locks the port registry to a specific point in time. Any version>= declaration that is not satisfiable at that exact baseline SHA will fail at build time — and the only way to verify it is to run vcpkg install.
The specific contributing factors:
-
No local validation gate. There is no step in the PR checklist or SDLC SOP that requires the author to run
vcpkg installbefore opening a PR. There is no CI step that runs it on PR. -
Baseline was chosen by SHA, not by iterating from a known-good state. The baseline
3508985146f1b1d248c67ead13f8f54be5b4f5dawas pinned without verifying that all declared versions exist at that SHA. The correct workflow is: pick a baseline, runvcpkg install, observe what it actually resolves, update the manifest to match. -
First-time C++ infrastructure tax. ADR-0076 Risk R-2 explicitly called this out: "The first container build WILL have problems." What it did not anticipate is that the problems would arrive serially, each requiring a 20-minute build iteration to discover. The risk was correctly identified but the mitigation (validate locally before CI) was not enforced.
-
No
vcpkg-lock.jsoncommitted. vcpkg supports a machine-generated lockfile. Without it,vcpkg installis the only mechanism to verify the manifest is consistent. With a lockfile committed, the author proves the manifest resolves by the act of generating the lock.
Why fix-forward is wrong here
Per [[feedback_no_naive_conflict_resolver]]: the "audit-once-not-iterate-bug-by-bug" principle. Fix-forward — applying one-line fix PRs sequentially, one per 20-minute build cycle — is the wrong remediation pattern because:
-
It does not enumerate the failure space. Three bugs surfaced from three deploy attempts. There are likely more (all 9 packages in
queue/vcpkg.jsonare unverified against the pinned baseline). Fix-forwarding discovers them one at a time at 20 minutes each. -
It does not close the process gap. Fixing bug 3 does not prevent bug 4. Only a complete audit of all packages (Card A) and a CI guard (Card B) close the loop.
-
It creates a misleading history. A series of one-line fix PRs obscures the root cause in the PR history. A single Card A PR that audits all packages and documents the verified state is the correct artifact.
The operator's "no more fix-forwards" rule is correct. Card A is one PR that audits all 9 packages at once.
4. Section B — Failure-Class Taxonomy
Class 1: Version-not-in-registry-at-baseline
What causes it: A version>= or version constraint in vcpkg.json specifies a version that does not exist in the vcpkg port registry at the pinned builtin-baseline SHA. The registry at any given SHA has a specific set of available versions per package; if the declared minimum is between two available versions, or above all available versions, vcpkg install fails immediately.
Examples: libpqxx 7.9.1 at baseline 3508985146f1b1d248c67ead13f8f54be5b4f5da — registry jumps from 7.9.0#1 to 7.9.2, so 7.9.1 does not exist.
How to detect before deploy: Run vcpkg install --triplet x64-linux --x-manifest-root=<dir> in a fresh container cloned from the pinned baseline. Fails immediately with "version X not found in database."
How to fix: Change version>= to the nearest available version at the baseline that satisfies the need. Or bump the baseline to a SHA where the desired version exists.
CI guard: vcpkg install --dry-run on PR catches this in seconds (see Section D).
Class 2: Feature-not-defined-at-baseline
What causes it: A features list in a dependency block names a feature that the port does not define at the pinned baseline SHA. vcpkg port manifests evolve; a feature that exists in current HEAD may not have existed at an older baseline.
Examples: drogon openssl feature at baseline 3508985146f1b1d248c67ead13f8f54be5b4f5da — drogon at that SHA defines ctl, mysql, orm, postgres, redis, sqlite3, yaml. No openssl feature.
How to detect before deploy: Same vcpkg install run. Fails immediately with "package X does not have a feature named Y."
How to fix: Remove the invalid feature, or replace it with the correct mechanism (in this case, openssl is a top-level dep, not a drogon feature).
CI guard: Same vcpkg install --dry-run catches this.
Class 3: Cross-package version incompatibility
What causes it: Two packages in the manifest have conflicting transitive version requirements that vcpkg cannot satisfy simultaneously. Unlike Python (pip/poetry), vcpkg resolves dependencies at the binary level; some version conflicts produce linker errors that --dry-run may not catch.
Current exposure: Not yet observed in Queue. Possible as the dep set grows (jwt-cpp + openssl version coupling; sentry-native + curl version coupling).
How to detect: Only a full build against the resolved package set will catch linker-level conflicts. --dry-run catches resolution-phase conflicts.
How to fix: Add an overrides block in vcpkg.json to pin the conflicting package, or bump the baseline.
CI guard: A full build in CI (the existing Docker build step) catches this. The --dry-run guard does not prevent this class — it catches resolution failures, not link-time failures.
Class 4: Build-cache pollution
What causes it: GH Actions vcpkg binary cache (VCPKG_DEFAULT_BINARY_CACHE) retains a compiled binary for a package at a previous version. If the manifest changes a version constraint and the cache key does not change (i.e., hashFiles('queue/vcpkg.json') alone is used), stale cached binaries can be used silently, producing a binary that does not match the current manifest.
Current exposure: The Dockerfile's RUN mkdir -p /opt/vcpkg-cache step and GH Actions cache keyed on hashFiles('queue/vcpkg.json') are the current cache mechanism. If the baseline SHA changes but the manifest file's hash does not (impossible in practice — the baseline is in vcpkg.json), the cache could be stale. In practice this is low-risk because the cache key includes vcpkg.json content.
How to detect: Periodically run a full cold-cache build in CI (e.g., monthly or on baseline bump). Compare the vcpkg binary ABI log.
How to fix: Bust the cache by changing the cache key strategy (include vcpkg HEAD SHA in the key, not just vcpkg.json hash).
CI guard: Low priority for v1. Revisit when the first baseline bump lands.
Class 5: Stale baseline
What causes it: The builtin-baseline SHA was correct when pinned, but as the codebase ages and dependencies need updating, the gap between the baseline SHA and current vcpkg HEAD widens. A stale baseline means any new package added to the manifest must have a version that exists at the old SHA — limiting which versions are available. Over time, the pinned baseline becomes a constraint that blocks legitimate dependency updates.
Current exposure: Baseline 3508985146f1b1d248c67ead13f8f54be5b4f5da was current as of Queue's scaffolding (approximately 2026-05-11 UTC). It will age.
How to detect: When adding a new package or updating a version constraint fails with "version not in database," it often indicates the baseline needs bumping. vcpkg's x-update-baseline command can identify an appropriate newer baseline.
How to fix: Bump builtin-baseline to a recent SHA. Run vcpkg install in a fresh container. Address any version constraint changes the bump surfaces.
Ownership: A quarterly baseline bump is sufficient for v1 stage. Assign to the engineer who last touched queue/vcpkg.json.
5. Section C — Dep-Pinning Discipline for Queue
The mandatory validation step
Before any PR that adds or changes queue/vcpkg.json is opened, the author must run:
# From inside a fresh gcc:13-bookworm container, with a full (non-shallow) vcpkg clone:
git clone https://github.com/microsoft/vcpkg.git /opt/vcpkg
/opt/vcpkg/bootstrap-vcpkg.sh -disableMetrics
/opt/vcpkg/vcpkg install --triplet x64-linux --x-manifest-root=/path/to/queue/
This must succeed without errors before the PR is opened. The PR description must state: "Verified: vcpkg install ran clean in container on [date] UTC."
This is P-1 from the invariants section. It is not optional. The CI guard (Card B) enforces this post-PR but the author's local run is the first gate.
A convenience script at queue/scripts/verify-vcpkg-local.sh (Card C) documents the exact Docker invocation for authors who do not want to manually manage the container.
Pin policy: version>= vs exact version
This is a real tradeoff (see ADR-0085):
| Policy | Pros | Cons |
|---|---|---|
version>= (current) |
Looser — less maintenance overhead when bumping baseline; allows automatic minor/patch upgrades within the same baseline | Versions that satisfy >= but are not the one you tested can introduce behavioral differences; harder to reproduce exact state |
Exact version |
Fully reproducible state; every dev and CI run resolves identically; the PR review can verify exact versions | Requires explicit update PR to upgrade any package; more maintenance churn on baseline bumps |
Recommendation for Queue: Use exact version for all packages. Queue is a tier-1 service handling billing PII and money-state mutations. Reproducibility and auditability outweigh the minor maintenance overhead. The CI guard (Card B) will catch version-not-available errors at PR time, making the maintenance cost low.
The transition from version>= to version is part of Card A (the baseline audit).
Baseline-update cadence
| Trigger | Action |
|---|---|
| A new package is added that does not exist at the current baseline | Bump baseline to a SHA where the package exists. Run vcpkg install clean. Update all version constraints to match new baseline. |
| A security advisory requires a dep upgrade that the current baseline cannot satisfy | Emergency baseline bump. Same procedure. |
| Quarterly cadence (planned) | Engineer who last touched queue/vcpkg.json runs vcpkg x-update-baseline to get a candidate newer SHA. Runs vcpkg install clean. Files a PR with the result. No urgency required if no security or compatibility trigger. |
Baseline bump is always a standalone PR. It is never combined with feature work. This isolates the "changed 5 package versions simultaneously" surface from application logic changes.
Documentation that lives next to vcpkg.json
A queue/vcpkg-notes.md file (Card C scope) must contain:
- The date the current baseline SHA was last verified.
- For each dependency: why it was chosen (one sentence), and whether any features were deliberately excluded.
- Any
overridesblock entries and the reason. - The last person to run a clean local
vcpkg installand on what date.
This is not a heavyweight document. It is 2-3 paragraphs. Its purpose is to make the next engineer's baseline bump faster by recording what was considered.
6. Section D — CI Guard (#2030 integration)
What the guard does
A new GH Actions job vcpkg-manifest-check triggers on any PR that touches queue/vcpkg.json or queue/Dockerfile. It:
- Checks out the PR branch.
- Clones vcpkg (full clone, not shallow) at the baseline SHA specified in
queue/vcpkg.json. - Runs
vcpkg install --dry-run --triplet x64-linux --x-manifest-root=queue/. - Exits with the vcpkg exit code. Non-zero = PR check fails.
What failure shape it produces
The --dry-run flag causes vcpkg to:
- Resolve all version constraints against the registry at the baseline SHA.
- Validate all feature names against port manifests at that SHA.
- Print the resolved package graph.
- Exit 0 if the graph is consistent; non-zero otherwise.
The exact error messages that would have caught today's bugs:
# Bug 2 (would have been caught):
error: drogon does not have a feature named 'openssl'
# Bug 3 (would have been caught):
error: libpqxx@7.9.1: could not satisfy dependency constraints
available versions: 7.9.0, 7.9.0#1, 7.9.2, ...
Note: --dry-run does not compile packages. It only resolves the graph. It catches Class 1 and Class 2 failures (see Section B). It does not catch Class 3 (link-time conflicts) — the full Docker build job catches those.
Note: Bug 1 (shallow clone preventing baseline resolution) is caught by the guard's own setup: the guard does a full clone, so if the Dockerfile were to reintroduce --depth 1, the guard would still succeed because it uses its own clone. To catch a Dockerfile regression on --depth 1, a separate step should validate that the Dockerfile does not contain --depth 1 for the vcpkg clone (a grep check, ~3 lines).
Caching strategy
The --dry-run step does not compile packages, so it is fast (~30-60 seconds). However, the vcpkg clone itself is ~600MB and slow on a cold runner. Cache it:
- uses: actions/cache@v3
with:
path: /opt/vcpkg
key: vcpkg-tool-${{ hashFiles('queue/vcpkg.json') }}
restore-keys: vcpkg-tool-
The cache key includes vcpkg.json hash. On a baseline bump the key changes and vcpkg is re-cloned. On unchanged vcpkg.json the cached clone is reused. Total job time: 10-20 seconds on a warm cache, 2-3 minutes on cold.
This does NOT use the same cache as the full Docker build's binary cache. The guard only needs the vcpkg tool and the port registry; it never compiles anything.
Tradeoff analysis
| Dimension | Cost | Benefit |
|---|---|---|
| GH Actions minutes | ~0.5-2 min per PR touching queue/vcpkg.json (warm/cold) |
Catches Class 1 + Class 2 bugs in seconds vs 20-min build |
| Maintenance | Minimal — the check is a vcpkg CLI invocation, not custom logic | No false positives once the manifest is in a valid state |
| Timing | Catches bug at PR review time, not at deploy attempt | Prevents the "3 bugs in 3 deploy attempts" pattern entirely |
The tradeoff is unambiguously worth it. 0.5-2 min per PR that touches queue/vcpkg.json is negligible. The 60+ minutes of deploy-attempt time lost on 2026-05-13 UTC represents an order-of-magnitude worse alternative.
7. Section E — Current State Remediation (Card A)
The operator selected Option B: a single, complete audit PR rather than continued fix-forwarding.
Card A (defined in Section 9) performs the following in one PR:
- Spin up
gcc:13-bookwormcontainer with a full vcpkg clone atbuiltin-baseline=3508985146f1b1d248c67ead13f8f54be5b4f5da. - For each of the 9 packages in
queue/vcpkg.json, identify the nearest available version at that baseline that satisfies the service's actual requirement. - Migrate all
version>=constraints to exactversionconstraints at the resolved versions. - Resolve the
libpqxx 7.9.1bug: update to7.9.2or the nearest available. - Confirm no other packages exhibit the same problem (sentry-native, jwt-cpp, gtest, spdlog, nlohmann-json, curl, openssl, drogon).
- Run
vcpkg installclean. Zero errors. Zero warnings about unavailable versions. - Optionally commit a
vcpkg-lock.jsonif the installed set can be locked (see Section F). - Add
queue/vcpkg-notes.mddocumenting the baseline verification date and per-package rationale.
This PR unblocks #2021.
Card A does NOT modify application code. It is a queue/vcpkg.json + queue/vcpkg-notes.md change only.
8. Section F — Should We Change the Queue C++ Scaffold?
Drogon version: stay at 1.9.6 or upgrade baseline?
Recommendation: keep baseline 3508985146f1b1d248c67ead13f8f54be5b4f5da, which pins Drogon at 1.9.6, for v1.
Rationale:
- Drogon 1.9.6 is a stable release with all the features Queue Phase 1 needs (Postgres support, async handlers, middleware filters, JSON response).
- Bumping the baseline to current vcpkg HEAD would surface a new set of potentially-unverified versions across all 9 packages simultaneously — exactly the risk we are trying to avoid until Card A completes.
- The correct sequence is: Card A audits and locks the current baseline cleanly, then an optional Card E (baseline bump to current) is done as a standalone PR with full re-verification.
- Baseline bumps are routine maintenance. They should not be mixed with the audit that establishes the clean baseline state.
The one exception: if Card A's audit reveals that no stable set of versions is satisfiable at the current baseline (unlikely given Drogon 1.9.6 was chosen from this baseline), then bumping to a newer baseline becomes part of Card A.
Should vcpkg-lock.json be committed?
vcpkg supports a machine-generated vcpkg.json-derived lockfile (accessible via vcpkg install --x-write-process-timeout or the lock-generation tooling in vcpkg 2024+). The tradeoff (see ADR-0086):
| Aspect | With vcpkg-lock.json | Without |
|---|---|---|
| Reproducibility | Fully reproducible — every build gets the exact same binary set | Reproducible within a baseline but version>= allows variation |
| Proof of authorship | Committing the lock proves the author ran vcpkg install |
Author must self-attest |
| Maintenance overhead | Lock must be regenerated on every dep change | No regeneration step |
| Noise in PR diffs | Lock file is large and machine-generated; clutters diff | Cleaner PRs |
Recommendation: Do not commit vcpkg-lock.json initially. The mandatory local validation step (P-1) and the exact-version pin policy (Section C) together provide equivalent reproducibility guarantees without the maintenance overhead of a regenerated lockfile. If Queue's dep set grows or baseline drift becomes a recurring problem, revisit locking in a future Card F sprint.
9. Section G — Process Propagation to Future Tier-1 Services
If a second C++ service ships (Velvet rewrite, MQ-A rewrite), it will start with a vcpkg.json. Without documented discipline, it will repeat today's incident.
Tier-1 C++ service onboarding checklist
This checklist lives at docs/sdlc/cpp-service-onboarding.md (Card C scope). Every new tier-1 C++ service scaffold PR must have this checklist completed in the PR description:
[ ] vcpkg.json authored with exact `version` constraints (not `version>=`)
[ ] builtin-baseline SHA chosen and documented in vcpkg-notes.md
[ ] vcpkg install run clean in a fresh gcc:13-bookworm container against the pinned baseline
[ ] All packages verified: no "version not in database" errors, no "feature not found" errors
[ ] Dockerfile uses full git clone (no --depth 1) for vcpkg
[ ] vcpkg-notes.md added alongside vcpkg.json with baseline date + per-package rationale
[ ] CI vcpkg-manifest-check job is wired for the new service's path
[ ] AddressSanitizer + UBSan enabled in the debug/CI build
[ ] No raw new/delete in any path touching PII or money state
[ ] Secrets read from Infisical at startup; no secrets in build system or Dockerfile ARGs
Who owns the SOP
The docs/sdlc/cpp-service-onboarding.md document is owned by the software-architect role. It is updated when:
- A new C++ service is added (checklist validated against the new service's reality).
- A new failure class is discovered that the checklist does not cover.
feature-developer claims the checklist before opening a scaffold PR. card-groomer verifies the checklist is present in the PR description before accepting the card.
Does today's incident warrant pausing tier-1 C++ ambitions?
No. The Queue C++ scaffold is fundamentally sound.
The incident exposed a process gap (no mandatory local validation before PR), not a architectural gap. The application code — Drogon framework selection, libpqxx, CMake/sqitch/ninja pipeline, multi-stage Dockerfile, Heroku container deployment model, GoogleTest suite — is correct. ADR-0076's stack choices were made with alternatives considered; none of them are implicated by today's bugs.
What today's incident reveals is that the first time a team builds C++ infrastructure, the build contract (vcpkg.json) demands the same verification rigor as application code — but this step was not yet encoded in the process. Card B (CI guard) and Card C (SOP document) close this gap permanently.
The tier-1 C++ ambition is correct. The process that enforces it was incomplete. This review completes it.
10. Sequence: Card A Unblocking #2021
sequenceDiagram
participant Dev as feature-developer
participant Docker as gcc:13-bookworm container
participant vcpkg as vcpkg (full clone, pinned baseline)
participant PR as GitHub PR (Card A)
participant CI as GH Actions (Card B guard)
participant 2021 as Issue #2021
Dev->>Docker: docker run gcc:13-bookworm
Dev->>vcpkg: git clone --no-depth vcpkg; bootstrap
Dev->>vcpkg: vcpkg install --x-manifest-root=queue/
vcpkg-->>Dev: Resolve 9 packages; note version errors
Dev->>Dev: Fix version constraints (exact pin at nearest available)
Dev->>vcpkg: vcpkg install (repeat until clean)
Dev->>PR: Open Card A PR (vcpkg.json + vcpkg-notes.md)
PR->>CI: vcpkg-manifest-check job runs (Card B guard)
CI-->>PR: PASS
PR-->>2021: Merges; blocks lifted; Docker build proceeds
11. Migrations
Not applicable for this review. No schema changes. No application code changes.
12. Rollout Plan
| Step | Action | Owner |
|---|---|---|
| Immediate | Card A: full vcpkg.json audit + exact-pin PR | feature-developer |
| Within 3 days | Card B: CI vcpkg-manifest-check job | feature-developer |
| Within 5 days | Card C: cpp-service-onboarding.md SOP | software-architect |
| Within 5 days | Card D: document pinning policy decision (exact vs >=) | software-architect |
| Post-v1 (optional) | Card E: baseline bump to current vcpkg HEAD | feature-developer |
| Post-v1 (optional) | Card F: evaluate vcpkg-lock.json discipline | feature-developer |
Card A is the critical-path unblock for #2021. All other cards harden the process but do not block #2021.
13. Security Considerations
This incident has no direct security impact: the bugs are in build-time dependency resolution, not in runtime behavior or credential handling. The vcpkg.json does not contain secrets. No invariants were violated.
However, the following security-adjacent implications apply:
- Stale dep versions are a security surface. A baseline that is never bumped will eventually fall behind security advisories. The quarterly baseline-bump cadence (Section C) is the mitigation.
- Exact version pinning aids vulnerability scanning. Security scanners (Dependabot, Snyk, custom tooling) can only report on versions they can identify.
version>=declarations with unresolved floor values are harder to scan than exact pins. - Build supply chain integrity. The Dockerfile's full vcpkg clone (post #2028) fetches from the official microsoft/vcpkg repository. The pinned baseline SHA is immutable — the registry content at that SHA cannot change retroactively. This is the correct supply-chain model.
No credential, PII, audit, or DSR implications.
14. Open Questions
OQ-1 — Baseline bump scope for Card A: If auditing the current baseline reveals more than 1-2 packages need version adjustments, should Card A simultaneously bump to a newer baseline where all packages have clean version availability? Or should it stay at the current baseline and work within its constraints? Recommendation: stay at current baseline for Card A; baseline bump is Card E.
OQ-2 — CI runner for vcpkg-manifest-check:
The vcpkg install --dry-run step is Linux-only (x64-linux triplet). GH Actions ubuntu-latest is correct. If the project migrates to Ubicloud runners (per [[project_ci_billing]]), confirm the runner image has git and basic build tools available (they do on standard Ubicloud images). No decision needed now.
OQ-3 — vcpkg-lock.json format stability: vcpkg's lockfile format has changed across versions. If we adopt lockfiles (Card F), pin the vcpkg tool version alongside the baseline SHA. This is a decision for Card F, not now.
15. Sub-Cards (for PM to file)
The following cards are scoped for feature-developer or software-architect to claim after card-groomer grooming. Ordered by criticality.
Card A — Queue vcpkg.json full audit + lock against pinned baseline
Blocks: #2021
Size: M (2-4 hours; container work + iteration)
Risk: Medium (could surface additional version mismatches requiring judgment calls)
Dependencies: None — can be claimed immediately.
Title: fix(queue): audit all vcpkg.json packages against pinned baseline; exact-pin versions
Body:
The three deploy bugs on 2026-05-13 UTC reveal that queue/vcpkg.json was never run through vcpkg install against its pinned builtin-baseline=3508985146f1b1d248c67ead13f8f54be5b4f5da. This card completes that audit and converts all version>= constraints to exact pinned versions.
Acceptance criteria:
- [ ] Spin up gcc:13-bookworm container; full (non-shallow) vcpkg clone.
- [ ] Run vcpkg install --triplet x64-linux --x-manifest-root=queue/ against the pinned baseline.
- [ ] For each of the 9 packages: identify the exact available version at the baseline SHA. Convert version>= to version at the resolved value.
- [ ] Resolve libpqxx 7.9.1 → nearest available (expected: 7.9.2).
- [ ] Confirm no other packages have unavailable versions at the baseline.
- [ ] vcpkg install exits 0 with no errors.
- [ ] Add queue/vcpkg-notes.md: baseline SHA, verification date (UTC), per-package rationale (one sentence each).
- [ ] PR description states: "Verified: vcpkg install ran clean in container on [date] UTC."
- [ ] Do NOT modify application code. vcpkg.json + vcpkg-notes.md only.
Refs: #2021, docs/architecture/queue-cpp-scaffold-review-2026-05-13.md
Card B — Implement CI vcpkg dry-run guard (#2030)
Blocks: Future vcpkg.json changes
Size: M (half-day; GH Actions YAML + caching setup)
Risk: Low
Dependencies: Card A (should be green before the guard is wired so the initial check passes).
Title: reliability(ci): add vcpkg-manifest-check job for queue/vcpkg.json changes
Body:
Closes #2030. Adds a GH Actions job that runs vcpkg install --dry-run on every PR touching queue/vcpkg.json or queue/Dockerfile. Catches Class 1 (version-not-in-registry) and Class 2 (feature-not-defined) failures at PR time instead of at deploy attempt.
Acceptance criteria:
- [ ] New job vcpkg-manifest-check in .github/workflows/ (new file or added to existing queue CI workflow).
- [ ] Triggers on pull_request when queue/vcpkg.json or queue/Dockerfile is in the changed files.
- [ ] Runs vcpkg install --dry-run --triplet x64-linux --x-manifest-root=queue/ using a full (non-shallow) vcpkg clone at the builtin-baseline SHA from the PR's queue/vcpkg.json.
- [ ] GH Actions cache keyed on hashFiles('queue/vcpkg.json') for the vcpkg tool clone. Warm-cache job time target: under 60 seconds.
- [ ] Job fails the PR on non-zero vcpkg exit code.
- [ ] Add a grep step that fails if queue/Dockerfile contains git clone --depth for the vcpkg clone (prevents regression of #2028).
- [ ] Tested: a deliberately-broken vcpkg.json (invalid feature name) fails the check; a valid vcpkg.json passes.
- [ ] Does NOT add time to PRs that don't touch queue/vcpkg.json or queue/Dockerfile.
Refs: #2030, docs/architecture/queue-cpp-scaffold-review-2026-05-13.md
Card C — SDLC doc: C++ service onboarding checklist
Blocks: Future tier-1 C++ services (Velvet rewrite, MQ-A rewrite)
Size: S (2-3 hours; doc-only)
Risk: Low
Dependencies: None. Can be written immediately.
Title: docs(sdlc): add cpp-service-onboarding.md with vcpkg discipline checklist
Body:
Creates docs/sdlc/cpp-service-onboarding.md. Documents the mandatory checklist every new C++ service scaffold must complete before the scaffold PR is opened. Prevents the 2026-05-13 UTC incident from recurring on Velvet or MQ-A.
Acceptance criteria:
- [ ] New file at docs/sdlc/cpp-service-onboarding.md.
- [ ] Contains the 10-item checklist from Section G of docs/architecture/queue-cpp-scaffold-review-2026-05-13.md.
- [ ] Documents the exact docker run command for local vcpkg verification.
- [ ] Documents the exact vcpkg version vs version>= policy decision (per ADR-0085).
- [ ] Documents the baseline-bump cadence (quarterly or security-triggered).
- [ ] Links to docs/architecture/queue-cpp-scaffold-review-2026-05-13.md as the incident background.
- [ ] Doc is under 500 words. It is a checklist, not an essay.
Refs: docs/architecture/queue-cpp-scaffold-review-2026-05-13.md
Card D — Document and lock the Queue dep-pinning policy decision
Blocks: Card B (the CI guard needs to know which policy is enforced)
Size: S (1-2 hours; ADR update)
Risk: Low
Dependencies: None.
Title: docs(adr): [ADR-0085](https://internal-docs.raxx.app/architecture/adr/0085-flag-reconciler-bidirectional-sync.html) — vcpkg version pinning policy for tier-1 C++ services
Body:
ADR-0085 is drafted in docs/architecture/adr/0085-vcpkg-version-pinning-policy.md. It records the decision: exact version pinning (not version>=) for all tier-1 C++ services, with rationale and alternatives considered. This card moves it from "Proposed" to "Accepted" after operator review.
Acceptance criteria:
- [ ] ADR-0085 status changed from "Proposed" to "Accepted."
- [ ] Any final objections to the exact-version policy surface and are addressed in the ADR's Consequences section.
- [ ] docs/architecture/queue-cpp-scaffold-review-2026-05-13.md §C references the accepted ADR.
Refs: ADR-0085, docs/architecture/queue-cpp-scaffold-review-2026-05-13.md
Card E (optional) — Bump vcpkg builtin-baseline to current
Does not block v1 launch.
Size: M (2-4 hours; re-audit all packages at new baseline)
Risk: Medium (new baseline may surface additional version constraint changes)
Dependencies: Card A must be merged and green first.
Title: chore(queue): bump vcpkg builtin-baseline to current HEAD; re-verify all packages
Body:
Optional post-v1 maintenance. Moves queue/vcpkg.json's builtin-baseline from 3508985146f1b1d248c67ead13f8f54be5b4f5da to a recent vcpkg HEAD SHA. Gets newer package versions (potentially including security patches). Card A must be complete and the CI guard (Card B) must be wired before this card is claimed.
Acceptance criteria:
- [ ] builtin-baseline updated to a SHA from within the last 30 days.
- [ ] vcpkg install runs clean in a fresh container at the new baseline.
- [ ] All 9 package version constraints updated to exact versions available at the new baseline.
- [ ] CI vcpkg-manifest-check (Card B) passes on the PR.
- [ ] queue/vcpkg-notes.md updated with new verification date.
- [ ] PR is standalone — no application code changes.
Refs: docs/architecture/queue-cpp-scaffold-review-2026-05-13.md §F
Card F (optional) — Evaluate and adopt vcpkg-lock.json discipline
Does not block v1 launch.
Size: S-M (research + decision + optional implementation)
Risk: Low
Dependencies: Card A, Card B.
Title: chore(queue): evaluate vcpkg-lock.json; adopt or explicitly decline
Body:
vcpkg supports a machine-generated lockfile that proves the author ran vcpkg install and provides fully reproducible builds independent of exact-version pinning alone. ADR-0086 records the current decision (no lockfile for v1). This card re-evaluates after v1 launch with operational experience.
Acceptance criteria:
- [ ] Review vcpkg lockfile format stability at current vcpkg version.
- [ ] If adopted: vcpkg-lock.json committed alongside vcpkg.json; regeneration step documented in onboarding checklist (Card C update).
- [ ] If declined again: ADR-0086 updated with additional rationale from operational experience.
- [ ] Either way: decision documented.
Refs: ADR-0086, docs/architecture/queue-cpp-scaffold-review-2026-05-13.md §F