Raxx · internal docs

internal · gated

ADR-0087: CI Guard for vcpkg Manifest Changes

Status: Accepted
Date: 2026-05-13 UTC
Author: software-architect
Closes: #2030
Incident background: docs/architecture/queue-cpp-scaffold-review-2026-05-13.md


Context

The three vcpkg bugs on 2026-05-13 UTC (#2028, #2029, bug 3 in Card A) each took ~20 minutes to discover because the only mechanism to run vcpkg install was a full Docker build. There was no step in the CI pipeline that validates the manifest before the full build is attempted.

#2030 was filed by the SRE agent to address this. This ADR records the design decision for the guard.


Decision

Add a vcpkg-manifest-check GH Actions job that runs on every PR touching queue/vcpkg.json or queue/Dockerfile.

The job runs vcpkg install --dry-run against the manifest and fails the PR if vcpkg exits non-zero. This catches Class 1 (version-not-in-registry) and Class 2 (feature-not-defined) bugs in seconds rather than 20-minute build attempts.

See Card B in docs/architecture/queue-cpp-scaffold-review-2026-05-13.md for the full implementation specification.


What the guard catches

Bug class Caught? Notes
Version not in registry at baseline (Class 1) Yes --dry-run resolves the graph
Feature not defined at baseline (Class 2) Yes --dry-run validates feature names
Cross-package version incompatibility at link time (Class 3) No Only caught by full build
Build-cache pollution (Class 4) No Separate cache-busting strategy
Stale baseline (Class 5) Partially --dry-run fails when new packages can't be satisfied; periodic baseline bump is the primary mitigation
--depth 1 regression in Dockerfile Yes Explicit grep step added alongside the dry-run

The guard does not replace the full Docker build job. It is a fast first gate that catches the most common manifest authoring errors.


Caching

The vcpkg install --dry-run step does not compile packages. It is fast. However, cloning the full vcpkg repository is ~600MB. The job uses GH Actions cache keyed on hashFiles('queue/vcpkg.json'):

- uses: actions/cache@v3
  with:
    path: /opt/vcpkg
    key: vcpkg-tool-${{ hashFiles('queue/vcpkg.json') }}
    restore-keys: vcpkg-tool-

On a warm cache: job completes in 10-20 seconds.
On a cold cache (first run, or after vcpkg.json change): job completes in 2-3 minutes.

This is separate from the binary package cache used by the full Docker build.


Consequences

Positive: - Catches the most common vcpkg manifest authoring errors at PR time, not at deploy attempt. - Does not add meaningful time to PRs that do not touch queue/vcpkg.json (the job does not trigger). - Extensible: when a second C++ service is added, extend the trigger paths rather than creating a new job.

Negative: - Adds 10-20 seconds to PRs that touch queue/vcpkg.json. This is intentional. - Does not catch Class 3 (link-time) conflicts — those still require a full build to surface.


Alternatives Considered

Validate on every PR (not just vcpkg.json changes)

Would slow down all PRs for no benefit. queue/vcpkg.json changes are infrequent.

Vendor all dependencies (no vcpkg)

Copy all C++ dependencies into queue/vendor/. Eliminates the external registry dependency entirely.

Rejected: Vendoring ~9 C++ libraries at their full source sizes would add hundreds of MB to the repository. Updating any dependency becomes a manual copy operation. The supply-chain benefits of vcpkg (signed releases, port overlays, ABI caching) are lost. Not appropriate for a service this size.

Switch to CMake FetchContent (no vcpkg)

CMake FetchContent can download and build dependencies at configure time using a CMakeLists.txt declaration. No separate package manager.

Rejected for now: FetchContent does not have a centralized port registry, so version discovery and conflict detection are manual. vcpkg's baseline model is strictly better for reproducibility. If vcpkg ever becomes untenable, FetchContent is the fallback — ADR-0076 notes Drogon can be vendored via FetchContent as a bus-factor mitigation.