Raxx · internal docs

internal · gated

CI Migration Candidates — Architect Filtering

Date: 2026-05-13 UTC Author: architect-agent Status: Proposal — decision pending BLR financial/licensing review Related issues: #726, #728, #1595 Source universe: https://github.com/ligurio/awesome-ci


1. Executive Summary

GitHub Actions is adequate for Raxx's workload today but is accumulating friction: a YAML-parse bug has wasted ~35 runner allocations per day since 2026-05-06, environment protection rules are missing despite the paid plan (GH support ticket in progress), and two workflows sharing the name "CI" make branch-protection semantics unreliable. The operator memory already notes Ubicloud as the standing drop-in candidate (#728) when monthly burn exceeds 5,000 minutes. This document widens the lens: 60+ tools in the awesome-ci universe were filtered against Raxx's workload shape, leaving three shortlisted candidates plus one architect-added tool (Buildkite) not present in the list. No migration should begin before the 2026-05-23 launch. This document is inputs for BLR financial/licensing research and an operator decision after launch.


2. Current GHA State and Friction Signals

Signal Impact
ci.yml YAML parse error line 222 (since 2026-05-06) ~35 wasted runner allocations/day; CI status unreliable
Two workflows named "CI" (ci.yml, ci-pr.yml) Branch protection cannot distinguish them; gate is semantically broken
Environment protection rules entitlement bug on Team plan Cannot enforce deployment approvals; GH support ticket in progress
Nightly security scan PR-creation failure recurring (#1820 fix did not hold; #1975 filed) Ops gap; security coverage is intermittent
34 total workflows; 30+ runs/24h High runner consumption; potential billing threshold risk per #726
Ubicloud drop-in already on the roadmap (#728) Operator has pre-committed to one migration path; architect review may confirm or redirect
Pre-launch: 2026-05-23 hard deadline Any migration is post-launch work; no changes to CI infra in the next 10 days

3. Workload Shape

Dimension Detail
Languages Python 3.11, JavaScript/React (CRA), C++ (CMake), HCL (Terraform)
Services backend_v2 (Flask), console (Flask), velvet (Flask), queue (C++)
Frontend frontend/trademaster_ui (CRA, Node 18+)
IaC roots cf-access, cf-pages, freescout, queue, waf, support-attachments, email-delivery-stack, sso-oidc-gateway
Migrations Alembic on Postgres (queue, console, velvet) + SQLite (backend_v2)
Cron workflows daily-card-groomer, nightly-security-scan, billing-collector, drift-orchestrator, flag-drift, freescout-backup, ci-digest — 7 distinct schedules
Operator triggers workflow_dispatch calls from Console Ops Dispatch surface
Deploy targets Heroku (3 apps), Cloudflare Pages (5 sites), Cloudflare Workers
Notifications Postmark (email), Slack (ops digest + incident pings)
Composite actions notify-deploy-status, load-vault-secrets
Secret sources Infisical (vault.raxx.app), AWS SSM
Team size Solo operator + AI agent fleet
Repository host GitHub (permanent; not under discussion)

4. Filter Rationale

All 60+ tools in the awesome-ci list were tested against these hard filters:

Hard filters (any fail = excluded):

  1. Must connect to GitHub as SCM source (repo stays on GitHub).
  2. Must support cron-equivalent scheduling (7 cron workflows).
  3. Must support operator-triggered runs (workflow_dispatch equivalent).
  4. Must support YAML or declarative pipeline definition (no GUI-only tools).
  5. Must support secret injection compatible with Infisical and/or AWS SSM (via OIDC, env-var API, or generic secret store hookup).
  6. Must NOT require JVM on the runner agent side (Jenkins, Bamboo, TeamCity, GoCD excluded for ergonomics — the Python/C++ workload has no JVM dependency and adding one inflates runner images and ops overhead for a solo operator).
  7. Must be actively maintained (not abandoned — checked last commit date / starred count signals).

Soft filters (used to rank survivors):


5. Excluded List

Tool Reason excluded
Abstruse CI Last meaningful activity 2021; effectively abandoned.
Agola No native cron scheduling; requires external trigger for time-based runs.
Appcircle.io Mobile-only (iOS/Android/Flutter); does not support Python/C++/Terraform workloads.
App Center (Azure) Mobile-only CI; Microsoft-linked; does not support our workload shape.
AppVeyor .NET-first; Linux support is secondary; primary identity is Windows/C# shops.
Assertible Post-deployment API testing only, not a general CI platform.
Azure DevOps Pipelines Vendor lock-in is high (Microsoft ecosystem); trades GH lock for Azure lock; YAML syntax divergence is large.
Bamboo Requires Atlassian stack; JVM-based agent; excluded by JVM filter.
Betterscan Static analysis only (SAST), not a CI platform.
BitBucket Pipelines Requires BitBucket SCM; repo stays on GitHub; hard filter 1 fails.
Bitrise Mobile-only; iOS/Android/React Native focus; excludes Python/C++ infra workloads.
builds.sr.ht GitHub integration is indirect (via dispatch.sr.ht); no native GitHub status checks; weak PR integration.
Buddy Paid-only at any meaningful scale; GUI-heavy workflow builder; YAML support is partial.
CDS (OVH) Enterprise-grade complexity; Kubernetes required for self-host; high ops overhead for solo.
Chrono CI Appears unmaintained; site loads but no recent public activity or changelog.
CICube Analytics/observability layer over GHA, not a CI replacement; relevant as a companion tool but out of scope for this comparison.
CircleCI Cloud-only at reasonable price; no self-hosted compute for free tier; pricing changes 2023 hurt OSS trust; minor vendor lock-in concern.
Codacy Code quality analysis only; not a CI execution platform.
Code Climate Code quality analysis only.
CodeFresh Docker/Kubernetes-native CI; adds K8s dependency Raxx does not have; overkill for Heroku+CF targets.
Codemagic Flutter/mobile focus; does not address Python/C++/Terraform shape.
Codeship Cloud-only; pricing moved to Cloudbees ownership; uncertain maintenance trajectory.
Concourse CI Self-hosted; steep learning curve (unique pipeline-as-code model differs from all others); no YAML that maps to GHA concepts; high migration cost; small community.
Continua CI Windows-first; requires FinalBuilder integration; does not fit Linux-native stack.
Continuous (continuous.sh) EU-hosted managed runners for GHA/GitLab CI; not a CI system replacement — a runner alternative only. Relevant as a cost solution but out of scope for full migration comparison.
continuousphp PHP-only.
Coveralls Coverage reporting only; not a CI execution platform.
Coverity Static analysis only (C/C++/Java/C#).
Crow CI Very early-stage; limited documentation; GitHub support listed but production-readiness unclear.
Dagger Pipeline-as-code SDK (runs inside existing CI); not a standalone CI replacement; would run on top of GHA or another runner host.
Drone Community edition maintenance has stalled since Harness acquisition; Woodpecker (community fork) is the active successor; see Woodpecker entry.
Ebert Static analysis PR comments only; not a CI execution platform.
Evergreen MongoDB-internal tool open-sourced; limited external adoption; complex distributed setup; no practical path for solo operator.
flow.ci Last release 2022; community inactive.
GitGud GitLab-hosted; requires migrating repo off GitHub; hard filter 1 fails.
GitLab CI/CD Requires GitLab SCM or cross-SCM mirroring; repo stays on GitHub; native GitHub status check integration is a workaround, not first-class; trades one lock for another with worse economics at this scale.
gitlab-ci-local Local dev runner for GitLab CI YAML; not a hosted CI service.
GoCD JVM agent required on runner (Go server + agent are Java); excluded by JVM filter.
Hound CI Code style/lint comments only; not a CI execution platform.
Hydra Nix-based; requires Nix toolchain on all runners; adds a heavy infrastructure dependency.
Jaypore CI Gitea-only SCM; hard filter 1 fails.
Jenkins JVM on controller and agents; plugin ecosystem maintenance burden; excluded by JVM filter + ops overhead for solo.
Kraken CI Test-execution focus; complex distributed setup (Starlark/Python workflows); low adoption outside its niche.
Laminar CI Minimal CGI-based CI; no concept of pull request integration or status checks; effectively a job runner, not a CI platform.
minci Toy/research project; no documentation or production use evidence.
mvoCI Very early-stage; minimal documentation and adoption.
PandaCI TypeScript-only pipeline definitions; does not support YAML pipelines; young project.
Peakflow Last meaningful activity 2019; abandoned.
Pipelight CLI automation tool, not a hosted CI platform; no pull request integration.
Previs Local Travis CI runner; no hosted component; not a CI migration candidate.
Probo.CI Focused on environment-per-PR for Drupal/CMS stacks; does not generalize to Python/C++ workloads.
RazorOps Container-native Kubernetes CI; adds K8s dependency; unclear longevity (small startup, limited public traction).
Saucelabs Browser/device testing cloud only; not a general CI platform.
Scrutinizer Code quality analysis with limited build execution; primary value is language-specific quality metrics, not build/deploy pipelines.
Semaphore CE (self-hosted) Interesting — see shortlist note. Requires Kubernetes to self-host; ops overhead too high for solo pre-launch. Cloud version has competitive pricing but is SaaS lock. Borderline excluded.
Sider Code review automation (PR comments); not a CI execution platform.
SonarQube Static analysis; integrates as a CI step but is not a CI platform itself.
StyleCI PHP/JS code style enforcement only.
SurplusCI Dedicated runner add-on (not a CI replacement); plugs into existing CI. Similar to Continuous.sh in scope.
TeamCity JVM server and agent; excluded by JVM filter.
Tekton Kubernetes-native CRD pipeline system; adds K8s dependency; no practical self-host path for solo operator without a K8s cluster.
Travis CI Post-Idera acquisition pricing removed free tiers; ongoing community trust deficit; not recommended for new projects.
Thundra Foresight Java-only test visibility; not a CI platform.
Vela Docker-based; GitHub support via webhook; self-hosted required; limited community and documentation; uncertain maintenance.
Wercker Oracle acquisition; service appears to be winding down; no safe choice for new adoption.
Zuul OpenStack project gating system; requires Gerrit or deep GitHub configuration; enterprise complexity for solo operator.

6. Shortlist

Three candidates survive hard filters plus the architect-added Buildkite. Buildkite is not in the awesome-ci list; it is included because it is a widely-deployed production tool with direct relevance to Raxx's hybrid (cloud-agent + self-hosted) shape and because issue #728 (Ubicloud) implies operator comfort with hybrid runner models.


6.1 Ubicloud Managed Runners (GHA-compatible)

URL: https://www.ubicloud.com/use-cases/github-actions License: Ubicloud core is Apache 2.0 (OSS); managed runner service is SaaS Maintainer: Ubicloud Inc. (Y Combinator W24)

Fit summary: This is not a CI platform replacement — it is a drop-in GitHub Actions runner replacement. Pipelines stay as .github/workflows/*.yml files with zero YAML changes. Ubicloud runners are already named as the standing trigger in operator memory (project_ci_billing: "Ubicloud drop-in if burn sustains >5,000 min/day"). The value proposition is pure cost reduction: Ubicloud charges approximately 1/10 the per-minute rate of GitHub-hosted runners and supports GitHub OIDC token exchange, which is the mechanism Raxx uses to authenticate into AWS SSM and Infisical. Issue #728 already exists and is groomed. This is the lowest-friction path.

Top 3 fits: 1. Zero YAML migration — existing 34 workflows run unchanged. 2. GitHub OIDC support means AWS SSM and Infisical auth are unaffected. 3. Directly addresses the cost-trigger threshold already set by the operator (#728).

Top 3 misfits / risks: 1. SaaS dependency on a young startup (YC W24, 2024 founding); longevity unproven. 2. Does not address the structural friction signals (duplicate CI name, missing environment protection rules) — those are GHA platform issues that Ubicloud cannot fix. 3. Runner performance claims (10x cheaper, comparable speed) have not been validated against Raxx's C++ CMake build times or Terraform plan times; BLR should confirm benchmark data.

Migration effort: XS No YAML changes. Runner label swap from ubuntu-latest to ubicloud-standard-2 (or equivalent) in workflow files. Can be done workflow-by-workflow as a canary.

What you would give up: Nothing in terms of GHA features — all GHA syntax and APIs remain active. You give up GitHub's guarantee of runner availability SLA (Ubicloud is smaller with less redundancy).

What you would gain: Estimated 60-80% cost reduction on runner minutes. Eliminates the billing-threshold anxiety driving #726 and #728.


6.2 Woodpecker CI (self-hosted)

URL: https://woodpecker-ci.org/ License: Apache 2.0 Maintainer: Woodpecker CI community (active fork of Drone CI post-Harness acquisition) Stars: ~4,500 (woodpecker-ci/woodpecker) Last commit: Active (weekly releases as of early 2026)

Fit summary: Woodpecker is the community-maintained successor to Drone CI. It runs as a small Go binary + agent pair with a Docker executor. Pipeline definitions are YAML files stored in the repository. Woodpecker connects to GitHub via OAuth app and posts status checks natively. It supports cron pipelines, manual triggers (equivalent to workflow_dispatch via the UI or API), and secrets storage. The self-hosted model means runner costs are infrastructure costs (a $5-$12/month VPS or an existing EC2 instance). Woodpecker does not have a native Infisical integration, but its secret injection model (env vars at pipeline runtime) is compatible with a pre-step that calls the Infisical API or AWS SSM Parameter Store directly. The YAML syntax is similar to Drone and partially similar to GHA; migration requires a full rewrite of pipeline files.

Top 3 fits: 1. Zero ongoing SaaS cost — infrastructure cost only (single VPS can host the server + agent). 2. Apache 2.0 license; no vendor lock-in; operator owns the data. 3. Active community with frequent releases; not at risk of Harness-style acquisition disruption.

Top 3 misfits / risks: 1. Full YAML rewrite required — 34 workflows need to be ported. Woodpecker's pipeline model (steps inside a pipeline, no composite actions equivalent) maps imperfectly to GHA's uses: composite actions (load-vault-secrets, notify-deploy-status need rewriting as pre/post steps). 2. No native environment protection rules or deployment environments equivalent — the GHA entitlement gap is replaced with a different feature gap. 3. Ops overhead: the operator must maintain the Woodpecker server binary, agent(s), and a Postgres database for pipeline state. Pre-launch this adds risk.

Migration effort: L 34 workflow files need full rewrites. Composite actions become Woodpecker pipeline templates or shell scripts. Cron syntax is different. Secret injection model requires wrapping Infisical/SSM calls into pipeline init steps for all 34 workflows. Estimated 2-3 weeks of focused migration work post-launch.

What you would give up: - GHA Marketplace actions (third-party uses: steps — each needs a Woodpecker equivalent or shell substitution) - Composite actions (load-vault-secrets, notify-deploy-status need to be rebuilt) - GitHub-native deployment environments and environment protection rules - The agent dispatch surface in Console (Ops Dispatch currently calls workflow_dispatch via GH API; Woodpecker has an API but Ops Dispatch would need updating)

What you would gain: - Infrastructure cost predictability (fixed VPS cost vs per-minute billing) - Full data ownership / no GH billing entanglement - Pipeline YAML lives in the repo alongside the application code, not in .github/ with GHA-specific syntax


6.3 Buildkite (architect addition — not in awesome-ci list)

URL: https://buildkite.com/ License: Buildkite Agent is MIT (open source); platform is commercial SaaS Maintainer: Buildkite Pty Ltd (founded 2013; profitable, independent) Stars: ~3,000 (buildkite/agent)

Why included despite not being in awesome-ci: Buildkite is a production-grade CI platform used by Shopify, Airbnb, Canva, and others at scale. It follows a hybrid model: the SaaS controller handles scheduling, pipeline UI, and PR status checks, while compute runs on operator-owned infrastructure (EC2, Heroku, local machine). The agent is MIT-licensed and open source. This model directly addresses Raxx's tension between GHA's managed convenience and the cost/control concerns driving #728. The awesome-ci list's omission is a meaningful gap in that list.

Fit summary: Buildkite pipelines are YAML files in the repository (.buildkite/pipeline.yml). The platform connects to GitHub via OAuth app and posts status checks natively. It supports cron-equivalent scheduled builds, manual triggers (webhook + API, direct equivalent to workflow_dispatch), and a secret injection model based on environment hooks. Infisical integration can be achieved via the agent's pre-command hook calling the Infisical CLI before each step. AWS SSM integration is similar. The free tier covers 1 user and 1 agent — exactly Raxx's current shape. Paid tiers start at $35/month for small teams.

Top 3 fits: 1. Hybrid model: SaaS controller (no infra to maintain) + operator-owned compute (cost control). Directly matches the operator's stated comfort with self-hosted runners. 2. The free tier (1 user, 1 agent) covers Raxx's solo-operator shape at $0 for the platform portion. 3. Mature, stable platform (12 years); no acquisition risk visible; bootstrapped/profitable.

Top 3 misfits / risks: 1. YAML migration required — .buildkite/pipeline.yml syntax is simpler than GHA but not compatible. 34 workflows need rewriting (roughly S-M effort per workflow). 2. The SaaS controller layer, while small, is still a vendor dependency. The agent is open source but the scheduling/UI plane is not. 3. GHA Marketplace actions have no Buildkite equivalent — each third-party uses: step needs a shell script substitution or a Docker plugin.

Migration effort: M Pipeline syntax is simpler than GHA (fewer special constructs). Composite actions map to pipeline upload steps or plugins. Cron and dispatch are first-class. Estimated 1-2 weeks of focused migration work post-launch. Smaller migration than Woodpecker because the YAML model is closer to shell-first rather than requiring a full mental model shift.

What you would give up: - GHA Marketplace ecosystem (all uses: third-party actions need shell equivalents) - GitHub deployment environments and protection rules (Buildkite has its own deployment model) - Console Ops Dispatch workflow_dispatch call — needs updating to Buildkite API trigger

What you would gain: - Compute cost control (runner is operator-owned EC2 or any machine) - Simpler pipeline YAML (Buildkite's model is closer to sequential shell steps; less YAML magic) - Better pipeline visualization and test analytics (built-in) - Stable vendor with 12-year track record


6.4 Cirrus CI (cloud, pay-per-second)

URL: https://cirrus-ci.org/ License: Cirrus CI is proprietary SaaS; some components open source Maintainer: Cirrus Labs (small team; active)

Fit summary: Cirrus CI integrates natively with GitHub and posts status checks. It supports Linux, macOS, Windows, and FreeBSD. Configuration is a .cirrus.yml YAML file in the repository. Pricing is per-second for private projects. The platform supports cron tasks, manual triggers via the API, and brings-your-own-compute (GCP, AWS, Azure task definitions). For Raxx's workload, the relevant feature is that it supports Docker containers and supports arbitrary shell commands, meaning Python, Node, C++, and Terraform all work. Secret injection is via encrypted variables stored in the Cirrus dashboard (not native Infisical/SSM), but a pre-step calling Infisical CLI or AWS CLI achieves the same effect. The platform is lean and has no JVM dependency.

Top 3 fits: 1. Native GitHub integration with first-class status checks and PR comments. 2. Pay-per-second billing (not per-minute) reduces waste from short jobs. 3. Brings-your-own-compute option (GCP/AWS task definitions) for cost control.

Top 3 misfits / risks: 1. Smaller team behind the product; longevity risk if the company does not scale. 2. YAML migration required — .cirrus.yml syntax is different from GHA; estimated M effort. 3. Weaker ecosystem than GHA or Buildkite; no marketplace-style reusable components.

Migration effort: M YAML rewrite required for all 34 workflows. Cron and manual triggers are supported but with different syntax. Pre-step secret injection needs to be coded for each workflow. Estimated 1-2 weeks post-launch.

What you would give up: - GHA Marketplace actions - Composite actions (need shell equivalents) - GitHub deployment environments

What you would gain: - Per-second billing (lower waste than GHA per-minute) - Bring-your-own-compute for larger jobs (C++ CMake builds, Terraform plan)


7. Recommendation

Leading candidate: Ubicloud Managed Runners (6.1)

Rationale: The explicit operator memory entry (project_ci_billing) already names Ubicloud as the drop-in trigger at 5,000 min/month, and issue #728 (groomed, open) is the implementation card. This candidate requires zero YAML changes — all 34 workflows, composite actions, secret injection patterns, and the Console Ops Dispatch workflow_dispatch calls remain entirely unchanged. The migration risk is XS. The cost reduction is material. BLR's primary question is runner pricing validation and startup longevity diligence.

This candidate does not solve the GHA structural friction signals (duplicate CI name, missing environment protection rules). Those are separate cleanup tasks independent of runner choice. Ubicloud is a cost solution, not a platform solution.

Runner-up: Buildkite (6.3)

If BLR's diligence disqualifies Ubicloud (e.g., startup longevity concern, pricing not competitive at Raxx's actual volume), Buildkite is the runner-up. It is the only other candidate with a credible free-tier path for a solo operator, a mature commercial track record, and a hybrid model that keeps compute costs on operator-owned infrastructure. Migration effort is M (1-2 weeks post-launch) rather than XS, but the trade is a stable 12-year vendor vs a 2-year-old startup.


8. Open Questions for BLR

Ubicloud (leading candidate): 1. What is Ubicloud's current per-minute pricing for standard-2 (2 vCPU, 8 GB) and standard-8 (8 vCPU, 32 GB) runners, and how does it compare to GHA's ubuntu-latest at Raxx's actual monthly consumption (current average: 30+ runs/day across 34 workflows)? 2. What are the SLA, data residency, and security certifications for Ubicloud's managed runner fleet? Does Ubicloud publish a SOC 2 report or equivalent? 3. What are the contract terms for Ubicloud's managed runner service — is it month-to-month, and is there a fallback (self-hosted Ubicloud OSS) if the SaaS is discontinued?

Buildkite (runner-up): 4. Does the Buildkite free tier (1 user, 1 agent) have any restrictions on build minutes, artifact storage, or pipeline count that would constrain Raxx's 34-workflow shape? What is the price of the first paid tier and what does it add? 5. What are Buildkite's data processing terms under GDPR — specifically, does pipeline log data (which may contain redacted secrets or PII in error traces) transit through Buildkite's SaaS controller, and what is the DPA availability?

General: 6. Is there a path to consolidate the two "CI" workflow naming conflict and restore reliable branch protection on GitHub Actions today, without a runner migration? If yes, does that change the urgency of migration at all?


9. Risks and Caveats


10. Migration Timeline Sketch

This is post-launch work. Do not begin before 2026-05-23.

2026-05-13 to 2026-05-23   PRE-LAUNCH FREEZE — no CI infra changes
                            Punch-list items OK: fix ci.yml YAML parse error,
                            rename duplicate "CI" workflow, pursue GH support ticket
                            on environment protection rules

2026-05-24 to 2026-05-30   BLR financial/licensing research (per Section 8 questions)
                            Operator reviews BLR report; makes go/no-go on migration

Week of 2026-06-02          IF go: Ubicloud canary
                            Swap 1-2 non-critical workflows (e.g., ci-digest-cron) to
                            Ubicloud runners. Observe cost, latency, reliability for
                            1 week before expanding.

Week of 2026-06-09          Ubicloud expansion
                            Migrate remaining workflows runner label. Full migration
                            takes 1-2 hours of label substitution if canary passes.

Week of 2026-06-09+         IF Buildkite selected instead (M effort):
                            Port workflows one service at a time (backend_v2 first,
                            then console, velvet, queue, frontend, Terraform).
                            Update Console Ops Dispatch to call Buildkite trigger API.
                            Estimated 2-3 sprints post-launch.

Appendix: Scoring Matrix

Candidate Ops overhead Vendor lock-in Migration cost Cron Dispatch Secret injection Cost model GH PR depth Multi-lang Community
Ubicloud Low Med (SaaS startup) XS (zero YAML change) N/A (stays GHA) N/A N/A (stays GHA OIDC) ~$0.008/min est. Native (GHA) Native (GHA) Growing
Woodpecker Med (self-host) Low (Apache 2.0) L (full rewrite) Yes Yes (API) Pre-step hook VPS fixed cost Good (status checks) Yes Active
Buildkite Low-Med Med (SaaS + MIT agent) M (full rewrite, simpler syntax) Yes Yes (API) Agent hooks Free-$35/mo Good (status checks) Yes Mature
Cirrus CI Low Med (SaaS) M (full rewrite) Yes Yes (API) Pre-step Per-second Good (status checks) Yes Small