strategies

Cost Reduction Strategies

Overview

This guide covers proven strategies to reduce GitLab CI/CD compute minute consumption. Strategies are organized by impact level and implementation complexity.

Strategy 1: Eliminate Unnecessary Pipelines

Problem

Running pipelines that provide no value wastes minutes:

Full pipelines on draft MRs during development
Pipelines when only documentation changes
Pipelines on mirror updates or automated commits
Duplicate pipelines (branch + MR) for the same commit

Solution A: Skip Pipelines on Draft MRs

Basic Implementation:

workflow:
  rules:
    # Skip pipeline for draft MRs
    - if: $CI_MERGE_REQUEST_TITLE =~ /^(Draft|WIP|draft|wip):/
      when: never
    # Run for MRs and default branch
    - if: $CI_MERGE_REQUEST_IID
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

Advanced - Allow Limited Jobs:

workflow:
  rules:
    - if: $CI_MERGE_REQUEST_TITLE =~ /^Draft:/
      variables:
        DRAFT_MODE: "true"
    - when: always

# Only lint/fast checks in draft mode
lint:
  rules:
    - when: always
  script:
    - npm run lint

test:
  rules:
    - if: $DRAFT_MODE != "true"
  script:
    - npm run test

deploy:
  rules:
    - if: $DRAFT_MODE != "true" && $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
  script:
    - npm run deploy

Savings: 20-40% for teams that iterate on MRs

Solution B: Disable Mirror Update Pipelines

workflow:
  rules:
    - if: $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_MESSAGE =~ /mirrored from/
      when: never
    - when: always

Solution C: Reduce Scheduled Pipeline Frequency

Before:

# Runs every hour = 720 pipelines/month
schedule:
  cron: "0 * * * *"

After:

# Runs every 6 hours = 120 pipelines/month
schedule:
  cron: "0 */6 * * *"

# Or only weekdays
schedule:
  cron: "0 9 * * 1-5"  # 9 AM Mon-Fri = 20 pipelines/month

Savings: Up to 80% on scheduled pipeline costs

Solution D: Prevent Duplicate Pipelines

Problem: Push to MR branch triggers both branch and MR pipelines.

workflow:
  rules:
    # For merge requests, only run merge request pipeline
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    # For default branch, run branch pipeline
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
    # For tags
    - if: $CI_COMMIT_TAG
    # Skip all other branch pipelines

Savings: 50% reduction in duplicate executions

Strategy 2: Auto-Cancel Redundant Pipelines

Problem

When pushing multiple commits rapidly, old pipelines continue running even though results are obsolete.

Example:

10:00 - Push commit A  Pipeline #100 starts (20 min)
10:05 - Push commit B  Pipeline #101 starts (20 min)
10:10 - Push commit C  Pipeline #102 starts (20 min)

Without auto-cancel: All 3 run = 60 minutes wasted
With auto-cancel: Only #102 runs = 20 minutes used

Solution: Workflow Auto-Cancel

Basic Configuration:

workflow:
  auto_cancel:
    on_new_commit: interruptible

Options:

interruptible (recommended): Cancel immediately when new commits arrive
conservative: Only cancel if all jobs are marked interruptible: true

Job Configuration:

# Most jobs can be interrupted
build:
  interruptible: true
  script:
    - npm run build

test:
  interruptible: true
  script:
    - npm test

# Only deployment should NOT be interrupted
deploy:
  interruptible: false  # Don't cancel mid-deploy!
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
  script:
    - npm run deploy

Advanced - Cancel on Job Failure:

workflow:
  auto_cancel:
    on_new_commit: interruptible
    on_job_failure: all  # Cancel all jobs if one fails

Savings: 20-40% for active development branches

Strategy 3: Skip Jobs When Files Unchanged

Problem

Running tests for unchanged components wastes minutes.

Example: Documentation change triggers full test suite

Solution: rules:changes

Basic Pattern:

# Only run tests when code changes
test:
  rules:
    - changes:
        - "src/**/*"
        - "package*.json"
  script:
    - npm test

# Only build docs when docs change
docs:
  rules:
    - changes:
        - "docs/**/*"
        - "*.md"
  script:
    - mkdocs build

Advanced - Skip Entire Pipeline:

workflow:
  rules:
    # Skip if only docs/config changed
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
      changes:
        paths:
          - "**/*.md"
          - "docs/**/*"
        compare_to: $CI_MERGE_REQUEST_TARGET_BRANCH_NAME
      when: never
    - when: always

Monorepo Pattern:

# Only test affected services
test:agent-mesh:
  rules:
    - changes:
        - "services/agent-mesh/**/*"
        - "shared/libraries/**/*"  # Shared dependencies
  script:
    - cd services/agent-mesh && npm test

test:agent-router:
  rules:
    - changes:
        - "services/agent-router/**/*"
        - "shared/libraries/**/*"
  script:
    - cd services/agent-router && npm test

Hash-Based Change Detection:

# Compare against base branch
test:
  rules:
    - if: $CI_MERGE_REQUEST_IID
      changes:
        paths:
          - "src/**/*"
        compare_to: $CI_MERGE_REQUEST_TARGET_BRANCH_NAME
  script:
    - npm test

Savings: 15-25% in monorepos, 10-15% in standard projects

Strategy 4: Fail Fast

Problem

Waiting for all jobs to complete when early failures make success impossible.

Example: Build fails but 15 test jobs continue running = wasted minutes

Solution A: Interruptible Jobs + Auto-Cancel

workflow:
  auto_cancel:
    on_job_failure: all  # Cancel all jobs when one fails

# Mark jobs as interruptible
lint:
  interruptible: true
  script:
    - npm run lint
    - exit 1  # If this fails, cancel everything

test:unit:
  interruptible: true
  needs: [lint]  # Only run after lint passes
  script:
    - npm test

test:e2e:
  interruptible: true
  needs: [lint]
  script:
    - npm run test:e2e

Options:

all: Cancel all remaining jobs
none: Don't auto-cancel (default)

Solution B: Job Dependencies with needs

stages:
  - validate
  - test
  - deploy

# Fast validation first
lint:
  stage: validate
  script:
    - npm run lint

# Tests only run if lint passes
test:unit:
  stage: test
  needs: [lint]
  script:
    - npm test

test:integration:
  stage: test
  needs: [lint]
  script:
    - npm run test:integration

# Deploy only if all tests pass
deploy:
  stage: deploy
  needs: [test:unit, test:integration]
  script:
    - npm run deploy

Solution C: Early Exit in Scripts

test:
  script:
    # Run fast checks first
    - npm run lint || exit 1
    - npm run type-check || exit 1
    - npm run security-check || exit 1
    # Only run slow tests if fast checks pass
    - npm run test:unit
    - npm run test:integration
    - npm run test:e2e

Savings: 10-20% by stopping failed pipelines early

Strategy 5: Parallel Execution Optimization

Problem

Sequential jobs waste time (but don't save minutes - they just slow you down).

Note: Parallelization reduces pipeline duration, not compute minutes consumed. However, it improves developer experience and can reduce costs by enabling faster failure detection.

Solution: Strategic use of needs

Before (Sequential):

# Total duration: 40 minutes
build:
  script: sleep 600  # 10 min

test:unit:
  script: sleep 600  # 10 min

test:integration:
  script: sleep 1200  # 20 min

After (Parallel):

build:
  script: sleep 600  # 10 min

test:unit:
  needs: [build]
  script: sleep 600  # 10 min (starts after build)

test:integration:
  needs: [build]
  script: sleep 1200  # 20 min (parallel with unit)

# Total duration: 30 minutes (10 build + 20 integration)
# Compute minutes: Still 40 minutes

Best Practice: Parallelize independent jobs, sequence dependent jobs

Strategy 6: Resource Class Selection

Problem

Using oversized runners or expensive runner types unnecessarily.

Cost Factor Reference

Runner Type	Cost Factor	Example Job Cost
Linux Small	1x	10 min job = 10 min
Linux Medium	2x	10 min job = 20 min
Linux Large	4x	10 min job = 40 min
Windows	2x	10 min job = 20 min
macOS	6x	10 min job = 60 min

Solution: Right-Size Runners

Use Tags to Select Runners:

# Small jobs - use small runners
lint:
  tags:
    - saas-linux-small-amd64
  script:
    - npm run lint

# Heavy builds - use medium
build:
  tags:
    - saas-linux-medium-amd64
  script:
    - npm run build

# Only use large for truly intensive work
test:e2e:
  tags:
    - saas-linux-large-amd64
  script:
    - npm run test:e2e:parallel

Avoid Expensive Runners:

# DON'T: Use macOS unless you need it
build:
  tags:
    - saas-macos-medium-m1  # 6x cost factor!
  script:
    - npm run build

# DO: Use Linux when possible
build:
  tags:
    - saas-linux-small-amd64  # 1x cost factor
  script:
    - npm run build

Savings: 50-80% by avoiding oversized/expensive runners

Strategy 7: Self-Hosted Runners

Problem

All jobs on GitLab-hosted runners consume quota at $10/1,000 minutes.

Solution: Move to Self-Hosted Runners

Cost Analysis:

Scenario	Monthly Pipelines	Minutes/Pipeline	Total Minutes	SaaS Cost	Self-Hosted Cost
Small	100	20	2,000	$20	$0 (uses existing infra)
Medium	500	30	15,000	$150	$0
Large	2,000	25	50,000	$500	$0

Self-Hosted Infrastructure Cost:

Small VM: $20-50/month (unlimited builds)
Medium VM: $100-200/month (highly parallel)
Large VM: $300-500/month (enterprise scale)

Break-Even: Typically 2,000-5,000 minutes/month

Implementation

1. Install GitLab Runner:

# On your server
curl -L https://packages.gitlab.com/install/repositories/runner/gitlab-runner/script.deb.sh | sudo bash
sudo apt-get install gitlab-runner

# Register runner
sudo gitlab-runner register \
  --url https://gitlab.com/ \
  --registration-token $REGISTRATION_TOKEN \
  --executor docker \
  --description "self-hosted-runner" \
  --tag-list "self-hosted,docker,linux"

2. Update .gitlab-ci.yml:

# Use self-hosted for heavy jobs
build:
  tags:
    - self-hosted
    - docker
  script:
    - npm run build

# Use SaaS runners for light jobs (better queue time)
lint:
  tags:
    - saas-linux-small-amd64
  script:
    - npm run lint

When to Use Self-Hosted:

High-volume projects (>5,000 min/month)
Long-running jobs (>30 min)
Security-sensitive (on-prem data)
GPU/specialized hardware needed

When to Use SaaS Runners:

Low-volume projects (<2,000 min/month)
Quick jobs (<5 min)
No maintenance burden
Guaranteed availability

Savings: Up to 100% (for jobs moved to self-hosted)

Strategy 8: Optimize Job Duration

Problem

Longer jobs consume more minutes. Every second counts.

Solution A: Remove Unnecessary Work

Before:

test:
  script:
    - apt-get update && apt-get install -y curl git
    - npm install
    - npm run lint
    - npm run type-check
    - npm test
    - npm run build
    - npm run e2e

After:

# Split into separate jobs with caching
lint:
  script:
    - npm run lint  # 30 seconds

test:unit:
  script:
    - npm test  # 2 minutes

test:e2e:
  script:
    - npm run e2e  # 10 minutes

Solution B: Use Smaller Images

Before: node:18 (400 MB) After: node:18-alpine (120 MB)

Impact: 2-3 minutes saved on image pull per job

Solution C: Remove Debug Output

# Verbose logging slows down jobs
script:
  - npm test --verbose  # Slow

# Reduce output
script:
  - npm test --silent  # Fast

Savings: 5-15% per job optimized

Strategy 9: Timeout and Retry Configuration

Problem

Jobs that hang consume minutes until global timeout (1 hour default)
Jobs that fail transiently retry multiple times

Solution: Set Appropriate Timeouts

# Set aggressive timeouts
lint:
  timeout: 5m  # Should complete in seconds
  script:
    - npm run lint

test:
  timeout: 15m  # Should complete in 10 minutes
  script:
    - npm test

deploy:
  timeout: 30m  # Can take longer
  script:
    - npm run deploy

Solution: Limit Retries

# Don't retry jobs that always fail
test:
  retry: 0  # No retries
  script:
    - npm test

# Retry transient failures (network, etc)
deploy:
  retry:
    max: 2
    when:
      - runner_system_failure
      - stuck_or_timeout_failure
  script:
    - npm run deploy

Savings: 10-20% by catching hung/failing jobs faster

Strategy 10: Component Reuse

Problem

Duplicating pipeline logic across 70+ projects increases maintenance and waste.

Solution: GitLab CI/CD Components

Create Shared Component:

# In gitlab-components project
# .gitlab-ci.yml
spec:
  inputs:
    stage:
      default: test
    node-version:
      default: "18"
---
test:
  stage: $[[ inputs.stage ]]
  image: node:$[[ inputs.node-version ]]
  cache:
    key:
      files:
        - package-lock.json
    paths:
      - node_modules/
  script:
    - npm ci
    - npm test

Use in Projects:

# In agent-mesh/.gitlab-ci.yml
include:
  - component: $CI_SERVER_HOST/blueflyio/gitlab-components/node-test@v1.0.0
    inputs:
      node-version: "20"

Benefits:

Optimize once, benefit everywhere
Enforce best practices
Reduce duplicate logic

Strategy 11: Disable Pipelines for Specific Projects

Problem

Some projects don't need CI/CD (docs-only, archived, etc).

Solution: Disable Shared Runners

Navigate to: Project Settings CI/CD Runners

Uncheck: "Enable shared runners for this project"

Alternative - .gitlab-ci.yml:

# In archived/docs projects
workflow:
  rules:
    - when: never  # Never run pipelines

Savings: 100% for disabled projects

Combined Strategy Example

Complete Optimized .gitlab-ci.yml:

# Workflow optimization
workflow:
  auto_cancel:
    on_new_commit: interruptible
    on_job_failure: all
  rules:
    # Skip draft MRs
    - if: $CI_MERGE_REQUEST_TITLE =~ /^Draft:/
      when: never
    # Skip if only docs changed
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
      changes:
        paths:
          - "**/*.md"
          - "docs/**/*"
        compare_to: $CI_MERGE_REQUEST_TARGET_BRANCH_NAME
      when: never
    # Run for MRs and main
    - if: $CI_MERGE_REQUEST_IID
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

# Default settings
default:
  interruptible: true
  image: node:20-alpine
  cache:
    key:
      files:
        - package-lock.json
      prefix: $CI_JOB_NAME
    paths:
      - node_modules/
    policy: pull

stages:
  - validate
  - test
  - build
  - deploy

# Fast validation
lint:
  stage: validate
  timeout: 5m
  tags:
    - saas-linux-small-amd64
  cache:
    policy: pull-push
  script:
    - npm ci --prefer-offline
    - npm run lint

# Parallel tests
test:unit:
  stage: test
  timeout: 10m
  tags:
    - saas-linux-small-amd64
  needs: [lint]
  rules:
    - changes:
        - "src/**/*"
        - "package*.json"
  cache:
    policy: pull-push
  script:
    - npm ci --prefer-offline
    - npm run test:unit

test:integration:
  stage: test
  timeout: 15m
  tags:
    - self-hosted  # Long-running = self-hosted
  needs: [lint]
  rules:
    - changes:
        - "src/**/*"
        - "package*.json"
  script:
    - npm ci --prefer-offline
    - npm run test:integration

# Build
build:
  stage: build
  timeout: 10m
  tags:
    - saas-linux-medium-amd64
  needs: [test:unit, test:integration]
  script:
    - npm ci --prefer-offline
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 day

# Deploy (not interruptible)
deploy:
  stage: deploy
  interruptible: false
  timeout: 20m
  tags:
    - saas-linux-small-amd64
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
  needs: [build]
  script:
    - npm run deploy

Measuring Success

Track these metrics before/after optimization:

Metric	Before	After	Improvement
Monthly compute minutes	48,000	28,000	42%
Average pipeline duration	25 min	15 min	40%
Failed job minute waste	8,000	2,000	75%
Top project usage	12,000	6,000	50%
Cost per deploy	50 min	20 min	60%

Next Steps

Caching Deep Dive - Master dependency caching
Pipeline Optimization - Advanced pipeline patterns
Pre-Push Validation - Catch errors before CI