Skip to main content

checklist

GitLab CI/CD Cost Optimization Checklist

Quick Reference Guide

Use this checklist for daily development, code reviews, and periodic audits.


Daily Developer Checklist

Before Committing

  • Run lint locally - npm run lint (catch errors before CI)
  • Run unit tests locally - npm test (verify changes work)
  • Validate .gitlab-ci.yml - If changed, validate syntax locally
  • Check pre-commit hooks - Ensure they pass before committing

Before Pushing

  • Review changes - Make sure only relevant files included
  • Squash WIP commits - Reduce pipeline runs from force pushes
  • Check draft status - Use Draft MRs for work-in-progress
  • Test locally first - Use gitlab-ci-local for pipeline changes

During MR Development

  • Mark as Draft - Until ready for review (saves pipeline runs)
  • Skip CI for docs-only changes - Add [ci skip] if appropriate
  • Review pipeline results - Don't push repeatedly without checking
  • Cancel obsolete pipelines - Manually cancel if you push again immediately

Code Review Checklist

Pipeline Configuration Review

  • Cache configuration present?

    • File-based cache keys (files: [package-lock.json])
    • Appropriate cache policy (pull for most jobs)
    • Fallback keys configured
    • Cache paths are minimal and necessary
  • Jobs marked as interruptible?

    • All non-deploy jobs: interruptible: true
    • Deploy/critical jobs: interruptible: false
  • Rules properly configured?

    • Skip jobs when files unchanged (rules:changes)
    • No duplicate pipelines (MR + branch)
    • Draft MRs skipped or limited
  • Timeouts set appropriately?

    • Fast jobs (<5m): timeout: 5m
    • Standard jobs (<15m): timeout: 15m
    • Appropriate for job type
  • Resource selection optimal?

    • Using smallest runner that works
    • Avoiding expensive runners (macOS, Windows) unless necessary
    • Self-hosted runners for heavy/long jobs
  • Fail fast patterns implemented?

    • Critical validations run first
    • Job dependencies with needs:
    • Auto-cancel on failure configured

Code Changes Review

  • Dependencies optimized?

    • No unnecessary dependencies added
    • Lock files updated correctly
    • Development dependencies in devDependencies
  • Tests reasonable?

    • Not overly long-running
    • Can be parallelized if needed
    • Appropriate for pipeline vs local

New Project Setup Checklist

Initial Configuration

  • Start from template or component

    • Use proven .gitlab-ci.yml configuration
    • Don't reinvent the wheel
  • Configure caching

    cache: key: files: - package-lock.json # Or poetry.lock, Gemfile.lock, etc prefix: $CI_JOB_NAME paths: - node_modules/ # Or .venv/, vendor/, etc policy: pull
  • Set default interruptible

    default: interruptible: true
  • Configure workflow rules

    workflow: auto_cancel: on_new_commit: interruptible on_job_failure: all rules: - if: $CI_MERGE_REQUEST_TITLE =~ /^Draft:/ when: never - if: $CI_PIPELINE_SOURCE == "merge_request_event" - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
  • Set job timeouts

    default: timeout: 15m # Adjust based on project
  • Add pre-commit hooks

    • Install husky + lint-staged
    • Configure GitLab CI validation
    • Add to README setup instructions
  • Tag runners appropriately

    • Use saas-linux-small-amd64 for light jobs
    • Use self-hosted for heavy/long jobs
  • Enable merge request pipelines only

    • Avoid duplicate branch pipelines
  • Configure appropriate branch protection

    • Require pipeline success before merge
    • Require approvals for .gitlab-ci.yml changes

Documentation

  • Add CI/CD section to README

    • Explain pipeline stages
    • Document required environment variables
    • Link to GitLab pipeline page
  • Document local testing

    • How to run tests locally
    • How to use gitlab-ci-local
    • Pre-commit hook setup
  • Add cost considerations

    • Encourage local testing
    • Explain Draft MR usage
    • Link to cost optimization docs

Weekly Project Maintenance Checklist

Usage Review

  • Check project CI minute usage

    • Navigate to: Settings Usage Quotas
    • Compare to previous week
    • Identify any spikes
  • Review pipeline duration trends

    • CI/CD Pipelines Analytics
    • Look for increasing durations
    • Investigate slowdowns
  • Check failure rate

    • CI/CD Pipelines
    • Filter by status: failed
    • Identify repeat failures
  • Review cache effectiveness

    • Look for jobs re-downloading dependencies
    • Check cache key stability
    • Verify cache is being uploaded/downloaded

Optimization Opportunities

  • Identify long-running jobs

    • Jobs >15 minutes
    • Can they be optimized?
    • Should they move to self-hosted?
  • Check for unnecessary job executions

    • Jobs running on every commit
    • Jobs running when files unchanged
    • Jobs in draft MRs
  • Review retry configurations

    • Jobs with high retry counts
    • Flaky tests that need fixing
    • Unnecessary retries on deterministic failures
  • Audit scheduled pipelines

    • Frequency appropriate?
    • Can they run less often?
    • Are results actually used?

Monthly Organization-Wide Checklist

Usage Analysis

  • Review namespace usage

    • Navigate to: Group Usage Quotas Pipelines
    • Check percentage of quota used
    • Compare to previous months
  • Identify top 10 projects by usage

    • Sort by CI minute consumption
    • Review each for optimization opportunities
    • Reach out to project maintainers
  • Calculate cost trends

    • Total minutes this month vs last month
    • Cost per project
    • Cost per team (if tracked)
  • Review quota allocation

    • Is current quota sufficient?
    • Need to purchase additional minutes?
    • Consider self-hosted runners?

Policy Review

  • Update cost optimization guidelines

    • Document new learnings
    • Share success stories
    • Update examples
  • Review CI/CD components

    • Are shared components up to date?
    • Are projects using them?
    • Opportunity to consolidate?
  • Audit runner usage

    • Self-hosted runner utilization
    • Are expensive runners (macOS, Windows) necessary?
    • Opportunity to add more self-hosted?
  • Check for stale projects

    • Projects with pipelines but no activity
    • Consider archiving or disabling CI
    • Clean up old branches

Team Education

  • Share cost metrics with teams

    • Top projects by usage
    • Overall trends
    • Optimization successes
  • Conduct cost optimization training

    • New team members onboarding
    • Quarterly refreshers
    • Share best practices
  • Recognize optimization efforts

    • Teams that reduced usage
    • Individuals who improved pipelines
    • Share lessons learned

Quarterly Optimization Sprint Checklist

Deep Dive Analysis

  • Profile top 5 projects

    • Run detailed pipeline analysis
    • Identify specific bottlenecks
    • Create optimization plans
  • Benchmark against industry standards

    • Pipeline duration benchmarks
    • Cache hit rate targets
    • Failure rate expectations
  • Review architectural patterns

    • Monorepo vs multi-repo efficiency
    • Microservices pipeline strategies
    • Component reuse opportunities
  • Analyze cost by category

    • Build jobs vs test jobs vs deploy jobs
    • SaaS runners vs self-hosted runners
    • Per-language/framework patterns

Implementation

  • Execute optimization plans

    • Prioritize by impact
    • Test changes in isolation
    • Measure before/after
  • Update shared components

    • Incorporate new best practices
    • Version and release updates
    • Document changes
  • Rollout improvements across projects

    • Create migration guides
    • Assist teams with updates
    • Track adoption
  • Infrastructure improvements

    • Add/upgrade self-hosted runners
    • Optimize cache storage
    • Review network performance

Documentation

  • Update internal wiki

    • New optimization techniques
    • Updated examples
    • Lessons learned
  • Create case studies

    • Document successful optimizations
    • Show cost savings
    • Share methodology
  • Review and update checklists

    • This checklist
    • Project setup guides
    • Code review guidelines

Incident Response Checklist

High Usage Alert (>90% Quota)

  • Immediate assessment

    • Check top projects by usage TODAY
    • Identify any runaway pipelines
    • Cancel unnecessary running pipelines
  • Quick wins

    • Disable scheduled pipelines temporarily
    • Switch high-volume projects to self-hosted runners
    • Temporarily disable non-critical projects
  • Communication

    • Alert teams about usage situation
    • Request voluntary reduction measures
    • Share guidelines for immediate optimization
  • Purchase decision

    • Calculate additional minutes needed
    • Approve purchase if necessary
    • Plan for next month to avoid recurrence

Quota Exhausted (100%)

  • Immediate actions

    • Purchase additional minutes immediately
    • Communicate outage to all teams
    • Identify critical deployments blocked
  • Emergency measures

    • Enable only critical project pipelines
    • Temporarily disable all scheduled pipelines
    • Use self-hosted runners for urgent work
  • Root cause analysis

    • Identify what caused quota exhaustion
    • Review usage spike timeline
    • Document lessons learned
  • Prevention plan

    • Implement stricter monitoring
    • Set up earlier warning alerts (70%, 80%)
    • Create emergency response procedure

Pipeline Optimization Patterns

DO

  • Use file-based cache keys

    cache: key: files: - package-lock.json
  • Mark jobs interruptible

    test: interruptible: true
  • Skip jobs when files unchanged

    test: rules: - changes: - "src/**/*"
  • Set aggressive timeouts

    lint: timeout: 5m
  • Use smallest appropriate runner

    lint: tags: - saas-linux-small-amd64
  • Enable auto-cancel

    workflow: auto_cancel: on_new_commit: interruptible
  • Use pull-only cache for read-only jobs

    test: cache: policy: pull

DON'T

  • Don't use static cache keys

    # BAD cache: key: "my-cache"
  • Don't make deploy jobs interruptible

    # BAD deploy: interruptible: true
  • Don't run all jobs for all changes

    # BAD - No rules to skip unnecessary jobs test: script: npm test
  • Don't use default 1-hour timeout

    # BAD - No timeout specified for 2-minute job
  • Don't use large runners for small jobs

    # BAD lint: tags: - saas-linux-large-amd64
  • Don't run duplicate pipelines

    # BAD - Both branch and MR pipelines workflow: rules: - when: always
  • Don't upload cache unnecessarily

    # BAD - Read-only job with pull-push test: cache: policy: pull-push

Cost Optimization Maturity Levels

Level 0: Unoptimized (Baseline)

  • No caching configured
  • No job timeouts
  • All jobs run for all changes
  • No pipeline rules
  • Typical cost: Baseline

Level 1: Basic Optimization

  • Caching configured with static keys
  • Jobs marked as interruptible
  • Workflow rules to prevent duplicates
  • Expected savings: 20-30%

Level 2: Intermediate Optimization

  • File-based cache keys
  • Cache fallback chains
  • Jobs skip when files unchanged
  • Timeouts configured
  • Auto-cancel redundant pipelines
  • Expected savings: 40-50%

Level 3: Advanced Optimization

  • Docker layer caching
  • Optimized cache policies (pull/push)
  • Self-hosted runners for heavy jobs
  • Pre-push validation
  • Component reuse
  • Monitoring and alerting
  • Expected savings: 60-70%

Level 4: Expert Optimization

  • All Level 3 optimizations
  • Custom runner infrastructure
  • Advanced parallelization strategies
  • Continuous optimization process
  • Automated cost analysis
  • Team-wide best practices adoption
  • Expected savings: 70-85%

Quick Wins (30 Minutes)

Immediate actions for high impact:

  1. Enable auto-cancel

    workflow: auto_cancel: on_new_commit: interruptible

    Impact: 20-40% savings

  2. Add file-based cache key

    cache: key: files: - package-lock.json paths: - node_modules/

    Impact: 30-50% faster jobs

  3. Mark jobs interruptible

    default: interruptible: true

    Impact: Enable auto-cancel

  4. Skip draft MR pipelines

    workflow: rules: - if: $CI_MERGE_REQUEST_TITLE =~ /^Draft:/ when: never

    Impact: 10-20% savings

  5. Set job timeouts

    default: timeout: 15m

    Impact: Catch hung jobs faster


Resources

Documentation

External Resources

Tools

  • glab - GitLab CLI tool
  • gitlab-ci-local - Test pipelines locally
  • mlr (Miller) - Analyze usage data
  • husky - Git hooks
  • pre-commit - Pre-commit framework

Support

Need help?

  • Read the detailed guides above
  • Ask in #ci-cd Slack channel
  • Create issue in gitlab-components project
  • Email: devops@example.com

Found an optimization?

  • Share in team meeting
  • Update this documentation
  • Create component for reuse