checklist
GitLab CI/CD Cost Optimization Checklist
Quick Reference Guide
Use this checklist for daily development, code reviews, and periodic audits.
Daily Developer Checklist
Before Committing
- Run lint locally -
npm run lint(catch errors before CI) - Run unit tests locally -
npm test(verify changes work) - Validate .gitlab-ci.yml - If changed, validate syntax locally
- Check pre-commit hooks - Ensure they pass before committing
Before Pushing
- Review changes - Make sure only relevant files included
- Squash WIP commits - Reduce pipeline runs from force pushes
- Check draft status - Use Draft MRs for work-in-progress
- Test locally first - Use
gitlab-ci-localfor pipeline changes
During MR Development
- Mark as Draft - Until ready for review (saves pipeline runs)
- Skip CI for docs-only changes - Add
[ci skip]if appropriate - Review pipeline results - Don't push repeatedly without checking
- Cancel obsolete pipelines - Manually cancel if you push again immediately
Code Review Checklist
Pipeline Configuration Review
-
Cache configuration present?
- File-based cache keys (
files: [package-lock.json]) - Appropriate cache policy (
pullfor most jobs) - Fallback keys configured
- Cache paths are minimal and necessary
- File-based cache keys (
-
Jobs marked as interruptible?
- All non-deploy jobs:
interruptible: true - Deploy/critical jobs:
interruptible: false
- All non-deploy jobs:
-
Rules properly configured?
- Skip jobs when files unchanged (
rules:changes) - No duplicate pipelines (MR + branch)
- Draft MRs skipped or limited
- Skip jobs when files unchanged (
-
Timeouts set appropriately?
- Fast jobs (<5m):
timeout: 5m - Standard jobs (<15m):
timeout: 15m - Appropriate for job type
- Fast jobs (<5m):
-
Resource selection optimal?
- Using smallest runner that works
- Avoiding expensive runners (macOS, Windows) unless necessary
- Self-hosted runners for heavy/long jobs
-
Fail fast patterns implemented?
- Critical validations run first
- Job dependencies with
needs: - Auto-cancel on failure configured
Code Changes Review
-
Dependencies optimized?
- No unnecessary dependencies added
- Lock files updated correctly
- Development dependencies in devDependencies
-
Tests reasonable?
- Not overly long-running
- Can be parallelized if needed
- Appropriate for pipeline vs local
New Project Setup Checklist
Initial Configuration
-
Start from template or component
- Use proven .gitlab-ci.yml configuration
- Don't reinvent the wheel
-
Configure caching
cache: key: files: - package-lock.json # Or poetry.lock, Gemfile.lock, etc prefix: $CI_JOB_NAME paths: - node_modules/ # Or .venv/, vendor/, etc policy: pull -
Set default interruptible
default: interruptible: true -
Configure workflow rules
workflow: auto_cancel: on_new_commit: interruptible on_job_failure: all rules: - if: $CI_MERGE_REQUEST_TITLE =~ /^Draft:/ when: never - if: $CI_PIPELINE_SOURCE == "merge_request_event" - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH -
Set job timeouts
default: timeout: 15m # Adjust based on project -
Add pre-commit hooks
- Install husky + lint-staged
- Configure GitLab CI validation
- Add to README setup instructions
-
Tag runners appropriately
- Use
saas-linux-small-amd64for light jobs - Use
self-hostedfor heavy/long jobs
- Use
-
Enable merge request pipelines only
- Avoid duplicate branch pipelines
-
Configure appropriate branch protection
- Require pipeline success before merge
- Require approvals for .gitlab-ci.yml changes
Documentation
-
Add CI/CD section to README
- Explain pipeline stages
- Document required environment variables
- Link to GitLab pipeline page
-
Document local testing
- How to run tests locally
- How to use gitlab-ci-local
- Pre-commit hook setup
-
Add cost considerations
- Encourage local testing
- Explain Draft MR usage
- Link to cost optimization docs
Weekly Project Maintenance Checklist
Usage Review
-
Check project CI minute usage
- Navigate to: Settings Usage Quotas
- Compare to previous week
- Identify any spikes
-
Review pipeline duration trends
- CI/CD Pipelines Analytics
- Look for increasing durations
- Investigate slowdowns
-
Check failure rate
- CI/CD Pipelines
- Filter by status: failed
- Identify repeat failures
-
Review cache effectiveness
- Look for jobs re-downloading dependencies
- Check cache key stability
- Verify cache is being uploaded/downloaded
Optimization Opportunities
-
Identify long-running jobs
- Jobs >15 minutes
- Can they be optimized?
- Should they move to self-hosted?
-
Check for unnecessary job executions
- Jobs running on every commit
- Jobs running when files unchanged
- Jobs in draft MRs
-
Review retry configurations
- Jobs with high retry counts
- Flaky tests that need fixing
- Unnecessary retries on deterministic failures
-
Audit scheduled pipelines
- Frequency appropriate?
- Can they run less often?
- Are results actually used?
Monthly Organization-Wide Checklist
Usage Analysis
-
Review namespace usage
- Navigate to: Group Usage Quotas Pipelines
- Check percentage of quota used
- Compare to previous months
-
Identify top 10 projects by usage
- Sort by CI minute consumption
- Review each for optimization opportunities
- Reach out to project maintainers
-
Calculate cost trends
- Total minutes this month vs last month
- Cost per project
- Cost per team (if tracked)
-
Review quota allocation
- Is current quota sufficient?
- Need to purchase additional minutes?
- Consider self-hosted runners?
Policy Review
-
Update cost optimization guidelines
- Document new learnings
- Share success stories
- Update examples
-
Review CI/CD components
- Are shared components up to date?
- Are projects using them?
- Opportunity to consolidate?
-
Audit runner usage
- Self-hosted runner utilization
- Are expensive runners (macOS, Windows) necessary?
- Opportunity to add more self-hosted?
-
Check for stale projects
- Projects with pipelines but no activity
- Consider archiving or disabling CI
- Clean up old branches
Team Education
-
Share cost metrics with teams
- Top projects by usage
- Overall trends
- Optimization successes
-
Conduct cost optimization training
- New team members onboarding
- Quarterly refreshers
- Share best practices
-
Recognize optimization efforts
- Teams that reduced usage
- Individuals who improved pipelines
- Share lessons learned
Quarterly Optimization Sprint Checklist
Deep Dive Analysis
-
Profile top 5 projects
- Run detailed pipeline analysis
- Identify specific bottlenecks
- Create optimization plans
-
Benchmark against industry standards
- Pipeline duration benchmarks
- Cache hit rate targets
- Failure rate expectations
-
Review architectural patterns
- Monorepo vs multi-repo efficiency
- Microservices pipeline strategies
- Component reuse opportunities
-
Analyze cost by category
- Build jobs vs test jobs vs deploy jobs
- SaaS runners vs self-hosted runners
- Per-language/framework patterns
Implementation
-
Execute optimization plans
- Prioritize by impact
- Test changes in isolation
- Measure before/after
-
Update shared components
- Incorporate new best practices
- Version and release updates
- Document changes
-
Rollout improvements across projects
- Create migration guides
- Assist teams with updates
- Track adoption
-
Infrastructure improvements
- Add/upgrade self-hosted runners
- Optimize cache storage
- Review network performance
Documentation
-
Update internal wiki
- New optimization techniques
- Updated examples
- Lessons learned
-
Create case studies
- Document successful optimizations
- Show cost savings
- Share methodology
-
Review and update checklists
- This checklist
- Project setup guides
- Code review guidelines
Incident Response Checklist
High Usage Alert (>90% Quota)
-
Immediate assessment
- Check top projects by usage TODAY
- Identify any runaway pipelines
- Cancel unnecessary running pipelines
-
Quick wins
- Disable scheduled pipelines temporarily
- Switch high-volume projects to self-hosted runners
- Temporarily disable non-critical projects
-
Communication
- Alert teams about usage situation
- Request voluntary reduction measures
- Share guidelines for immediate optimization
-
Purchase decision
- Calculate additional minutes needed
- Approve purchase if necessary
- Plan for next month to avoid recurrence
Quota Exhausted (100%)
-
Immediate actions
- Purchase additional minutes immediately
- Communicate outage to all teams
- Identify critical deployments blocked
-
Emergency measures
- Enable only critical project pipelines
- Temporarily disable all scheduled pipelines
- Use self-hosted runners for urgent work
-
Root cause analysis
- Identify what caused quota exhaustion
- Review usage spike timeline
- Document lessons learned
-
Prevention plan
- Implement stricter monitoring
- Set up earlier warning alerts (70%, 80%)
- Create emergency response procedure
Pipeline Optimization Patterns
DO
-
Use file-based cache keys
cache: key: files: - package-lock.json -
Mark jobs interruptible
test: interruptible: true -
Skip jobs when files unchanged
test: rules: - changes: - "src/**/*" -
Set aggressive timeouts
lint: timeout: 5m -
Use smallest appropriate runner
lint: tags: - saas-linux-small-amd64 -
Enable auto-cancel
workflow: auto_cancel: on_new_commit: interruptible -
Use pull-only cache for read-only jobs
test: cache: policy: pull
DON'T
-
Don't use static cache keys
# BAD cache: key: "my-cache" -
Don't make deploy jobs interruptible
# BAD deploy: interruptible: true -
Don't run all jobs for all changes
# BAD - No rules to skip unnecessary jobs test: script: npm test -
Don't use default 1-hour timeout
# BAD - No timeout specified for 2-minute job -
Don't use large runners for small jobs
# BAD lint: tags: - saas-linux-large-amd64 -
Don't run duplicate pipelines
# BAD - Both branch and MR pipelines workflow: rules: - when: always -
Don't upload cache unnecessarily
# BAD - Read-only job with pull-push test: cache: policy: pull-push
Cost Optimization Maturity Levels
Level 0: Unoptimized (Baseline)
- No caching configured
- No job timeouts
- All jobs run for all changes
- No pipeline rules
- Typical cost: Baseline
Level 1: Basic Optimization
- Caching configured with static keys
- Jobs marked as interruptible
- Workflow rules to prevent duplicates
- Expected savings: 20-30%
Level 2: Intermediate Optimization
- File-based cache keys
- Cache fallback chains
- Jobs skip when files unchanged
- Timeouts configured
- Auto-cancel redundant pipelines
- Expected savings: 40-50%
Level 3: Advanced Optimization
- Docker layer caching
- Optimized cache policies (pull/push)
- Self-hosted runners for heavy jobs
- Pre-push validation
- Component reuse
- Monitoring and alerting
- Expected savings: 60-70%
Level 4: Expert Optimization
- All Level 3 optimizations
- Custom runner infrastructure
- Advanced parallelization strategies
- Continuous optimization process
- Automated cost analysis
- Team-wide best practices adoption
- Expected savings: 70-85%
Quick Wins (30 Minutes)
Immediate actions for high impact:
-
Enable auto-cancel
workflow: auto_cancel: on_new_commit: interruptibleImpact: 20-40% savings
-
Add file-based cache key
cache: key: files: - package-lock.json paths: - node_modules/Impact: 30-50% faster jobs
-
Mark jobs interruptible
default: interruptible: trueImpact: Enable auto-cancel
-
Skip draft MR pipelines
workflow: rules: - if: $CI_MERGE_REQUEST_TITLE =~ /^Draft:/ when: neverImpact: 10-20% savings
-
Set job timeouts
default: timeout: 15mImpact: Catch hung jobs faster
Resources
Documentation
- Overview - Cost optimization overview
- Tracking - Monitor minute usage
- Strategies - Reduction strategies
- Caching Deep Dive - Master caching
- Pipeline Optimization - Pipeline efficiency
- Pre-Push Validation - Test before CI
- Monitoring - Track optimization impact
External Resources
Tools
glab- GitLab CLI toolgitlab-ci-local- Test pipelines locallymlr(Miller) - Analyze usage datahusky- Git hookspre-commit- Pre-commit framework
Support
Need help?
- Read the detailed guides above
- Ask in #ci-cd Slack channel
- Create issue in gitlab-components project
- Email: devops@example.com
Found an optimization?
- Share in team meeting
- Update this documentation
- Create component for reuse