caching
GitLab CI/CD Caching Strategies
Comprehensive guide to caching in GitLab CI/CD for faster pipelines and reduced costs.
Table of Contents
- Understanding Cache vs Artifacts
- Docker Layer Caching
- Dependency Caching
- Build Artifact Caching
- Cache Configuration
- Best Practices
- Troubleshooting
Understanding Cache vs Artifacts
Cache
Purpose: Speed up jobs by preserving dependencies between pipeline runs Lifecycle: Persists across pipelines, shared between branches Use for: node_modules, pip packages, Maven dependencies, build caches
Artifacts
Purpose: Pass data between jobs within the same pipeline Lifecycle: Exists only for current pipeline (unless kept for releases) Use for: Build outputs, test reports, compiled binaries
Key Differences
| Feature | Cache | Artifacts |
|---|---|---|
| Speed | Not guaranteed (best effort) | Guaranteed availability |
| Scope | Cross-pipeline | Within pipeline |
| Storage | Distributed cache server | GitLab instance |
| Purpose | Optimization | Required data transfer |
| Expiration | Can be evicted anytime | Configured expiration |
Rule of Thumb: If a job needs it, use artifacts. If it speeds up work, use cache.
Source: GitLab Caching Documentation
Docker Layer Caching
Why Docker Layer Caching Matters
Docker builds are often the most expensive part of CI/CD pipelines. Effective layer caching can reduce build times by 70-90%.
Cost Impact:
Without caching: 10 min build 20 runs/day = 200 minutes/day
With caching: 2 min build 20 runs/day = 40 minutes/day
Savings: 160 minutes/day = 80% reduction
Method 1: Docker Build with Cache-From
Most common approach for GitLab CI:
build-image: stage: build image: docker:24.0.7 services: - docker:24.0.7-dind variables: DOCKER_DRIVER: overlay2 DOCKER_TLS_CERTDIR: "/certs" before_script: - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY script: # Pull previous image to use as cache - docker pull $CI_REGISTRY_IMAGE:latest || true # Build with cache-from - > docker build --cache-from $CI_REGISTRY_IMAGE:latest --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA --tag $CI_REGISTRY_IMAGE:latest . # Push both tags - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA - docker push $CI_REGISTRY_IMAGE:latest
How it works: Docker pulls the latest tag and reuses matching layers during build.
Source: GitLab Docker Layer Caching
Method 2: BuildKit with Registry Cache Backend
Most efficient method (recommended for 2026):
build-image-buildkit: stage: build image: docker:24.0.7 services: - docker:24.0.7-dind variables: DOCKER_DRIVER: overlay2 DOCKER_BUILDKIT: 1 before_script: - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY script: # Use BuildKit with registry cache - > docker buildx build --cache-from type=registry,ref=$CI_REGISTRY_IMAGE:buildcache --cache-to type=registry,ref=$CI_REGISTRY_IMAGE:buildcache,mode=max --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA --tag $CI_REGISTRY_IMAGE:latest --push .
Advantages:
mode=max: Caches ALL layers (not just final image)- Better cache hit rates
- Faster cache retrieval
- More efficient storage
Source: Faster CI Builds with Docker Cache
Method 3: Inline Cache
For simple use cases:
build-inline-cache: script: - docker pull $CI_REGISTRY_IMAGE:latest || true - > docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from $CI_REGISTRY_IMAGE:latest --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
Advantage: Embeds cache metadata directly in image (no separate cache image) Disadvantage: Only caches layers in final image (not multi-stage intermediate layers)
Dockerfile Optimization for Caching
Poor caching (invalidates on every code change):
FROM node:18 WORKDIR /app COPY . . # Everything copied - cache always invalidated RUN npm install RUN npm run build
Optimized for caching:
FROM node:18-slim AS builder # Use slim images WORKDIR /app # Layer 1: Dependencies (changes rarely) COPY package*.json ./ RUN npm ci --only=production # Layer 2: Source code (changes frequently) COPY . . RUN npm run build # Layer 3: Final image FROM node:18-slim WORKDIR /app COPY /app/dist ./dist COPY /app/node_modules ./node_modules CMD ["node", "dist/index.js"]
Key principles:
- Order by change frequency: Least changed Most changed
- Separate dependencies from code: Package files first, then source
- Use multi-stage builds: Only copy final artifacts
- Minimize COPY scope: Don't copy unnecessary files
Expected improvement: 80-95% cache hit rate vs 10-30% without optimization
Source: Docker Layer Caching Best Practices
Dependency Caching
Node.js / npm
node-build: image: node:18 cache: key: files: - package-lock.json # Cache invalidates when lockfile changes paths: - node_modules/ - .npm/ # npm cache directory before_script: - npm ci --cache .npm --prefer-offline script: - npm run build artifacts: paths: - dist/ expire_in: 1 day
Key optimizations:
- Use
npm ciinstead ofnpm install(faster, deterministic) - Cache both
node_modulesand.npmdirectory - Use
--prefer-offlineto use cache first - Key based on lockfile ensures cache invalidation on dependency changes
Python / pip
python-test: image: python:3.11 cache: key: files: - requirements.txt paths: - .cache/pip - venv/ before_script: - pip install --cache-dir .cache/pip -r requirements.txt script: - python -m pytest
Go modules
go-build: image: golang:1.21 cache: key: files: - go.sum paths: - .go-cache/ - .go-mod-cache/ variables: GOCACHE: $CI_PROJECT_DIR/.go-cache GOMODCACHE: $CI_PROJECT_DIR/.go-mod-cache script: - go build -o app ./cmd/app
Maven / Java
maven-build: image: maven:3.9-eclipse-temurin-17 cache: key: files: - pom.xml paths: - .m2/repository variables: MAVEN_OPTS: "-Dmaven.repo.local=$CI_PROJECT_DIR/.m2/repository" script: - mvn clean package
Source: Speeding Up GitLab CI with Caching
Build Artifact Caching
Webpack / Build Caches
frontend-build: cache: - key: files: - package-lock.json paths: - node_modules/ - key: webpack-cache paths: - .webpack-cache/ script: - npm run build -- --cache-directory .webpack-cache
Multiple caches: Separate dependency cache (keyed to lockfile) from build cache (static key)
Compiler Caches (ccache, sccache)
cpp-build: cache: key: ccache-$CI_COMMIT_REF_SLUG paths: - .ccache/ variables: CCACHE_DIR: $CI_PROJECT_DIR/.ccache before_script: - export PATH="/usr/lib/ccache:$PATH" script: - make
Cache Configuration
Cache Keys
1. Per-branch caching:
cache: key: $CI_COMMIT_REF_SLUG # Different cache per branch paths: - node_modules/
2. File-based caching (recommended):
cache: key: files: - package-lock.json - yarn.lock paths: - node_modules/
3. Composite keys:
cache: key: $CI_COMMIT_REF_SLUG-$CI_JOB_NAME paths: - node_modules/
4. Global cache with per-branch fallback:
cache: - key: files: - package-lock.json prefix: $CI_COMMIT_REF_SLUG paths: - node_modules/ - key: global-cache paths: - .npm/
Cache Policies
# Pull and push (default) build: cache: paths: - node_modules/ policy: pull-push # Download and upload cache # Pull only (read-only jobs) test: cache: paths: - node_modules/ policy: pull # Only download, don't upload # Create cache job prepare-cache: script: - npm ci cache: paths: - node_modules/ policy: push # Only upload cache
Cost optimization: Use pull policy on most jobs, pull-push only on jobs that modify cache.
Cache Fallback
cache: - key: $CI_COMMIT_REF_SLUG # Try branch-specific cache first paths: - node_modules/ - key: main # Fallback to main branch cache paths: - node_modules/ policy: pull
Source: GitLab Caching Documentation
Best Practices
1. Cache Key Strategy
Do:
cache: key: files: - package-lock.json # Automatic invalidation on changes
Don't:
cache: key: fixed-key # Manual invalidation required
2. Cache Scope
Per-branch (isolated caching):
cache: key: $CI_COMMIT_REF_SLUG-deps paths: - node_modules/
Shared across branches (faster initial builds on new branches):
cache: key: global-deps paths: - node_modules/
3. Cache Size Management
Monitor cache sizes:
- Keep caches < 1GB when possible
- Exclude unnecessary files (test outputs, logs)
- Use
.gitignorepatterns in cache paths
cache: paths: - node_modules/ - !node_modules/.cache # Exclude sub-caches
4. Multi-Cache Strategy
# Different caches for different purposes cache: - key: files: - package-lock.json paths: - node_modules/ - key: build-cache-$CI_COMMIT_REF_SLUG paths: - .webpack-cache/ - key: test-data paths: - test-data/ policy: pull # Test data never changes in pipeline
5. Runner Tag Consistency
Critical for cache effectiveness:
build: tags: - docker - linux cache: paths: - node_modules/
Cache is not shared between runners with different tags. Ensure consistent tagging across jobs using the same cache.
6. Artifact Expiration
build: script: npm run build artifacts: paths: - dist/ expire_in: 1 day # Clean up after 1 day cache: paths: - node_modules/
Expiration Guidelines:
- Development builds: 1-7 days
- Release builds: 30 days or never (for releases)
- Test artifacts: 1 day
- Default (if not set): 30 days
Cost impact: Shorter expiration = less storage costs
Source: GitLab Job Artifacts Documentation
Troubleshooting
Cache Not Working
Symptom: Dependencies reinstalled on every run
Causes:
- Different runner tags between jobs
- Cache key changes unexpectedly
- Cache eviction due to size limits
- Incorrect paths (typo or wrong directory)
Debug:
debug-cache: script: - ls -la node_modules/ || echo "Cache miss - no node_modules" - echo "Cache key: $CI_COMMIT_REF_SLUG" cache: paths: - node_modules/
Cache Download Slow
Symptom: Cache download takes longer than rebuilding
Solutions:
- Reduce cache size (exclude unnecessary files)
- Use more specific cache keys (avoid huge shared caches)
- Consider if cache is worth it for this job
- Use
policy: pull-pushonly where necessary
Docker Cache Miss
Symptom: Docker always rebuilds from scratch
Causes:
- Previous image not found (check registry)
- Dockerfile changed order (invalidated layers)
- Not using BuildKit with mode=max
Fix:
build: script: # Add error handling - docker pull $CI_REGISTRY_IMAGE:latest || echo "No cache image found" - docker build --cache-from $CI_REGISTRY_IMAGE:latest ...
Cache Conflicts
Symptom: Cache from wrong branch causes failures
Solution: Use file-based keys instead of branch-based:
cache: key: files: - package-lock.json # Same lockfile = compatible cache
Performance Benchmarks
Expected Improvements
| Optimization | Time Savings | Cost Savings |
|---|---|---|
| Docker layer caching | 70-90% | High |
| Dependency caching (npm/pip) | 50-70% | High |
| Build cache (webpack) | 40-60% | Medium |
| Compiler cache (ccache) | 60-80% | High |
| Artifact reuse | 30-50% | Medium |
Measurement
benchmark-cache: script: - time npm ci # Measure with timing cache: paths: - node_modules/
Expect:
- First run (cold cache): Full install time
- Second run (warm cache): 10-30% of original time
- Subsequent runs: < 10% of original time
Additional Resources
- Cost Optimization Guide - Reduce CI minute usage
- Pipeline Efficiency - Job orchestration patterns
- GitLab Caching Docs
- Docker Layer Caching
Last Updated: 2026-01-08 Priority: HIGH - Implement caching to reduce costs