Skip to main content

caching

GitLab CI/CD Caching Strategies

Comprehensive guide to caching in GitLab CI/CD for faster pipelines and reduced costs.

Table of Contents

Understanding Cache vs Artifacts

Cache

Purpose: Speed up jobs by preserving dependencies between pipeline runs Lifecycle: Persists across pipelines, shared between branches Use for: node_modules, pip packages, Maven dependencies, build caches

Artifacts

Purpose: Pass data between jobs within the same pipeline Lifecycle: Exists only for current pipeline (unless kept for releases) Use for: Build outputs, test reports, compiled binaries

Key Differences

FeatureCacheArtifacts
SpeedNot guaranteed (best effort)Guaranteed availability
ScopeCross-pipelineWithin pipeline
StorageDistributed cache serverGitLab instance
PurposeOptimizationRequired data transfer
ExpirationCan be evicted anytimeConfigured expiration

Rule of Thumb: If a job needs it, use artifacts. If it speeds up work, use cache.

Source: GitLab Caching Documentation

Docker Layer Caching

Why Docker Layer Caching Matters

Docker builds are often the most expensive part of CI/CD pipelines. Effective layer caching can reduce build times by 70-90%.

Cost Impact:

Without caching: 10 min build  20 runs/day = 200 minutes/day
With caching:     2 min build  20 runs/day = 40 minutes/day
Savings: 160 minutes/day = 80% reduction

Method 1: Docker Build with Cache-From

Most common approach for GitLab CI:

build-image: stage: build image: docker:24.0.7 services: - docker:24.0.7-dind variables: DOCKER_DRIVER: overlay2 DOCKER_TLS_CERTDIR: "/certs" before_script: - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY script: # Pull previous image to use as cache - docker pull $CI_REGISTRY_IMAGE:latest || true # Build with cache-from - > docker build --cache-from $CI_REGISTRY_IMAGE:latest --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA --tag $CI_REGISTRY_IMAGE:latest . # Push both tags - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA - docker push $CI_REGISTRY_IMAGE:latest

How it works: Docker pulls the latest tag and reuses matching layers during build.

Source: GitLab Docker Layer Caching

Method 2: BuildKit with Registry Cache Backend

Most efficient method (recommended for 2026):

build-image-buildkit: stage: build image: docker:24.0.7 services: - docker:24.0.7-dind variables: DOCKER_DRIVER: overlay2 DOCKER_BUILDKIT: 1 before_script: - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY script: # Use BuildKit with registry cache - > docker buildx build --cache-from type=registry,ref=$CI_REGISTRY_IMAGE:buildcache --cache-to type=registry,ref=$CI_REGISTRY_IMAGE:buildcache,mode=max --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA --tag $CI_REGISTRY_IMAGE:latest --push .

Advantages:

  • mode=max: Caches ALL layers (not just final image)
  • Better cache hit rates
  • Faster cache retrieval
  • More efficient storage

Source: Faster CI Builds with Docker Cache

Method 3: Inline Cache

For simple use cases:

build-inline-cache: script: - docker pull $CI_REGISTRY_IMAGE:latest || true - > docker build --build-arg BUILDKIT_INLINE_CACHE=1 --cache-from $CI_REGISTRY_IMAGE:latest --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .

Advantage: Embeds cache metadata directly in image (no separate cache image) Disadvantage: Only caches layers in final image (not multi-stage intermediate layers)

Dockerfile Optimization for Caching

Poor caching (invalidates on every code change):

FROM node:18 WORKDIR /app COPY . . # Everything copied - cache always invalidated RUN npm install RUN npm run build

Optimized for caching:

FROM node:18-slim AS builder # Use slim images WORKDIR /app # Layer 1: Dependencies (changes rarely) COPY package*.json ./ RUN npm ci --only=production # Layer 2: Source code (changes frequently) COPY . . RUN npm run build # Layer 3: Final image FROM node:18-slim WORKDIR /app COPY --from=builder /app/dist ./dist COPY --from=builder /app/node_modules ./node_modules CMD ["node", "dist/index.js"]

Key principles:

  1. Order by change frequency: Least changed Most changed
  2. Separate dependencies from code: Package files first, then source
  3. Use multi-stage builds: Only copy final artifacts
  4. Minimize COPY scope: Don't copy unnecessary files

Expected improvement: 80-95% cache hit rate vs 10-30% without optimization

Source: Docker Layer Caching Best Practices

Dependency Caching

Node.js / npm

node-build: image: node:18 cache: key: files: - package-lock.json # Cache invalidates when lockfile changes paths: - node_modules/ - .npm/ # npm cache directory before_script: - npm ci --cache .npm --prefer-offline script: - npm run build artifacts: paths: - dist/ expire_in: 1 day

Key optimizations:

  • Use npm ci instead of npm install (faster, deterministic)
  • Cache both node_modules and .npm directory
  • Use --prefer-offline to use cache first
  • Key based on lockfile ensures cache invalidation on dependency changes

Python / pip

python-test: image: python:3.11 cache: key: files: - requirements.txt paths: - .cache/pip - venv/ before_script: - pip install --cache-dir .cache/pip -r requirements.txt script: - python -m pytest

Go modules

go-build: image: golang:1.21 cache: key: files: - go.sum paths: - .go-cache/ - .go-mod-cache/ variables: GOCACHE: $CI_PROJECT_DIR/.go-cache GOMODCACHE: $CI_PROJECT_DIR/.go-mod-cache script: - go build -o app ./cmd/app

Maven / Java

maven-build: image: maven:3.9-eclipse-temurin-17 cache: key: files: - pom.xml paths: - .m2/repository variables: MAVEN_OPTS: "-Dmaven.repo.local=$CI_PROJECT_DIR/.m2/repository" script: - mvn clean package

Source: Speeding Up GitLab CI with Caching

Build Artifact Caching

Webpack / Build Caches

frontend-build: cache: - key: files: - package-lock.json paths: - node_modules/ - key: webpack-cache paths: - .webpack-cache/ script: - npm run build -- --cache-directory .webpack-cache

Multiple caches: Separate dependency cache (keyed to lockfile) from build cache (static key)

Compiler Caches (ccache, sccache)

cpp-build: cache: key: ccache-$CI_COMMIT_REF_SLUG paths: - .ccache/ variables: CCACHE_DIR: $CI_PROJECT_DIR/.ccache before_script: - export PATH="/usr/lib/ccache:$PATH" script: - make

Cache Configuration

Cache Keys

1. Per-branch caching:

cache: key: $CI_COMMIT_REF_SLUG # Different cache per branch paths: - node_modules/

2. File-based caching (recommended):

cache: key: files: - package-lock.json - yarn.lock paths: - node_modules/

3. Composite keys:

cache: key: $CI_COMMIT_REF_SLUG-$CI_JOB_NAME paths: - node_modules/

4. Global cache with per-branch fallback:

cache: - key: files: - package-lock.json prefix: $CI_COMMIT_REF_SLUG paths: - node_modules/ - key: global-cache paths: - .npm/

Cache Policies

# Pull and push (default) build: cache: paths: - node_modules/ policy: pull-push # Download and upload cache # Pull only (read-only jobs) test: cache: paths: - node_modules/ policy: pull # Only download, don't upload # Create cache job prepare-cache: script: - npm ci cache: paths: - node_modules/ policy: push # Only upload cache

Cost optimization: Use pull policy on most jobs, pull-push only on jobs that modify cache.

Cache Fallback

cache: - key: $CI_COMMIT_REF_SLUG # Try branch-specific cache first paths: - node_modules/ - key: main # Fallback to main branch cache paths: - node_modules/ policy: pull

Source: GitLab Caching Documentation

Best Practices

1. Cache Key Strategy

Do:

cache: key: files: - package-lock.json # Automatic invalidation on changes

Don't:

cache: key: fixed-key # Manual invalidation required

2. Cache Scope

Per-branch (isolated caching):

cache: key: $CI_COMMIT_REF_SLUG-deps paths: - node_modules/

Shared across branches (faster initial builds on new branches):

cache: key: global-deps paths: - node_modules/

3. Cache Size Management

Monitor cache sizes:

  • Keep caches < 1GB when possible
  • Exclude unnecessary files (test outputs, logs)
  • Use .gitignore patterns in cache paths
cache: paths: - node_modules/ - !node_modules/.cache # Exclude sub-caches

4. Multi-Cache Strategy

# Different caches for different purposes cache: - key: files: - package-lock.json paths: - node_modules/ - key: build-cache-$CI_COMMIT_REF_SLUG paths: - .webpack-cache/ - key: test-data paths: - test-data/ policy: pull # Test data never changes in pipeline

5. Runner Tag Consistency

Critical for cache effectiveness:

build: tags: - docker - linux cache: paths: - node_modules/

Cache is not shared between runners with different tags. Ensure consistent tagging across jobs using the same cache.

6. Artifact Expiration

build: script: npm run build artifacts: paths: - dist/ expire_in: 1 day # Clean up after 1 day cache: paths: - node_modules/

Expiration Guidelines:

  • Development builds: 1-7 days
  • Release builds: 30 days or never (for releases)
  • Test artifacts: 1 day
  • Default (if not set): 30 days

Cost impact: Shorter expiration = less storage costs

Source: GitLab Job Artifacts Documentation

Troubleshooting

Cache Not Working

Symptom: Dependencies reinstalled on every run

Causes:

  1. Different runner tags between jobs
  2. Cache key changes unexpectedly
  3. Cache eviction due to size limits
  4. Incorrect paths (typo or wrong directory)

Debug:

debug-cache: script: - ls -la node_modules/ || echo "Cache miss - no node_modules" - echo "Cache key: $CI_COMMIT_REF_SLUG" cache: paths: - node_modules/

Cache Download Slow

Symptom: Cache download takes longer than rebuilding

Solutions:

  1. Reduce cache size (exclude unnecessary files)
  2. Use more specific cache keys (avoid huge shared caches)
  3. Consider if cache is worth it for this job
  4. Use policy: pull-push only where necessary

Docker Cache Miss

Symptom: Docker always rebuilds from scratch

Causes:

  1. Previous image not found (check registry)
  2. Dockerfile changed order (invalidated layers)
  3. Not using BuildKit with mode=max

Fix:

build: script: # Add error handling - docker pull $CI_REGISTRY_IMAGE:latest || echo "No cache image found" - docker build --cache-from $CI_REGISTRY_IMAGE:latest ...

Cache Conflicts

Symptom: Cache from wrong branch causes failures

Solution: Use file-based keys instead of branch-based:

cache: key: files: - package-lock.json # Same lockfile = compatible cache

Performance Benchmarks

Expected Improvements

OptimizationTime SavingsCost Savings
Docker layer caching70-90%High
Dependency caching (npm/pip)50-70%High
Build cache (webpack)40-60%Medium
Compiler cache (ccache)60-80%High
Artifact reuse30-50%Medium

Measurement

benchmark-cache: script: - time npm ci # Measure with timing cache: paths: - node_modules/

Expect:

  • First run (cold cache): Full install time
  • Second run (warm cache): 10-30% of original time
  • Subsequent runs: < 10% of original time

Additional Resources


Last Updated: 2026-01-08 Priority: HIGH - Implement caching to reduce costs