caching deep dive
Caching Deep Dive
Overview
Caching is one of the highest-impact optimizations for reducing CI/CD costs. Properly configured caches can reduce job duration by 30-70%, directly saving compute minutes.
Key Principle: Never download/install what you already have.
How GitLab Caching Works
Cache vs Artifacts
| Feature | Cache | Artifacts |
|---|---|---|
| Purpose | Speed up jobs | Pass data between jobs |
| Scope | Global (by key) | Pipeline-specific |
| Guarantee | Best-effort | Guaranteed |
| Storage | External (S3, etc) | GitLab storage |
| Size limit | Varies | 1 GB default |
| Use for | Dependencies | Build outputs |
Rule of Thumb:
- Use cache for dependencies (node_modules, pip cache, etc)
- Use artifacts for build outputs (dist/, binaries, etc)
Cache Lifecycle
Job Start
Check cache key
Cache exists? No Download dependencies Run job Upload cache
Yes
Download cache Run job Upload cache (if policy allows)
Cache Storage
GitLab stores caches in:
- SaaS: AWS S3 (per region)
- Self-Managed: Local filesystem or configured object storage
Important: Caches are NOT guaranteed. If cache is unavailable, job proceeds without it.
Cache Key Strategies
The cache key determines when caches are shared or regenerated.
Static Keys (Simple)
Same cache for all branches/jobs:
cache: key: "global-cache" paths: - node_modules/
Pros: Maximum reuse Cons: No invalidation when dependencies change
Dynamic Keys (Recommended)
File-based keys (automatic invalidation):
cache: key: files: - package-lock.json # Regenerate when lockfile changes paths: - node_modules/
How it works:
- GitLab hashes
package-lock.json - Uses hash as part of cache key
- When file changes, hash changes, new cache created
Key with Prefix
Per-job or per-branch caching:
cache: key: files: - package-lock.json prefix: $CI_JOB_NAME # Different cache per job paths: - node_modules/
Per-branch:
cache: key: files: - package-lock.json prefix: $CI_COMMIT_REF_SLUG # Different cache per branch paths: - node_modules/
Composite Keys
Multiple files:
cache: key: files: - package-lock.json - package.json - .gitlab-ci.yml # Invalidate when CI config changes prefix: $CI_JOB_NAME paths: - node_modules/
Cache Key Hierarchy
From specific to general:
cache: - key: "$CI_COMMIT_REF_SLUG-$CI_JOB_NAME" paths: - node_modules/ - key: "$CI_COMMIT_REF_SLUG" paths: - node_modules/ - key: "default" paths: - node_modules/
Lookup order: Branch+Job Branch Default
Cache Scope
Project-Scoped (Default)
Cache shared across all branches/jobs in the same project.
cache: key: "shared" paths: - node_modules/
Best for: Single-project workflows
Branch-Scoped
Separate cache per branch.
cache: key: "$CI_COMMIT_REF_SLUG" paths: - node_modules/
Best for: Long-lived feature branches with different dependencies
Job-Scoped
Separate cache per job.
cache: key: prefix: "$CI_JOB_NAME" files: - package-lock.json paths: - node_modules/
Best for: Jobs with different dependency sets
Cache Fallback Keys
GitLab 16.1+: Use fallback keys for better cache reuse.
Basic Fallback
cache: - key: "cache-$CI_COMMIT_REF_SLUG" fallback_keys: - "cache-$CI_DEFAULT_BRANCH" # Try main branch - "cache-default" # Last resort paths: - node_modules/
How it works:
- Try branch-specific cache
- If not found, try main branch cache
- If not found, try default cache
- If none exist, proceed without cache
Advanced Fallback Chain
cache: - key: files: - package-lock.json prefix: "$CI_COMMIT_REF_SLUG" fallback_keys: # Same lockfile, different branch - files: - package-lock.json prefix: "$CI_DEFAULT_BRANCH" # Any cache from main branch - "$CI_DEFAULT_BRANCH-default" # Global fallback - "global-cache" paths: - node_modules/
Benefits:
- New branches inherit cache from main
- Reduces initial build time on new branches
- Graceful degradation
Cache Policy
Controls when cache is downloaded/uploaded.
pull-push (Default)
Download before job, upload after job:
cache: policy: pull-push # Default
Use for: Jobs that modify dependencies (install, update)
Cost: 2x cache operations per job
pull (Recommended for Most Jobs)
Download only, don't upload:
cache: policy: pull
Use for: Jobs that only read dependencies (test, lint)
Benefit: 50% cache operation reduction
push
Upload only, don't download:
cache: policy: push
Use for: Initial setup jobs that create cache
Combined Strategy
# Job that installs dependencies install: stage: .pre cache: key: files: - package-lock.json paths: - node_modules/ policy: pull-push # Create/update cache script: - npm ci --prefer-offline # Jobs that use dependencies test: cache: key: files: - package-lock.json paths: - node_modules/ policy: pull # Only download script: - npm test lint: cache: key: files: - package-lock.json paths: - node_modules/ policy: pull # Only download script: - npm run lint
Savings: 40-60% reduction in cache upload operations
Dependency Caching by Language
Node.js / npm
Basic:
cache: key: files: - package-lock.json paths: - node_modules/
Advanced (with npm cache):
variables: npm_config_cache: "$CI_PROJECT_DIR/.npm" cache: key: files: - package-lock.json paths: - node_modules/ - .npm/ # npm cache directory before_script: - npm ci --prefer-offline --no-audit
Yarn:
cache: key: files: - yarn.lock paths: - node_modules/ - .yarn/cache/ before_script: - yarn install --frozen-lockfile --cache-folder .yarn/cache
Python / pip
Basic:
variables: PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip" cache: key: files: - requirements.txt paths: - .cache/pip/ - venv/ before_script: - python -m venv venv - source venv/bin/activate - pip install -r requirements.txt
Poetry:
variables: POETRY_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pypoetry" cache: key: files: - poetry.lock prefix: "$CI_JOB_NAME" paths: - .venv/ - .cache/pypoetry/ before_script: - poetry config virtualenvs.in-project true - poetry install --no-root
Ruby / Bundler
variables: BUNDLE_PATH: "$CI_PROJECT_DIR/vendor/bundle" cache: key: files: - Gemfile.lock paths: - vendor/bundle/ before_script: - bundle install --jobs $(nproc) --path=vendor/bundle
Go
variables: GOPATH: "$CI_PROJECT_DIR/.go" cache: key: files: - go.sum paths: - .go/pkg/mod/ before_script: - go mod download
Rust / Cargo
variables: CARGO_HOME: "$CI_PROJECT_DIR/.cargo" cache: key: files: - Cargo.lock paths: - .cargo/ - target/ before_script: - cargo fetch
Docker Layer Caching
Separate from GitLab cache - uses Docker registry.
Problem
Building Docker images from scratch every time:
FROM node:20 COPY package*.json ./ RUN npm install # Downloads packages every time COPY . . RUN npm run build
Solution 1: BuildKit Registry Cache
Enable BuildKit:
variables: DOCKER_BUILDKIT: 1 build: image: docker:24 services: - docker:24-dind script: - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY - | docker buildx create --use docker buildx build \ --cache-from type=registry,ref=$CI_REGISTRY_IMAGE:cache \ --cache-to type=registry,ref=$CI_REGISTRY_IMAGE:cache,mode=max \ --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA \ --push \ .
Benefits:
- Reuses all layers across builds
- mode=max stores intermediate layers
- 40-70% faster builds
Solution 2: Multi-Stage Build Caching
build: script: - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY # Pull previous images for cache - docker pull $CI_REGISTRY_IMAGE:builder || true - docker pull $CI_REGISTRY_IMAGE:latest || true # Build with cache - | docker build \ --target builder \ --cache-from $CI_REGISTRY_IMAGE:builder \ --tag $CI_REGISTRY_IMAGE:builder \ . - | docker build \ --cache-from $CI_REGISTRY_IMAGE:builder \ --cache-from $CI_REGISTRY_IMAGE:latest \ --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA \ --tag $CI_REGISTRY_IMAGE:latest \ . - docker push $CI_REGISTRY_IMAGE:builder - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA - docker push $CI_REGISTRY_IMAGE:latest
Dockerfile:
FROM node:20 AS builder WORKDIR /app COPY package*.json ./ RUN npm ci --only=production FROM node:20-alpine WORKDIR /app COPY /app/node_modules ./node_modules COPY . . RUN npm run build
Solution 3: Kaniko
Google's Kaniko for caching without Docker daemon:
build: image: name: gcr.io/kaniko-project/executor:debug entrypoint: [""] script: - | /kaniko/executor \ --context $CI_PROJECT_DIR \ --dockerfile $CI_PROJECT_DIR/Dockerfile \ --destination $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA \ --cache=true \ --cache-repo $CI_REGISTRY_IMAGE/cache
Benefits:
- No Docker-in-Docker needed
- Built-in layer caching
- More efficient in GitLab
Build Artifact Caching
Incremental Builds
Cache build outputs between runs:
build: cache: key: "$CI_COMMIT_REF_SLUG" paths: - node_modules/ - .next/cache/ # Next.js build cache - dist/.cache/ # Custom build cache script: - npm run build artifacts: paths: - dist/ expire_in: 1 day
Webpack/Rollup Cache
cache: key: files: - package-lock.json - webpack.config.js paths: - node_modules/ - .webpack-cache/
webpack.config.js:
module.exports = { cache: { type: 'filesystem', cacheDirectory: path.resolve(__dirname, '.webpack-cache'), }, };
Cache Expiration and Cleanup
Automatic Expiration
GitLab automatically removes caches:
- Not used in 7 days
- Exceeding storage quota
- Manually cleared
Manual Cleanup
UI: Project CI/CD Pipelines Clear runner caches
API:
# Clear project cache curl --request POST \ --header "PRIVATE-TOKEN: $GITLAB_TOKEN" \ "https://gitlab.com/api/v4/projects/:id/jobs/cache"
Versioned Caches
Force cache invalidation with version prefix:
cache: key: files: - package-lock.json prefix: "v2-$CI_JOB_NAME" # Increment when needed paths: - node_modules/
When to bump version:
- Major dependency changes
- CI configuration changes
- Cache corruption suspected
Troubleshooting Cache Issues
Cache Not Being Used
Symptoms: Jobs always download dependencies
Causes:
- Cache key changes every run
- Cache upload failed (quota, permissions)
- Cache storage unavailable
Debug:
test: script: - echo "Cache key: $CI_CACHE_KEY" - ls -la node_modules/ || echo "Cache miss" - npm ci - ls -la node_modules/
Fix:
- Use stable cache keys (file-based)
- Check runner logs for upload errors
- Verify cache storage configuration
Cache Corruption
Symptoms: Jobs fail with "module not found" errors
Fix:
# Clear cache and rebuild cache: key: files: - package-lock.json prefix: "v2" # Increment version paths: - node_modules/
Or manually clear: Project CI/CD Clear runner caches
Slow Cache Download
Symptoms: 5+ minutes to download cache
Causes:
- Cache too large (>500 MB)
- Network latency to cache storage
Fix:
- Reduce cache size (exclude unnecessary files)
- Use .gitignore-style patterns
- Consider splitting into multiple caches
cache: - key: files: - package-lock.json paths: - node_modules/ - "!node_modules/.cache/" # Exclude large subdirs - "!node_modules/**/*.md" # Exclude docs
Cache Quota Exceeded
Symptoms: Warning in job logs about cache quota
Fix:
- Delete old caches
- Reduce cache size
- Use cache expiration
- Contact admin to increase quota
Advanced Patterns
Monorepo Caching
Problem: Different services have different dependencies
# Shared cache config .cache_template: cache: key: files: - $SERVICE_DIR/package-lock.json prefix: "$CI_JOB_NAME" paths: - $SERVICE_DIR/node_modules/ policy: pull # Service-specific jobs test:agent-mesh: extends: .cache_template variables: SERVICE_DIR: "services/agent-mesh" script: - cd services/agent-mesh - npm test test:agent-router: extends: .cache_template variables: SERVICE_DIR: "services/agent-router" script: - cd services/agent-router - npm test
Matrix Builds with Caching
Different Node versions:
test: parallel: matrix: - NODE_VERSION: ["18", "20", "22"] image: node:${NODE_VERSION} cache: key: files: - package-lock.json prefix: "node-${NODE_VERSION}" script: - npm ci - npm test
Conditional Caching
# Only cache on main branch build: cache: key: "$CI_COMMIT_REF_SLUG" paths: - node_modules/ policy: !reference [.cache_policy, $CI_COMMIT_BRANCH] .cache_policy: main: pull-push "*": pull # All other branches: pull only
Measuring Cache Effectiveness
Key Metrics
Cache Hit Rate:
Cache Hit Rate = (Jobs with cache / Total jobs) 100%
Target: >80%
Time Savings:
Time Saved = (Avg time without cache - Avg time with cache) Job count
Cost Savings:
Cost Saved = Time Saved Cost Factor $10/1000 minutes
Monitoring
Add cache hit detection:
test: before_script: - | if [ -d "node_modules" ]; then echo " Cache hit" else echo " Cache miss" fi - npm ci --prefer-offline
Track in CI/CD variables:
script: - | if [ -d "node_modules" ]; then export CACHE_HIT=1 else export CACHE_HIT=0 fi - echo "CACHE_HIT=$CACHE_HIT" >> metrics.env dotenv: metrics.env
Best Practices Summary
- Use file-based cache keys (package-lock.json, poetry.lock)
- Add fallback keys for better reuse across branches
- Use pull-only policy for read-only jobs
- Cache both dependencies and package manager cache (.npm, .cache/pip)
- Enable Docker layer caching for build jobs
- Version your cache keys when forcing invalidation
- Monitor cache hit rate (target >80%)
- Keep cache size reasonable (<500 MB)
- Use separate caches for different dependency sets
- Clear cache when corrupted
Example: Complete Caching Setup
variables: npm_config_cache: "$CI_PROJECT_DIR/.npm" DOCKER_BUILDKIT: 1 # Default cache configuration default: cache: - key: files: - package-lock.json prefix: "$CI_JOB_NAME" fallback_keys: - files: - package-lock.json prefix: "$CI_DEFAULT_BRANCH" - "default-cache" paths: - node_modules/ - .npm/ policy: pull # Install job creates cache install: stage: .pre cache: - key: files: - package-lock.json paths: - node_modules/ - .npm/ policy: pull-push script: - npm ci --prefer-offline --no-audit artifacts: paths: - node_modules/ expire_in: 1 hour # All other jobs use cache (pull-only from default) lint: script: - npm run lint test: script: - npm test # Docker build with layer caching build:docker: image: docker:24 services: - docker:24-dind cache: [] # No npm cache needed script: - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY - | docker buildx create --use docker buildx build \ --cache-from type=registry,ref=$CI_REGISTRY_IMAGE:cache \ --cache-to type=registry,ref=$CI_REGISTRY_IMAGE:cache,mode=max \ --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA \ --push \ .
Next Steps
- Pipeline Optimization - Combine caching with other optimizations
- Monitoring - Track cache effectiveness over time
- Strategies - See caching in context of overall cost reduction