cicd analytics
CI/CD Analytics in GitLab
Overview
GitLab CI/CD Analytics provides comprehensive insights into pipeline performance, success rates, and efficiency metrics. These analytics help teams identify bottlenecks, optimize build times, and improve overall DevOps performance.
What is CI/CD Analytics?
CI/CD Analytics enables you to:
- Track pipeline performance: Monitor execution times and success rates
- Identify bottlenecks: Find slow jobs and stages
- Optimize efficiency: Reduce pipeline duration and costs
- Monitor trends: Track improvements over time
- Correlate with deployments: Link CI/CD performance to releases
Accessing CI/CD Analytics
Project-Level Analytics
Navigate to: Analyze CI/CD analytics
Group-Level Analytics
Navigate to: Group Analyze CI/CD analytics
Key Metrics
1. Pipeline Statistics
Track overall pipeline health:
Pipeline Overview (Last 30 days)
Total Pipelines: 1,234
Success Rate: 87.5%
Failure Rate: 12.5%
Median Duration: 8m 45s
P95 Duration: 15m 30s
Metrics Included:
- Total pipeline runs: Count of all pipelines
- Success rate: Percentage of successful pipelines
- Failure rate: Percentage of failed pipelines
- Duration statistics: Median and 95th percentile
2. Pipeline Duration Trends
Visualize performance over time:
Pipeline Duration Over Time
20m
18m
16m
14m
12m
10m
8m
6m
4m
2m
Week 1 Week 2 Week 3 Week 4
Median (P50): 8m 45s
P95: 15m 30s
Trend: +15% increase
3. Success/Failure Rates
Track reliability:
Pipeline Success Rate
100%
90%
80%
70%
60%
Jan Feb Mar Apr May
Success: 87.5%
Failures: 12.5%
- Build errors: 5.5%
- Test failures: 4.0%
- Deployment errors: 3.0%
Enhanced Analytics with ClickHouse
GitLab 18.0+ uses ClickHouse for improved analytics performance.
Benefits
- Faster queries: Sub-second response times
- Larger datasets: Analyze years of history
- Complex aggregations: Multi-dimensional analysis
- Real-time updates: Near-instantaneous data
Example Queries
-- Pipeline duration by branch SELECT ref as branch, count() as pipeline_count, avg(duration) as avg_duration_sec, quantile(0.95)(duration) as p95_duration_sec FROM ci_pipelines WHERE created_at >= now() - INTERVAL 30 DAY GROUP BY branch ORDER BY pipeline_count DESC LIMIT 10; -- Failure rate by stage SELECT stage, countIf(status = 'failed') as failures, count() as total, round(failures / total * 100, 2) as failure_rate_pct FROM ci_jobs WHERE created_at >= now() - INTERVAL 7 DAY GROUP BY stage ORDER BY failure_rate_pct DESC; -- Most time-consuming jobs SELECT name, count() as executions, avg(duration) as avg_duration_sec, sum(duration) as total_duration_sec FROM ci_jobs WHERE created_at >= now() - INTERVAL 30 DAY GROUP BY name ORDER BY total_duration_sec DESC LIMIT 20;
Filtering and Segmentation
Filter by Pipeline Trigger
Analyze pipelines by trigger source:
- Push: Commits to branches
- Merge request: MR pipelines
- Schedule: Scheduled pipelines
- Manual: Manually triggered
- API: API-triggered pipelines
- Web: GitLab UI triggers
-- Performance by trigger source SELECT source, count() as pipeline_count, avg(duration) as avg_duration_sec, countIf(status = 'success') / count() as success_rate FROM ci_pipelines WHERE created_at >= now() - INTERVAL 30 DAY GROUP BY source ORDER BY pipeline_count DESC;
Filter by Branch
Compare pipeline performance across branches:
-- Main vs. feature branch performance SELECT CASE WHEN ref IN ('main', 'master', 'development') THEN 'Protected' ELSE 'Feature' END as branch_type, count() as pipeline_count, avg(duration) as avg_duration_sec, countIf(status = 'failed') / count() as failure_rate FROM ci_pipelines WHERE created_at >= now() - INTERVAL 30 DAY GROUP BY branch_type;
Filter by Date Range
Analyze specific time periods:
- Last 7 days
- Last 30 days
- Last 90 days
- Custom date range
Job-Level Analytics
Job Performance
Identify slow jobs:
Top 10 Slowest Jobs (Last 30 days)
Job Name Avg Duration Runs
build:production 12m 45s 234
test:integration 8m 30s 456
security:sast 7m 15s 234
test:e2e 6m 50s 456
deploy:production 5m 30s 89
build:docker 4m 45s 234
test:unit 3m 20s 456
lint:code 2m 10s 456
security:dependency 1m 45s 234
deploy:staging 1m 30s 178
Job Failure Analysis
Find unreliable jobs:
-- Jobs with highest failure rates SELECT name, count() as total_runs, countIf(status = 'failed') as failures, round(failures / total_runs * 100, 2) as failure_rate_pct, groupArray(failure_reason) as common_reasons FROM ci_jobs WHERE created_at >= now() - INTERVAL 30 DAY AND status IN ('failed', 'success') GROUP BY name HAVING failure_rate_pct > 5 ORDER BY failure_rate_pct DESC LIMIT 10;
Stage-Level Analytics
Stage Performance
Analyze pipeline stages:
Stage Performance (Last 30 days)
Stage Avg Duration Success Rate
build 5m 30s 95.2%
test 8m 45s 87.5%
security 4m 20s 92.1%
deploy 3m 15s 98.5%
Bottleneck: test stage (longest duration)
Unreliable: test stage (lowest success rate)
Stage Optimization
Optimize slow stages:
# Before: Sequential tests (slow) test: stage: test script: - npm run test:unit # 2 minutes - npm run test:integration # 5 minutes - npm run test:e2e # 8 minutes # Total: 15 minutes # After: Parallel tests (fast) test:unit: stage: test script: - npm run test:unit # 2 minutes test:integration: stage: test script: - npm run test:integration # 5 minutes test:e2e: stage: test script: - npm run test:e2e # 8 minutes # Total: 8 minutes (parallel execution)
Pipeline Efficiency Metrics
1. Cycle Time
Time from commit to deployment:
Cycle Time Breakdown
Commit Pipeline Start: 30s
Build Stage: 5m 30s
Test Stage: 8m 45s
Security Stage: 4m 20s
Deploy Stage: 3m 15s
Total Cycle Time: 22m 20s
Target: < 20 minutes
Current: 22m 20s (above target)
2. Queue Time
Time waiting for runner:
-- Jobs with longest queue times SELECT name, avg(queued_duration) as avg_queue_sec, max(queued_duration) as max_queue_sec, count() as job_count FROM ci_jobs WHERE created_at >= now() - INTERVAL 7 DAY GROUP BY name ORDER BY avg_queue_sec DESC LIMIT 10;
3. Runner Utilization
Track runner efficiency:
-- Runner utilization SELECT runner_id, count() as jobs_executed, sum(duration) as total_busy_time_sec, avg(duration) as avg_job_duration_sec FROM ci_jobs WHERE created_at >= now() - INTERVAL 7 DAY AND runner_id IS NOT NULL GROUP BY runner_id ORDER BY jobs_executed DESC;
Pipeline Optimization Strategies
1. Parallelize Jobs
Run independent jobs concurrently:
# Use 'needs' for fine-grained parallelism build:app: stage: build script: npm run build test:unit: stage: test needs: [build:app] script: npm run test:unit test:integration: stage: test needs: [build:app] script: npm run test:integration # test:unit and test:integration run in parallel
2. Cache Dependencies
Reduce installation time:
.node_cache: &node_cache cache: key: files: - package-lock.json paths: - node_modules/ policy: pull install: stage: .pre script: - npm ci cache: <<: *node_cache policy: push build: stage: build script: - npm run build cache: <<: *node_cache
3. Optimize Docker Builds
Use layer caching:
build:docker: stage: build script: - docker build --cache-from $CI_REGISTRY_IMAGE:latest --tag $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA --tag $CI_REGISTRY_IMAGE:latest . - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA - docker push $CI_REGISTRY_IMAGE:latest
4. Skip Unnecessary Jobs
Use rules to conditionally run jobs:
test:e2e: stage: test script: - npm run test:e2e rules: # Only run E2E tests for MRs and main branch - if: $CI_PIPELINE_SOURCE == "merge_request_event" - if: $CI_COMMIT_BRANCH == "main" # Skip for feature branches - when: never
5. Use Faster Runners
Optimize runner configuration:
- Use SSD storage
- Increase CPU cores
- Add more RAM
- Use spot/preemptible instances for cost savings
Integration with External Tools
GitLab CI Pipelines Exporter
Export metrics to Prometheus:
# Install gitlab-ci-pipelines-exporter # https://github.com/mvisonneau/gitlab-ci-pipelines-exporter # Config file gitlab: url: https://gitlab.com token: ${GITLAB_TOKEN} projects: - name: my-org/my-project refs: - main - development metrics: - name: pipeline_duration_seconds kind: gauge labels: - project - ref - status
Grafana Dashboards
Visualize CI/CD metrics:
{ "dashboard": { "title": "CI/CD Performance", "panels": [ { "title": "Pipeline Success Rate", "targets": [ { "expr": "sum(rate(gitlab_ci_pipeline_status{status=\"success\"}[5m])) / sum(rate(gitlab_ci_pipeline_status[5m]))" } ] }, { "title": "Pipeline Duration (P95)", "targets": [ { "expr": "histogram_quantile(0.95, rate(gitlab_ci_pipeline_duration_seconds_bucket[5m]))" } ] } ] } }
CI/CD Best Practices
1. Monitor Key Metrics
Track essential KPIs:
- Build frequency: Deployments per day
- Build duration: Time to complete pipeline
- Build success rate: Percentage of passing pipelines
- Queue time: Time waiting for runners
- Recovery time: Time to fix broken builds
2. Set Performance Targets
Establish acceptable thresholds:
performance_targets: pipeline_duration: target: 15m acceptable: 20m critical: 30m success_rate: target: 95% acceptable: 90% critical: 85% queue_time: target: 30s acceptable: 2m critical: 5m
3. Regular Optimization
Schedule pipeline reviews:
- Weekly: Review failed pipelines
- Monthly: Analyze duration trends
- Quarterly: Major optimization initiatives
4. Cost Monitoring
Track CI/CD costs:
-- Compute cost per pipeline SELECT DATE(created_at) as date, count() as pipeline_count, sum(duration) / 3600 as compute_hours, compute_hours * 0.10 as cost_usd -- $0.10/hour FROM ci_pipelines WHERE created_at >= now() - INTERVAL 30 DAY GROUP BY date ORDER BY date;
Alerting on CI/CD Metrics
Alert Configuration
Set up alerts for CI/CD issues:
# Prometheus alert rules groups: - name: cicd_alerts rules: # High failure rate - alert: HighPipelineFailureRate expr: | sum(rate(gitlab_ci_pipeline_status{status="failed"}[1h])) / sum(rate(gitlab_ci_pipeline_status[1h])) > 0.2 for: 30m labels: severity: warning annotations: summary: "High pipeline failure rate" description: "Pipeline failure rate is {{ $value | humanizePercentage }}" # Slow pipelines - alert: SlowPipelines expr: | histogram_quantile(0.95, rate(gitlab_ci_pipeline_duration_seconds_bucket[1h]) ) > 1800 for: 1h labels: severity: warning annotations: summary: "Pipelines running slowly" description: "P95 pipeline duration is {{ $value }}s" # Runner queue backup - alert: RunnerQueueBackup expr: gitlab_ci_runner_jobs_queued > 10 for: 15m labels: severity: warning annotations: summary: "CI runner queue backed up" description: "{{ $value }} jobs waiting in queue"
Troubleshooting Slow Pipelines
Common Issues
1. Slow Dependency Installation
Problem: npm install takes 3+ minutes
Solution:
# Cache node_modules cache: key: files: - package-lock.json paths: - node_modules/ # Or use npm ci with cache script: - npm ci --cache .npm --prefer-offline
2. Unnecessary Test Runs
Problem: All tests run for small changes
Solution:
# Use changed files detection test:unit: script: - npm run test:changed rules: - changes: - src/**/*.js - test/**/*.js
3. Sequential Job Execution
Problem: Jobs run serially when they could be parallel
Solution:
# Use 'needs' for parallel execution test:unit: needs: [build] test:integration: needs: [build] # Both run in parallel after build
4. Large Docker Images
Problem: Docker pulls take 2+ minutes
Solution:
# Multi-stage build FROM node:18-alpine AS builder WORKDIR /app COPY package*.json ./ RUN npm ci --only=production FROM node:18-alpine COPY /app/node_modules ./node_modules COPY . . # Final image only contains production dependencies
References
- GitLab CI/CD Analytics Documentation
- Pipeline Efficiency Documentation
- GitLab CI Pipelines Exporter
- CI/CD Best Practices
Related Documentation
- DORA Metrics - DevOps performance measurement
- Metrics - Prometheus metrics collection
- Dashboards - Visualization and reporting
- Value Stream Analytics - End-to-end workflow analysis