Skip to main content

qdrant

Qdrant Runbook

Overview

  • Purpose: Vector database for semantic search, similarity matching, and AI memory retrieval. Stores embeddings for agent memories, document chunks, and knowledge base entries.
  • Port: 6333 (HTTP/gRPC), 6334 (gRPC only)
  • Health endpoint: GET /health or GET /readyz
  • Namespace: data (Kubernetes)
  • Version: Qdrant 1.7+

Dependencies

  • Persistent Volume (PVC) - Vector storage
  • Agent Brain (port 3001) - Primary consumer

Collection Layout

CollectionPurposeDimensions
agent_memoryAgent long-term memory1536 (OpenAI) or 1024 (local)
documentsDocument embeddings for RAG1536
knowledge_baseStructured knowledge entries1536
conversation_historyChat history embeddings1536

Common Issues

Issue 1: Collection Not Found

  • Symptoms:
    • 404 errors on vector operations
    • "Collection does not exist" in logs
    • Agent memory retrieval failing
  • Cause:
    • Collection not created
    • Wrong collection name in configuration
    • Collection deleted accidentally
  • Resolution:
    # List existing collections curl http://localhost:6333/collections # Create missing collection curl -X PUT http://localhost:6333/collections/agent_memory \ -H "Content-Type: application/json" \ -d '{ "vectors": { "size": 1536, "distance": "Cosine" }, "optimizers_config": { "indexing_threshold": 20000 } }' # Verify collection exists curl http://localhost:6333/collections/agent_memory
  • Symptoms:
    • Search queries taking >1 second
    • High CPU usage during search
    • Memory retrieval timeouts
  • Cause:
    • Index not optimized
    • Too many vectors in collection
    • HNSW index not built
  • Resolution:
    # Check collection status curl http://localhost:6333/collections/agent_memory # Force index optimization curl -X POST http://localhost:6333/collections/agent_memory/index \ -H "Content-Type: application/json" \ -d '{ "field_name": null, "wait": true }' # Check if indexing is in progress curl http://localhost:6333/collections/agent_memory | jq '.result.status' # Reduce search scope with filters # (Application-level optimization) # Scale up resources if collection is large kubectl set resources deployment/qdrant -n data \ --limits=cpu=4000m,memory=8Gi

Issue 3: Out of Memory

  • Symptoms:
    • OOMKilled events
    • Qdrant crashing during indexing
    • "Not enough memory" errors
  • Cause:
    • Large collection exceeding RAM
    • HNSW index too large for memory
    • Memory mapped files not working
  • Resolution:
    # Check memory usage kubectl top pods -n data -l app=qdrant # Check collection size curl http://localhost:6333/collections/agent_memory | jq '.result.points_count' # Enable on-disk storage for large collections curl -X PATCH http://localhost:6333/collections/agent_memory \ -H "Content-Type: application/json" \ -d '{ "optimizers_config": { "memmap_threshold": 10000 } }' # Reduce HNSW memory usage curl -X PATCH http://localhost:6333/collections/agent_memory \ -H "Content-Type: application/json" \ -d '{ "hnsw_config": { "on_disk": true } }' # Increase memory limits kubectl set resources deployment/qdrant -n data \ --limits=memory=16Gi

Issue 4: Write Failures

  • Symptoms:
    • Upsert operations failing
    • "Write ahead log full" errors
    • Vector insertions timing out
  • Cause:
    • Disk full
    • WAL corruption
    • Collection locked for optimization
  • Resolution:
    # Check disk usage kubectl exec -it qdrant-0 -n data -- df -h /qdrant/storage # Check collection status curl http://localhost:6333/collections/agent_memory | jq '.result.status' # If status is "yellow" or optimizing, wait or # Clear old snapshots kubectl exec -it qdrant-0 -n data -- ls -la /qdrant/storage/snapshots kubectl exec -it qdrant-0 -n data -- rm -rf /qdrant/storage/snapshots/old_* # Force WAL flush curl -X POST http://localhost:6333/collections/agent_memory/points/flush # Restart if WAL corrupted kubectl rollout restart deployment/qdrant -n data

Issue 5: Embedding Dimension Mismatch

  • Symptoms:
    • "Vector size mismatch" errors
    • 400 Bad Request on upsert
    • Inconsistent search results
  • Cause:
    • Embedding model changed
    • Wrong collection used
    • Misconfigured client
  • Resolution:
    # Check expected dimension curl http://localhost:6333/collections/agent_memory | jq '.result.config.params.vectors.size' # Verify embedding from application # (Check agent-brain configuration) # If model changed, need to recreate collection and reindex # 1. Backup existing data curl -X POST "http://localhost:6333/collections/agent_memory/snapshots" # 2. Delete and recreate with new dimension curl -X DELETE http://localhost:6333/collections/agent_memory curl -X PUT http://localhost:6333/collections/agent_memory \ -H "Content-Type: application/json" \ -d '{ "vectors": { "size": 1024, "distance": "Cosine" } }' # 3. Trigger reindexing from source data curl -X POST http://localhost:3001/api/v1/memory/reindex

Issue 6: Cluster Partition (Distributed Mode)

  • Symptoms:
    • Some shards unavailable
    • Partial search results
    • "Shard not available" errors
  • Cause:
    • Network partition between nodes
    • Node crashed
    • Replication lag
  • Resolution:
    # Check cluster status curl http://localhost:6333/cluster # Check peer status curl http://localhost:6333/cluster/peers # Force shard recovery curl -X POST http://localhost:6333/collections/agent_memory/cluster/recover # Remove unhealthy peer curl -X DELETE http://localhost:6333/cluster/peer/{peer_id} # Restart affected node kubectl delete pod qdrant-{node} -n data

Restart Procedure

# 1. Check for ongoing operations curl http://localhost:6333/collections | jq '.result.collections[].status' # 2. Create snapshot for safety curl -X POST "http://localhost:6333/collections/agent_memory/snapshots" # 3. Wait for snapshot completion sleep 30 # 4. Perform rolling restart kubectl rollout restart deployment/qdrant -n data # 5. Wait for ready kubectl wait --for=condition=ready pod -l app=qdrant -n data --timeout=180s # 6. Verify health curl http://localhost:6333/health curl http://localhost:6333/readyz

Emergency Restart

# Force restart kubectl delete pod qdrant-0 -n data --force # Wait for recovery kubectl wait --for=condition=ready pod qdrant-0 -n data --timeout=180s # Verify collection integrity curl http://localhost:6333/collections/agent_memory | jq '.result.status' # If corrupted, restore from snapshot curl -X PUT "http://localhost:6333/collections/agent_memory/snapshots/recover" \ -H "Content-Type: application/json" \ -d '{"location": "/qdrant/storage/snapshots/agent_memory-SNAPSHOT_ID.snapshot"}'

Local Development Restart

# Docker docker restart qdrant # OrbStack orb restart qdrant # Using docker-compose docker-compose restart qdrant

Logs Location

Kubernetes Logs

# Qdrant logs kubectl logs -f deployment/qdrant -n data # Filter for errors kubectl logs deployment/qdrant -n data | grep -E "ERROR|WARN" # Export logs kubectl logs deployment/qdrant -n data > qdrant-logs-$(date +%Y%m%d).txt

Inside Container

# Check Qdrant storage directory kubectl exec -it qdrant-0 -n data -- ls -la /qdrant/storage # Check WAL files kubectl exec -it qdrant-0 -n data -- ls -la /qdrant/storage/collections/agent_memory/0/wal/

Telemetry

# Get telemetry data curl http://localhost:6333/telemetry # Get detailed metrics curl http://localhost:6333/metrics

Scaling

Vertical Scaling

# Increase resources for larger collections kubectl set resources deployment/qdrant -n data \ --limits=cpu=8000m,memory=32Gi \ --requests=cpu=2000m,memory=8Gi

Horizontal Scaling (Distributed Mode)

# Scale to multiple nodes (requires cluster configuration) kubectl scale statefulset/qdrant -n data --replicas=3 # Configure sharding curl -X PUT http://localhost:6333/collections/agent_memory \ -H "Content-Type: application/json" \ -d '{ "vectors": { "size": 1536, "distance": "Cosine" }, "shard_number": 3, "replication_factor": 2 }'

Storage Scaling

# Expand PVC kubectl patch pvc qdrant-storage -n data -p '{"spec":{"resources":{"requests":{"storage":"100Gi"}}}}'

Scaling Guidelines

MetricThresholdAction
Memory Usage> 80%Increase memory, enable memmap
CPU Usage> 70%Add replicas, increase CPU
Search Latency P99> 500msOptimize index, add shards
Disk Usage> 80%Expand storage
Points Count> 10MAdd shards, enable on-disk

Alerts

Critical Alerts (PagerDuty)

AlertConditionRunbook Action
QdrantDownCannot connect for 2minEmergency Restart
CollectionUnavailableCollection status "red"Recover from snapshot
DiskFullDisk > 95%Expand storage, cleanup

Warning Alerts (Slack)

AlertConditionRunbook Action
HighMemoryMemory > 80%Enable memmap, increase limit
SlowSearchP99 > 500msOptimize index
IndexingStuckOptimizing > 1hrCheck resources
ShardUnhealthyShard status != greenRecover shard

Prometheus Alert Rules

groups: - name: qdrant rules: - alert: QdrantDown expr: up{job="qdrant"} == 0 for: 2m labels: severity: critical annotations: summary: "Qdrant is down" runbook_url: "https://gitlab.com/blueflyio/agent-platform/technical-docs/-/wikis/runbooks/qdrant" - alert: QdrantHighMemory expr: qdrant_memory_usage_bytes / qdrant_memory_limit_bytes > 0.8 for: 5m labels: severity: warning annotations: summary: "Qdrant memory usage high" - alert: QdrantSlowSearch expr: histogram_quantile(0.99, rate(qdrant_search_duration_seconds_bucket[5m])) > 0.5 for: 5m labels: severity: warning annotations: summary: "Qdrant search latency high"

Monitoring Dashboards

  • Grafana: https://grafana.local/d/qdrant
  • Qdrant Dashboard: http://localhost:6333/dashboard (built-in)
  • Prometheus: https://prometheus.local/graph?g0.expr=up{job="qdrant"}

Backup & Recovery

Create Snapshot

# Snapshot specific collection curl -X POST "http://localhost:6333/collections/agent_memory/snapshots" # List snapshots curl http://localhost:6333/collections/agent_memory/snapshots # Download snapshot curl -O http://localhost:6333/collections/agent_memory/snapshots/{snapshot_name}

Restore from Snapshot

# Restore collection from snapshot curl -X PUT "http://localhost:6333/collections/agent_memory/snapshots/recover" \ -H "Content-Type: application/json" \ -d '{ "location": "file:///qdrant/storage/snapshots/agent_memory-SNAPSHOT.snapshot" }' # Or from URL curl -X PUT "http://localhost:6333/collections/agent_memory/snapshots/recover" \ -H "Content-Type: application/json" \ -d '{ "location": "https://backup-storage/snapshots/agent_memory.snapshot" }'

Useful API Endpoints

# Health check curl http://localhost:6333/health curl http://localhost:6333/readyz # Collection info curl http://localhost:6333/collections curl http://localhost:6333/collections/{name} curl http://localhost:6333/collections/{name}/points/count # Search example curl -X POST http://localhost:6333/collections/agent_memory/points/search \ -H "Content-Type: application/json" \ -d '{ "vector": [0.1, 0.2, ...], "limit": 10 }' # Cluster info (distributed mode) curl http://localhost:6333/cluster curl http://localhost:6333/cluster/peers # Metrics (Prometheus format) curl http://localhost:6333/metrics

Contacts

  • On-call: PagerDuty rotation
  • Slack: #platform-incidents
  • Owner: AI/ML Team