oracle vm disaster recovery
Oracle VM Disaster Recovery
Last Updated: 2026-02-15 Owner: Infrastructure Team Review Cycle: Monthly
Overview
Disaster recovery strategy for Oracle VM at 100.103.48.75 (oracle.tailcf98b3.ts.net).
Critical Risk - If Oracle VM disappeared:
- ✅ Services code: SAFE (git repos)
- ❌ Production data: LOST (databases)
- ❌ Production configs: LOST (.env files)
- ❌ Service orchestration: LOST (docker-compose.yml)
What Would Be Lost
Files NOT in Git
/opt/bluefly/.env- Main secrets (CRITICAL)/opt/bluefly/docker-compose.yml- Service orchestration (CRITICAL)- 11 service-specific .env files
Data Volumes
/opt/bluefly/data/postgres- All databases/opt/bluefly/data/mongodb- Chat history/opt/bluefly/data/qdrant- Vector embeddings/opt/bluefly/data/{redis,grafana,loki,tempo,phoenix}- Ephemeral data
Backup Commands
# Backup configs mkdir -p ~/backups/oracle-vm/{configs,databases} scp oracle:/opt/bluefly/.env ~/backups/oracle-vm/configs/ scp oracle:/opt/bluefly/docker-compose.yml ~/backups/oracle-vm/configs/ # Backup databases ssh oracle "docker exec postgres pg_dumpall -U bluefly" > ~/backups/oracle-vm/databases/postgres-$(date +%Y%m%d).sql ssh oracle "docker exec mongodb mongodump --archive" > ~/backups/oracle-vm/databases/mongodb-$(date +%Y%m%d).archive ssh oracle "docker exec qdrant tar czf - /qdrant/storage" > ~/backups/oracle-vm/databases/qdrant-$(date +%Y%m%d).tar.gz
Recovery Procedure
Total Time: ~4 hours
- Restore configs (30min)
- Restore databases (1hr)
- Deploy services via CI/CD (2hrs)
- Verify (30min)
See full procedure in wiki.
Currently Running
26 containers:
- Core: agent-router, agent-mesh, agent-protocol, workflow-engine, compliance-engine, agent-tracer, dragonfly (ALL healthy)
- Infrastructure: postgres, redis, qdrant, mongodb (ALL healthy)
- Observability: grafana, loki, tempo (healthy), phoenix, otel-collector (unhealthy)
- AI: librechat, langflow, n8n
- Agents: social-research, whitepaper-writer, content-reviewer