infrastructure migration summary
Infrastructure Migration Summary - 2026-01-23
Purpose: Complete guide for infrastructure migration to NAS - includes Docker services, git repositories, worktrees, and wikis
Status: ✅ MIGRATION COMPLETE - 100% NAS-Based Infrastructure
WHAT GOT MOVED
1. Git Infrastructure Moved to NAS (2026-01-23)
All Repositories (67 GitLab Projects)
- Status: ✅ COMPLETE - All repos are bare repos on NAS
- Location:
/Volumes/AgentPlatform/repos/bare/blueflyio/ - Size: 9.4GB total
- Organization:
- Top-level repos: gitlab_components.git, platform-agents.git, security-policies.git
- Agent-platform group:
/Volumes/AgentPlatform/repos/bare/blueflyio/agent-platform/[project].git - OSSA group:
/Volumes/AgentPlatform/repos/bare/blueflyio/ossa/
- Local Action: NO local clones exist or needed - all work via NAS worktrees
All Worktrees (NAS-Centralized Strategy)
- Status: ✅ COMPLETE - 100% NAS-based worktree workflow
- Location:
/Volumes/AgentPlatform/worktrees/[DEVICE]/[DATE]/[PROJECT]/[BRANCH]/ - Device Namespaces:
shared/- Multi-device work (Mac M3, M4, code-server, phone, iPad)m3/- Mac M3 performance-specific workm4/- Mac M4 performance-specific work
- Old Local Worktrees: ✅ REMOVED
- Cleaned up:
~/Sites/blueflyio/.worktrees/(all old local worktrees removed) - Freed: ~500MB
- All bare repos pruned to remove stale registrations
- Cleaned up:
- Benefits: Work from ANY device - same files, same state, always in sync
All Wikis (Documentation)
- Status: ✅ COMPLETE - All wikis on NAS only
- Location:
/Volumes/AgentPlatform/wikis/blueflyio/ - Wikis Migrated:
technical-docs.wiki/- Platform documentation (already had GitLab origin)api_normalization.wiki/- API Normalization project docs
- Old Local Wikis: ✅ REMOVED
- Cleaned up:
~/Sites/blueflyio/_WIKI/(entire directory removed) - Freed: 2.7GB (including old .zip backups)
- No symlinks or local copies remain
- Cleaned up:
- Access: Direct git operations in NAS wiki directories (no worktrees needed for wikis)
Infrastructure Alignment Complete
- AGENTS.md: ✅ Updated to NAS-centralized workflow (v1.1.0)
- CLAUDE.md: ✅ Updated with NAS worktree paths and device namespaces
- Wiki Docs: ✅ Updated separation-of-duties.md, system-overview.md, platform-overview.md
- Architecture Decisions: ✅ worktree-strategy-nas-centralized.md marked APPROVED
- Local Setup Required: MINIMAL
~/Sites/blueflyio/CLAUDE.md(instructions)~/Sites/blueflyio/.claude/(config)- Mount NAS:
/Volumes/AgentPlatform/ - NO repos, NO worktrees, NO wikis locally
2. Services Moved to NAS (blueflynas.tailcf98b3.ts.net)
Agent-Tracer (Knowledge Graph API)
- Status: Deployed on NAS K3s cluster
- Location:
developmentnamespace, port 3007 - Pod:
agent-tracer-67df67fc46-q9kzh - Local Docker: Can be removed (postgres-tracer, redis-tracer, agent-tracer containers)
KAGENT Platform (23 Agents)
- Status: Running in K3s
kagentnamespace - Agents: Assessment, Onboarding, Renewal, ROI Calculator, K8s agents, Helm agents, etc.
- API/UI: KAGENT API, UI, and controller all on NAS
- Local Docker: Can be cleaned up (old KAGENT images)
Phoenix Observability
- Status: Running in K3s
monitoringnamespace - Local Docker: Old Phoenix images can be removed
3. Services Moved to Vast.ai
LLMLingua Compression Service
- Status: Deployed on Vast.ai RTX 4090
- Instance ID: 30386215
- SSH:
ssh -p 26214 root@ssh9.vast.ai - Cost: $0.29/hr ($207/month)
- Purpose: Token compression (79x reduction for large contexts)
- Local Docker: Never existed locally (deployed directly to Vast.ai)
WHAT TO DO ON YOUR OTHER MAC
Phase 0: Git Infrastructure Setup (REQUIRED FIRST)
NAS-centralized workflow means NOTHING local except instructions:
-
Mount NAS (Finder: ⌘K):
# Mount via NFS nfs://192.168.68.54/volume1/AgentPlatform # Or via Tailscale: nfs://blueflynas.tailcf98b3.ts.net/volume1/AgentPlatform # Verify mount ls /Volumes/AgentPlatform/ # Should show: repos/, worktrees/, wikis/ -
Create minimal local directory:
mkdir -p ~/Sites/blueflyio/.claude cd ~/Sites/blueflyio/ -
Copy CLAUDE.md from NAS:
# Option A: From wiki cp /Volumes/AgentPlatform/wikis/blueflyio/technical-docs.wiki/getting-started/CLAUDE.md ~/Sites/blueflyio/ # Option B: From this Mac (via Tailscale) scp bluefly.tailcf98b3.ts.net:~/Sites/blueflyio/CLAUDE.md ~/Sites/blueflyio/ -
That's it! Everything else (repos, worktrees, wikis) is on NAS.
Complete setup guide: /Volumes/AgentPlatform/wikis/blueflyio/technical-docs.wiki/getting-started/nas-setup-guide.md
Also see: /tmp/other-computer-setup.md (quick reference created for Mac M3)
Phase 1: Docker Cleanup (Safe - Do This Second)
Run this single command to remove all unused Docker resources:
docker system prune -af --volumes
This removes:
- Dangling images (untagged, unused)
- Stopped containers
- Build cache
- Unused volumes
- Unused networks
Expected recovery: 20-30 GB (varies by Mac)
What it does NOT remove:
- Running containers
- Images used by running containers
- Volumes attached to running containers
Phase 2: Service-Specific Cleanup (After Verification)
A. Agent-Tracer Local Services
Verify it's on NAS first:
# Confirm it's running on NAS (ask primary Mac user to verify) kubectl get pods -n development | grep agent-tracer
If confirmed on NAS, remove local:
# Stop and remove local agent-tracer stack docker ps | grep tracer # Should show nothing after NAS deployment # If any tracer containers exist locally: docker stop $(docker ps -q --filter "name=tracer") docker rm $(docker ps -aq --filter "name=tracer") # Remove tracer images docker images | grep tracer | awk '{print $3}' | xargs docker rmi -f
B. Old Phoenix Images
Keep only latest, remove old versions:
# List Phoenix images docker images arizephoenix/phoenix # Remove all but the latest (they're running on NAS now) docker images arizephoenix/phoenix --format "{{.ID}} {{.CreatedAt}}" | tail -n +2 | awk '{print $1}' | xargs docker rmi -f
Expected recovery: ~5 GB
C. Old LibreChat Images
# Remove old versions docker images ghcr.io/danny-avila/librechat | grep '<none>' | awk '{print $3}' | xargs docker rmi -f
Expected recovery: ~4-5 GB
D. Old ClickHouse Images
docker images clickhouse/clickhouse-server | grep '<none>' | awk '{print $3}' | xargs docker rmi -f
Expected recovery: ~1-2 GB
E. Old Prometheus Images
docker images prom/prometheus | grep '<none>' | awk '{print $3}' | xargs docker rmi -f
Expected recovery: ~1 GB
Phase 3: GitLab Runner Cache Cleanup (If Applicable)
Check if GitLab runners are active:
docker ps | grep gitlab-runner
If NO runners active, clean up cache volumes:
docker volume ls | grep runner- | awk '{print $2}' | xargs docker volume rm
Expected recovery: 1-2 GB
Phase 4: Old Compose Stack Volumes (Selective)
Only remove if you've confirmed these services are on NAS/K3s:
CSMA Platform Volumes (if on K3s)
docker volume rm csma-platform_grafana-data \ csma-platform_langflow-data \ csma-platform_loki-data \ csma-platform_neo4j-data \ csma-platform_neo4j-logs \ csma-platform_phoenix-data \ csma-platform_postgres-data \ csma-platform_prometheus-data \ csma-platform_redis-data
Old Infrastructure Volumes
docker volume rm infrastructure_librechat_logs \ infrastructure_librechat_meilisearch_data \ infrastructure_librechat_mongo_data \ infrastructure_librechat_storage
Expected recovery: 5-10 GB
WHAT TO KEEP LOCALLY
Active Docker Compose Stacks (DO NOT REMOVE)
1. agent-router Stack
- Status: Keep for local development/monitoring
- Services: Grafana, Prometheus, Postgres, Redis, Qdrant, Mongo (10 services)
- Purpose: LLM Platform infrastructure
2. agent-mesh Stack
- Status: Keep for local KAGENT development
- Services: postgres, redis, qdrant (3 services)
- Purpose: KAGENT local services
3. DDEV Stacks (If Active)
ddev-drupal-demo- Keep if actively used for demosddev-ssh-agent- Keep (DDEV infrastructure)
SERVICES STATUS REFERENCE
Running on NAS (K3s Cluster)
| Service | Namespace | Status | Access |
|---|---|---|---|
| agent-tracer | development | Running | http://blueflynas:3007 |
| KAGENT Platform | kagent | Running (23 pods) | Various ports |
| Phoenix | monitoring | Running | Monitoring stack |
Running on Vast.ai
| Service | Instance | Status | Access |
|---|---|---|---|
| LLMLingua | 30386215 | Running | ssh -p 26214 root@ssh9.vast.ai |
Deprecated Locally (Safe to Remove)
- Agent-tracer local containers
- Old Phoenix images
- Old LibreChat images
- Old ClickHouse images
- Old Prometheus images
- GitLab runner caches (if no active runners)
- CSMA platform volumes (if on K3s)
COMPLETE CLEANUP COMMANDS
⚠️ FILE POLICY NOTE: The commands below were previously packaged as a .sh script, but .sh scripts are restricted per project FILE POLICY. Run these commands manually or via npm scripts instead.
Run these commands manually after verification:
# Complete Docker cleanup after migration to NAS and Vast.ai set -e echo "=== Phase 1: Remove Unused Resources ===" docker system prune -af --volumes echo "✅ Phase 1 Complete" echo "" echo "=== Phase 2: Remove Old Service Images ===" # Phoenix docker images arizephoenix/phoenix --format "{{.ID}}" | tail -n +2 | xargs -r docker rmi -f # LibreChat docker images ghcr.io/danny-avila/librechat | grep '<none>' | awk '{print $3}' | xargs -r docker rmi -f # ClickHouse docker images clickhouse/clickhouse-server | grep '<none>' | awk '{print $3}' | xargs -r docker rmi -f # Prometheus docker images prom/prometheus | grep '<none>' | awk '{print $3}' | xargs -r docker rmi -f echo "✅ Phase 2 Complete" echo "" echo "=== Phase 3: Remove GitLab Runner Caches ===" docker volume ls --format "{{.Name}}" | grep runner- | xargs -r docker volume rm echo "✅ Phase 3 Complete" echo "" echo "=== Phase 4: Remove Old Stack Volumes ===" # CSMA Platform (now on K3s) docker volume rm -f csma-platform_grafana-data \ csma-platform_langflow-data \ csma-platform_loki-data \ csma-platform_neo4j-data \ csma-platform_neo4j-logs \ csma-platform_phoenix-data \ csma-platform_postgres-data \ csma-platform_prometheus-data \ csma-platform_redis-data 2>/dev/null || true # Old infrastructure docker volume rm -f infrastructure_librechat_logs \ infrastructure_librechat_meilisearch_data \ infrastructure_librechat_mongo_data \ infrastructure_librechat_storage 2>/dev/null || true echo "✅ Phase 4 Complete" echo "" echo "=== Cleanup Summary ===" docker system df echo "" echo "✅ All cleanup complete!"
Usage: Copy and paste the commands above into your terminal, or package them in an npm script in package.json (preferred over .sh files).
VERIFICATION STEPS
1. Check What's Running Locally
docker ps
2. Check Disk Usage Before/After
docker system df
3. Verify Services on NAS (From Primary Mac)
# Check K3s pods kubectl get pods -n development kubectl get pods -n kagent kubectl get pods -n monitoring # Check NAS services curl http://blueflynas.tailcf98b3.ts.net:3007/health
4. Verify LLMLingua on Vast.ai
ssh -p 26214 root@ssh9.vast.ai 'curl -s http://localhost:8000/health'
EXPECTED DISK RECOVERY
| Phase | Recovery | Safety |
|---|---|---|
| Phase 1: Unused Resources | 20-30 GB | ✅ Safe |
| Phase 2: Old Images | 10-15 GB | ✅ Safe (after NAS verification) |
| Phase 3: Runner Caches | 1-2 GB | ✅ Safe (if no active runners) |
| Phase 4: Old Volumes | 5-10 GB | ✅ Safe (after NAS verification) |
| TOTAL | 35-55 GB | - |
TIMELINE
- Immediate (Safe): Run Phase 1 cleanup → ~20-30 GB recovery (5 minutes)
- After NAS Verification: Run Phase 2-4 cleanup → ~15-25 GB recovery (10 minutes)
- Total Active Work: 15-20 minutes
- Total Recovery: 35-55 GB
SUPPORT
Questions? Ask the primary Mac user (thomas.scola) to verify:
- Services running on NAS K3s cluster
- Services running on Vast.ai
- What's safe to remove locally
NAS Access:
- Host: blueflynas.tailcf98b3.ts.net
- User: bluefly
- Via Tailscale VPN
Vast.ai Access:
- Instance: 30386215
- SSH:
ssh -p 26214 root@ssh9.vast.ai
SUMMARY
What Changed:
Git Infrastructure (2026-01-23):
- ✅ All 67 repositories → NAS bare repos (
/Volumes/AgentPlatform/repos/bare/blueflyio/) - ✅ All worktrees → NAS device namespaces (
/Volumes/AgentPlatform/worktrees/[DEVICE]/) - ✅ All wikis → NAS only (
/Volumes/AgentPlatform/wikis/blueflyio/) - ✅ Local worktrees removed (
~/Sites/blueflyio/.worktrees/- freed 500MB) - ✅ Local wikis removed (
~/Sites/blueflyio/_WIKI/- freed 2.7GB) - ✅ NAS-centralized worktree strategy APPROVED and DEPLOYED
Docker Services (2026-01-22):
- ✅ Agent-tracer → NAS K3s
- ✅ KAGENT Platform → NAS K3s
- ✅ Phoenix → NAS K3s
- ✅ LLMLingua → Vast.ai RTX 4090
What to Do on Other Mac:
- Phase 0: Mount NAS, create minimal local setup (REQUIRED FIRST)
- Phase 1: Run Docker cleanup immediately (safe, 20-30GB)
- Phase 2-4: Run Docker service cleanup after NAS verification (15-25GB)
- Total Recovery: 35-55 GB of disk space
What to Keep Locally:
~/Sites/blueflyio/CLAUDE.md(instructions only)~/Sites/blueflyio/.claude/(config)- Docker: agent-router stack, agent-mesh stack, active DDEV stacks
- NO repos, NO worktrees, NO wikis - all on NAS
Benefits:
- Work from ANY device (Mac M3, M4, code-server, phone, iPad)
- Same files, same state, always in sync
- Infrastructure independence achieved
Generated: 2026-01-23
Status: ✅ MIGRATION COMPLETE - 100% NAS-Based Infrastructure
For: Complete infrastructure migration guide (git + Docker)
Reference: See /Volumes/AgentPlatform/wikis/blueflyio/technical-docs.wiki/getting-started/nas-setup-guide.md