gpu cluster status
Vast.ai GPU Cluster Status
AUTHORITATIVE SOURCE: BULLETPROOF_VASTAI_PLAN.md
Complete Implementation Plan: See BULLETPROOF_VASTAI_PLAN.md for full details including Cloudflare Tunnel + Tailscale integration, agent-docker service, and CI/CD components.
Last Updated: 2026-01-04
Active Instances
| Instance ID | GPU | IP (Tailscale) | Hostname | Cost/hr | Status |
|---|---|---|---|---|---|
| 29484611 | RTX 4090 (24GB) | 100.113.211.78 | vastai-gpu-worker-1 | $0.25 | Running |
Service Discovery Registry
API Endpoint: https://mesh.bluefly.internal/api/v1/vastai/registry
Query active instances:
curl https://mesh.bluefly.internal/api/v1/vastai/registry?environment=prod
Register instance:
curl -X POST https://mesh.bluefly.internal/api/v1/vastai/registry/register \ -H "Content-Type: application/json" \ -d @instance-payload.json
OpenAPI Spec: See common_npm/agent-mesh/openapi/vastai-registry.openapi.yml
Network Configuration
- Tailscale Mesh: bluefly tailnet
- SSH Access:
ssh root@100.113.211.78(via Tailscale) - Public SSH:
ssh -p 14610 root@ssh5.vast.ai - Registry API:
https://mesh.bluefly.internal/api/v1/vastai/registry
Environment Variables Required
# Vast.ai API tokens (set in CI/CD or .env) VASTAI_CLUSTER_OP_KEY= # Instance management VASTAI_COST_MONITOR_KEY= # Billing/cost access VASTAI_TASK_DISPATCH_KEY= # Task coordination # Tailscale (optional - for automated joining) TAILSCALE_AUTHKEY= # Pre-auth key for mesh join
Quick Commands
# List instances via registry API curl https://mesh.bluefly.internal/api/v1/vastai/registry # SSH via Tailscale (preferred) ssh root@100.113.211.78 # SSH via public proxy ssh -p 14610 root@ssh5.vast.ai # Check GPU status ssh root@100.113.211.78 'nvidia-smi' # Heartbeat (keeps instance in registry) curl -X POST https://mesh.bluefly.internal/api/v1/vastai/registry/29484611/heartbeat
Related OSSA Agents
cluster-operator- Instance lifecycle managementcost-intelligence-monitor- Cost tracking and optimizationtask-dispatcher- Workload distribution
Event Types
All instances emit canonical events (see agent-router/src/infrastructure/deployment/vastai/events.ts):
vastai.instance.createdvastai.instance.readyvastai.instance.terminatedvastai.mesh.registeredvastai.mesh.heartbeat
Last updated: 2026-01-04