Hosting Oracle vs Vast.ai
Hosting: Oracle vs Vast.ai for LLM Inference and Training
Related: technical-docs#172
Summary
| Role | Oracle (tunnel connector) | Vast.ai | NAS |
|---|---|---|---|
| LLM inference | No (calls out to Vast.ai or APIs) | Yes (Ollama RTX 4090, vLLM) | No (backup only) |
| Model training | No | Yes (A100/H100) | No |
| Tunnel, MCP, GKG, mesh | Yes | No | Backup only |
| Always-on services | Primary | Elastic GPU | Backup |
Oracle
- Role: Tunnel connector; public ingress (Cloudflare Tunnel); MCP, GKG, mesh, LangFlow UI.
- No GPU. All LLM inference is done by calling Vast.ai (Ollama/vLLM) or hosted APIs (OpenAI, Anthropic, Kimi) via foundation-bridge/agent-router.
- Secrets:
/root/.env.localon the tunnel host.
Vast.ai
- Role: GPU compute for inference and training.
- Ollama inference: RTX 4090; owner:
@bluefly/foundation-bridge(src/providers/ollama/). - Embeddings: RTX 4090; owner:
@bluefly/agent-brain. - Model training: A100/H100; code in
models/repo (training/). - Auto-scaling:
@bluefly/agent-router(src/scaling/vastai.ts). - CI/CD:
gitlab_components/templates/vastai-deploy/.
NAS
- Backup only; not primary for LLM inference or training.
- MinIO, PostgreSQL, Redis, A2A hub, code-server, etc.
Provider List (Routing)
- Self-hosted GPU: Ollama/vLLM on Vast.ai.
- Hosted APIs: OpenAI, Anthropic, Kimi (when integrated) — called from Oracle/tunnel host via foundation-bridge.
References
- CLAUDE.md: Infrastructure, Vast.ai, Oracle, NAS layout.
- separation-of-duties: Infrastructure ownership.