ai infrastructure services
AI Infrastructure Services
Owner: agent-docker (
blueflyio/agent-platform/infra/agent-docker) Branch:release/v0.1.xLast updated: 2026-02-27
Overview
Six open-source AI infrastructure services deployed via Docker Compose on Oracle. All secrets in 1Password vault "AgentPlatform". No custom code — open source first.
Service Architecture
┌──────────────┐
│ Open WebUI │ :3095 → webui.blueflyagents.com
│ (Chat UI) │
└──────┬───────┘
│ OpenAI-compatible API
┌──────▼───────┐
│ LiteLLM │ :4050 → litellm.bluefly.internal
│ (Gateway) │────────────┐
└──┬───┬───┬──┘ │
┌────────────┘ │ └──────────┐ │ callbacks
┌──────▼──────┐ ┌─────▼─────┐ ┌─────▼────▼──┐
│ Anthropic │ │ OpenAI │ │ Langfuse │ :3150 → langfuse.blueflyagents.com
│ (Claude) │ │ (GPT-4) │ │(Observability)│
└─────────────┘ └───────────┘ └──────────────┘
┌──────────────┐ ┌──────────────┐
│ vLLM │ │ Ollama │
│ (GPU Infer.) │ │ (Local Dev) │
│ :8000 │ │ :11434 │
└──────────────┘ └──────────────┘
┌──────────────┐
│ MCP Gateway │ :8811 (internal)
│ (MCP Servers)│
└──────────────┘
Service Details
LiteLLM — Unified LLM Gateway
- Image:
ghcr.io/berriai/litellm:main-stable - Port: 4050 (host) → 4000 (container)
- URL:
litellm.bluefly.internal - Purpose: All LLM traffic routes through LiteLLM. Provides cost tracking, budgets, rate limits, load balancing, and an OpenAI-compatible API.
- Database: Dedicated Postgres 16 (litellm-db)
- Models configured: Claude Sonnet 4.5, Claude Haiku 4.5, local vLLM, Ollama
- Callbacks: Langfuse (automatic cost/trace reporting)
Langfuse — LLM Observability
- Image:
langfuse/langfuse:latest - Port: 3150 (host) → 3000 (container)
- URL:
langfuse.blueflyagents.com - Purpose: LLM-specific observability — traces, metrics, cost tracking, prompt management, evals. Complements agent-tracer (which handles general agent spans).
- Database: Dedicated Postgres 16 (langfuse-db)
- Init: Auto-creates org "BlueFly", project "Agent Platform" on first boot
MCP Gateway — Docker MCP Orchestration
- Image: Docker MCP Gateway
- Port: 8811
- URL: Internal only (no public subdomain)
- Purpose: Runs MCP servers as isolated Docker containers. Currently configured: filesystem (read-only), git, fetch, postgres, github.
- Config:
mcp-gateway-config.yaml
Open WebUI — Chat Interface
- Image:
ghcr.io/open-webui/open-webui:main - Port: 3095 (host) → 8080 (container)
- URL:
webui.blueflyagents.com - Purpose: User-facing chat UI. Routes through LiteLLM for model access. Supports 250+ LLMs.
- Auth: Enabled (
WEBUI_SECRET_KEY)
vLLM — GPU Inference
- Image:
vllm/vllm-openai:latest - Port: 8000
- URL:
vllm.blueflyagents.com - Purpose: Production GPU inference for open models (Llama, Mistral, etc.). Requires NVIDIA GPU.
- Default model:
meta-llama/Llama-3.1-8B-Instruct - Deployment: Vast.ai or Oracle GPU (when available)
Ollama — Local Dev Inference
- Image:
ollama/ollama:latest - Port: 11434
- Purpose: Local development inference. No secrets. No GPU required.
Port Allocation (Oracle)
| Port | Service | Notes |
|---|---|---|
| 3005 | agent-mesh | Existing |
| 3006 | agent-tracer | Existing |
| 3080 | LibreChat | Existing |
| 3095 | Open WebUI | NEW |
| 3100 | Loki | Existing |
| 3150 | Langfuse | NEW (was 3100, moved to avoid Loki conflict) |
| 4000 | agent-router | Existing |
| 4005 | agent-protocol | Existing |
| 4050 | LiteLLM | NEW (was 4000, moved to avoid router conflict) |
| 8000 | vLLM | NEW (GPU required) |
| 8811 | MCP Gateway | NEW |
| 11434 | Ollama | NEW (local dev) |
Cross-Project Integration
Priority 1 (CRITICAL): foundation-bridge → LiteLLM
Currently, individual services hold ANTHROPIC_API_KEY and OPENAI_API_KEY directly. LiteLLM becomes the single gateway.
What changes:
- foundation-bridge routes through
LITELLM_URL=http://litellm:4000 - Remove direct API keys from: agent-router, agent-chat, agent-brain, agent-orchestrator
- Add
LITELLM_URL+LITELLM_API_KEY(set toLITELLM_MASTER_KEY) to those services
Benefits: Single API key management, cost tracking, rate limits, budget enforcement, load balancing.
Priority 2 (HIGH): agent-mesh → Register New Services
agent-mesh is the discovery backbone (DUADP protocol). New services must be discoverable.
What changes:
- Register LiteLLM, Langfuse, MCP Gateway, Open WebUI in mesh service registry
- Replace direct LLM endpoint env vars (
VLLM_ENDPOINT,NEMOTRON_ENDPOINT, etc.) withLITELLM_URL
Priority 3 (HIGH): agent-tracer → Langfuse
agent-tracer handles general agent traces. Langfuse handles LLM-specific observability.
What changes:
- Forward LLM-related spans from agent-tracer to Langfuse
- Add
LANGFUSE_HOST,LANGFUSE_PUBLIC_KEY,LANGFUSE_SECRET_KEYenv vars to agent-tracer
Priority 4 (MEDIUM): agent-protocol → MCP Gateway
agent-protocol serves the platform's MCP endpoint. MCP Gateway manages external MCP tool servers.
What changes:
- Add
MCP_GATEWAY_URLenv var to agent-protocol - Proxy tool discovery/calls to MCP Gateway for external tools
Priority 5 (MEDIUM): mcp_registry (Drupal) → MCP Gateway
Drupal's mcp_registry module manages MCP server entities. Should import from MCP Gateway.
What changes:
- Add MCP Gateway as a discovery source in mcp_registry
- Import servers from
http://mcp-gateway:8811/catalog
Priority 6 (MEDIUM): agent-buildkit → Deploy Commands
CLI needs deploy targets for new services.
What changes:
- Add deploy commands for litellm, langfuse, mcp-gateway, open-webui
- Update
coordination-state.jsonwith new Oracle services
Priority 7 (LOW): Drupal ai_agents_client → LiteLLM
Uses drupal/ai provider system. Configuration-only change.
What changes:
- Set
drupal/aidefault provider to LiteLLM endpoint at/admin/config/ai/settings - No source code changes needed
Priority 8 (LOW): Drupal ai_agents_ossa → Langfuse
Token cost tracking could pull data from Langfuse API.
What changes:
- Optional Langfuse API client in
UnifiedCostService - Nice-to-have for cross-platform cost aggregation
NAS Cleanup
NAS role is cold storage + admin only. These stale Dockge stacks should be cleaned up:
| Stack | Status | Action |
|---|---|---|
monitoring | Superseded by observability | Remove |
observability | References invalid /Volumes/AgentPlatform-1/ path | Fix or remove |
agents | All migrated to Oracle | Remove |
services | All migrated to Oracle | Remove |
infrastructure | cloudflared stopped (Oracle is sole tunnel) | Remove |
data | postgres/redis/qdrant migrated | Keep MinIO only |
Active NAS services (keep):
- MinIO (storage.blueflyagents.com)
- Dockge (dockge.blueflyagents.com)
- Verdaccio (npm.blueflyagents.com)
- CouchDB/Obsidian LiveSync
- Zotero WebDAV
- code-server
Cloudflare Tunnel Routes (New)
Added to k8s/cloudflared-oracle/config-configmap.yaml:
- hostname: litellm.bluefly.internal service: http://localhost:4050 - hostname: langfuse.blueflyagents.com service: http://localhost:3150 - hostname: vllm.blueflyagents.com service: http://localhost:8000 - hostname: mcp-gateway.blueflyagents.com service: http://localhost:8811 - hostname: webui.blueflyagents.com service: http://localhost:3095
Files Reference
All in agent-docker repo on release/v0.1.x:
deployments/
├── .env.example # All required env vars
├── litellm/
│ ├── docker-compose.litellm.yml # LiteLLM + Postgres
│ └── litellm-config.yaml # Model routing config
├── langfuse/
│ └── docker-compose.langfuse.yml # Langfuse + Postgres
├── mcp-gateway/
│ ├── docker-compose.mcp-gateway.yml # MCP Gateway
│ └── mcp-gateway-config.yaml # MCP server definitions
├── open-webui/
│ └── docker-compose.open-webui.yml # Open WebUI
├── vllm/
│ └── docker-compose.vllm.yml # vLLM (GPU)
└── ollama/
└── docker-compose.ollama.yml # Ollama (local dev)
k8s/
├── litellm/deployment.yaml # K8s LiteLLM
├── langfuse/deployment.yaml # K8s Langfuse
├── vllm/deployment.yaml # K8s vLLM
├── kyverno/cluster-policies.yaml # Security policies
├── kubeai/values.yaml # KubeAI Helm values
└── cloudflared-oracle/
└── config-configmap.yaml # Tunnel routes