Skip to main content

ai infrastructure services

AI Infrastructure Services

Owner: agent-docker (blueflyio/agent-platform/infra/agent-docker) Branch: release/v0.1.x Last updated: 2026-02-27

Overview

Six open-source AI infrastructure services deployed via Docker Compose on Oracle. All secrets in 1Password vault "AgentPlatform". No custom code — open source first.

Service Architecture

                    ┌──────────────┐
                    │  Open WebUI   │ :3095 → webui.blueflyagents.com
                    │  (Chat UI)    │
                    └──────┬───────┘
                           │ OpenAI-compatible API
                    ┌──────▼───────┐
                    │   LiteLLM    │ :4050 → litellm.bluefly.internal
                    │  (Gateway)   │────────────┐
                    └──┬───┬───┬──┘            │
          ┌────────────┘   │   └──────────┐    │ callbacks
   ┌──────▼──────┐  ┌─────▼─────┐  ┌─────▼────▼──┐
   │  Anthropic   │  │  OpenAI   │  │   Langfuse   │ :3150 → langfuse.blueflyagents.com
   │  (Claude)    │  │ (GPT-4)   │  │(Observability)│
   └─────────────┘  └───────────┘  └──────────────┘

   ┌──────────────┐  ┌──────────────┐
   │    vLLM      │  │   Ollama     │
   │ (GPU Infer.) │  │ (Local Dev)  │
   │ :8000        │  │ :11434       │
   └──────────────┘  └──────────────┘

   ┌──────────────┐
   │  MCP Gateway │ :8811 (internal)
   │ (MCP Servers)│
   └──────────────┘

Service Details

LiteLLM — Unified LLM Gateway

  • Image: ghcr.io/berriai/litellm:main-stable
  • Port: 4050 (host) → 4000 (container)
  • URL: litellm.bluefly.internal
  • Purpose: All LLM traffic routes through LiteLLM. Provides cost tracking, budgets, rate limits, load balancing, and an OpenAI-compatible API.
  • Database: Dedicated Postgres 16 (litellm-db)
  • Models configured: Claude Sonnet 4.5, Claude Haiku 4.5, local vLLM, Ollama
  • Callbacks: Langfuse (automatic cost/trace reporting)

Langfuse — LLM Observability

  • Image: langfuse/langfuse:latest
  • Port: 3150 (host) → 3000 (container)
  • URL: langfuse.blueflyagents.com
  • Purpose: LLM-specific observability — traces, metrics, cost tracking, prompt management, evals. Complements agent-tracer (which handles general agent spans).
  • Database: Dedicated Postgres 16 (langfuse-db)
  • Init: Auto-creates org "BlueFly", project "Agent Platform" on first boot

MCP Gateway — Docker MCP Orchestration

  • Image: Docker MCP Gateway
  • Port: 8811
  • URL: Internal only (no public subdomain)
  • Purpose: Runs MCP servers as isolated Docker containers. Currently configured: filesystem (read-only), git, fetch, postgres, github.
  • Config: mcp-gateway-config.yaml

Open WebUI — Chat Interface

  • Image: ghcr.io/open-webui/open-webui:main
  • Port: 3095 (host) → 8080 (container)
  • URL: webui.blueflyagents.com
  • Purpose: User-facing chat UI. Routes through LiteLLM for model access. Supports 250+ LLMs.
  • Auth: Enabled (WEBUI_SECRET_KEY)

vLLM — GPU Inference

  • Image: vllm/vllm-openai:latest
  • Port: 8000
  • URL: vllm.blueflyagents.com
  • Purpose: Production GPU inference for open models (Llama, Mistral, etc.). Requires NVIDIA GPU.
  • Default model: meta-llama/Llama-3.1-8B-Instruct
  • Deployment: Vast.ai or Oracle GPU (when available)

Ollama — Local Dev Inference

  • Image: ollama/ollama:latest
  • Port: 11434
  • Purpose: Local development inference. No secrets. No GPU required.

Port Allocation (Oracle)

PortServiceNotes
3005agent-meshExisting
3006agent-tracerExisting
3080LibreChatExisting
3095Open WebUINEW
3100LokiExisting
3150LangfuseNEW (was 3100, moved to avoid Loki conflict)
4000agent-routerExisting
4005agent-protocolExisting
4050LiteLLMNEW (was 4000, moved to avoid router conflict)
8000vLLMNEW (GPU required)
8811MCP GatewayNEW
11434OllamaNEW (local dev)

Cross-Project Integration

Priority 1 (CRITICAL): foundation-bridge → LiteLLM

Currently, individual services hold ANTHROPIC_API_KEY and OPENAI_API_KEY directly. LiteLLM becomes the single gateway.

What changes:

  • foundation-bridge routes through LITELLM_URL=http://litellm:4000
  • Remove direct API keys from: agent-router, agent-chat, agent-brain, agent-orchestrator
  • Add LITELLM_URL + LITELLM_API_KEY (set to LITELLM_MASTER_KEY) to those services

Benefits: Single API key management, cost tracking, rate limits, budget enforcement, load balancing.

Priority 2 (HIGH): agent-mesh → Register New Services

agent-mesh is the discovery backbone (DUADP protocol). New services must be discoverable.

What changes:

  • Register LiteLLM, Langfuse, MCP Gateway, Open WebUI in mesh service registry
  • Replace direct LLM endpoint env vars (VLLM_ENDPOINT, NEMOTRON_ENDPOINT, etc.) with LITELLM_URL

Priority 3 (HIGH): agent-tracer → Langfuse

agent-tracer handles general agent traces. Langfuse handles LLM-specific observability.

What changes:

  • Forward LLM-related spans from agent-tracer to Langfuse
  • Add LANGFUSE_HOST, LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY env vars to agent-tracer

Priority 4 (MEDIUM): agent-protocol → MCP Gateway

agent-protocol serves the platform's MCP endpoint. MCP Gateway manages external MCP tool servers.

What changes:

  • Add MCP_GATEWAY_URL env var to agent-protocol
  • Proxy tool discovery/calls to MCP Gateway for external tools

Priority 5 (MEDIUM): mcp_registry (Drupal) → MCP Gateway

Drupal's mcp_registry module manages MCP server entities. Should import from MCP Gateway.

What changes:

  • Add MCP Gateway as a discovery source in mcp_registry
  • Import servers from http://mcp-gateway:8811/catalog

Priority 6 (MEDIUM): agent-buildkit → Deploy Commands

CLI needs deploy targets for new services.

What changes:

  • Add deploy commands for litellm, langfuse, mcp-gateway, open-webui
  • Update coordination-state.json with new Oracle services

Priority 7 (LOW): Drupal ai_agents_client → LiteLLM

Uses drupal/ai provider system. Configuration-only change.

What changes:

  • Set drupal/ai default provider to LiteLLM endpoint at /admin/config/ai/settings
  • No source code changes needed

Priority 8 (LOW): Drupal ai_agents_ossa → Langfuse

Token cost tracking could pull data from Langfuse API.

What changes:

  • Optional Langfuse API client in UnifiedCostService
  • Nice-to-have for cross-platform cost aggregation

NAS Cleanup

NAS role is cold storage + admin only. These stale Dockge stacks should be cleaned up:

StackStatusAction
monitoringSuperseded by observabilityRemove
observabilityReferences invalid /Volumes/AgentPlatform-1/ pathFix or remove
agentsAll migrated to OracleRemove
servicesAll migrated to OracleRemove
infrastructurecloudflared stopped (Oracle is sole tunnel)Remove
datapostgres/redis/qdrant migratedKeep MinIO only

Active NAS services (keep):

  • MinIO (storage.blueflyagents.com)
  • Dockge (dockge.blueflyagents.com)
  • Verdaccio (npm.blueflyagents.com)
  • CouchDB/Obsidian LiveSync
  • Zotero WebDAV
  • code-server

Cloudflare Tunnel Routes (New)

Added to k8s/cloudflared-oracle/config-configmap.yaml:

- hostname: litellm.bluefly.internal service: http://localhost:4050 - hostname: langfuse.blueflyagents.com service: http://localhost:3150 - hostname: vllm.blueflyagents.com service: http://localhost:8000 - hostname: mcp-gateway.blueflyagents.com service: http://localhost:8811 - hostname: webui.blueflyagents.com service: http://localhost:3095

Files Reference

All in agent-docker repo on release/v0.1.x:

deployments/
├── .env.example                          # All required env vars
├── litellm/
│   ├── docker-compose.litellm.yml        # LiteLLM + Postgres
│   └── litellm-config.yaml               # Model routing config
├── langfuse/
│   └── docker-compose.langfuse.yml       # Langfuse + Postgres
├── mcp-gateway/
│   ├── docker-compose.mcp-gateway.yml    # MCP Gateway
│   └── mcp-gateway-config.yaml           # MCP server definitions
├── open-webui/
│   └── docker-compose.open-webui.yml     # Open WebUI
├── vllm/
│   └── docker-compose.vllm.yml           # vLLM (GPU)
└── ollama/
    └── docker-compose.ollama.yml         # Ollama (local dev)

k8s/
├── litellm/deployment.yaml               # K8s LiteLLM
├── langfuse/deployment.yaml              # K8s Langfuse
├── vllm/deployment.yaml                  # K8s vLLM
├── kyverno/cluster-policies.yaml         # Security policies
├── kubeai/values.yaml                    # KubeAI Helm values
└── cloudflared-oracle/
    └── config-configmap.yaml             # Tunnel routes