Skip to main content

infrastructure reference

Infrastructure Reference (for AI Assistants)

Moved from CLAUDE.md to reduce context size. Consult when working on infrastructure, deployment, or ops tasks.

OWNERSHIP: See architecture/separation-of-duties.md for strict infrastructure ownership rules.

DIRECT GITLAB WORKFLOW: All repositories push directly to GitLab. No NAS bare repo sync required.

Resource Allocation

ResourceRoleUse forDo NOT use for
Oracle CloudPrimary tunnel + agent runtime + runnersPublic ingress (cloudflared), MCP, GKG, mesh, agent-router, tracer, workflow-engine, Langflow, LibreChat, kagent (k3s), all *.blueflyagents.com. GitLab runners run on Oracle. Secrets in /root/.env.local.Long-term storage (use NAS).
Synology NASConfig master + always-on storage + backupConfig, data, code-server (browser IDE), wikis, Research. MinIO, Postgres, Redis if run on NAS. Backup MCP/GKG when Oracle down.Primary agent runtime (Oracle is primary).
Mac M4Operator console + primary devWorktrees, Cursor, BuildKit CLI, __BARE_REPOS, TESTING_DEMOS, WORKING_DEMOs. Push to GitLab; no hosted services.Running MCP, mesh, GKG, or any always-on service.
OrbStack (orb.local)Local dev + testingDDEV, local Drupal sites, testing.Production.
GitLab Ultimate + AgentSource of truth + CI + AIRepos, registry (@bluefly/*), issues, wiki, MRs. GitLab Agent (agentk) on Oracle k3s. Runners on Oracle.Storing runtime config (use NAS).
Vast.aiEssentials-only GPU (costs money)Only when Oracle cannot satisfy workload (model training, burst embeddings). Prefer Oracle for all other compute.Always-on services; storage; CI.

Resource plan (three tiers):

  1. Oracle = production. All compute, tunnel, platform services. Oracle is free; default here.
  2. NAS = local storage and local dev server. Config master, long-term storage, code-server, wikis, Research.
  3. DDEV + Drupal = only local (Mac). Local Drupal dev and testing.
  4. Vast.ai — only when needed. GPU workloads Oracle cannot do; costs money.

Infrastructure Decision Matrix

NeedUseOwner PackageCode Location
Primary agent runtime + tunnelOracle Cloudagent-docker, platform-agentsk8s/cloudflared-oracle/
Config + storage + backupSynology NAS@bluefly/agent-dockerconfig/, data/, ServiceApps
GPU compute (costs $)Vast.ai@bluefly/agent-routersrc/scaling/vastai.ts
CI/CD + registry + DuoGitLab Ultimategitlab_components, platform-agentsRunners on Oracle
Private networkTailscale@bluefly/agent-tailscaleALL Tailscale code
Public ingressCloudflare Tunnelgitlab_componentsTunnel on Oracle
Service discoveryAgent Mesh@bluefly/agent-meshOracle (primary); NAS backup

Synology NAS (config + storage + backup)

What runs here: Config master, data/, code-server, MinIO/Postgres/Redis if deployed; backup for MCP/GKG when Oracle down.

Service (if on NAS)PortOwnerDeploy Location
code-server8080ServiceAppsapplications/ServiceApps/code-server
MinIO S39000InfrastructureNAS native or Container Manager
PostgreSQL5432InfrastructureNAS Container Manager
Redis6379InfrastructureNAS Container Manager
Backup MCP/GKG27495, etc.When Oracle downContainer Manager

Vast.ai GPU Cloud (essentials only; costs money)

Policy: Oracle is free; Vast.ai costs money. Use only for essentials.

WorkloadGPU TypeOwner PackageCode Location
Nemotron 3 Nano (vLLM)H100/GPU@bluefly/agent-dockersrc/services/vastai-deployment.service.ts
Ollama inferenceRTX 4090@bluefly/foundation-bridgesrc/providers/ollama/
EmbeddingsRTX 4090@bluefly/agent-brainsrc/embeddings/
Model trainingA100/H100models/training/
Auto-scaling logic-@bluefly/agent-routersrc/scaling/vastai.ts

Tailscale Mesh Network

CRITICAL: ALL Tailscale code MUST be in @bluefly/agent-tailscale

CapabilityCode Location
Subnet routing@bluefly/agent-tailscale/subnet
DNS management@bluefly/agent-tailscale/dns
SSH access@bluefly/agent-tailscale/ssh
Certificates@bluefly/agent-tailscale/certs
Webhooks@bluefly/agent-tailscale/webhooks
Device management@bluefly/agent-tailscale/devices

Cloudflare Integration

Uses Cloudflare Tunnel (cloudflared daemon), NOT Cloudflare WARP.

ComponentOwnerLocation
Tunnel configInfrastructure~/.cloudflared/config.yml
Tunnel daemonInfrastructureNAS Container Manager
DNS recordsInfrastructureCloudflare dashboard
Webhook routinggitlab_componentsCI/CD templates
WAF rulessecurity-policiesPolicy definitions

Public Endpoints (Always-On via Cloudflare Tunnel)

All targets on Oracle or NAS, not your computer. Updated 2026-02-23.

HostnameOriginService
mesh.bluefly.internaloracle:3005Agent mesh API
nas.blueflyagents.comblueflynas:5001 (HTTPS)DSM Web UI
storage.blueflyagents.comblueflynas:9000MinIO S3
npm.blueflyagents.comblueflynas:4873npm registry
kagent.blueflyagents.comoracle:30083K-agent API
kagent-ui.blueflyagents.comoracle:30080K-agent UI
mcpdash.blueflyagents.comoracle:3003MCP dashboard
router.bluefly.internaloracle:4000Agent router
agents.blueflyagents.comoracle:3001Agents API
studio.blueflyagents.comoracle:3012Studio UI
tracer.bluefly.internaloracle:3006Agent tracer
mcp.blueflyagents.comoracle:4005MCP / agent-protocol
brain.bluefly.internaloracle:6333Agent brain (Qdrant)
compliance.bluefly.internaloracle:3010Compliance Engine API
workflow.bluefly.internaloracle:3015Workflow engine
devops.blueflyagents.comoracle:3011DevOps
a2a-collector.blueflyagents.comoracle:9004A2A log collector
a2a-stream.blueflyagents.comoracle:9005A2A stream
adash.blueflyagents.comoracle:3013Agent dashboard
dockge.blueflyagents.comblueflynas:9010Dockge (container mgmt)
obsidian.blueflyagents.comblueflynas:5984Obsidian
gkg.bluefly.internaloracle:27495GKG (Knowledge Graph)
ecma-agent.blueflyagents.comoracle:3016ECMA Agent
content-guardian.blueflyagents.comoracle:4010Content Guardian
intel.blueflyagents.comoracle:9006Intel
langflow.blueflyagents.comoracle:7860LangFlow UI
dragonfly.blueflyagents.comDragonfly
grafana.blueflyagents.comoracle:30300Grafana
n8n.blueflyagents.comoracle:5678n8n (workflow automation)
zotero.blueflyagents.comblueflynas:5006 (HTTPS)Zotero
infra.blueflyagents.comoracle:3030Infrastructure
chat.blueflyagents.comoracle:3080Chat (LibreChat)
code.blueflyagents.comblueflynas:8080code-server (remote IDE)
flowise.blueflyagents.comnas-platform:3100Flowise
happy.blueflyagents.comoracle:3045Happy
orchestrator.blueflyagents.comoracle:3014Orchestrator
api.blueflyagents.comoracle:3085API gateway
ossa-ui.blueflyagents.comoracle:3456OSSA UI

Private Access (Tailscale)

DeviceTailscale hostnameIPv4Role
mac-m4mac-m4.tailcf98b3.ts.net100.108.129.7Operator console
blueflynasblueflynas.tailcf98b3.ts.net100.104.119.76Synology NAS
iphone-tiphone-t.tailcf98b3.ts.net100.67.125.25Phone
oracle-platformoracle-platform.tailcf98b3.ts.net100.103.48.75Primary tunnel; GitLab runners

MCP service (one URL for all clients)

  • Public: https://mcp.blueflyagents.com/api/mcp/sse
  • Tailscale Oracle (primary): http://oracle-platform.tailcf98b3.ts.net:4005/api/mcp/sse
  • Tailscale NAS (backup only): http://blueflynas.tailcf98b3.ts.net:27495/mcp/sse

Config: NAS master and user ~/.agent-platform

  • Master config (NAS): Mac: /Volumes/AgentPlatform/config — NAS SSH: /volume1/AgentPlatform/config. Read config/AGENTS.md first.
  • NAS layout: Root: /Volumes/AgentPlatform/ (Mac) or /volume1/AgentPlatform/ (NAS SSH). Dirs: services/, applications/ServiceApps/, config/, data/. No repos on NAS. Config: config.json, workspace.json, coordination-state.json, nas-infrastructure-reference.json. Secrets: config/tokens/, config/ssl/, config/.ssh/, or .env.local.
  • Per-user config: ~/.agent-platform/ with one folder per project. After npm i -g @bluefly/agent-buildkit, run buildkit setup once.

GKG (Knowledge Graph API + MCP SSE)

  • Public: https://gkg.bluefly.internal
  • Tailscale Oracle: http://oracle-platform.tailcf98b3.ts.net:27495
  • NAS (backup): http://blueflynas.tailcf98b3.ts.net:27495

Oracle Platform

  • Tailscale: oracle-platform.tailcf98b3.ts.net. SSH: flux423@oracle-platform.tailcf98b3.ts.net.
  • Secrets: /root/.env.local is canonical. Docker: env_file: /root/.env.local. systemd: EnvironmentFile=/root/.env.local.
  • A2A collector + OTLP bridge: Set OTEL_EXPORTER_OTLP_ENDPOINT to GitLab OTLP URL.

Code from the road (phone + Termius + code-server)

  • Remote IDE: https://code.blueflyagents.com (code-server on NAS).
  • Termius: Add hosts: mac-m4, blueflynas, oracle-platform (all .tailcf98b3.ts.net).
  • Mosh: mosh flux423@mac-m4.tailcf98b3.ts.net -- "tmux attach -t ops"
  • Flow: (1) Browser -> code-server -> code. (2) Termius -> SSH to oracle -> deploy. (3) Mosh to mac-m4 -> tmux ops.

Distributed ops console (mobile + A2A)

  • Goal: Persistent terminal (tmux + mosh) from phone; watch agents (A2A) in real time.
  • One command from phone: mosh flux423@mac-m4.tailcf98b3.ts.net -- "tmux attach -t ops || ide-remote ops"
  • Ops env: Mac /Volumes/AgentPlatform/.env.local. Keys: A2A_STREAM_URL, A2A_URL, MCP_BASE_URL, MCP_URL, MCP_TOKEN, OPS_CWD.
  • A2A stream: Tailscale http://blueflynas.tailcf98b3.ts.net:9001/a2a/stream. Public https://dashboard.mcp.blueflyagents.com/a2a/stream.

Cost control and OSS coding stack

  • Router budget: ROUTER_BUDGET_HOURLY_LIMIT (0.5), ROUTER_BUDGET_MONTHLY_LIMIT (200).
  • Ollama on NAS: OLLAMA_URL=http://blueflynas.tailcf98b3.ts.net:11434
  • Deploy from anywhere: POST https://studio.blueflyagents.com/api/v1/deploy/run
  • OSS tools: Aider, hello-halo, Continue/Cline, Ollama. Router prefers Ollama; use Kimi/DeepSeek for lower-cost cloud.