infrastructure reference
Infrastructure Reference (for AI Assistants)
Moved from CLAUDE.md to reduce context size. Consult when working on infrastructure, deployment, or ops tasks.
OWNERSHIP: See architecture/separation-of-duties.md for strict infrastructure ownership rules.
DIRECT GITLAB WORKFLOW: All repositories push directly to GitLab. No NAS bare repo sync required.
Resource Allocation
| Resource | Role | Use for | Do NOT use for |
|---|---|---|---|
| Oracle Cloud | Primary tunnel + agent runtime + runners | Public ingress (cloudflared), MCP, GKG, mesh, agent-router, tracer, workflow-engine, Langflow, LibreChat, kagent (k3s), all *.blueflyagents.com. GitLab runners run on Oracle. Secrets in /root/.env.local. | Long-term storage (use NAS). |
| Synology NAS | Config master + always-on storage + backup | Config, data, code-server (browser IDE), wikis, Research. MinIO, Postgres, Redis if run on NAS. Backup MCP/GKG when Oracle down. | Primary agent runtime (Oracle is primary). |
| Mac M4 | Operator console + primary dev | Worktrees, Cursor, BuildKit CLI, __BARE_REPOS, TESTING_DEMOS, WORKING_DEMOs. Push to GitLab; no hosted services. | Running MCP, mesh, GKG, or any always-on service. |
| OrbStack (orb.local) | Local dev + testing | DDEV, local Drupal sites, testing. | Production. |
| GitLab Ultimate + Agent | Source of truth + CI + AI | Repos, registry (@bluefly/*), issues, wiki, MRs. GitLab Agent (agentk) on Oracle k3s. Runners on Oracle. | Storing runtime config (use NAS). |
| Vast.ai | Essentials-only GPU (costs money) | Only when Oracle cannot satisfy workload (model training, burst embeddings). Prefer Oracle for all other compute. | Always-on services; storage; CI. |
Resource plan (three tiers):
- Oracle = production. All compute, tunnel, platform services. Oracle is free; default here.
- NAS = local storage and local dev server. Config master, long-term storage, code-server, wikis, Research.
- DDEV + Drupal = only local (Mac). Local Drupal dev and testing.
- Vast.ai — only when needed. GPU workloads Oracle cannot do; costs money.
Infrastructure Decision Matrix
| Need | Use | Owner Package | Code Location |
|---|---|---|---|
| Primary agent runtime + tunnel | Oracle Cloud | agent-docker, platform-agents | k8s/cloudflared-oracle/ |
| Config + storage + backup | Synology NAS | @bluefly/agent-docker | config/, data/, ServiceApps |
| GPU compute (costs $) | Vast.ai | @bluefly/agent-router | src/scaling/vastai.ts |
| CI/CD + registry + Duo | GitLab Ultimate | gitlab_components, platform-agents | Runners on Oracle |
| Private network | Tailscale | @bluefly/agent-tailscale | ALL Tailscale code |
| Public ingress | Cloudflare Tunnel | gitlab_components | Tunnel on Oracle |
| Service discovery | Agent Mesh | @bluefly/agent-mesh | Oracle (primary); NAS backup |
Synology NAS (config + storage + backup)
What runs here: Config master, data/, code-server, MinIO/Postgres/Redis if deployed; backup for MCP/GKG when Oracle down.
| Service (if on NAS) | Port | Owner | Deploy Location |
|---|---|---|---|
| code-server | 8080 | ServiceApps | applications/ServiceApps/code-server |
| MinIO S3 | 9000 | Infrastructure | NAS native or Container Manager |
| PostgreSQL | 5432 | Infrastructure | NAS Container Manager |
| Redis | 6379 | Infrastructure | NAS Container Manager |
| Backup MCP/GKG | 27495, etc. | When Oracle down | Container Manager |
Vast.ai GPU Cloud (essentials only; costs money)
Policy: Oracle is free; Vast.ai costs money. Use only for essentials.
| Workload | GPU Type | Owner Package | Code Location |
|---|---|---|---|
| Nemotron 3 Nano (vLLM) | H100/GPU | @bluefly/agent-docker | src/services/vastai-deployment.service.ts |
| Ollama inference | RTX 4090 | @bluefly/foundation-bridge | src/providers/ollama/ |
| Embeddings | RTX 4090 | @bluefly/agent-brain | src/embeddings/ |
| Model training | A100/H100 | models/ | training/ |
| Auto-scaling logic | - | @bluefly/agent-router | src/scaling/vastai.ts |
Tailscale Mesh Network
CRITICAL: ALL Tailscale code MUST be in @bluefly/agent-tailscale
| Capability | Code Location |
|---|---|
| Subnet routing | @bluefly/agent-tailscale/subnet |
| DNS management | @bluefly/agent-tailscale/dns |
| SSH access | @bluefly/agent-tailscale/ssh |
| Certificates | @bluefly/agent-tailscale/certs |
| Webhooks | @bluefly/agent-tailscale/webhooks |
| Device management | @bluefly/agent-tailscale/devices |
Cloudflare Integration
Uses Cloudflare Tunnel (cloudflared daemon), NOT Cloudflare WARP.
| Component | Owner | Location |
|---|---|---|
| Tunnel config | Infrastructure | ~/.cloudflared/config.yml |
| Tunnel daemon | Infrastructure | NAS Container Manager |
| DNS records | Infrastructure | Cloudflare dashboard |
| Webhook routing | gitlab_components | CI/CD templates |
| WAF rules | security-policies | Policy definitions |
Public Endpoints (Always-On via Cloudflare Tunnel)
All targets on Oracle or NAS, not your computer. Updated 2026-02-23.
| Hostname | Origin | Service |
|---|---|---|
mesh.bluefly.internal | oracle:3005 | Agent mesh API |
nas.blueflyagents.com | blueflynas:5001 (HTTPS) | DSM Web UI |
storage.blueflyagents.com | blueflynas:9000 | MinIO S3 |
npm.blueflyagents.com | blueflynas:4873 | npm registry |
kagent.blueflyagents.com | oracle:30083 | K-agent API |
kagent-ui.blueflyagents.com | oracle:30080 | K-agent UI |
mcpdash.blueflyagents.com | oracle:3003 | MCP dashboard |
router.bluefly.internal | oracle:4000 | Agent router |
agents.blueflyagents.com | oracle:3001 | Agents API |
studio.blueflyagents.com | oracle:3012 | Studio UI |
tracer.bluefly.internal | oracle:3006 | Agent tracer |
mcp.blueflyagents.com | oracle:4005 | MCP / agent-protocol |
brain.bluefly.internal | oracle:6333 | Agent brain (Qdrant) |
compliance.bluefly.internal | oracle:3010 | Compliance Engine API |
workflow.bluefly.internal | oracle:3015 | Workflow engine |
devops.blueflyagents.com | oracle:3011 | DevOps |
a2a-collector.blueflyagents.com | oracle:9004 | A2A log collector |
a2a-stream.blueflyagents.com | oracle:9005 | A2A stream |
adash.blueflyagents.com | oracle:3013 | Agent dashboard |
dockge.blueflyagents.com | blueflynas:9010 | Dockge (container mgmt) |
obsidian.blueflyagents.com | blueflynas:5984 | Obsidian |
gkg.bluefly.internal | oracle:27495 | GKG (Knowledge Graph) |
ecma-agent.blueflyagents.com | oracle:3016 | ECMA Agent |
content-guardian.blueflyagents.com | oracle:4010 | Content Guardian |
intel.blueflyagents.com | oracle:9006 | Intel |
langflow.blueflyagents.com | oracle:7860 | LangFlow UI |
dragonfly.blueflyagents.com | — | Dragonfly |
grafana.blueflyagents.com | oracle:30300 | Grafana |
n8n.blueflyagents.com | oracle:5678 | n8n (workflow automation) |
zotero.blueflyagents.com | blueflynas:5006 (HTTPS) | Zotero |
infra.blueflyagents.com | oracle:3030 | Infrastructure |
chat.blueflyagents.com | oracle:3080 | Chat (LibreChat) |
code.blueflyagents.com | blueflynas:8080 | code-server (remote IDE) |
flowise.blueflyagents.com | nas-platform:3100 | Flowise |
happy.blueflyagents.com | oracle:3045 | Happy |
orchestrator.blueflyagents.com | oracle:3014 | Orchestrator |
api.blueflyagents.com | oracle:3085 | API gateway |
ossa-ui.blueflyagents.com | oracle:3456 | OSSA UI |
Private Access (Tailscale)
| Device | Tailscale hostname | IPv4 | Role |
|---|---|---|---|
| mac-m4 | mac-m4.tailcf98b3.ts.net | 100.108.129.7 | Operator console |
| blueflynas | blueflynas.tailcf98b3.ts.net | 100.104.119.76 | Synology NAS |
| iphone-t | iphone-t.tailcf98b3.ts.net | 100.67.125.25 | Phone |
| oracle-platform | oracle-platform.tailcf98b3.ts.net | 100.103.48.75 | Primary tunnel; GitLab runners |
MCP service (one URL for all clients)
- Public:
https://mcp.blueflyagents.com/api/mcp/sse - Tailscale Oracle (primary):
http://oracle-platform.tailcf98b3.ts.net:4005/api/mcp/sse - Tailscale NAS (backup only):
http://blueflynas.tailcf98b3.ts.net:27495/mcp/sse
Config: NAS master and user ~/.agent-platform
- Master config (NAS): Mac:
/Volumes/AgentPlatform/config— NAS SSH:/volume1/AgentPlatform/config. Readconfig/AGENTS.mdfirst. - NAS layout: Root:
/Volumes/AgentPlatform/(Mac) or/volume1/AgentPlatform/(NAS SSH). Dirs:services/,applications/ServiceApps/,config/,data/. No repos on NAS. Config:config.json,workspace.json,coordination-state.json,nas-infrastructure-reference.json. Secrets:config/tokens/,config/ssl/,config/.ssh/, or.env.local. - Per-user config:
~/.agent-platform/with one folder per project. Afternpm i -g @bluefly/agent-buildkit, runbuildkit setuponce.
GKG (Knowledge Graph API + MCP SSE)
- Public:
https://gkg.bluefly.internal - Tailscale Oracle:
http://oracle-platform.tailcf98b3.ts.net:27495 - NAS (backup):
http://blueflynas.tailcf98b3.ts.net:27495
Oracle Platform
- Tailscale: oracle-platform.tailcf98b3.ts.net. SSH:
flux423@oracle-platform.tailcf98b3.ts.net. - Secrets:
/root/.env.localis canonical. Docker:env_file: /root/.env.local. systemd:EnvironmentFile=/root/.env.local. - A2A collector + OTLP bridge: Set
OTEL_EXPORTER_OTLP_ENDPOINTto GitLab OTLP URL.
Code from the road (phone + Termius + code-server)
- Remote IDE:
https://code.blueflyagents.com(code-server on NAS). - Termius: Add hosts: mac-m4, blueflynas, oracle-platform (all .tailcf98b3.ts.net).
- Mosh:
mosh flux423@mac-m4.tailcf98b3.ts.net -- "tmux attach -t ops" - Flow: (1) Browser -> code-server -> code. (2) Termius -> SSH to oracle -> deploy. (3) Mosh to mac-m4 -> tmux ops.
Distributed ops console (mobile + A2A)
- Goal: Persistent terminal (tmux + mosh) from phone; watch agents (A2A) in real time.
- One command from phone:
mosh flux423@mac-m4.tailcf98b3.ts.net -- "tmux attach -t ops || ide-remote ops" - Ops env: Mac
/Volumes/AgentPlatform/.env.local. Keys:A2A_STREAM_URL,A2A_URL,MCP_BASE_URL,MCP_URL,MCP_TOKEN,OPS_CWD. - A2A stream: Tailscale
http://blueflynas.tailcf98b3.ts.net:9001/a2a/stream. Publichttps://dashboard.mcp.blueflyagents.com/a2a/stream.
Cost control and OSS coding stack
- Router budget:
ROUTER_BUDGET_HOURLY_LIMIT(0.5),ROUTER_BUDGET_MONTHLY_LIMIT(200). - Ollama on NAS:
OLLAMA_URL=http://blueflynas.tailcf98b3.ts.net:11434 - Deploy from anywhere:
POST https://studio.blueflyagents.com/api/v1/deploy/run - OSS tools: Aider, hello-halo, Continue/Cline, Ollama. Router prefers Ollama; use Kimi/DeepSeek for lower-cost cloud.