Multi-Mac Agent Coordination: Technical Planning Document
Multi-Mac Agent Coordination: Technical Planning Document
Executive Summary
This document outlines the technical architecture and implementation plan for coordinating AI agents across two Mac machines (Mac M4 - BlueFly primary, Mac M3 - GitLab secondary) to double agent processing capacity through shared infrastructure and intelligent load balancing.
Goal: Enable agents running on both Macs to coordinate through a shared Redis/PostgreSQL backend, with a coordinator service managing task distribution and resource monitoring.
Estimated Implementation Time: ~20 minutes for basic setup, ~2-4 hours for full production deployment
Architecture Overview
System Components
Shared Infrastructure
Redis PostgreSQL Coordinator
(Router) (Router) (Mac M4)
Mac Mac Mac Mac Mac Mac
M4 M4 M3 M3 M4 M3
Agent 1 Agent Agent Agent Agent Agent
2 3 4 5 6
Key Design Decisions
-
Coordinator Location: Mac M4 (primary work machine)
- Rationale: More stable, always-on machine
- Handles task distribution and health monitoring
-
Shared Infrastructure: Router-based services
- Redis: Task queue, agent state, coordination
- PostgreSQL: Persistent agent registry, metrics, history
-
Agent Registration: Both Macs register agents with coordinator
- Agents announce capabilities and health status
- Coordinator maintains registry in PostgreSQL
- Redis used for real-time coordination
-
Load Balancing: Capability-aware weighted round-robin
- Considers agent capabilities, current load, machine resources
- Failover to healthy agents on either machine
Infrastructure Setup
Phase 1: Shared Redis Setup (5 minutes)
Option A: Router-Based Redis (Recommended)
Requirements:
- Router with Docker support OR
- Router with ability to run Redis container OR
- Dedicated small device (Raspberry Pi, etc.)
Setup Steps:
-
Install Redis on Router/Device:
# If router supports Docker docker run -d \ --name redis-coordinator \ --restart unless-stopped \ -p 6379:6379 \ -v redis-data:/data \ redis:7-alpine redis-server --appendonly yes -
Configure Network Access:
- Ensure Redis port 6379 is accessible from both Macs
- Configure firewall rules if needed
- Set strong password:
redis-cli CONFIG SET requirepass <strong-password>
-
Test Connectivity from Both Macs:
# From Mac M4 redis-cli -h <router-ip> -p 6379 -a <password> PING # From Mac M3 redis-cli -h <router-ip> -p 6379 -a <password> PING
Option B: Mac M4 Hosted Redis (Fallback)
If router-based setup isn't feasible:
-
Install Redis on Mac M4:
brew install redis brew services start redis -
Configure for Network Access:
# Edit /opt/homebrew/etc/redis.conf bind 0.0.0.0 requirepass <strong-password> -
Configure Mac M4 Firewall:
- Allow incoming connections on port 6379 from Mac M3 IP
Phase 2: Shared PostgreSQL Setup (10 minutes)
Option A: Router-Based PostgreSQL
-
Install PostgreSQL on Router/Device:
docker run -d \ --name postgres-coordinator \ --restart unless-stopped \ -p 5432:5432 \ -e POSTGRES_DB=agent_coordinator \ -e POSTGRES_USER=agent_user \ -e POSTGRES_PASSWORD=<strong-password> \ -v postgres-data:/var/lib/postgresql/data \ postgres:16-alpine -
Create Agent Registry Schema:
CREATE TABLE agent_registry ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), agent_id VARCHAR(255) UNIQUE NOT NULL, agent_name VARCHAR(255) NOT NULL, machine_id VARCHAR(100) NOT NULL, capabilities JSONB NOT NULL, endpoint VARCHAR(500), status VARCHAR(50) NOT NULL, registered_at TIMESTAMP DEFAULT NOW(), last_seen TIMESTAMP DEFAULT NOW(), metadata JSONB ); CREATE TABLE agent_metrics ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), agent_id VARCHAR(255) REFERENCES agent_registry(agent_id), timestamp TIMESTAMP DEFAULT NOW(), cpu_percent FLOAT, memory_mb INTEGER, active_tasks INTEGER, completed_tasks INTEGER, failed_tasks INTEGER ); CREATE INDEX idx_agent_capabilities ON agent_registry USING GIN(capabilities); CREATE INDEX idx_agent_status ON agent_registry(status); CREATE INDEX idx_agent_machine ON agent_registry(machine_id);
Option B: Mac M4 Hosted PostgreSQL (Fallback)
-
Install PostgreSQL on Mac M4:
brew install postgresql@16 brew services start postgresql@16 -
Configure for Network Access:
# Edit /opt/homebrew/var/postgresql@16/postgresql.conf listen_addresses = '*' # Edit pg_hba.conf host all all 0.0.0.0/0 md5 -
Create Database and Schema (same as above)
Coordinator Service Implementation
Phase 3: Coordinator Service on Mac M4 (15 minutes)
Service Architecture
The coordinator service will be built using existing agent-mesh infrastructure:
Location: common_npm/agent-mesh/backend/src/services/coordinator/
Key Components:
-
Agent Registry Manager
- Maintains PostgreSQL-backed registry
- Handles agent registration/deregistration
- Tracks agent health and capabilities
-
Task Queue Manager
- Redis-backed task queue
- Priority-based task distribution
- Task assignment and tracking
-
Load Balancer
- Capability-aware routing
- Weighted round-robin with health checks
- Failover logic
-
Resource Monitor
- Tracks CPU, memory, active tasks per agent
- Updates metrics in PostgreSQL
- Triggers alerts for unhealthy agents
Implementation Steps
-
Create Coordinator Service:
// common_npm/agent-mesh/backend/src/services/coordinator/coordinator-service.ts import { EventEmitter } from 'events'; import Redis from 'ioredis'; import { Pool } from 'pg'; import { AgentRegistry } from '../../mesh/runtime/agent-registry'; import { LoadBalancer } from '../../mesh/routing/load-balancer'; export class CoordinatorService extends EventEmitter { private redis: Redis; private pgPool: Pool; private registry: AgentRegistry; private loadBalancer: LoadBalancer; constructor(config: { redisUrl: string; postgresUrl: string; }) { super(); this.redis = new Redis(config.redisUrl); this.pgPool = new Pool({ connectionString: config.postgresUrl }); this.registry = new AgentRegistry(); this.loadBalancer = new LoadBalancer({ strategy: 'weighted-round-robin', healthCheckEnabled: true, failoverEnabled: true }); } async registerAgent(agentInfo: { agentId: string; agentName: string; machineId: string; capabilities: string[]; endpoint: string; }): Promise<void> { // Register in PostgreSQL await this.pgPool.query( `INSERT INTO agent_registry (agent_id, agent_name, machine_id, capabilities, endpoint, status) VALUES ($1, $2, $3, $4, $5, 'active') ON CONFLICT (agent_id) DO UPDATE SET last_seen = NOW(), status = 'active'`, [ agentInfo.agentId, agentInfo.agentName, agentInfo.machineId, JSON.stringify(agentInfo.capabilities), agentInfo.endpoint ] ); // Register in Redis for fast lookup await this.redis.hset( `agent:${agentInfo.agentId}`, 'status', 'active', 'machine', agentInfo.machineId, 'capabilities', JSON.stringify(agentInfo.capabilities), 'endpoint', agentInfo.endpoint, 'last_seen', Date.now() ); // Add to capability index for (const capability of agentInfo.capabilities) { await this.redis.sadd(`capability:${capability}`, agentInfo.agentId); } this.emit('agentRegistered', agentInfo); } async distributeTask(task: { id: string; type: string; requiredCapabilities: string[]; priority: number; payload: any; }): Promise<string> { // Find eligible agents const eligibleAgents = await this.findAgentsByCapabilities( task.requiredCapabilities ); // Select agent using load balancer const selectedAgent = this.loadBalancer.selectEndpoint( eligibleAgents.map(agent => ({ id: agent.agentId, weight: this.calculateWeight(agent), health: agent.status === 'active' })) ); // Assign task await this.redis.lpush( `tasks:${selectedAgent.id}`, JSON.stringify(task) ); // Track task assignment await this.redis.hset( `task:${task.id}`, 'assigned_to', selectedAgent.id, 'status', 'assigned', 'assigned_at', Date.now() ); return selectedAgent.id; } private async findAgentsByCapabilities( requiredCapabilities: string[] ): Promise<any[]> { // Get agents with all required capabilities const agentIds = await this.redis.sinter( ...requiredCapabilities.map(cap => `capability:${cap}`) ); // Fetch agent details const agents = await Promise.all( agentIds.map(async (agentId: string) => { const agentData = await this.redis.hgetall(`agent:${agentId}`); return { agentId, ...agentData, capabilities: JSON.parse(agentData.capabilities || '[]') }; }) ); return agents.filter(agent => agent.status === 'active'); } private calculateWeight(agent: any): number { // Weight based on: // - Machine performance (M4 > M3) // - Current load // - Health status let weight = 1.0; if (agent.machine === 'mac-m4') weight *= 1.2; if (agent.machine === 'mac-m3') weight *= 1.0; // Reduce weight if high load const load = parseFloat(agent.active_tasks || '0'); weight *= Math.max(0.5, 1.0 - (load / 10)); return weight; } } -
Create Coordinator CLI/Service Entry Point:
// common_npm/agent-mesh/backend/src/services/coordinator/index.ts import { CoordinatorService } from './coordinator-service'; import express from 'express'; const app = express(); app.use(express.json()); const coordinator = new CoordinatorService({ redisUrl: process.env.REDIS_URL || 'redis://localhost:6379', postgresUrl: process.env.POSTGRES_URL || 'postgresql://localhost:5432/agent_coordinator' }); // Agent registration endpoint app.post('/api/v1/agents/register', async (req, res) => { try { await coordinator.registerAgent(req.body); res.json({ success: true }); } catch (error) { res.status(500).json({ error: error.message }); } }); // Task distribution endpoint app.post('/api/v1/tasks/distribute', async (req, res) => { try { const agentId = await coordinator.distributeTask(req.body); res.json({ agentId, success: true }); } catch (error) { res.status(500).json({ error: error.message }); } }); // Health check endpoint app.get('/health', async (req, res) => { res.json({ status: 'healthy' }); }); const PORT = process.env.PORT || 8080; app.listen(PORT, () => { console.log(`Coordinator service running on port ${PORT}`); }); -
Create Systemd Service or LaunchDaemon:
<!-- ~/Library/LaunchAgents/com.bluefly.agent-coordinator.plist --> <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>com.bluefly.agent-coordinator</string> <key>ProgramArguments</key> <array> <string>/usr/local/bin/node</string> <string>$LLM_ROOT/common_npm/agent-mesh/backend/dist/services/coordinator/index.js</string> </array> <key>RunAtLoad</key> <true/> <key>KeepAlive</key> <true/> <key>StandardOutPath</key> <string>$LLM_ROOT/logs/coordinator.log</string> <key>StandardErrorPath</key> <string>$LLM_ROOT/logs/coordinator.error.log</string> <key>EnvironmentVariables</key> <dict> <key>REDIS_URL</key> <string>redis://<router-ip>:6379</string> <key>POSTGRES_URL</key> <string>postgresql://agent_user:<password>@<router-ip>:5432/agent_coordinator</string> <key>NODE_ENV</key> <string>production</string> </dict> </dict> </plist>Load service:
launchctl load ~/Library/LaunchAgents/com.bluefly.agent-coordinator.plist launchctl start com.bluefly.agent-coordinator
Agent Registration from Both Macs
Phase 4: Agent Registration System (10 minutes)
Mac M4 Agent Registration
Agents on Mac M4 will automatically register with the coordinator on startup:
// common_npm/agent-mesh/src/runtime/agent-registry.ts (enhancement) import axios from 'axios'; export class AgentRegistry extends EventEmitter { private coordinatorUrl: string; private machineId: string = 'mac-m4'; constructor(config?: { coordinatorUrl?: string }) { super(); this.coordinatorUrl = config?.coordinatorUrl || process.env.COORDINATOR_URL || 'http://localhost:8080'; } async register(manifest: AgentManifest): Promise<AgentRegistration> { // Local registration (existing code) const registration = await this.localRegister(manifest); // Register with coordinator try { await axios.post(`${this.coordinatorUrl}/api/v1/agents/register`, { agentId: registration.id, agentName: manifest.name, machineId: this.machineId, capabilities: manifest.capabilities.map(c => c.name), endpoint: this.getAgentEndpoint(registration.id) }); // Start heartbeat this.startHeartbeat(registration.id); } catch (error) { console.warn('Failed to register with coordinator:', error); // Continue with local registration only } return registration; } private startHeartbeat(agentId: string): void { setInterval(async () => { try { await axios.post(`${this.coordinatorUrl}/api/v1/agents/heartbeat`, { agentId, machineId: this.machineId, metrics: this.getCurrentMetrics(agentId) }); } catch (error) { console.warn('Heartbeat failed:', error); } }, 30000); // Every 30 seconds } private getCurrentMetrics(agentId: string): any { const agent = this.agents.get(agentId); if (!agent) return {}; return { cpu_percent: this.getCpuUsage(), memory_mb: this.getMemoryUsage(), active_tasks: this.getActiveTaskCount(agentId), status: agent.status }; } }
Mac M3 Agent Registration
Same code, but with machineId: 'mac-m3':
// On Mac M3, set environment variable: export MACHINE_ID=mac-m3 export COORDINATOR_URL=http://<mac-m4-ip>:8080
Agent Discovery and Task Execution
When an agent needs to execute a task, it queries the coordinator:
// common_npm/agent-mesh/src/runtime/task-executor.ts export class TaskExecutor { private coordinatorUrl: string; async executeTask(task: Task): Promise<TaskResult> { // If task requires coordination, route through coordinator if (task.requiresCoordination) { const response = await axios.post( `${this.coordinatorUrl}/api/v1/tasks/distribute`, { id: task.id, type: task.type, requiredCapabilities: task.requiredCapabilities, priority: task.priority, payload: task.payload } ); // Task assigned to an agent (may be on different machine) // Wait for result or execute locally if assigned to this agent if (response.data.agentId === this.agentId) { return this.executeLocally(task); } else { return this.waitForRemoteResult(task.id); } } else { // Execute locally return this.executeLocally(task); } } }
Load Balancing Configuration
Phase 5: Load Balancing Strategy (5 minutes)
The coordinator implements capability-aware weighted round-robin:
Algorithm:
- Capability Matching: Filter agents by required capabilities
- Health Check: Only consider healthy agents (status = 'active', recent heartbeat)
- Weight Calculation:
- Base weight: Machine type (M4 = 1.2, M3 = 1.0)
- Load factor: Reduce weight based on active tasks (max 10 tasks = 0.5x weight)
- Health factor: Unhealthy agents = 0 weight
- Selection: Weighted round-robin from eligible agents
Configuration:
// common_npm/agent-mesh/backend/src/services/coordinator/load-balancer-config.ts export const LoadBalancerConfig = { strategy: 'weighted-round-robin', healthCheckInterval: 30000, // 30 seconds healthCheckTimeout: 5000, // 5 seconds maxRetries: 3, failoverEnabled: true, weights: { 'mac-m4': 1.2, 'mac-m3': 1.0 }, loadThresholds: { maxActiveTasks: 10, cpuThreshold: 80, // % memoryThreshold: 2048 // MB } };
Resource Monitoring
Phase 6: Resource Monitoring System (10 minutes)
Metrics Collection
Each agent reports metrics to the coordinator:
// common_npm/agent-mesh/src/runtime/metrics-collector.ts import os from 'os'; import { performance } from 'perf_hooks'; export class MetricsCollector { private agentId: string; private coordinatorUrl: string; startCollection(): void { setInterval(() => { this.collectAndReport(); }, 30000); // Every 30 seconds } private async collectAndReport(): Promise<void> { const metrics = { agentId: this.agentId, timestamp: new Date().toISOString(), cpu: { percent: this.getCpuUsage(), cores: os.cpus().length }, memory: { used_mb: this.getMemoryUsage(), total_mb: os.totalmem() / 1024 / 1024, percent: (os.freemem() / os.totalmem()) * 100 }, tasks: { active: this.getActiveTaskCount(), completed: this.getCompletedTaskCount(), failed: this.getFailedTaskCount() }, network: { latency_ms: await this.measureLatency() } }; // Report to coordinator try { await axios.post( `${this.coordinatorUrl}/api/v1/metrics/report`, metrics ); } catch (error) { console.warn('Failed to report metrics:', error); } } private getCpuUsage(): number { const cpus = os.cpus(); let totalIdle = 0; let totalTick = 0; for (const cpu of cpus) { for (const type in cpu.times) { totalTick += cpu.times[type as keyof typeof cpu.times]; } totalIdle += cpu.times.idle; } const idle = totalIdle / cpus.length; const total = totalTick / cpus.length; const usage = 100 - ~~(100 * idle / total); return usage; } private getMemoryUsage(): number { return (os.totalmem() - os.freemem()) / 1024 / 1024; } }
Coordinator Metrics Storage
Coordinator stores metrics in PostgreSQL:
// In coordinator-service.ts async storeMetrics(metrics: AgentMetrics): Promise<void> { await this.pgPool.query( `INSERT INTO agent_metrics (agent_id, timestamp, cpu_percent, memory_mb, active_tasks, completed_tasks, failed_tasks) VALUES ($1, $2, $3, $4, $5, $6, $7)`, [ metrics.agentId, metrics.timestamp, metrics.cpu.percent, metrics.memory.used_mb, metrics.tasks.active, metrics.tasks.completed, metrics.tasks.failed ] ); }
Monitoring Dashboard (Optional)
Create a simple monitoring endpoint:
// In coordinator service app.get('/api/v1/monitoring/dashboard', async (req, res) => { const agents = await this.pgPool.query( `SELECT agent_id, agent_name, machine_id, status, last_seen, capabilities FROM agent_registry WHERE status = 'active' ORDER BY machine_id, agent_name` ); const metrics = await this.pgPool.query( `SELECT agent_id, AVG(cpu_percent) as avg_cpu, AVG(memory_mb) as avg_memory, SUM(active_tasks) as total_active_tasks, SUM(completed_tasks) as total_completed, SUM(failed_tasks) as total_failed FROM agent_metrics WHERE timestamp > NOW() - INTERVAL '1 hour' GROUP BY agent_id` ); res.json({ agents: agents.rows, metrics: metrics.rows, summary: { total_agents: agents.rows.length, agents_by_machine: { 'mac-m4': agents.rows.filter(a => a.machine_id === 'mac-m4').length, 'mac-m3': agents.rows.filter(a => a.machine_id === 'mac-m3').length } } }); });
Testing and Validation
Phase 7: Testing Task Distribution (5 minutes)
Test Script
// test-coordination.ts import axios from 'axios'; const COORDINATOR_URL = process.env.COORDINATOR_URL || 'http://localhost:8080'; async function testCoordination() { console.log('Testing agent coordination...'); // 1. Register test agents from both Macs console.log('\n1. Registering agents...'); // Mac M4 agents await axios.post(`${COORDINATOR_URL}/api/v1/agents/register`, { agentId: 'test-agent-m4-1', agentName: 'Test Agent M4-1', machineId: 'mac-m4', capabilities: ['code_generation', 'testing'], endpoint: 'http://mac-m4:3000/agent/test-agent-m4-1' }); await axios.post(`${COORDINATOR_URL}/api/v1/agents/register`, { agentId: 'test-agent-m4-2', agentName: 'Test Agent M4-2', machineId: 'mac-m4', capabilities: ['code_review', 'refactoring'], endpoint: 'http://mac-m4:3000/agent/test-agent-m4-2' }); // Mac M3 agents await axios.post(`${COORDINATOR_URL}/api/v1/agents/register`, { agentId: 'test-agent-m3-1', agentName: 'Test Agent M3-1', machineId: 'mac-m3', capabilities: ['code_generation', 'testing'], endpoint: 'http://mac-m3:3000/agent/test-agent-m3-1' }); await axios.post(`${COORDINATOR_URL}/api/v1/agents/register`, { agentId: 'test-agent-m3-2', agentName: 'Test Agent M3-2', machineId: 'mac-m3', capabilities: ['documentation', 'analysis'], endpoint: 'http://mac-m3:3000/agent/test-agent-m3-2' }); console.log(' Agents registered'); // 2. Distribute tasks console.log('\n2. Distributing tasks...'); const tasks = [ { id: 'task-1', type: 'code_generation', requiredCapabilities: ['code_generation'] }, { id: 'task-2', type: 'code_review', requiredCapabilities: ['code_review'] }, { id: 'task-3', type: 'testing', requiredCapabilities: ['testing'] }, { id: 'task-4', type: 'documentation', requiredCapabilities: ['documentation'] }, ]; for (const task of tasks) { const response = await axios.post( `${COORDINATOR_URL}/api/v1/tasks/distribute`, { ...task, priority: 1, payload: { test: true } } ); console.log(`Task ${task.id} assigned to: ${response.data.agentId}`); } // 3. Check monitoring dashboard console.log('\n3. Checking monitoring dashboard...'); const dashboard = await axios.get(`${COORDINATOR_URL}/api/v1/monitoring/dashboard`); console.log('Dashboard data:', JSON.stringify(dashboard.data, null, 2)); console.log('\n Coordination test complete!'); } testCoordination().catch(console.error);
Security Considerations
Network Security
- Redis Authentication: Always use strong passwords
- PostgreSQL Authentication: Use strong passwords, limit network access
- TLS/SSL: Consider using stunnel or Redis/TLS for production
- Firewall Rules: Only allow connections from known Mac IPs
- VPN Option: Consider using VPN for secure communication
Agent Authentication
- API Keys: Each agent should authenticate with coordinator using API key
- Token Rotation: Rotate tokens regularly
- Rate Limiting: Implement rate limiting on coordinator endpoints
Implementation
// Add authentication middleware to coordinator import crypto from 'crypto'; const API_KEYS = new Map([ ['mac-m4-key', 'mac-m4'], ['mac-m3-key', 'mac-m3'] ]); app.use('/api/v1', (req, res, next) => { const apiKey = req.headers['x-api-key']; if (!apiKey || !API_KEYS.has(apiKey)) { return res.status(401).json({ error: 'Unauthorized' }); } req.machineId = API_KEYS.get(apiKey); next(); });
Deployment Checklist
Pre-Deployment
- Redis installed and accessible from both Macs
- PostgreSQL installed and accessible from both Macs
- Database schema created
- Coordinator service code implemented
- Environment variables configured
- Firewall rules configured
- API keys generated and distributed
Mac M4 Setup
- Coordinator service installed
- LaunchDaemon configured
- Service running and healthy
- Agents configured to register with coordinator
- Test agent registration successful
Mac M3 Setup
- Environment variables set (COORDINATOR_URL, MACHINE_ID, API_KEY)
- Agents configured to register with coordinator
- Test agent registration successful
- Network connectivity verified
Validation
- Agents from both Macs visible in coordinator
- Task distribution working across machines
- Load balancing distributing tasks correctly
- Metrics collection working
- Health checks functioning
- Failover working (test by stopping one agent)
Troubleshooting Guide
Common Issues
1. Agents Not Registering
Symptoms: Agents don't appear in coordinator dashboard
Diagnosis:
# Check coordinator logs tail -f ~/Sites/LLM/logs/coordinator.log # Check Redis connectivity redis-cli -h <router-ip> -p 6379 -a <password> PING # Check PostgreSQL connectivity psql -h <router-ip> -U agent_user -d agent_coordinator -c "SELECT COUNT(*) FROM agent_registry;"
Solutions:
- Verify network connectivity
- Check firewall rules
- Verify credentials
- Check coordinator service is running
2. Tasks Not Distributing
Symptoms: Tasks stay in queue, not assigned to agents
Diagnosis:
# Check Redis task queues redis-cli -h <router-ip> -p 6379 -a <password> KEYS "tasks:*" # Check agent capabilities redis-cli -h <router-ip> -p 6379 -a <password> SMEMBERS "capability:code_generation"
Solutions:
- Verify agents have required capabilities
- Check load balancer configuration
- Verify agent health status
3. High Latency
Symptoms: Tasks take long time to distribute
Diagnosis:
- Check network latency between Macs and router
- Check Redis/PostgreSQL performance
- Review coordinator service logs for bottlenecks
Solutions:
- Optimize database queries
- Consider Redis connection pooling
- Review load balancer algorithm
Performance Optimization
Redis Optimization
- Connection Pooling: Use connection pools for Redis
- Pipelining: Batch Redis operations
- Memory Management: Configure Redis maxmemory policy
PostgreSQL Optimization
- Indexes: Ensure proper indexes on frequently queried columns
- Connection Pooling: Use pgBouncer or similar
- Query Optimization: Review slow queries
Load Balancer Tuning
- Weights: Adjust machine weights based on actual performance
- Health Check Interval: Balance between responsiveness and overhead
- Task Batching: Batch small tasks for efficiency
Future Enhancements
Phase 8: Advanced Features (Future)
- Dynamic Scaling: Automatically spawn agents based on load
- Task Prioritization: Implement priority queues
- Agent Specialization: Route tasks to specialized agents
- Cost Tracking: Track token usage per agent/machine
- Auto-Failover: Automatically restart failed agents
- Distributed Tracing: Full observability across machines
- Web Dashboard: Real-time monitoring UI
- Mobile Notifications: Alert on critical issues
Summary
This plan provides a complete technical approach to coordinating agents across two Mac machines:
- Shared Infrastructure: Redis and PostgreSQL on router for coordination
- Coordinator Service: Central service on Mac M4 managing task distribution
- Agent Registration: Both Macs register agents with coordinator
- Load Balancing: Capability-aware weighted distribution
- Resource Monitoring: Real-time metrics collection and storage
Total Implementation Time:
- Basic setup: ~20 minutes
- Full production deployment: ~2-4 hours
- Testing and validation: ~30 minutes
Expected Benefits:
- Double agent processing capacity
- Better resource utilization
- Automatic failover and load balancing
- Centralized monitoring and management