Common Pitfalls

Separation of Duties: See Separation of Duties - Getting started guides document onboarding. They do NOT own agent manifests, execution, or infrastructure configuration.

Learn from common mistakes and how to avoid them when working with the LLM Platform, BuildKit, and OSSA agents.

Installation & Setup
OSSA Agent Development
BuildKit CLI Usage
Drupal Development
Workflow Orchestration
Production Deployment
Performance & Optimization
Security

Installation & Setup

Pitfall: Using Docker Desktop Instead of OrbStack (macOS)

Problem:

# Slow file sync, high CPU usage, poor performance
docker ps  # Takes 5+ seconds

Solution:

# Switch to OrbStack for 10x better performance
brew install orbstack

# Uninstall Docker Desktop
# macOS  Applications  Docker  Uninstall

# Verify OrbStack
docker ps  # Should be instant

Why it matters: Docker Desktop on macOS uses inefficient file mounting. OrbStack uses VirtIO-FS for native-speed file access.

Pitfall: Not Installing DDEV Addons

Problem:

ddev drush status
# Command not found: drush

Solution:

# Install platform-specific DDEV addons (one-time)
cd ~/Sites/LLM/llm-platform
./infrastructure/ddev-addons/install-addons.sh

# Now available:
ddev drush status
ddev tddai check
ddev git-safe commit

Pitfall: Wrong PHP Version

Problem:

composer install
# Your requirements could not be resolved to an installable set of packages
# drupal/core requires php >=8.3

Solution:

# Check PHP version
php --version

# macOS: Install PHP 8.3
brew install php@8.3
brew link php@8.3 --force --overwrite

# Verify
php --version  # Should show 8.3.x

Pitfall: Missing Environment Variables

Problem:

buildkit agents deploy
# Error: GITLAB_TOKEN not set

Solution:

# Store tokens in ~/.tokens/
mkdir -p ~/.tokens
echo "your-gitlab-token" > ~/.tokens/gitlab
chmod 600 ~/.tokens/gitlab

# Set environment variables
export GITLAB_URL="https://gitlab.com"
export GITLAB_TOKEN=$(cat ~/.tokens/gitlab)

# Persist in shell profile
echo 'export GITLAB_URL="https://gitlab.com"' >> ~/.zshrc
echo 'export GITLAB_TOKEN=$(cat ~/.tokens/gitlab)' >> ~/.zshrc

OSSA Agent Development

Pitfall: Invalid OSSA Manifest

Problem:

# agent.ossa.yaml
ossaVersion: "0.4.9"
agent:
  id: my-agent
  name: My Agent
  # Missing required fields!

Solution:

ossaVersion: "0.4.9"

agent:
  id: my-agent                    # Required: DNS-1123 format
  name: My Agent                  # Required: Human-readable name
  version: "1.0.0"                # Required: Semantic version
  role: worker                    # Required: worker, governor, critic, observer

  runtime:                        # Required
    type: local                   # Required
    node:
      version: "20.x"
      entrypoint: "dist/index.js"

  capabilities:                   # Required: At least one
    - name: process_data
      description: Process data
      input_schema: { type: object }
      output_schema: { type: object }

Validate:

ossa validate agent.ossa.yaml

Pitfall: Missing Capability Input/Output Schemas

Problem:

capabilities:
  - name: validate_code
    description: Validate code
    # Missing input_schema and output_schema!

Impact: Agents can't communicate, workflow orchestration breaks, no type safety.

Solution:

capabilities:
  - name: validate_code
    description: Validate code quality
    input_schema:
      type: object
      properties:
        files:
          type: array
          items:
            type: string
          description: List of file paths
        standards:
          type: string
          enum: [drupal, javascript, python]
      required: [files]
    output_schema:
      type: object
      properties:
        valid:
          type: boolean
        violations:
          type: array
          items:
            type: object
        summary:
          type: string
      required: [valid]

Pitfall: Not Handling Agent Errors

Problem:

// Agent crashes on error
app.post('/capabilities/validate', async (req, res) => {
  const result = await validateCode(req.body.files);  // Might throw!
  res.json(result);
});

Solution:

app.post('/capabilities/validate', async (req, res) => {
  try {
    const { files, standards = 'javascript' } = req.body;

    if (!files || !Array.isArray(files)) {
      return res.status(400).json({
        error: 'Invalid input: files array required',
        code: 'INVALID_INPUT'
      });
    }

    const result = await validateCode(files, standards);
    res.json(result);
  } catch (error) {
    logger.error('Validation failed', { error: error.message, stack: error.stack });
    res.status(500).json({
      error: 'Validation failed',
      code: 'VALIDATION_ERROR',
      message: error.message
    });
  }
});

Pitfall: Ignoring Agent Health Checks

Problem:

# Kubernetes kills agent repeatedly
kubectl get pods -n agents
# agent-pod   0/1   CrashLoopBackOff

Solution:

// Add health check endpoint
app.get('/health', (req, res) => {
  const health = {
    status: 'ok',
    agent: 'my-agent',
    version: '1.0.0',
    uptime: process.uptime(),
    memory: process.memoryUsage(),
  };

  res.json(health);
});

// Add readiness check
app.get('/ready', async (req, res) => {
  try {
    // Check dependencies
    await checkDatabaseConnection();
    await checkExternalServices();

    res.json({ ready: true });
  } catch (error) {
    res.status(503).json({ ready: false, error: error.message });
  }
});

Kubernetes config:

livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /ready
    port: 3000
  initialDelaySeconds: 15
  periodSeconds: 5

BuildKit CLI Usage

Pitfall: Not Checking Agent Status Before Deployment

Problem:

buildkit agents deploy my-agent
# Deploys broken agent to production!

Solution:

# Always validate first
buildkit ossa validate agent.ossa.yaml

# Test locally
buildkit agents start my-agent --local

# Run health check
curl http://localhost:3000/health

# Then deploy
buildkit agents deploy my-agent --namespace agents

Pitfall: Hardcoding Configuration

Problem:

// Hardcoded values
const DATABASE_URL = 'postgresql://user:password@localhost:5432/db';
const API_KEY = 'sk-1234567890';

Solution:

// Use environment variables
const DATABASE_URL = process.env.DATABASE_URL;
const API_KEY = process.env.API_KEY;

// Validate at startup
if (!DATABASE_URL || !API_KEY) {
  console.error('Missing required environment variables');
  process.exit(1);
}

Set in Kubernetes:

env:
  - name: DATABASE_URL
    valueFrom:
      secretKeyRef:
        name: db-credentials
        key: url
  - name: API_KEY
    valueFrom:
      secretKeyRef:
        name: api-credentials
        key: key

Pitfall: Not Using BuildKit Golden Commands

Problem:

# Doing manual work that BuildKit automates
grep -r "TODO" src/
find . -name "*.ts" -exec eslint {} \;
git add . && git commit -m "Update"

Solution:

# Use BuildKit golden commands instead
buildkit golden audit          # Comprehensive security + quality audit
buildkit golden fix            # Auto-fix issues
buildkit golden test           # Run all tests
buildkit golden sync           # Sync GitLab (issues + wiki)
buildkit golden deploy --env dev  # Deploy with checks

Drupal Development

Pitfall: Editing Composer-Managed Modules

Problem:

# Editing files in web/modules/custom/
cd $LLM_ROOT/llm-platform/web/modules/custom/llm
nano llm.module  # Changes will be LOST on composer install!

Solution:

# Edit source files instead
cd $LLM_ROOT/all_drupal_custom/modules/llm
nano llm.module

# Sync to llm-platform
buildkit drupal sync --modules

# Or manually
cd $LLM_ROOT/llm-platform
composer update drupal/llm

Why: web/modules/custom/* is managed by Composer and will be overwritten.

Pitfall: Not Clearing Drupal Cache

Problem:

# Made changes but don't see them
# Updated routing, added service, changed config

Solution:

# Always clear cache after changes
ddev drush cr

# Or use DDEV shortcut
ddev restart

When to clear cache:

After configuration import (drush cim)
After module enable/disable
After routing changes
After service definition changes
After pretty much anything!

Pitfall: Skipping Configuration Export

Problem:

# Made configuration changes in UI
# Didn't export to code
# Lost on next deployment!

Solution:

# After any UI configuration changes
ddev drush cex -y

# Commit changes
git add config/
git commit -m "feat: update content type configuration"

Automate with Git hook:

# .git/hooks/pre-commit
#!/bin/bash
ddev drush cex -y
git add config/

Workflow Orchestration

Pitfall: Missing Workflow Dependencies

Problem:

stages:
  - name: deploy
    # Forgot depends_on!
    steps:
      - name: deploy_to_prod
        agent: deployment-orchestrator

Impact: Deploy runs before tests complete, deploys broken code.

Solution:

stages:
  - name: validate
    steps: [...]

  - name: test
    depends_on: [validate]
    steps: [...]

  - name: deploy
    depends_on: [test]
    condition: "{{ stages.test.status == 'passed' }}"
    steps: [...]

Pitfall: No Timeout Configuration

Problem:

steps:
  - name: run_tests
    agent: test-runner
    # No timeout! Hangs forever if tests freeze.

Solution:

steps:
  - name: run_tests
    agent: test-runner
    capability: run_tests
    timeout: 10m              # Fail after 10 minutes
    retry:
      max_attempts: 2
      backoff: exponential

Pitfall: Not Handling Workflow Failures

Problem:

# No failure handling
# Leaves deployments in inconsistent state

Solution:

on_workflow_failure:
  - name: rollback
    agent: deployment-orchestrator
    capability: rollback
    input:
      environment: "{{ env.ENVIRONMENT }}"

  - name: notify_team
    agent: slack-notifier
    capability: send_message
    input:
      channel: "#incidents"
      message: "Deployment failed: {{ workflow.error }}"

Production Deployment

Pitfall: No Resource Limits

Problem:

# Kubernetes deployment without limits
spec:
  containers:
    - name: agent
      image: my-agent:latest
      # No resources! Agent can consume all cluster resources!

Solution:

spec:
  containers:
    - name: agent
      image: my-agent:latest
      resources:
        requests:
          cpu: 250m
          memory: 512Mi
        limits:
          cpu: 1000m
          memory: 2Gi

Pitfall: Missing Health Checks in Kubernetes

Problem:

# Kubernetes doesn't know if pod is healthy
kubectl get pods
# Pod shows Running but agent is crashed inside

Solution:

spec:
  containers:
    - name: agent
      livenessProbe:
        httpGet:
          path: /health
          port: 3000
        initialDelaySeconds: 30
        periodSeconds: 10
        failureThreshold: 3

      readinessProbe:
        httpGet:
          path: /ready
          port: 3000
        initialDelaySeconds: 15
        periodSeconds: 5
        failureThreshold: 3

Pitfall: Not Using Persistent Volumes

Problem:

# Pod restarts, loses all data!
kubectl delete pod my-agent-xyz
# Files, databases, logs - all gone!

Solution:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: agent-storage
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: fast-ssd

---
spec:
  containers:
    - name: agent
      volumeMounts:
        - name: storage
          mountPath: /data
  volumes:
    - name: storage
      persistentVolumeClaim:
        claimName: agent-storage

Pitfall: No SSL/TLS Configuration

Problem:

# Accessing service via HTTP
curl http://agents.yourcompany.com
# Insecure! Credentials sent in plaintext!

Solution:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: agents-ingress
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
    - hosts:
        - agents.yourcompany.com
      secretName: agents-tls
  rules:
    - host: agents.yourcompany.com
      http:
        paths:
          - path: /
            backend:
              service:
                name: agents
                port:
                  number: 80

Performance & Optimization

Pitfall: Not Enabling Caching

Problem:

// Recalculates expensive operation on every request
app.get('/data', async (req, res) => {
  const data = await expensiveCalculation();  // Takes 5 seconds!
  res.json(data);
});

Solution:

import Redis from 'ioredis';
const redis = new Redis(process.env.REDIS_URL);

app.get('/data', async (req, res) => {
  // Check cache first
  const cached = await redis.get('expensive-data');
  if (cached) {
    return res.json(JSON.parse(cached));
  }

  // Calculate and cache
  const data = await expensiveCalculation();
  await redis.setex('expensive-data', 3600, JSON.stringify(data));
  res.json(data);
});

Pitfall: Blocking Event Loop

Problem:

// Synchronous file operations block event loop
app.post('/process', (req, res) => {
  const files = fs.readdirSync('./large-directory');  // Blocks!
  files.forEach(file => {
    const content = fs.readFileSync(file);  // Blocks!
    processFile(content);
  });
  res.json({ done: true });
});

Solution:

app.post('/process', async (req, res) => {
  // Use async operations
  const files = await fs.promises.readdir('./large-directory');
  await Promise.all(
    files.map(async file => {
      const content = await fs.promises.readFile(file);
      await processFile(content);
    })
  );
  res.json({ done: true });
});

Pitfall: Not Monitoring Memory Usage

Problem:

# Agent crashes with OOM (Out of Memory)
kubectl get pods
# agent-xyz   0/1   OOMKilled

Solution:

// Monitor memory usage
setInterval(() => {
  const usage = process.memoryUsage();
  const mbUsed = Math.round(usage.heapUsed / 1024 / 1024);

  logger.info('Memory usage', { heapUsed: mbUsed });

  if (mbUsed > 1500) {  // 1.5 GB threshold
    logger.warn('High memory usage', { heapUsed: mbUsed });
    // Trigger cleanup, restart, or scale
  }
}, 60000);  // Check every minute

Security

Pitfall: Committing Secrets to Git

Problem:

# Committed .env file with secrets!
git add .env
git commit -m "Add config"
git push
# Secrets now in Git history forever!

Solution:

# Add to .gitignore
echo ".env" >> .gitignore
echo ".env.*" >> .gitignore
echo "*.pem" >> .gitignore
echo "*.key" >> .gitignore

# Remove from Git history if already committed
git filter-branch --force --index-filter \
  "git rm --cached --ignore-unmatch .env" \
  --prune-empty --tag-name-filter cat -- --all

Store secrets securely:

# Use ~/.tokens/ directory
mkdir -p ~/.tokens
chmod 700 ~/.tokens

echo "secret-value" > ~/.tokens/service-name
chmod 600 ~/.tokens/service-name

# Reference in code
const token = fs.readFileSync(path.join(os.homedir(), '.tokens', 'gitlab'), 'utf8').trim();

Pitfall: Not Validating Input

Problem:

// Vulnerable to injection attacks
app.post('/execute', (req, res) => {
  const command = req.body.command;
  exec(command);  // Command injection!
});

Solution:

import validator from 'validator';

app.post('/execute', (req, res) => {
  const { command } = req.body;

  // Validate input
  if (!command || typeof command !== 'string') {
    return res.status(400).json({ error: 'Invalid command' });
  }

  // Whitelist allowed commands
  const allowedCommands = ['test', 'build', 'deploy'];
  if (!allowedCommands.includes(command)) {
    return res.status(403).json({ error: 'Command not allowed' });
  }

  // Sanitize and execute safely
  exec(validator.escape(command), { timeout: 30000 }, (error, stdout) => {
    if (error) {
      return res.status(500).json({ error: error.message });
    }
    res.json({ output: stdout });
  });
});

Pitfall: Missing Rate Limiting

Problem:

// No rate limiting - vulnerable to DDoS
app.post('/webhook', async (req, res) => {
  await processWebhook(req.body);
  res.json({ received: true });
});

Solution:

import rateLimit from 'express-rate-limit';

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000,  // 15 minutes
  max: 100,                   // Limit each IP to 100 requests per windowMs
  message: 'Too many requests, please try again later',
});

app.post('/webhook', limiter, async (req, res) => {
  await processWebhook(req.body);
  res.json({ received: true });
});

Quick Reference: Troubleshooting Commands

# DDEV
ddev describe                    # Show DDEV project info
ddev logs                        # View container logs
ddev restart                     # Restart containers
ddev delete -O && ddev start     # Nuclear option: rebuild everything

# BuildKit
buildkit agents status <name>    # Check agent health
buildkit agents logs <name>      # View agent logs
buildkit agents restart <name>   # Restart agent
buildkit ossa validate <file>    # Validate OSSA manifest

# Kubernetes
kubectl get pods -n agents       # List agent pods
kubectl describe pod <pod>       # Pod details
kubectl logs <pod> --follow      # Stream logs
kubectl exec -it <pod> -- /bin/sh  # Shell into pod

# Drupal
ddev drush status                # Drupal status
ddev drush cr                    # Clear cache
ddev drush cex -y                # Export config
ddev drush cim -y                # Import config
ddev drush updb -y               # Run database updates

Next Steps

Review System Requirements for optimal setup
Follow Development Setup for best practices
Learn Production Deployment patterns

Common Pitfalls

Common Pitfalls

Table of Contents

Installation & Setup

Pitfall: Using Docker Desktop Instead of OrbStack (macOS)

Pitfall: Not Installing DDEV Addons

Pitfall: Wrong PHP Version

Pitfall: Missing Environment Variables

OSSA Agent Development

Pitfall: Invalid OSSA Manifest

Pitfall: Missing Capability Input/Output Schemas

Pitfall: Not Handling Agent Errors

Pitfall: Ignoring Agent Health Checks

BuildKit CLI Usage

Pitfall: Not Checking Agent Status Before Deployment

Pitfall: Hardcoding Configuration

Pitfall: Not Using BuildKit Golden Commands

Drupal Development

Pitfall: Editing Composer-Managed Modules

Pitfall: Not Clearing Drupal Cache

Pitfall: Skipping Configuration Export

Workflow Orchestration

Pitfall: Missing Workflow Dependencies

Pitfall: No Timeout Configuration

Pitfall: Not Handling Workflow Failures

Production Deployment

Pitfall: No Resource Limits

Pitfall: Missing Health Checks in Kubernetes

Pitfall: Not Using Persistent Volumes

Pitfall: No SSL/TLS Configuration

Performance & Optimization

Pitfall: Not Enabling Caching

Pitfall: Blocking Event Loop

Pitfall: Not Monitoring Memory Usage

Security

Pitfall: Committing Secrets to Git

Pitfall: Not Validating Input

Pitfall: Missing Rate Limiting

Quick Reference: Troubleshooting Commands

Next Steps

Additional Resources