Security Architecture
Security Architecture
Comprehensive security architecture for the Bluefly LLM Platform.
Overview
The platform implements defense-in-depth security with multiple layers:
- Authentication: JWT + OAuth 2.0 with GitLab
- Authorization: RBAC with granular permissions
- Encryption: AES-256-GCM at rest, TLS 1.3 in transit
- Compliance: FedRAMP Moderate, NIST 800-53, HIPAA, GDPR
- Audit: Complete audit trails with tamper-proof logging
- Network Security: Zero-trust architecture with mTLS
Security Layers
graph TB A[Edge Security] --> B[Network Security] B --> C[Authentication Layer] C --> D[Authorization Layer] D --> E[Application Security] E --> F[Data Security] F --> G[Audit & Monitoring]
1. Edge Security
Web Application Firewall (WAF)
Features:
- SQL injection prevention
- XSS attack mitigation
- CSRF protection
- Rate limiting
- DDoS protection
Configuration:
waf: enabled: true rulesets: - OWASP_CRS_3.3 - custom-llm-rules blockMode: true logAll: true
Content Security Policy (CSP)
Headers:
Content-Security-Policy:
default-src 'self';
script-src 'self';
style-src 'self' 'unsafe-inline';
img-src 'self' data: https:;
connect-src 'self';
font-src 'self';
object-src 'none';
media-src 'self';
frame-src 'none';
Security Headers
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
Referrer-Policy: strict-origin-when-cross-origin
Permissions-Policy: geolocation=(), microphone=(), camera=()
2. Network Security
Zero-Trust Architecture
Principles:
- Never trust, always verify
- Least privilege access
- Assume breach mindset
- Verify explicitly
Implementation:
graph LR A[Client] -->|mTLS| B[API Gateway] B -->|JWT Validation| C[Auth Service] C -->|RBAC Check| D[Service Mesh] D -->|Encrypted| E[Microservices]
mTLS (Mutual TLS)
All service-to-service communication uses mTLS:
Certificate Authority:
- Internal CA for service certificates
- 90-day certificate rotation
- Automatic renewal via cert-manager
Example Configuration:
mtls: enabled: true mode: STRICT certificates: ca: /etc/certs/ca.crt cert: /etc/certs/service.crt key: /etc/certs/service.key rotation: autoRenew: true renewBefore: 720h # 30 days
Network Segmentation
Zones:
- DMZ: Edge services (gateway, WAF)
- Application: Services, APIs
- Data: Databases, storage
- Management: Admin tools, monitoring
Firewall Rules:
DMZ Application: Port 443 (HTTPS), 50051 (gRPC)
Application Data: Port 5432 (PostgreSQL), 6379 (Redis)
Management All: Port 22 (SSH, admin only)
3. Authentication
See JWT Authentication and OAuth 2.0 for details.
Methods:
- JWT: Service-to-service
- OAuth 2.0: User authentication (GitLab)
- API Keys: External integrations
- mTLS: Microservice mesh
Token Security:
- RS256 signing (RSA 2048-bit)
- 1-hour access token lifetime
- 30-day refresh token lifetime
- Automatic key rotation (90 days)
- Token revocation support
4. Authorization
See RBAC Configuration for complete details.
Role-Based Access Control (RBAC)
Roles:
- Admin: Full system access
- Developer: Agent execution, workflow creation
- User: Read-only access
- Service: Service account permissions
Permission Model:
permission = resource:action
Examples:
- agent:execute
- workflow:create
- mesh:communicate
- admin:configure
Policy Enforcement
Open Policy Agent (OPA):
package authz default allow = false allow { input.user.roles[_] == "admin" } allow { input.user.permissions[_] == concat(":", [input.resource, input.action]) }
Attribute-Based Access Control (ABAC)
Fine-grained access based on attributes:
{ "user": { "id": "user-123", "roles": ["developer"], "groups": ["llm-platform/ml-team"], "clearance": "confidential" }, "resource": { "type": "model", "classification": "confidential", "owner": "llm-platform/ml-team" }, "action": "deploy", "context": { "time": "2025-01-15T10:00:00Z", "location": "us-west-2" } }
5. Data Security
Encryption at Rest
See Encryption at Rest for details.
Algorithms:
- Database: AES-256-GCM
- Files: AES-256-GCM
- Backups: AES-256-GCM with separate keys
Key Management:
- Hardware Security Module (HSM) for key storage
- Key rotation every 90 days
- Key versioning for decryption of old data
Encryption in Transit
See Encryption in Transit for details.
Protocols:
- TLS 1.3: All HTTP(S) traffic
- gRPC with mTLS: Service mesh
- SSH: Admin access only
Cipher Suites (TLS 1.3):
TLS_AES_256_GCM_SHA384
TLS_CHACHA20_POLY1305_SHA256
TLS_AES_128_GCM_SHA256
Secrets Management
See Secrets Management for details.
Tools:
- HashiCorp Vault: Centralized secrets storage
- Kubernetes Secrets: Encrypted at rest
- Environment Variables: Injected at runtime
Secret Types:
- API keys
- Database credentials
- Encryption keys
- OAuth client secrets
- Service account tokens
6. Application Security
Input Validation
Zod Schemas:
import { z } from 'zod'; const ChatCompletionSchema = z.object({ model: z.string().min(1).max(100), messages: z.array(z.object({ role: z.enum(['system', 'user', 'assistant']), content: z.string().min(1).max(10000) })).min(1).max(50), temperature: z.number().min(0).max(2).optional(), max_tokens: z.number().min(1).max(4096).optional() });
Output Encoding
HTML Encoding:
import DOMPurify from 'isomorphic-dompurify'; function sanitizeHTML(dirty: string): string { return DOMPurify.sanitize(dirty, { ALLOWED_TAGS: ['p', 'b', 'i', 'em', 'strong', 'a'], ALLOWED_ATTR: ['href'] }); }
SQL Injection Prevention
Parameterized Queries:
// Safe const result = await db.query( 'SELECT * FROM users WHERE email = $1', [email] ); // Unsafe const result = await db.query( `SELECT * FROM users WHERE email = '${email}'` );
Dependency Security
Automated Scanning:
npm audit(daily)- Snyk vulnerability scanning
- Dependabot security updates
Policy:
- High/Critical vulnerabilities: Fix within 7 days
- Medium vulnerabilities: Fix within 30 days
- Low vulnerabilities: Fix in next release
7. Audit & Monitoring
Audit Logging
Events Logged:
- Authentication attempts (success/failure)
- Authorization decisions
- Data access (PII/PHI)
- Configuration changes
- Admin operations
- Security events
Log Format (JSON):
{ "timestamp": "2025-01-15T10:00:00.000Z", "eventType": "authentication", "action": "login_success", "userId": "user-123", "username": "developer@bluefly.io", "sourceIP": "192.168.1.100", "userAgent": "Mozilla/5.0...", "metadata": { "mfa": false, "provider": "gitlab" } }
Tamper-Proof Logging:
- Hash chain for log integrity
- Append-only storage
- Encrypted at rest
- Retention: 90 days (configurable)
Security Monitoring
Tools:
- SIEM: Splunk / ELK Stack
- IDS/IPS: Snort
- File Integrity: AIDE
- Vulnerability Scanning: Nessus
Alerts:
- Failed login attempts (5+ in 5 min)
- Privilege escalation attempts
- Unusual API usage patterns
- Data exfiltration indicators
- Certificate expiration (30 days)
8. Incident Response
Incident Response Plan
Phases:
- Detection: Automated alerts + manual reporting
- Analysis: Determine scope and impact
- Containment: Isolate affected systems
- Eradication: Remove threat
- Recovery: Restore services
- Post-Incident: Review and improve
Incident Severity:
- P0 (Critical): Data breach, system down
- P1 (High): Security vulnerability, degraded service
- P2 (Medium): Minor security issue
- P3 (Low): Security improvement
Disaster Recovery
RPO/RTO:
- RPO (Recovery Point Objective): 1 hour
- RTO (Recovery Time Objective): 4 hours
Backups:
- Database: Every 6 hours
- Files: Daily
- Configuration: On change
- Retention: 30 days
9. Compliance
FedRAMP Moderate
See FedRAMP Compliance for details.
Controls: 325 controls from NIST 800-53 Certification: FedRAMP Moderate baseline Audit: Annual assessment
NIST 800-53
See NIST 800-53 Controls for details.
Families Implemented:
- AC (Access Control)
- AU (Audit and Accountability)
- SC (System and Communications Protection)
- IA (Identification and Authentication)
- IR (Incident Response)
HIPAA
Applicable Controls:
- PHI encryption at rest and in transit
- Access controls (RBAC)
- Audit logging
- Data retention policies
- Breach notification procedures
GDPR
Compliance Features:
- Data subject rights (access, deletion, portability)
- Consent management
- Data processing agreements
- Privacy by design
- Breach notification (72 hours)
Security Testing
Penetration Testing
Frequency: Quarterly Scope: Full platform Methods:
- Automated scanning (OWASP ZAP)
- Manual testing
- Social engineering
- Physical security
Vulnerability Management
Process:
- Scan (weekly)
- Triage (1 business day)
- Remediate (7-30 days based on severity)
- Verify fix
- Document
Security Metrics
Key Metrics:
- Mean Time to Detect (MTTD): < 5 minutes
- Mean Time to Respond (MTTR): < 30 minutes
- Vulnerability patch time: < 7 days (high/critical)
- Failed login rate: < 0.1%
- Audit log retention: 90 days