OpenTelemetry Configuration
OpenTelemetry Configuration
OpenTelemetry (OTLP) is the foundation of distributed tracing across the LLM platform. It provides standardized instrumentation for traces, metrics, and logs.
Overview
OpenTelemetry enables:
- Distributed Tracing: Track requests across services
- Metrics Collection: Standardized metric export
- Structured Logging: Correlated logs with traces
- Context Propagation: Maintain context across boundaries
- Multiple Exporters: Send data to Jaeger, Phoenix, Tempo
Installation
# TypeScript/JavaScript npm install @opentelemetry/api \ @opentelemetry/sdk-node \ @opentelemetry/auto-instrumentations-node \ @opentelemetry/exporter-trace-otlp-http \ @opentelemetry/exporter-metrics-otlp-http # Python pip install opentelemetry-api \ opentelemetry-sdk \ opentelemetry-instrumentation \ opentelemetry-exporter-otlp # PHP composer require open-telemetry/sdk \ open-telemetry/exporter-otlp
Configuration
Environment Variables
# Service Configuration OTEL_SERVICE_NAME=tdd-enforcer OTEL_SERVICE_VERSION=1.0.0 OTEL_DEPLOYMENT_ENVIRONMENT=production # OTLP Exporters OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4317/v1/traces OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://localhost:4317/v1/metrics # Sampling OTEL_TRACES_SAMPLER=always_on OTEL_TRACES_SAMPLER_ARG=1.0 # Resource Attributes OTEL_RESOURCE_ATTRIBUTES=service.name=tdd-enforcer,service.version=1.0.0,deployment.environment=production
TypeScript Configuration
// tracing.ts import { NodeSDK } from '@opentelemetry/sdk-node' import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node' import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http' import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http' import { Resource } from '@opentelemetry/resources' import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions' const sdk = new NodeSDK({ resource: new Resource({ [SemanticResourceAttributes.SERVICE_NAME]: 'tdd-enforcer', [SemanticResourceAttributes.SERVICE_VERSION]: '1.0.0', [SemanticResourceAttributes.DEPLOYMENT_ENVIRONMENT]: 'production' }), traceExporter: new OTLPTraceExporter({ url: 'http://localhost:4318/v1/traces' }), metricReader: new OTLPMetricExporter({ url: 'http://localhost:4318/v1/metrics' }), instrumentations: [getNodeAutoInstrumentations()] }) sdk.start() process.on('SIGTERM', () => { sdk.shutdown() .then(() => console.log('Tracing terminated')) .catch((error) => console.log('Error terminating tracing', error)) .finally(() => process.exit(0)) })
Usage
Manual Instrumentation
import { trace } from '@opentelemetry/api' const tracer = trace.getTracer('tdd-enforcer') async function enforceTDD(filePath: string) { const span = tracer.startSpan('enforce-tdd', { attributes: { 'file.path': filePath, 'agent.id': 'tdd-enforcer-001' } }) try { const tests = await generateTests(filePath) span.setAttribute('tests.generated', tests.length) span.setStatus({ code: SpanStatusCode.OK }) return tests } catch (error) { span.recordException(error) span.setStatus({ code: SpanStatusCode.ERROR }) throw error } finally { span.end() } }
Automatic Instrumentation
// Automatic HTTP, Database, Redis instrumentation import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node' const sdk = new NodeSDK({ instrumentations: [ getNodeAutoInstrumentations({ '@opentelemetry/instrumentation-http': { enabled: true }, '@opentelemetry/instrumentation-express': { enabled: true }, '@opentelemetry/instrumentation-pg': { enabled: true }, '@opentelemetry/instrumentation-redis': { enabled: true } }) ] })
Exporters
OTLP HTTP Exporter
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http' const exporter = new OTLPTraceExporter({ url: 'http://localhost:4318/v1/traces', headers: { 'Authorization': 'Bearer token' } })
OTLP gRPC Exporter
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-grpc' const exporter = new OTLPTraceExporter({ url: 'http://localhost:4317' })
Multiple Exporters
import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base' // Send to Phoenix const phoenixExporter = new OTLPTraceExporter({ url: 'http://localhost:6006/v1/traces' }) // Send to Jaeger const jaegerExporter = new OTLPTraceExporter({ url: 'http://localhost:4318/v1/traces' }) // Configure SDK with multiple exporters provider.addSpanProcessor(new BatchSpanProcessor(phoenixExporter)) provider.addSpanProcessor(new BatchSpanProcessor(jaegerExporter))
Context Propagation
W3C Trace Context
import { propagation, context } from '@opentelemetry/api' import { W3CTraceContextPropagator } from '@opentelemetry/core' // Set propagator propagation.setGlobalPropagator(new W3CTraceContextPropagator()) // Inject context into HTTP headers const headers = {} propagation.inject(context.active(), headers) // Extract context from HTTP headers const extractedContext = propagation.extract(context.active(), headers)
Cross-Service Tracing
// Service A import axios from 'axios' import { context, propagation } from '@opentelemetry/api' async function callServiceB() { const span = tracer.startSpan('call-service-b') context.with(trace.setSpan(context.active(), span), async () => { const headers = {} propagation.inject(context.active(), headers) const response = await axios.get('http://service-b/api', { headers }) span.end() return response.data }) } // Service B app.use((req, res, next) => { const extractedContext = propagation.extract(context.active(), req.headers) context.with(extractedContext, () => { const span = tracer.startSpan('handle-request') // ... process request span.end() }) next() })
W3C Baggage for Multi-Agent Correlation
Critical for OSSA multi-agent workflows - W3C Baggage propagates correlation data across agent handoffs.
import { propagation, context, baggage, BaggageEntry } from '@opentelemetry/api' import { W3CBaggagePropagator } from '@opentelemetry/core' // Set composite propagator (Trace Context + Baggage) propagation.setGlobalPropagator( new CompositePropagator({ propagators: [ new W3CTraceContextPropagator(), new W3CBaggagePropagator() ] }) ) // Agent A: Set correlation baggage before handoff function prepareAgentHandoff(targetAgent: string) { const entries: Record<string, BaggageEntry> = { 'ossa.session.id': { value: sessionId }, 'ossa.workflow.id': { value: workflowId }, 'ossa.source.agent': { value: 'agent-a' }, 'ossa.target.agent': { value: targetAgent }, 'ossa.handoff.reason': { value: 'capability_delegation' } } const bag = baggage.setEntries(baggage.active(), entries) return context.with(baggage.setSpan(context.active(), span), () => { const headers = {} propagation.inject(context.active(), headers) return headers }) } // Agent B: Extract correlation from incoming request function extractCorrelation(headers: Record<string, string>) { const extractedContext = propagation.extract(context.active(), headers) const bag = baggage.getBaggage(extractedContext) const sessionId = bag?.getEntry('ossa.session.id')?.value const sourceAgent = bag?.getEntry('ossa.source.agent')?.value // Create linked span for visibility const span = tracer.startSpan('agent-handoff-received', { links: [{ context: trace.getSpanContext(extractedContext)!, attributes: { 'ossa.handoff.source': sourceAgent, 'ossa.handoff.type': 'capability_delegation' } }] }) return { sessionId, sourceAgent, span, context: extractedContext } }
Span Links for Agent Handoffs: Use span links (not parent-child) for agent-to-agent communication. This shows the full chain in Jaeger/SigNoz while maintaining independent traces per agent.
Semantic Conventions
Standard OTel Attributes
import { SemanticAttributes } from '@opentelemetry/semantic-conventions' span.setAttributes({ // HTTP [SemanticAttributes.HTTP_METHOD]: 'POST', [SemanticAttributes.HTTP_URL]: '/api/tests', [SemanticAttributes.HTTP_STATUS_CODE]: 200, // Database [SemanticAttributes.DB_SYSTEM]: 'postgresql', [SemanticAttributes.DB_NAME]: 'llm_platform', [SemanticAttributes.DB_STATEMENT]: 'SELECT * FROM tests', // RPC [SemanticAttributes.RPC_SERVICE]: 'TestGenerator', [SemanticAttributes.RPC_METHOD]: 'generateTests', // Custom 'agent.id': 'tdd-enforcer-001', 'task.type': 'test-generation', 'file.path': 'src/auth/AuthService.ts' })
OSSA Semantic Conventions (v0.3.0+)
OSSA defines OpenTelemetry semantic conventions for AI agents. Use these attributes for consistent agent observability across the platform.
// OSSA Agent Semantic Attributes span.setAttributes({ // GenAI Standard (OTel GenAI SIG) 'gen_ai.system': 'ossa', 'gen_ai.request.model': 'claude-sonnet-4-20250514', 'gen_ai.request.max_tokens': 4096, 'gen_ai.response.finish_reason': 'stop', 'gen_ai.usage.input_tokens': 1523, 'gen_ai.usage.output_tokens': 892, // OSSA-Specific Identity 'ossa.agent.id': 'uuid-agent-definition', // From manifest metadata.id 'ossa.agent.name': 'review-agent', // From manifest metadata.name 'ossa.agent.version': '1.2.0', // From manifest metadata.version 'ossa.instance.id': 'uuid-runtime-instance', // Runtime instance UUID (P0!) // OSSA Session & Interaction 'ossa.session.id': 'uuid-session', // Conversation/workflow session 'ossa.interaction.id': 'uuid-interaction', // Single prompt/response (P0!) 'ossa.turn.number': 3, // Turn within session // OSSA Capability Tracking 'ossa.capability.name': 'code_review', 'ossa.capability.version': '2.1', 'ossa.tool.name': 'gitlab-api', 'ossa.tool.type': 'http', // OSSA State 'ossa.state.mode': 'session', // stateless | session | long_running 'ossa.state.storage_type': 'vector-db' })
Why Instance ID Matters: Without ossa.instance.id, you can't distinguish agent-a on pod-1 from agent-a on pod-2. This is critical for debugging in Kubernetes.
Why Interaction ID Matters: Every prompt/response needs a unique ID for debugging failed generations and correlating logs.
GitLab Integration Configuration
For GitLab Ultimate observability integration:
# .gitlab-ci.yml for agent observability variables: OTEL_EXPORTER_OTLP_ENDPOINT: "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/observability/v1/traces" OTEL_EXPORTER_OTLP_HEADERS: "PRIVATE-TOKEN=${GITLAB_OBSERVABILITY_TOKEN}" OTEL_SERVICE_NAME: "ossa-agents" OTEL_RESOURCE_ATTRIBUTES: "deployment.environment=production,service.namespace=llm-platform,ossa.agent.name=${AGENT_NAME}"
---
## Sampling
### Always On
```typescript
import { AlwaysOnSampler } from '@opentelemetry/sdk-trace-base'
const sdk = new NodeSDK({
sampler: new AlwaysOnSampler()
})
Probability-Based
import { TraceIdRatioBasedSampler } from '@opentelemetry/sdk-trace-base' const sdk = new NodeSDK({ sampler: new TraceIdRatioBasedSampler(0.1) // 10% sampling })
Parent-Based
import { ParentBasedSampler, AlwaysOnSampler } from '@opentelemetry/sdk-trace-base' const sdk = new NodeSDK({ sampler: new ParentBasedSampler({ root: new AlwaysOnSampler() }) })