Skip to main content

OpenTelemetry Configuration

OpenTelemetry Configuration

OpenTelemetry (OTLP) is the foundation of distributed tracing across the LLM platform. It provides standardized instrumentation for traces, metrics, and logs.


Overview

OpenTelemetry enables:

  • Distributed Tracing: Track requests across services
  • Metrics Collection: Standardized metric export
  • Structured Logging: Correlated logs with traces
  • Context Propagation: Maintain context across boundaries
  • Multiple Exporters: Send data to Jaeger, Phoenix, Tempo

Installation

# TypeScript/JavaScript npm install @opentelemetry/api \ @opentelemetry/sdk-node \ @opentelemetry/auto-instrumentations-node \ @opentelemetry/exporter-trace-otlp-http \ @opentelemetry/exporter-metrics-otlp-http # Python pip install opentelemetry-api \ opentelemetry-sdk \ opentelemetry-instrumentation \ opentelemetry-exporter-otlp # PHP composer require open-telemetry/sdk \ open-telemetry/exporter-otlp

Configuration

Environment Variables

# Service Configuration OTEL_SERVICE_NAME=tdd-enforcer OTEL_SERVICE_VERSION=1.0.0 OTEL_DEPLOYMENT_ENVIRONMENT=production # OTLP Exporters OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:4317/v1/traces OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://localhost:4317/v1/metrics # Sampling OTEL_TRACES_SAMPLER=always_on OTEL_TRACES_SAMPLER_ARG=1.0 # Resource Attributes OTEL_RESOURCE_ATTRIBUTES=service.name=tdd-enforcer,service.version=1.0.0,deployment.environment=production

TypeScript Configuration

// tracing.ts import { NodeSDK } from '@opentelemetry/sdk-node' import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node' import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http' import { OTLPMetricExporter } from '@opentelemetry/exporter-metrics-otlp-http' import { Resource } from '@opentelemetry/resources' import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions' const sdk = new NodeSDK({ resource: new Resource({ [SemanticResourceAttributes.SERVICE_NAME]: 'tdd-enforcer', [SemanticResourceAttributes.SERVICE_VERSION]: '1.0.0', [SemanticResourceAttributes.DEPLOYMENT_ENVIRONMENT]: 'production' }), traceExporter: new OTLPTraceExporter({ url: 'http://localhost:4318/v1/traces' }), metricReader: new OTLPMetricExporter({ url: 'http://localhost:4318/v1/metrics' }), instrumentations: [getNodeAutoInstrumentations()] }) sdk.start() process.on('SIGTERM', () => { sdk.shutdown() .then(() => console.log('Tracing terminated')) .catch((error) => console.log('Error terminating tracing', error)) .finally(() => process.exit(0)) })

Usage

Manual Instrumentation

import { trace } from '@opentelemetry/api' const tracer = trace.getTracer('tdd-enforcer') async function enforceTDD(filePath: string) { const span = tracer.startSpan('enforce-tdd', { attributes: { 'file.path': filePath, 'agent.id': 'tdd-enforcer-001' } }) try { const tests = await generateTests(filePath) span.setAttribute('tests.generated', tests.length) span.setStatus({ code: SpanStatusCode.OK }) return tests } catch (error) { span.recordException(error) span.setStatus({ code: SpanStatusCode.ERROR }) throw error } finally { span.end() } }

Automatic Instrumentation

// Automatic HTTP, Database, Redis instrumentation import { getNodeAutoInstrumentations } from '@opentelemetry/auto-instrumentations-node' const sdk = new NodeSDK({ instrumentations: [ getNodeAutoInstrumentations({ '@opentelemetry/instrumentation-http': { enabled: true }, '@opentelemetry/instrumentation-express': { enabled: true }, '@opentelemetry/instrumentation-pg': { enabled: true }, '@opentelemetry/instrumentation-redis': { enabled: true } }) ] })

Exporters

OTLP HTTP Exporter

import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http' const exporter = new OTLPTraceExporter({ url: 'http://localhost:4318/v1/traces', headers: { 'Authorization': 'Bearer token' } })

OTLP gRPC Exporter

import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-grpc' const exporter = new OTLPTraceExporter({ url: 'http://localhost:4317' })

Multiple Exporters

import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base' // Send to Phoenix const phoenixExporter = new OTLPTraceExporter({ url: 'http://localhost:6006/v1/traces' }) // Send to Jaeger const jaegerExporter = new OTLPTraceExporter({ url: 'http://localhost:4318/v1/traces' }) // Configure SDK with multiple exporters provider.addSpanProcessor(new BatchSpanProcessor(phoenixExporter)) provider.addSpanProcessor(new BatchSpanProcessor(jaegerExporter))

Context Propagation

W3C Trace Context

import { propagation, context } from '@opentelemetry/api' import { W3CTraceContextPropagator } from '@opentelemetry/core' // Set propagator propagation.setGlobalPropagator(new W3CTraceContextPropagator()) // Inject context into HTTP headers const headers = {} propagation.inject(context.active(), headers) // Extract context from HTTP headers const extractedContext = propagation.extract(context.active(), headers)

Cross-Service Tracing

// Service A import axios from 'axios' import { context, propagation } from '@opentelemetry/api' async function callServiceB() { const span = tracer.startSpan('call-service-b') context.with(trace.setSpan(context.active(), span), async () => { const headers = {} propagation.inject(context.active(), headers) const response = await axios.get('http://service-b/api', { headers }) span.end() return response.data }) } // Service B app.use((req, res, next) => { const extractedContext = propagation.extract(context.active(), req.headers) context.with(extractedContext, () => { const span = tracer.startSpan('handle-request') // ... process request span.end() }) next() })

W3C Baggage for Multi-Agent Correlation

Critical for OSSA multi-agent workflows - W3C Baggage propagates correlation data across agent handoffs.

import { propagation, context, baggage, BaggageEntry } from '@opentelemetry/api' import { W3CBaggagePropagator } from '@opentelemetry/core' // Set composite propagator (Trace Context + Baggage) propagation.setGlobalPropagator( new CompositePropagator({ propagators: [ new W3CTraceContextPropagator(), new W3CBaggagePropagator() ] }) ) // Agent A: Set correlation baggage before handoff function prepareAgentHandoff(targetAgent: string) { const entries: Record<string, BaggageEntry> = { 'ossa.session.id': { value: sessionId }, 'ossa.workflow.id': { value: workflowId }, 'ossa.source.agent': { value: 'agent-a' }, 'ossa.target.agent': { value: targetAgent }, 'ossa.handoff.reason': { value: 'capability_delegation' } } const bag = baggage.setEntries(baggage.active(), entries) return context.with(baggage.setSpan(context.active(), span), () => { const headers = {} propagation.inject(context.active(), headers) return headers }) } // Agent B: Extract correlation from incoming request function extractCorrelation(headers: Record<string, string>) { const extractedContext = propagation.extract(context.active(), headers) const bag = baggage.getBaggage(extractedContext) const sessionId = bag?.getEntry('ossa.session.id')?.value const sourceAgent = bag?.getEntry('ossa.source.agent')?.value // Create linked span for visibility const span = tracer.startSpan('agent-handoff-received', { links: [{ context: trace.getSpanContext(extractedContext)!, attributes: { 'ossa.handoff.source': sourceAgent, 'ossa.handoff.type': 'capability_delegation' } }] }) return { sessionId, sourceAgent, span, context: extractedContext } }

Span Links for Agent Handoffs: Use span links (not parent-child) for agent-to-agent communication. This shows the full chain in Jaeger/SigNoz while maintaining independent traces per agent.


Semantic Conventions

Standard OTel Attributes

import { SemanticAttributes } from '@opentelemetry/semantic-conventions' span.setAttributes({ // HTTP [SemanticAttributes.HTTP_METHOD]: 'POST', [SemanticAttributes.HTTP_URL]: '/api/tests', [SemanticAttributes.HTTP_STATUS_CODE]: 200, // Database [SemanticAttributes.DB_SYSTEM]: 'postgresql', [SemanticAttributes.DB_NAME]: 'llm_platform', [SemanticAttributes.DB_STATEMENT]: 'SELECT * FROM tests', // RPC [SemanticAttributes.RPC_SERVICE]: 'TestGenerator', [SemanticAttributes.RPC_METHOD]: 'generateTests', // Custom 'agent.id': 'tdd-enforcer-001', 'task.type': 'test-generation', 'file.path': 'src/auth/AuthService.ts' })

OSSA Semantic Conventions (v0.3.0+)

OSSA defines OpenTelemetry semantic conventions for AI agents. Use these attributes for consistent agent observability across the platform.

// OSSA Agent Semantic Attributes span.setAttributes({ // GenAI Standard (OTel GenAI SIG) 'gen_ai.system': 'ossa', 'gen_ai.request.model': 'claude-sonnet-4-20250514', 'gen_ai.request.max_tokens': 4096, 'gen_ai.response.finish_reason': 'stop', 'gen_ai.usage.input_tokens': 1523, 'gen_ai.usage.output_tokens': 892, // OSSA-Specific Identity 'ossa.agent.id': 'uuid-agent-definition', // From manifest metadata.id 'ossa.agent.name': 'review-agent', // From manifest metadata.name 'ossa.agent.version': '1.2.0', // From manifest metadata.version 'ossa.instance.id': 'uuid-runtime-instance', // Runtime instance UUID (P0!) // OSSA Session & Interaction 'ossa.session.id': 'uuid-session', // Conversation/workflow session 'ossa.interaction.id': 'uuid-interaction', // Single prompt/response (P0!) 'ossa.turn.number': 3, // Turn within session // OSSA Capability Tracking 'ossa.capability.name': 'code_review', 'ossa.capability.version': '2.1', 'ossa.tool.name': 'gitlab-api', 'ossa.tool.type': 'http', // OSSA State 'ossa.state.mode': 'session', // stateless | session | long_running 'ossa.state.storage_type': 'vector-db' })

Why Instance ID Matters: Without ossa.instance.id, you can't distinguish agent-a on pod-1 from agent-a on pod-2. This is critical for debugging in Kubernetes.

Why Interaction ID Matters: Every prompt/response needs a unique ID for debugging failed generations and correlating logs.

GitLab Integration Configuration

For GitLab Ultimate observability integration:

# .gitlab-ci.yml for agent observability variables: OTEL_EXPORTER_OTLP_ENDPOINT: "${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/observability/v1/traces" OTEL_EXPORTER_OTLP_HEADERS: "PRIVATE-TOKEN=${GITLAB_OBSERVABILITY_TOKEN}" OTEL_SERVICE_NAME: "ossa-agents" OTEL_RESOURCE_ATTRIBUTES: "deployment.environment=production,service.namespace=llm-platform,ossa.agent.name=${AGENT_NAME}"

---

## Sampling

### Always On

```typescript
import { AlwaysOnSampler } from '@opentelemetry/sdk-trace-base'

const sdk = new NodeSDK({
  sampler: new AlwaysOnSampler()
})

Probability-Based

import { TraceIdRatioBasedSampler } from '@opentelemetry/sdk-trace-base' const sdk = new NodeSDK({ sampler: new TraceIdRatioBasedSampler(0.1) // 10% sampling })

Parent-Based

import { ParentBasedSampler, AlwaysOnSampler } from '@opentelemetry/sdk-trace-base' const sdk = new NodeSDK({ sampler: new ParentBasedSampler({ root: new AlwaysOnSampler() }) })