Skip to main content

Runtime Specification

OSSA runtime specification defining agent execution, lifecycle, control signals, and memory management

OSSA Runtime Specification

The OSSA Runtime Specification defines how agents execute, communicate, and manage state. It provides a standardized execution model that ensures consistent behavior across different runtime implementations.

Overview

The runtime specification consists of several key components:

ComponentDescriptionDocumentation
Lifecycle5-phase agent execution modelLifecycle
Control SignalsStandard communication primitivesControl Signals
Execution ModelTimeouts, sandboxing, resource limitsThis page
Memory ModelShort-term and long-term storageMemory Model
Execution Profilesfast/balanced/deep/safe profilesExecution Profiles

Quick Start

Define runtime configuration in your agent manifest:

apiVersion: ossa/v0.4.9 kind: Agent metadata: name: my-agent version: "1.0.0" spec: role: worker runtime: lifecycle: max_iterations: 10 phases: init: timeout_seconds: 30 act: timeout_seconds: 300 execution: sandbox: enabled: true type: container resource_limits: memory_mb: 512 cpu_millicores: 1000

Execution Model

Timeout Configuration

Multiple timeout levels protect against runaway executions:

TimeoutDefaultRangeDescription
default_seconds3001-3600Default operation timeout
llm_call_seconds605-300LLM API call timeout
tool_call_seconds601-600Tool invocation timeout
delegation_seconds30030-3600Agent delegation timeout

Sandbox Configuration

Agents run in isolated sandboxes to prevent unauthorized access:

apiVersion: ossa/v0.4.9 kind: RuntimeSpec execution: sandbox: enabled: true type: container # options: container, vm, wasm, process, none filesystem: read_paths: - /app/data write_paths: - /app/output deny_paths: - /etc/passwd - /etc/shadow - ~/.ssh - ~/.aws capabilities: [] # Linux capabilities syscall_filter: default # options: strict, default, permissive

Resource Limits

Control agent resource consumption:

resource_limits: memory_mb: 512 cpu_millicores: 1000 gpu_memory_mb: 0 max_open_files: 256 max_processes: 10 max_network_connections: 50

Network Isolation

Isolation Modes

ModeDescription
strictNo network access except explicitly allowed
namespaceIsolated network namespace per agent
meshService mesh with controlled inter-agent communication
noneNo network isolation (development only)

Egress Rules

Control outbound traffic:

network_isolation: enabled: true mode: namespace egress: allowed_domains: - api.openai.com - api.anthropic.com - "*.googleapis.com" blocked_domains: [] allowed_ports: - 443 - 80 rate_limit: requests_per_second: 100 bytes_per_second: 10485760 # 10 MB/s

Ingress Rules

Control inbound traffic:

ingress: allowed_sources: - "10.0.0.0/8" - "agent-orchestrator" allowed_ports: - 8080 - 8443 require_mtls: true

Agent Mesh

Configuration for agent-to-agent communication:

agent_mesh: enabled: true discovery: kubernetes # options: dns, consul, static encryption: mtls # options: none, tls, wireguard allowed_agents: - security-scanner - code-reviewer denied_agents: []

Context Windows: Sliding Window + Long-Term Memory

Modern agents must operate under hard model context limits. OSSA requires agents to declare a context strategy that is deterministic enough to audit, but flexible enough to evolve.

Context Strategy Declaration

Every agent should declare how it manages its context window in agent.ossa.yaml:

spec: context: strategy: sliding_window # How to manage the context window max_tokens: 32768 # Hard limit for the context window reserved_tokens: system_prompt: 4096 # Reserved for system instructions tools: 2048 # Reserved for tool definitions output: 4096 # Reserved for generation overflow: summarize # What to do when window is full long_term_memory: provider: qdrant # Vector store for persistent recall embedding_model: text-embedding-3-small dimensions: 1536 retrieval: top_k: 10 score_threshold: 0.7

Context Strategies

StrategyBehaviorBest For
sliding_windowRemove oldest messages first (FIFO)General-purpose agents
summarizeCompress old messages into summariesLong conversations
importance_basedKeep high-importance messages longerComplex multi-step tasks
hybridSliding window + summarize on overflowProduction workloads

Token Budget Zones

The context window is divided into reserved zones:

┌──────────────────────────────────────────┐
│  System Prompt          (reserved: 4096) │
├──────────────────────────────────────────┤
│  Tool Definitions       (reserved: 2048) │
├──────────────────────────────────────────┤
│  Retrieved Memories     (dynamic)        │
├──────────────────────────────────────────┤
│  Conversation History   (sliding window) │
├──────────────────────────────────────────┤
│  Output Generation      (reserved: 4096) │
└──────────────────────────────────────────┘

The remaining tokens after reservations are split between retrieved long-term memories and the sliding conversation window. This ensures the agent never exceeds its context limit while maintaining access to both recent context and relevant historical knowledge.

Why Declare Context Strategy?

  1. Auditability: Operators can predict and verify token usage patterns
  2. Cost control: Reserved zones prevent runaway token consumption
  3. Determinism: Same inputs produce predictable context window contents
  4. Graceful degradation: Overflow strategies prevent hard failures when context fills up

Memory Model

Short-Term Memory

Volatile memory for immediate context (managed by the context strategy above):

ComponentDefault LimitDescription
Conversation Context100 messages / 32K tokensRecent conversation history
Working Memory64 MBScratchpad for computations
Session State1 hour TTLSession-level state

Long-Term Memory

Persistent storage for knowledge and recall across sessions:

Supported Providers

CategoryProviders
Relationalpostgres, sqlite
Documentmongodb, dynamodb
Vectorqdrant, pinecone, weaviate, chromadb, milvus
Graphneo4j
Hybridpostgres with pgvector

Vector Store Configuration

spec: context: long_term_memory: provider: qdrant embedding_model: text-embedding-3-small dimensions: 1536 distance_metric: cosine # options: euclidean, dot_product index_type: hnsw # options: flat, ivf, scann retrieval: top_k: 10 score_threshold: 0.7

Memory Provider Interface

All memory providers MUST implement these operations:

OperationDescriptionRequired
getRetrieve entry by keyYes
setStore entryYes
deleteRemove entryYes
existsCheck if key existsYes
listList keys matching patternYes
clearClear namespaceYes
searchSemantic search (vector)No
batch_getRetrieve multiple entriesNo
batch_setStore multiple entriesNo
expireSet expirationNo
incrementAtomic incrementNo

Memory Lifecycle

Garbage Collection

memory: lifecycle: gc: strategy: hybrid # options: lru, lfu, ttl, size_based interval_seconds: 3600 trigger_threshold: 0.8 # 80% memory usage

Backup

backup: interval_hours: 24 retention_count: 7 encryption: true

Complete Example

apiVersion: ossa/v0.4.9 kind: RuntimeSpec lifecycle: phases: init: timeout_seconds: 30 plan: timeout_seconds: 60 act: timeout_seconds: 300 reflect: timeout_seconds: 30 terminate: timeout_seconds: 15 max_iterations: 10 control_signals: tool_call: async: true timeout_seconds: 60 delegation: async: true timeout_seconds: 300 execution: timeout: default_seconds: 300 llm_call_seconds: 60 sandbox: enabled: true type: container resource_limits: memory_mb: 512 cpu_millicores: 1000 network_isolation: enabled: true mode: namespace egress: allowed_domains: - api.openai.com - api.anthropic.com allowed_ports: - 443 agent_mesh: enabled: true encryption: mtls

Specification Version: v0.4.5 Last Updated: 2026-02