Runtime Specification
OSSA runtime specification defining agent execution, lifecycle, control signals, and memory management
OSSA Runtime Specification
The OSSA Runtime Specification defines how agents execute, communicate, and manage state. It provides a standardized execution model that ensures consistent behavior across different runtime implementations.
Overview
The runtime specification consists of several key components:
| Component | Description | Documentation |
|---|---|---|
| Lifecycle | 5-phase agent execution model | Lifecycle |
| Control Signals | Standard communication primitives | Control Signals |
| Execution Model | Timeouts, sandboxing, resource limits | This page |
| Memory Model | Short-term and long-term storage | Memory Model |
| Execution Profiles | fast/balanced/deep/safe profiles | Execution Profiles |
Quick Start
Define runtime configuration in your agent manifest:
apiVersion: ossa/v0.4.9 kind: Agent metadata: name: my-agent version: "1.0.0" spec: role: worker runtime: lifecycle: max_iterations: 10 phases: init: timeout_seconds: 30 act: timeout_seconds: 300 execution: sandbox: enabled: true type: container resource_limits: memory_mb: 512 cpu_millicores: 1000
Execution Model
Timeout Configuration
Multiple timeout levels protect against runaway executions:
| Timeout | Default | Range | Description |
|---|---|---|---|
default_seconds | 300 | 1-3600 | Default operation timeout |
llm_call_seconds | 60 | 5-300 | LLM API call timeout |
tool_call_seconds | 60 | 1-600 | Tool invocation timeout |
delegation_seconds | 300 | 30-3600 | Agent delegation timeout |
Sandbox Configuration
Agents run in isolated sandboxes to prevent unauthorized access:
apiVersion: ossa/v0.4.9 kind: RuntimeSpec execution: sandbox: enabled: true type: container # options: container, vm, wasm, process, none filesystem: read_paths: - /app/data write_paths: - /app/output deny_paths: - /etc/passwd - /etc/shadow - ~/.ssh - ~/.aws capabilities: [] # Linux capabilities syscall_filter: default # options: strict, default, permissive
Resource Limits
Control agent resource consumption:
resource_limits: memory_mb: 512 cpu_millicores: 1000 gpu_memory_mb: 0 max_open_files: 256 max_processes: 10 max_network_connections: 50
Network Isolation
Isolation Modes
| Mode | Description |
|---|---|
strict | No network access except explicitly allowed |
namespace | Isolated network namespace per agent |
mesh | Service mesh with controlled inter-agent communication |
none | No network isolation (development only) |
Egress Rules
Control outbound traffic:
network_isolation: enabled: true mode: namespace egress: allowed_domains: - api.openai.com - api.anthropic.com - "*.googleapis.com" blocked_domains: [] allowed_ports: - 443 - 80 rate_limit: requests_per_second: 100 bytes_per_second: 10485760 # 10 MB/s
Ingress Rules
Control inbound traffic:
ingress: allowed_sources: - "10.0.0.0/8" - "agent-orchestrator" allowed_ports: - 8080 - 8443 require_mtls: true
Agent Mesh
Configuration for agent-to-agent communication:
agent_mesh: enabled: true discovery: kubernetes # options: dns, consul, static encryption: mtls # options: none, tls, wireguard allowed_agents: - security-scanner - code-reviewer denied_agents: []
Context Windows: Sliding Window + Long-Term Memory
Modern agents must operate under hard model context limits. OSSA requires agents to declare a context strategy that is deterministic enough to audit, but flexible enough to evolve.
Context Strategy Declaration
Every agent should declare how it manages its context window in agent.ossa.yaml:
spec: context: strategy: sliding_window # How to manage the context window max_tokens: 32768 # Hard limit for the context window reserved_tokens: system_prompt: 4096 # Reserved for system instructions tools: 2048 # Reserved for tool definitions output: 4096 # Reserved for generation overflow: summarize # What to do when window is full long_term_memory: provider: qdrant # Vector store for persistent recall embedding_model: text-embedding-3-small dimensions: 1536 retrieval: top_k: 10 score_threshold: 0.7
Context Strategies
| Strategy | Behavior | Best For |
|---|---|---|
sliding_window | Remove oldest messages first (FIFO) | General-purpose agents |
summarize | Compress old messages into summaries | Long conversations |
importance_based | Keep high-importance messages longer | Complex multi-step tasks |
hybrid | Sliding window + summarize on overflow | Production workloads |
Token Budget Zones
The context window is divided into reserved zones:
┌──────────────────────────────────────────┐
│ System Prompt (reserved: 4096) │
├──────────────────────────────────────────┤
│ Tool Definitions (reserved: 2048) │
├──────────────────────────────────────────┤
│ Retrieved Memories (dynamic) │
├──────────────────────────────────────────┤
│ Conversation History (sliding window) │
├──────────────────────────────────────────┤
│ Output Generation (reserved: 4096) │
└──────────────────────────────────────────┘
The remaining tokens after reservations are split between retrieved long-term memories and the sliding conversation window. This ensures the agent never exceeds its context limit while maintaining access to both recent context and relevant historical knowledge.
Why Declare Context Strategy?
- Auditability: Operators can predict and verify token usage patterns
- Cost control: Reserved zones prevent runaway token consumption
- Determinism: Same inputs produce predictable context window contents
- Graceful degradation: Overflow strategies prevent hard failures when context fills up
Memory Model
Short-Term Memory
Volatile memory for immediate context (managed by the context strategy above):
| Component | Default Limit | Description |
|---|---|---|
| Conversation Context | 100 messages / 32K tokens | Recent conversation history |
| Working Memory | 64 MB | Scratchpad for computations |
| Session State | 1 hour TTL | Session-level state |
Long-Term Memory
Persistent storage for knowledge and recall across sessions:
Supported Providers
| Category | Providers |
|---|---|
| Relational | postgres, sqlite |
| Document | mongodb, dynamodb |
| Vector | qdrant, pinecone, weaviate, chromadb, milvus |
| Graph | neo4j |
| Hybrid | postgres with pgvector |
Vector Store Configuration
spec: context: long_term_memory: provider: qdrant embedding_model: text-embedding-3-small dimensions: 1536 distance_metric: cosine # options: euclidean, dot_product index_type: hnsw # options: flat, ivf, scann retrieval: top_k: 10 score_threshold: 0.7
Memory Provider Interface
All memory providers MUST implement these operations:
| Operation | Description | Required |
|---|---|---|
get | Retrieve entry by key | Yes |
set | Store entry | Yes |
delete | Remove entry | Yes |
exists | Check if key exists | Yes |
list | List keys matching pattern | Yes |
clear | Clear namespace | Yes |
search | Semantic search (vector) | No |
batch_get | Retrieve multiple entries | No |
batch_set | Store multiple entries | No |
expire | Set expiration | No |
increment | Atomic increment | No |
Memory Lifecycle
Garbage Collection
memory: lifecycle: gc: strategy: hybrid # options: lru, lfu, ttl, size_based interval_seconds: 3600 trigger_threshold: 0.8 # 80% memory usage
Backup
backup: interval_hours: 24 retention_count: 7 encryption: true
Complete Example
apiVersion: ossa/v0.4.9 kind: RuntimeSpec lifecycle: phases: init: timeout_seconds: 30 plan: timeout_seconds: 60 act: timeout_seconds: 300 reflect: timeout_seconds: 30 terminate: timeout_seconds: 15 max_iterations: 10 control_signals: tool_call: async: true timeout_seconds: 60 delegation: async: true timeout_seconds: 300 execution: timeout: default_seconds: 300 llm_call_seconds: 60 sandbox: enabled: true type: container resource_limits: memory_mb: 512 cpu_millicores: 1000 network_isolation: enabled: true mode: namespace egress: allowed_domains: - api.openai.com - api.anthropic.com allowed_ports: - 443 agent_mesh: enabled: true encryption: mtls
Related Documentation
- Lifecycle Phases - Detailed lifecycle documentation
- Control Signals - Signal types and usage
- Memory Model - Short-term and long-term memory management
- Execution Profiles - fast, balanced, deep, and safe profiles
- Schema Reference - Complete schema documentation
- Execution Flow - How requests flow through agents
Specification Version: v0.4.5 Last Updated: 2026-02