Security & Sandboxing

Why Security in Harness Engineering?

AI agents execute code, read files, call APIs, and interact with external systems. A poorly secured harness is a remote code execution vulnerability with a chat interface.

Permission Models

Allowlist (Restrictive)

Only explicitly permitted tools and actions are available.

tools:
  allowed: [read_file, web_search, exec_sandboxed]
  denied: [rm, sudo, network_admin]

Capability-Based

Agents request capabilities, and the harness grants or denies them.

Agent: "I need to write to /tmp/output.txt"
Harness: ✅ Granted (within sandbox)

Agent: "I need to access ~/.ssh/id_rsa"  
Harness: ❌ Denied (outside trust boundary)

Human-in-the-Loop

Sensitive actions require explicit user approval.

Low risk: auto-approve (reading files, searching)
Medium risk: notify + proceed unless stopped
High risk: pause and wait for explicit approval (sending emails, deploying)

Sandbox Architectures

Architecture	Isolation Level	Performance	Use Case
Process sandbox	Medium	Fast	Local development
Docker container	High	Medium	Production agents
Firecracker/microVM	Very high	Slower	Multi-tenant platforms
WASM sandbox	Medium-high	Fast	Browser-based agents

Trust Boundaries

┌─ Fully Trusted ─────────────────────┐
│  Agent config, system prompt         │
├─ Trusted with Verification ─────────┤
│  User messages, uploaded files       │
├─ Untrusted ─────────────────────────┤
│  Web content, API responses,         │
│  other agents' output                │
├─ Never Trusted ─────────────────────┤
│  Prompt injection attempts,          │
│  unknown tool outputs                │
└─────────────────────────────────────┘

Key Principles

Least privilege — Agents get only the permissions they need
Defense in depth — Multiple layers of protection
Fail safe — When in doubt, deny and ask the user
Audit trail — Log all sensitive actions for review

Back to README →