Security & Sandboxing
Why Security in Harness Engineering?
AI agents execute code, read files, call APIs, and interact with external systems. A poorly secured harness is a remote code execution vulnerability with a chat interface.
Permission Models
Allowlist (Restrictive)
Only explicitly permitted tools and actions are available.
tools:
allowed: [read_file, web_search, exec_sandboxed]
denied: [rm, sudo, network_admin]
Capability-Based
Agents request capabilities, and the harness grants or denies them.
Agent: "I need to write to /tmp/output.txt"
Harness: ✅ Granted (within sandbox)
Agent: "I need to access ~/.ssh/id_rsa"
Harness: ❌ Denied (outside trust boundary)
Human-in-the-Loop
Sensitive actions require explicit user approval.
- Low risk: auto-approve (reading files, searching)
- Medium risk: notify + proceed unless stopped
- High risk: pause and wait for explicit approval (sending emails, deploying)
Sandbox Architectures
| Architecture | Isolation Level | Performance | Use Case |
|---|---|---|---|
| Process sandbox | Medium | Fast | Local development |
| Docker container | High | Medium | Production agents |
| Firecracker/microVM | Very high | Slower | Multi-tenant platforms |
| WASM sandbox | Medium-high | Fast | Browser-based agents |
Trust Boundaries
┌─ Fully Trusted ─────────────────────┐
│ Agent config, system prompt │
├─ Trusted with Verification ─────────┤
│ User messages, uploaded files │
├─ Untrusted ─────────────────────────┤
│ Web content, API responses, │
│ other agents' output │
├─ Never Trusted ─────────────────────┤
│ Prompt injection attempts, │
│ unknown tool outputs │
└─────────────────────────────────────┘
Key Principles
- Least privilege — Agents get only the permissions they need
- Defense in depth — Multiple layers of protection
- Fail safe — When in doubt, deny and ask the user
- Audit trail — Log all sensitive actions for review
Back to README →