Meta-Harness: Automated Harness Optimization
Original: Tweet thread by Lior Alexander · March 30, 2026 Paper: by Yoonho Lee et al. — Quoted tweet Engagement: 242 likes · 390 bookmarks Category: Research / Practice
Overview
What if you could automate the process of improving an agent's harness? Meta-Harness does exactly that — an automated optimization method that tunes system prompts, tool definitions, retry logic, and context management end-to-end, achieving a 6x performance gap from harness changes alone, without changing the model.
How It Works
The core loop is simple:
1. Start with any harness
2. An optimizer agent reads the code, logs, and scores
3. It identifies what caused each failure
4. It rewrites the harness and submits a new version
5. That version gets tested, results go back into the folder
6. Repeat
What Makes It Different: Raw File Access
Previous automated optimization methods compressed everything into short summaries — the optimizer only saw ~26K tokens per step. Not enough to trace why something broke.
Meta-Harness keeps every raw file: 10 million tokens per step, 400x more data to learn from. Enough to trace a failure back to the exact line that caused it.
Results
| Benchmark | Result | Note |
|---|---|---|
| TerminalBench-2 (coding) | #1 among all Haiku agents | |
| Text classification | Beats best hand-designed harness by 7.7 points | Uses 4x fewer tokens |
| Math | Single strategy improves accuracy across 5 unseen models | Cross-model transfer |
All improvements came from optimizing the harness, not the model.
The Key Insight
"The performance delta between frontier models is narrowing. The delta between harness implementations on the same model is not. That's where the leverage is."
This is the strongest empirical evidence yet for why Harness Engineering matters: a well-designed harness can extract dramatically more value from the same model than a poorly-designed one. And now, the design process itself can be automated.
Implications
- Harness optimization is a tractable problem — you don't need humans to hand-tune every configuration
- Cross-model transfer — a good harness design for one model often improves others
- Token efficiency is a first-class objective — Meta-Harness found solutions that are both better AND cheaper
- The future of harness engineering may be harness-engineering-engineering — meta-level optimization of the optimization process itself
See also: AutoAgent: Self-Optimizing Agent · Wayne Zhang: Three Scaling Dimensions