Meta-Harness: Automated Harness Optimization

Original: Tweet thread by Lior Alexander · March 30, 2026 Paper: by Yoonho Lee et al. — Quoted tweet Engagement: 242 likes · 390 bookmarks Category: Research / Practice

Overview

What if you could automate the process of improving an agent's harness? Meta-Harness does exactly that — an automated optimization method that tunes system prompts, tool definitions, retry logic, and context management end-to-end, achieving a 6x performance gap from harness changes alone, without changing the model.

How It Works

The core loop is simple:

1. Start with any harness
2. An optimizer agent reads the code, logs, and scores
3. It identifies what caused each failure
4. It rewrites the harness and submits a new version
5. That version gets tested, results go back into the folder
6. Repeat

What Makes It Different: Raw File Access

Previous automated optimization methods compressed everything into short summaries — the optimizer only saw ~26K tokens per step. Not enough to trace why something broke.

Meta-Harness keeps every raw file: 10 million tokens per step, 400x more data to learn from. Enough to trace a failure back to the exact line that caused it.

Results

Benchmark	Result	Note
TerminalBench-2 (coding)	#1 among all Haiku agents
Text classification	Beats best hand-designed harness by 7.7 points	Uses 4x fewer tokens
Math	Single strategy improves accuracy across 5 unseen models	Cross-model transfer

All improvements came from optimizing the harness, not the model.

The Key Insight

"The performance delta between frontier models is narrowing. The delta between harness implementations on the same model is not. That's where the leverage is."

This is the strongest empirical evidence yet for why Harness Engineering matters: a well-designed harness can extract dramatically more value from the same model than a poorly-designed one. And now, the design process itself can be automated.

Implications

Harness optimization is a tractable problem — you don't need humans to hand-tune every configuration
Cross-model transfer — a good harness design for one model often improves others
Token efficiency is a first-class objective — Meta-Harness found solutions that are both better AND cheaper
The future of harness engineering may be harness-engineering-engineering — meta-level optimization of the optimization process itself