Huang Jia: The Harness Revolution — A Comprehensive Overview
Original: Datawhale WeChat · Huang Jia · 2026-04-13 Category: Overview / Foundational
Overview
Huang Jia, AI researcher at a Singapore research institute and author of "Hands-on AI Agent" and "Agent Design Patterns", delivered a deep dive into Agent Harness at Datawhale. In 2026, as model capabilities plateau, Harness is becoming the decisive factor in Agent system success.
The 2026 Turning Point: From Model Competition to Harness Competition
In 2026, model intelligence has plateaued — models are smart enough, both Chinese and international. What we compete on now is Harness.
DeepMind's Agents team demonstrated this clearly: using the same model, just swapping the Harness (the model's surrounding infrastructure), performance varies dramatically. Claude Code's rapid progress and commercial success both prove that Harness is a strategic-level asset.
The Historical Inevitability of Harness: 30 Years of Software Engineering
Harness isn't accidental — it's historically inevitable. Engineers have always fought system complexity. The constant across 20+ years of architecture evolution: how to tame complexity.
| Era | Taming Target | Landmark Work |
|---|---|---|
| 1994 | Object complexity | GOF "Design Patterns" — 23 patterns |
| 2002 | Enterprise architecture | Martin Fowler's "PoEAA", Eric Evans' "DDD" |
| 2010 | Distributed systems | Microservices, message queues, eventual consistency |
| 2017 | Data system complexity | Martin Kleppmann's "DDIA" |
| 2026 | Agent complexity | Harness Engineering |
Agents are the first non-deterministic system — they're probabilistic machines. Harness is the reins we use to control them.
Three Leaps: Prompt → Context → Harness
| Phase | Period | Core Focus |
|---|---|---|
| Prompt Engineering | 2023 | Making LLMs understand us (CoT, Few-shot) |
| Context Engineering | 2024-2025 | What you give = what you get (RAG, knowledge bases) |
| Harness Engineering | 2026 | Designing controllable Agent systems (loops, tools, quality gates, governance) |
Agent = Model + Harness
Harness is the infrastructure wrapping model execution:
Harness turns the LLM's brain into the Agent's body.
Six Core Components of a Harness
- Agentic Loop — The heart. Accept input, execute tools iteratively, return results. Directly descended from the ReAct (Reasoning + Acting) pattern.
- Tool System — Extends the LLM's action capabilities beyond language generation.
- Memory & Context Management — Claude Code leads by "several streets" in Context Engineering.
- Guardrails — Allow / Deny / Ask permission controls. The core of the "reins."
- Hooks — Guard mechanisms (e.g., preventing sensitive file leaks to GitHub).
- Session — Runtime continuity management.
Five Production Problems Harness Solves
- Infinite loop problem
- Context explosion problem
- Permission loss-of-control problem
- Quality unpredictability problem
- Cost opacity problem
The Current Harness Landscape
| Type | Representatives | Positioning |
|---|---|---|
| Deep vertical | Claude Code | Deep engineering, the undisputed #1 in Harness |
| Deep vertical | Codex (OpenAI) | Open-source, strong GPT coding, commonly used for code review |
| IDE | Cursor, Windsurf | Coding IDEs, popular before Claude Code |
| Open-source alt | OpenCode | Open-source Claude Code alternative, works with DeepSeek |
| Horizontal | OpenClaw, Hermes | Automation operations (WhatsApp, Feishu, etc.) |
Deep vertical for engineering, horizontal for operations — they complement each other.
The Engineer's Transformation
Engineers will never be unemployed, but coders might.
- Coder: someone who just writes code → replaceable by Agents
- Engineer: someone who designs and controls complex systems → irreplaceable
Core capabilities: understanding system complexity + abstract/structural thinking + taming uncertainty + deep domain knowledge
What's past is prologue.
Based on Huang Jia's talk at Datawhale. Original: Datawhale WeChat.