Fuli Luo: Anthropic Cuts Third-Party Harness Subscriptions — A Compute Economics Analysis
Original: Tweet · Fuli Luo (@_LuoFuli) · April 5, 2026 Engagement: 1,770 likes · 222 retweets · 733K views Category: Industry Perspective
Background
In early April 2026, Anthropic cut off third-party harnesses (such as OpenClaw, OpenCode) from using Claude subscriptions. Almost simultaneously, MiMo launched Token Plan. Fuli Luo analyzed both events together, offering deep insights into compute economics in the Agent era.
Core Arguments
1. Claude Code Subscriptions Are a Carefully Designed Compute Allocation
Claude Code's subscription system most likely doesn't make money — it may even be bleeding cash. After third-party harnesses plugged in, real costs may have been tens of times the subscription price.
Luo analyzed OpenClaw's context management: in a single user query, it initiates multiple rounds of low-value tool calls as independent API requests, each carrying over 100K tokens of long context. Even with cache hits, this is wasteful; in extreme cases it can drive up cache miss rates for other queries.
2. Short-Term Pain → Long-Term Engineering Discipline
Third-party harnesses can still call Claude via API — they just can't piggyback on subscriptions anymore. Short-term user costs jump by orders of magnitude. But this pressure is exactly what drives harness improvements in context management, maximizing prompt cache hit rates, and reducing wasteful token consumption.
"Pain eventually converts to engineering discipline."
3. Don't Race on Money-Losing Pricing
Luo urged LLM companies not to blindly engage in price wars. Selling tokens cheaply while giving third-party harnesses free rein looks user-friendly, but it's a trap — the very trap Anthropic just escaped.
The deeper issue: if users spend their attention on low-quality agent harnesses, unstable inference services, and degraded models, nothing gets done well — creating a negative cycle for user experience and retention.
4. Co-Evolution Theory
"The Agent era doesn't belong to whoever burns the most compute. It belongs to whoever uses it wisely."
Global compute can't keep up with the token demand generated by agents. The real way out isn't cheaper tokens, but co-evolution:
More Efficient Agent Harnesses × More Powerful & Efficient Models
Anthropic's move, whether intentional or not, is pushing the entire ecosystem — open source and closed — in this direction.
Why This Matters for Harness Engineering
This analysis reveals an often-overlooked dimension of Harness Engineering: efficiency.
A harness isn't just about making agents more capable, longer-running, or coordinating more agents — it must also make agents more economical. Context management quality directly impacts API costs, and API costs are the key to whether an Agent product can be commercially viable.
When model providers start pricing based on context efficiency (or outright refuse inefficient harnesses), optimizing harness token consumption is no longer optional — it's a survival issue.
See also: OpenAI: Harness Engineering · Anthropic: Multi-Agent Harness Design