Archive BRAID DAILY
Harness-1: a 20-billion-parameter search agent trained to rival Opus-4.6
Subscribe

Braid Daily · 2026-06-07

Harness-1: a 20-billion-parameter search agent trained to rival Opus-4.6

A 20-billion-parameter agent claims frontier search at Context-1 cost — the harness, not the weights, is the story.

Wireframe harness scaffold holding a small compute core, externalizing state into ordered cards, on a dark editorial background.
The harness is becoming the unit of agent engineering.

The lead

1

Patrick Jiang announced Harness-1, a 20-billion-parameter search agent trained with what he calls a state-externalizing harness. The pitch: "frontier-level long-horizon search, rivaling Opus-4.6 and outperforming GPT-5.4" at "Context-1-level cost and latency." The claim worth testing is that a small model plus a harness built to push its working state outside the context window can stand in for a…

Read source
Diagram showing Agent equals Model plus Harness, with a train-both loop producing Harness-1.
The day's thread: tune the model and the harness together, not just the weights.

Agent = model + harness

4

A default recipe for tuning the whole agent

X / Viv

Viv argues an agent is a model plus a harness, and you should train both: build a v1 on a sensible base harness with task-specific scaffolding, then optimize the pair together rather than just swapping in a bigger model.

“Agent = Model + Harness”

Read source

One dollar, twenty minutes, three platforms

X / Nate

Nate reports stitching DeepSeek agents together with a few homemade tools to one-shot a full-stack web, iOS, and Android app. His number: about a dollar in roughly twenty minutes.

“I can now one-shot a full-stack web + iOS + Android app for about $1 in 20 minutes.”

Read source

An agent that shipped an app to the App Store

X / Tamaz Gadaev

Tamaz Gadaev describes a CRUX test where an agent built and published an iOS app to the App Store with a few human interventions, his case for why open-world evaluations show more than a pass/fail score.

Read source

Plumbing for agents

4

Sem: code entities on top of Git, not LSP

Hacker News

Sem proposes a primitive for code understanding built from Git dependencies rather than a language server: ask what a function depends on and what depends on it. 128 points and 49 comments on Hacker News.

Read source

A proposed shared format for agent memory

Hacker News

The Universal Memory Protocol wants one portable format for agent memory across tools. The top comment names the catch directly: a protocol is only as good as its adoption, and it isn't clear who is using this yet.

Read source

pidgin.sh turns Claude Code artifacts into URLs

Reddit / r/ClaudeAI

Built with Claude Code, pidgin.sh targets a familiar friction: Claude generates an HTML mockup or a one-pager, and now it can share that artifact as a hosted link instead of you saving and hosting it by hand.

Read source

On the timeline

4

OpenAI plans to turn ChatGPT into a superapp

Techmeme / Financial Times

OpenAI plans to overhaul ChatGPT in the coming weeks into a superapp with coding tools and agents, framed as a gateway to higher-margin products, per Cristina Criddle at the Financial Times.

Read source

What stays scarce after AGI

Techmeme / Dwarkesh Podcast

A Q&A with Google DeepMind's Alex Imas and Epoch AI's Phil Trammell on what remains scarce after AGI and how AI-generated wealth might be redistributed, on the Dwarkesh Podcast.

Read source

Local & on-device

2

Companion episode

Twenty Billion Parameters, One Big Harness

· 00:16:51

Three days running, the thread has been the same: capability is moving into the harness and the tooling around the model, not only the weights. Harness-1 puts a number on it, and the agent-building tweets show people wiring small models into full apps. The open question the memory-protocol thread keeps asking is who agrees on the standards once everyone's doing it.