◆ Braid Daily · 2026-06-07

Harness-1: a 20-billion-parameter search agent trained to rival Opus-4.6

7 June 2026

A 20-billion-parameter agent claims frontier search at Context-1 cost — the harness, not the weights, is the story.

The lead

Patrick Jiang announced Harness-1, a 20-billion-parameter search agent trained with what he calls a state-externalizing harness. The pitch: "frontier-level long-horizon search, rivaling Opus-4.6 and outperforming GPT-5.4" at "Context-1-level cost and latency." The claim worth testing is that a small model plus a harness built to push its working state outside the context window can stand in for a…

Read source

Agent = model + harness

A default recipe for tuning the whole agent

X / Viv

Viv argues an agent is a model plus a harness, and you should train both: build a v1 on a sensible base harness with task-specific scaffolding, then optimize the pair together rather than just swapping in a bigger model.

“Agent = Model + Harness”

Read source

One dollar, twenty minutes, three platforms

X / Nate

Nate reports stitching DeepSeek agents together with a few homemade tools to one-shot a full-stack web, iOS, and Android app. His number: about a dollar in roughly twenty minutes.

“I can now one-shot a full-stack web + iOS + Android app for about $1 in 20 minutes.”

Read source

An agent that shipped an app to the App Store

X / Tamaz Gadaev

Tamaz Gadaev describes a CRUX test where an agent built and published an iOS app to the App Store with a few human interventions, his case for why open-world evaluations show more than a pass/fail score.

Read source

Grok Build edits a live app from a comment

X / Jon Shulkin

Jon Shulkin shows a natural-language comment-and-edit tool built with Grok Build that lives inside the app being built; leave a comment, and Grok Build makes the change and updates the app.

Read source

Plumbing for agents

Sem: code entities on top of Git, not LSP

Hacker News

Sem proposes a primitive for code understanding built from Git dependencies rather than a language server: ask what a function depends on and what depends on it. 128 points and 49 comments on Hacker News.

Read source

A proposed shared format for agent memory

Hacker News

The Universal Memory Protocol wants one portable format for agent memory across tools. The top comment names the catch directly: a protocol is only as good as its adoption, and it isn't clear who is using this yet.

Read source

Databricks tuned a retriever to speed up its assistant

X / Matei Zaharia

Matei Zaharia writes up how Databricks made its Knowledge Assistant three times faster with an Instructed Retriever trained end-to-end, a sign that custom model tuning is showing up as agents reach production scale.

Read source

pidgin.sh turns Claude Code artifacts into URLs

Reddit / r/ClaudeAI

Built with Claude Code, pidgin.sh targets a familiar friction: Claude generates an HTML mockup or a one-pager, and now it can share that artifact as a hosted link instead of you saving and hosting it by hand.

Read source

On the timeline

OpenAI plans to turn ChatGPT into a superapp

Techmeme / Financial Times

OpenAI plans to overhaul ChatGPT in the coming weeks into a superapp with coding tools and agents, framed as a gateway to higher-margin products, per Cristina Criddle at the Financial Times.

Read source

SpaceX signs a $30B compute deal with Google

Indian Express

SpaceX agreed to supply Google with AI computing power under a deal reported at $30 billion, another sign of compute supply being locked up across the largest players.

Read source

UK police told to stop drafting court statements with AI

Techmeme / Financial Times

Several UK police forces have been told to stop using AI to prepare court statements, on the concern that inaccurate outputs could contaminate legal procedures, per Robert Wright at the Financial Times.

Read source

What stays scarce after AGI

Techmeme / Dwarkesh Podcast

A Q&A with Google DeepMind's Alex Imas and Epoch AI's Phil Trammell on what remains scarce after AGI and how AI-generated wealth might be redistributed, on the Dwarkesh Podcast.

Read source

Local & on-device

r/LocalLLaMA is still waiting on a runnable GLM Air

Reddit / r/LocalLLaMA

The local crowd's complaint, in one thread: GLM 5.1 is a strong coder but too big to run at home and slow on the API, and there's been no upgraded Air model since 4.5. The ask is a capable GLM that fits on local hardware.

Read source

mlx-audio ships local TTS and ASR on Apple Silicon

X / Kris Matterz

mlx-audio v0.4.4 brings new text-to-speech and speech recognition models running locally on Apple Silicon, the kind of on-device audio stack that doesn't need a server round-trip.

Read source

Companion episode

Twenty Billion Parameters, One Big Harness

2026-06-07 · 00:16:51

Episode Watch on YouTube Sources Transcript Chapters JSON

Three days running, the thread has been the same: capability is moving into the harness and the tooling around the model, not only the weights. Harness-1 puts a number on it, and the agent-building tweets show people wiring small models into full apps. The open question the memory-protocol thread keeps asking is who agrees on the standards once everyone's doing it.