◆ Braid Daily · 2026-06-05

Claude now clears 76% of open-ended coding problems, Anthropic says

5 June 2026

Claude's success on open-ended coding hits 76%, up 50 points in six months — plus AI security cutting both ways.

The lead

On open-ended coding problems where the right answer isn't clear up front, Anthropic now reports that "Claude's success rate is now 76%—a 50 point jump in just 6 months." The same post says many of its engineers rate Claude's code quality as on par with their own.

Read source

When the model finds the bug — and when it's the bug

Claude audit blamed for a 48% Zcash drop

Watcher.Guru (X)

Watcher.Guru reports a Claude audit surfaced a critical Zcash flaw that allowed unlimited ZEC minting and sat undetected for about four years before a June 1 patch.

“Zcash crashes 48% after Claude AI finds critical vulnerability allowing unlimited minting of $ZEC”

Read source

Meta's support agent gets turned into an account-theft tool

MIT Technology Review

On June 5, 404 Media reported attackers were using Meta's AI customer-support agent to steal Instagram accounts through a simple prompt-based approach. MIT Technology Review uses the case to argue AI security is more ordinary than its mythos suggests.

Read source

Who controls the build-out

OpenAI agrees to pre-release government capability checks

CNBC via Techmeme

OpenAI confirmed it will comply with President Trump's executive order asking AI companies to let the US government assess their models' capabilities before release.

Read source

White House–Anthropic dispute eases before the IPO

Reuters via Techmeme

Reuters sources say a months-long dispute between the White House and Anthropic is easing across the US government as the company prepares to go public.

Read source

Switch in talks to raise billions at a $50B-plus valuation

The Information via Techmeme

Data-center developer Switch is in talks to raise billions from private-equity firms including Brookfield and KKR at a valuation above $50 billion.

Read source

China pulls more AI talent back from the US

CNBC

Tencent's chief AI scientist Yao Shunyu, who left OpenAI for the company, said he aims to pursue artificial general intelligence as Chinese firms recruit more researchers away from US labs.

Read source

Agents, and what's still breaking

CHARM tackles cascading hallucination in agentic RAG

arXiv

A new framework, CHARM, targets cascading hallucination in multi-step agentic retrieval-augmented generation, where one bad retrieval poisons every step that follows.

Read source

A benchmark for agents that build agents

arXiv

The Meta-Agent Challenge asks whether current agents can develop other agents on their own, rather than only executing tasks inside setups humans designed for them.

Read source

Temporal regret as a training objective

arXiv

Trivium proposes temporal regret as a first-class objective so agents correct mistakes earlier in a run instead of only optimizing the final outcome.

Read source

Naval Ravikant's one-line bet on where this goes

Naval (X)

A compact prediction about how agentic systems reshape software architecture.

“Software platforms are going to be rebuilt for agent-first.”

Read source

The embodied stack fills in

A world-language-action model class

arXiv

One of several robotics papers out today, this proposes folding world modeling, language reasoning, and action synthesis into a single architecture for embodied agents.

Read source

Flash-WAM claims a 23x speedup for world-action models

arXiv

Flash-WAM distills world-action models for real-time use, reporting a 23x speedup on the joint video-and-action generation that robots run in a loop.

Read source

An open dataset for medical robotics

arXiv

A large consortium released Open-H-Embodiment, an open dataset and foundation models aimed at autonomous medical robots.

Read source

Companion episode

What the Mug Lets You Do

2026-06-05 · 00:19:40

Episode Watch on YouTube Sources Transcript Chapters JSON

Capability and failure showed up side by side today. Claude's coding numbers and the wave of embodied-agent papers push one way; the cascading-hallucination work and Meta's hijacked support agent show what still breaks once these systems ship.