On open-ended coding problems where the right answer isn't clear up front, Anthropic now reports that "Claude's success rate is now 76%—a 50 point jump in just 6 months." The same post says many of its engineers rate Claude's code quality as on par with their own.
Read source◆ Braid Daily · 2026-06-05
Claude now clears 76% of open-ended coding problems, Anthropic says
Claude's success on open-ended coding hits 76%, up 50 points in six months — plus AI security cutting both ways.
The lead
1
When the model finds the bug — and when it's the bug
2Claude audit blamed for a 48% Zcash drop
Watcher.Guru (X)
Watcher.Guru reports a Claude audit surfaced a critical Zcash flaw that allowed unlimited ZEC minting and sat undetected for about four years before a June 1 patch.
Read source“Zcash crashes 48% after Claude AI finds critical vulnerability allowing unlimited minting of $ZEC”
Meta's support agent gets turned into an account-theft tool
MIT Technology Review
On June 5, 404 Media reported attackers were using Meta's AI customer-support agent to steal Instagram accounts through a simple prompt-based approach. MIT Technology Review uses the case to argue AI security is more ordinary than its mythos suggests.
Read sourceWho controls the build-out
4OpenAI agrees to pre-release government capability checks
CNBC via Techmeme
OpenAI confirmed it will comply with President Trump's executive order asking AI companies to let the US government assess their models' capabilities before release.
Read sourceWhite House–Anthropic dispute eases before the IPO
Reuters via Techmeme
Reuters sources say a months-long dispute between the White House and Anthropic is easing across the US government as the company prepares to go public.
Read sourceSwitch in talks to raise billions at a $50B-plus valuation
The Information via Techmeme
Data-center developer Switch is in talks to raise billions from private-equity firms including Brookfield and KKR at a valuation above $50 billion.
Read sourceChina pulls more AI talent back from the US
CNBC
Tencent's chief AI scientist Yao Shunyu, who left OpenAI for the company, said he aims to pursue artificial general intelligence as Chinese firms recruit more researchers away from US labs.
Read sourceAgents, and what's still breaking
4CHARM tackles cascading hallucination in agentic RAG
arXiv
A new framework, CHARM, targets cascading hallucination in multi-step agentic retrieval-augmented generation, where one bad retrieval poisons every step that follows.
Read sourceA benchmark for agents that build agents
arXiv
The Meta-Agent Challenge asks whether current agents can develop other agents on their own, rather than only executing tasks inside setups humans designed for them.
Read sourceTemporal regret as a training objective
arXiv
Trivium proposes temporal regret as a first-class objective so agents correct mistakes earlier in a run instead of only optimizing the final outcome.
Read sourceNaval Ravikant's one-line bet on where this goes
Naval (X)
A compact prediction about how agentic systems reshape software architecture.
Read source“Software platforms are going to be rebuilt for agent-first.”
The embodied stack fills in
3A world-language-action model class
arXiv
One of several robotics papers out today, this proposes folding world modeling, language reasoning, and action synthesis into a single architecture for embodied agents.
Read sourceFlash-WAM claims a 23x speedup for world-action models
arXiv
Flash-WAM distills world-action models for real-time use, reporting a 23x speedup on the joint video-and-action generation that robots run in a loop.
Read sourceAn open dataset for medical robotics
arXiv
A large consortium released Open-H-Embodiment, an open dataset and foundation models aimed at autonomous medical robots.
Read sourceCompanion episode
What the Mug Lets You Do
Capability and failure showed up side by side today. Claude's coding numbers and the wave of embodied-agent papers push one way; the cascading-hallucination work and Meta's hijacked support agent show what still breaks once these systems ship.