Simon Willison's annotated PyCon US lightning talk calls November 2025 the inflection point. The top model changed hands five times in one month — Sonnet 4.5, GPT-5.1, Gemini 3, GPT-5.1 Codex Max, and Opus 4.5 — and coding agents crossed from often-work to mostly-work. He also names Google's Gemma 4 as the strongest US open-weight model he's seen, and GLM-5.1 as a…
Read source◆ Braid Daily · 2026-05-19
The November inflection, and a second npm worm in three weeks
Simon Willison's six-month recap, a Shai-Hulud variant that hides inside agent config, and two Unix-era historians gone in the same week.
The lead
1Supply chain reaches the agent layer
1Mini Shai-Hulud: 317 npm packages, 22 minutes, and a new persistence trick
SafeDep
SafeDep's same-day forensic writeup of a May 19 npm compromise: 637 malicious versions across 317 packages — including size-sensor, echarts-for-react, timeago.js, and most of the antv suite — pushed in a 22-minute automated burst. The novel piece is persistence. The payload plants session-start hooks inside Claude Code's settings file, then drops a folder-open task into the VS Code workspace config, so it re-fires every time you open the editor or start an agent session.
Read source“The payload hijacks Claude Code and Codex by injecting SessionStart hooks that re-execute the malware on every AI session, both locally and via commits to accessible GitHub repositories.”
Tools, training, and the genmedia stack
3Prime Intellect: General-Agent, a self-evolving synthetic RL environment
@PrimeIntellect
Prime Intellect pitches a fully synthetic reinforcement-learning environment whose task corpus self-evolves: 4,504 tool-use tasks across 1,040 domains and 8,159 tools at the start. If synthetic environment generation produces durable agent skills, the post-training advantage Anthropic and OpenAI built through hand-curated environment engineering loses some of its scarcity.
Read source“The next step toward automating AI is automating RL environments.”
Unconfirmed: Cursor trained Compose 2.5 on xAI's Colossus 2
@techdevnotes
A single-tweet claim from a generally accurate account says Cursor's Compose 2.5 coding model was trained on xAI's Colossus 2 supercluster. If it holds up, an editor-layer company just rented frontier-scale training compute from a frontier lab — and the lab selling it is the one most willing to undercut its first-party coding-agent rivals.
Read source“Compose 2.5 by Cursor was trained at xAI's Colossus 2.”
Let's go Bananas with GenMedia — Guillaume Vernade, Google DeepMind
YouTube
Live demo of Google DeepMind's generative-media pipeline reading The Wind in the Willows end to end: Gemini structures the prompts, Nano Banana 2 renders scenes, Veo animates them at 5 cents per second, Lyria composes per-chapter music, Gemini TTS reads dialogue — about a dollar a book. The new Interactions API caches context server-side for around two days, killing the upload-on-every-call pattern.
Read source“Each model has its own set of API. It doesn't make any sense — a developer should just swap the model name and it works.”
Where agents land in practice
3Rewiring the State — Eoin Mulgrew, 10 Downing Street
YouTube
Eoin Mulgrew runs cross-government transformation inside 10 Downing Street's data science team. He describes one embedded engineer building, in about two weeks, the statute-book analysis tool the Cabinet Office was about to spend £1.5M outsourcing — and the recruiting and political-cover machinery that lets a small unit of outsiders ship in weeks rather than the typical year-plus discovery phase.
Read source“We do want to recruit missionaries, not mercenaries — a paycheck is not going to get you out of bed when stuff gets hard.”
Ethan Mollick: large companies are already insourcing what they used to buy
@emollick
After talking with executives at large companies, Mollick says insourcing is underway: in-house teams with agent-driven productivity gains are absorbing routine outside legal counsel, mid-tier marketing agencies, and integration-heavy software vendors. The pressure lands on the high-margin professional-services tier first.
Read source“Why pay so many outside vendors (legal, marketing, software vendors) when you can hire in-house and harness AI productivity gains yourself?”
pi-config: a worked Plan/handoff/subagents kit for Claude Code
@DanielGri
Daniel Griesser's open repository (HazAT/pi-config) collects three production-tested artifacts for Claude Code: a Plan skill for larger tasks, a handoff skill for when an agent runs out of context, and a directory of named subagents — plus integration with Aaron Francis's Soloterm as a shared terminal substrate. Mario Zechner reposted it, which is what put it on our radar.
Read source“Don't copy, get inspired.”
Two historians, and one date
2Peter Neumann has died
TUHS mailing list
Peter Neumann died Sunday at the hospital in Santa Clara, from complications of a fall and surgery a few weeks earlier. He ran the RISKS Digest from 1985, worked on Multics, and wrote Computer-Related Risks (1995) — the column and the book that taught a generation of engineers how to reason about software harms. The Mini Shai-Hulud writeup higher up is the kind of incident RISKS would have annotated.
Read sourcePeter Salus has died
TUHS mailing list
Peter Salus died on May 15. His A Quarter Century of Unix is the canonical written history of Unix from Bell Labs through the workstation era — researched while most of the principals were still alive to interview. Two of the field's historians gone in the same week.
Read sourceCompanion episode
Mostly-work, malicious npm, and one engineer replacing a law firm
We've been tracking the encyclical since last week — Vatican News confirms Pope Leo XIV's Magnifica humanitas publishes Monday, May 25, and we'll cover it then.