Musk says xAI's next foundation model has finished pre-training: 1.5 trillion parameters, fine-tuning underway, reinforcement learning starting in days, public release in two to three weeks. He credits the training mix — 'A lot of Cursor data was added in supplementary training and there is more to come.' No public evals yet, so the coding claim is his to back up.
Read source◆ Braid Daily · 2026-05-25
Grok V9-Medium wraps pre-training, with Cursor data in the mix
xAI's 1.5-trillion-parameter coding model is done pre-training; harness-vs-engine, local hardware, and who owns the training data.
The lead
1Running agents, and the bill
4Microsoft moved engineers from Claude Code to GitHub Copilot — both on Opus 4.7
Tren Griffin (X)
A rumor said Microsoft throttled Claude Code to cut a runaway AI bill. Griffin, a Microsoft employee, says it was a harness swap, not a cost cut: engineers moved to GitHub Copilot, both tools run Opus 4.7 on the same enterprise API, so 'Same Anthropic bill. Zero expense cut.' Treat the specifics as one person's claim, not a Microsoft statement.
Read source“The wrapper is interchangeable — the engine isn't... The moat was never the UI.”
Heterogeneous intelligence: route each subtask to the smallest model that can do it
Adrian Bertagnoli, Callosum (AI Engineer)
Bertagnoli's case for routing across models and chips: on Video Web Arena, an 8-billion-parameter Qwen3 VL paired with Kimi K2.5 beat GPT-5.2 by 18% and Gemini 2.5 by 25%. Sending cheap steps like zooming and visual parsing to the small model alone ran 11x faster and 43x cheaper on those steps. The lever any multi-step agent can pull is matching each subtask to the smallest model that can do it.
Read source“You don't need GPT to zoom for you.”
'Everyone is Wrong about Tokens'
ThePrimeagen (YouTube)
Reacting to a post bragging about $1.3M and 603 billion tokens spent in a month running OpenClaw, ThePrimeagen notes the poster paid nothing for those tokens. His prediction: orgs will swap token-maxing for token efficiency and rank people by features shipped, not spend.
Read source“It's going to be the people that are just being engineers... not the people spending Infinity.”
How Google DeepMind runs its own agents
KP Sawhney & Ian Ballantyne (AI Engineer)
A look inside day-to-day agent ops at a frontier lab. DeepMind engineers get worse rate limits than paying customers, who are prioritized; a 'Darwinian' skills library lets the org cull all but the best skills so agents inherit them for free; and an agent-trajectory store replays runs down to raw requests to find exactly when one started looping. KP is skeptical of MCP (the Model Context Protocol) and favors skills plus guardrailed CLI calls.
Read source“We have worse limits than you do because obviously we prioritize customers and not ourselves.”
Tools for the local builder
41,000 tokens/sec on a 27-billion-parameter model, using old V100 cards
r/LocalLLaMA
A hobbyist pushed roughly 1,000 tokens per second aggregate on Qwen3.6 27B across 128 concurrent requests, on multi-generation-old Nvidia V100 server cards. Single-user generation lands around 80 tokens per second. It's a best-case throughput demo, but the point holds: a coding-grade model runs fast on cheap, old GPUs.
Read source“For single user the generation is around 80 t/s with 3000 t/s processing, no mtp!!”
A llama.cpp fix that stops local agents reprocessing the whole context
llama.cpp (GitHub PR)
Agent harnesses that rewrite conversation history to 'optimize context' were forcing llama.cpp to reprocess huge token chunks — sometimes the full 70k-token context — stalling every turn. The merged PR fixes checkpoint creation so it reprocesses only what changed. A reminder that local agent speed is as much about runtime cache plumbing as about the model.
Read source“In the worst case, it has to reprocess the entire context and you get "forcing full prompt re-processing."”
Is NVIDIA still the default for local LLMs in 2026?
r/LocalLLaMA
A 230-comment thread on whether 'just buy Nvidia' still holds. For text inference the gap to AMD has mostly closed on llama.cpp's Vulkan backend; AMD still hurts for training and image generation. The value case people cite is an MI50 around $600 for 32 gigabytes of memory and a terabyte per second of bandwidth, with Apple's unified-memory Macs as the turnkey alternative.
Read source“MI50 can be had for just $600... 32GB of VRAM and 1TB/s of memory bandwidth.”
Defeating git rigour fatigue with Jujutsu
Ike Saunders
A concrete workflow for the messy middle of feature work: build the ideal commit history first as empty labeled commits, squash all the mess into one 'everything commit,' then sort hunks into place by hand until that commit is empty. Saunders is upfront about the catch — there's no guarantee every commit compiles, which may rule it out for bisect-clean history.
Read source“Doing Commits Like A Big Pile Of Laundry, perhaps?”
Where the training data comes from
3Now individual AI researchers are being sued over training data
Ed Newton-Rex (X)
Newton-Rex flags a shift in the AI-training suits: Hobbs v. Meta names an individual researcher, not just the company and its executives. The authors allege Guillaume Lample, then at Meta, torrented 70-plus terabytes of pirated books to train Llama; he has since co-founded Mistral AI.
Read source“It's no longer just AI companies & their founders being sued over AI training - individual researchers are now being sued, too.”
Court records: Meta staff torrented nearly 82TB of pirated books
Tom's Hardware
The records behind that suit: 81.7 terabytes pulled from shadow libraries to train Llama, plus internal messages from researchers who objected at the time. The dissent in the record is the part that lands — someone flagged the line and it was crossed anyway.
Read source“I don't think we should use pirated material. I really need to draw a line here.”
Tech firms are paying people $20–25 an hour to film their chores
The Washington Post
The data bottleneck for humanoid robots is physical demonstration, and a labor market has formed to supply it. DoorDash launched a Tasks app in March letting US Dashers film chores; Micro1 reports about 4,000 'robotics generalists' across 71 countries sending more than 160,000 hours of video a month. The next training-data land grab is happening in living rooms, not on the open web.
Read source“Gig workers earn $20-25 an hour to record themselves folding laundry, washing dishes, and making beds.”
Who gets to check the work
3AlphaProof Nexus solved 9 open Erdős problems — and the proofs typecheck
Google DeepMind (arXiv)
AlphaProof Nexus pairs a large language model with the Lean proof assistant and runs agentic loops until a proof typechecks or it gives up. It solved 9 of 353 open Erdős problems and proved 44 of 492 OEIS conjectures, some open for decades, at a few hundred dollars each. Because Lean is the referee, there's no hallucinated-proof problem: it passes the kernel or it doesn't. This extends the autonomous-math thread from the May 21 planar-unit-distance result.
Read source“9 of 353 open Erdős problems, at an inference cost of a few hundred dollars per problem.”
Simon Willison: OpenAI should publish GPT-4's retired architecture
Simon Willison (X)
Willison's argument: much of the much-cited 'bottle of water per email' figure rested on guesses about GPT-4's architecture, so publishing the real numbers for a now-retired, three-year-old model would let people reason from facts instead of leaks.
Read source“Given how much of the original 'bottle of water per generated email' water estimate came from guesses at the architecture of GPT-4, it would be very much in OpenAI's interest to publish the architecture of that now-retired, three year old model.”
Pope Leo XIV's encyclical argues AI on the terms of work and dignity
Vatican — Magnifica Humanitas
A major non-industry institution arguing about AI on the terms of work and dignity rather than benchmarks. The encyclical warns that power over ourselves is concentrating in private hands rather than democratic ones, and insists work has dignity independent of productivity. Its line about technology is a sharp counterpoint to the 'tools are neutral' reflex common in engineering.
Read source“Technology is never neutral, because it takes on the characteristics of those who devise, finance, regulate and use it.”
Companion episode
A few hundred dollars a proof, and the long argument about what machines are for
Two earlier threads come back today. The autonomous-math run we covered on May 21 now has a verified-proof sibling in AlphaProof Nexus, and Pope Leo XIV's encyclical — flagged here a week ago as due May 25 — arrived on schedule. Read next to the Meta court records, they circle one problem: when a model produces a result or a corpus, who gets to check it, and on whose terms.