◆ Dispatch 027 · 2026-05-15 GSV Bugmageddon Forecast
Five Days to Root, Four Months in Exile
“The shape of the work changes. The standards of craft don't.”
— Lenar Kess, today's narration
Five days for a small security team paired with Mythos Preview to land the first public macOS kernel exploit on Apple's M5 with Memory Integrity Enforcement turned on. Four months for Replit to claw back into the iOS App Store. In between: arXiv starts banning authors of LLM-error papers, Metabase explains why open-source security is being strip-mined this summer, NVIDIA squeezes the 5090, Uncle Bob switches from Claude to Codex, and a pure-OCaml protocol stack boots in low Earth orbit.
- Codex everywhere, Claude in the rearview — OpenAI ships Codex inside the ChatGPT mobile app, Uncle Bob cancels his Claude account, and Arvind Narayanan names the irony underneath both.
- Five days to a kernel exploit on M5 — Calif and Mythos Preview crack Apple's Memory Integrity Enforcement and hand-deliver the 55-page report to Cupertino.
- The strip-mining era of open source security — Metabase's security inbox went from ten reports a month to ten a week. Cal.com is going closed source.
- arXiv bans authors of LLM-error papers — Tom Dietterich announces a one-year submission ban on papers with hallucinated references or results.
- Replit out of the App Store wilderness — Four months after being pulled, Replit's iOS app is published again. Replies note what that says about platform power.
- GDDR7 squeezes the 5090 — A 300-dollar price hike to add-in-card partners as GDDR7 lead times stretch into weeks.
- The web's secret quirks file — Den Odell walks through Safari's Quirks.cpp and Firefox's about:compat. Chrome doesn't need a quirks file.
- OCaml in orbit — Thomas Gazagnaire's pure-OCaml protocol stack booted in low Earth orbit on April 23, with post-quantum rekeying and OxCaml-tuned dispatch.
Chapters
- 00:00:04 Codex everywhere, Claude in the rearview
- 00:03:34 Five days to a kernel exploit on M5
- 00:07:04 The strip-mining era of open source security
- 00:10:25 arXiv bans authors of LLM-error papers
- 00:13:33 Replit out of the App Store wilderness
- 00:16:38 GDDR7 squeezes the 5090
- 00:20:22 The web's secret quirks file
- 00:24:06 OCaml in orbit
Sources
11 cited-
1
First public macOS kernel memory corruption exploit on Apple M5
Article Calif (Bruce Dang, Dion Blazakis, Josh Maine) — Small security research firm that paired with Mythos Preview on the bug-finding and exploit-development workflow.
Apple spent five years building hardware and software to make memory corruption exploits dramatically harder. Our engineers, working together with Mythos Preview, built a working exploit in five days.
blog.calif.io/p/first-public-kernel-memory-… →Details
- Cited text
Apple spent five years building hardware and software to make memory corruption exploits dramatically harder. Our engineers, working together with Mythos Preview, built a working exploit in five days.
- Context
- A concrete data point on how fast model-plus-expert pairs can collapse the calendar against new hardware mitigations, and a follow-up to yesterday's UK AI Security Institute coverage of accelerating autonomous cyber capability.
- Key points
- Data-only kernel local privilege escalation chain on macOS 26.4.1 (build 25E253) targeting bare-metal M5 with kernel MIE enabled
- Bugs found April 25 by Bruce Dang; full working exploit by May 1, paired with Mythos Preview
- Mythos Preview generalizes within a known bug class but did not autonomously bypass MIE — human expertise closed the gap
- Calif claims this is the first public macOS kernel exploit on MIE hardware; a 55-page report was hand-delivered to Apple Park
- Frames the moment as the start of 'AI bugmageddon' for hardware mitigations like Memory Tagging Extension
- Provenance
- Article · Supporting source
-
2
Welcome to the strip mining era of open source security
Article Metabase team — Open-source BI vendor; their security inbox is a representative sample of what's hitting commercial OSS projects this spring.
Historically, Metabase averaged 10 submissions per month. Starting in January, we've been averaging 10 submissions per week, and many of these are legit.
www.metabase.com/blog/strip-mining-era-of-o… →Details
- Cited text
Historically, Metabase averaged 10 submissions per month. Starting in January, we've been averaging 10 submissions per week, and many of these are legit.
- Context
- Reframes the economics of being open-source in 2026: the historical security advantage of public-eyes-on-code is dissipating as agents commoditize code review, and commercial OSS projects are starting to close their doors.
- Key points
- Volume of vulnerability reports moved from ~10/month to ~10/week starting January 2026, most reading like LLM output
- No single vendor or model is driving it — coding agents in general have crossed a code-reading threshold
- Cal.com is going closed source as a direct response; more commercial OSS projects expected to follow
- Maintainer playbook: assume every disclosed vuln is trivially discoverable, drop weekend plans, patch immediately
- User-facing advice: pin dependencies, upgrade aggressively, defense in depth, least privilege, log everything
- Provenance
- Article · Supporting source
-
3
arXiv implements 1-year ban for papers containing incontrovertible evidence of unchecked LLM-generated errors
Source u/Nunki08 (paraphrasing Tom Dietterich) — Top-voted MachineLearning subreddit post quoting Tom Dietterich, the arXiv moderator for cs.LG, on a new enforcement policy.
By signing your name as an author of a paper, each author takes full responsibility for all its contents, irrespective of how the content was generated.
www.reddit.com/r/MachineLearning/comments/1… →Details
- Cited text
By signing your name as an author of a paper, each author takes full responsibility for all its contents, irrespective of how the content was generated.
- Context
- Marks a turning point where the citation graph itself starts treating LLM-produced garbage as fraud rather than tolerable noise — and forces every other publication venue to decide whether to do the same.
- Key points
- arXiv will impose a one-year submission ban on authors of papers with hallucinated references, hallucinated results, or leftover model output
- The bar is 'incontrovertible evidence' — fake DOIs, citations to nonexistent papers, results referencing experiments that weren't run
- Policy is driven by a sharp increase in volume across cs.LG, not isolated incidents
- Open question: whether banned authors will be named publicly; the deterrent effect depends on it
- Other preprint servers and venues are likely to follow
- Provenance
- Source · Background source
-
4
Replit iOS app back on the App Store after four months
X @amasad (Amjad Masad) — Founder and CEO of Replit, the highest-profile agentic-coding-for-everyone product on the market.
We worked things out with Apple, and just published our app for the first time in 4 months.
x.com/amasad/status/2055185058282226146 →Details
- Cited text
We worked things out with Apple, and just published our app for the first time in 4 months.
- Context
- A concrete reminder that platform gatekeeping, not model capability, sets the actual ceiling for shipping agentic products to consumers — and the resolution path is opaque even for a well-funded company.
- Key points
- Replit's iOS app was pulled or paused for four months; details of what Apple objected to have not been disclosed
- Resolution announced May 15 with no public lesson-learned post
- Replies frame app review as the real ceiling for agentic AI apps on mobile
- Houman Asefi: 'App Store, cloud credits, GPUs, and payment rails are the actual choke points'
- Robertus: 'Mobile is where agent workflows stop being a demo and start becoming something you check between errands'
- Provenance
- Tweet · Primary source
-
5
Uncle Bob switches from Claude to Codex
X @unclebobmartin (Robert C. Martin) — Long-time developer-craft author (Clean Code, Clean Architecture); audience skews senior practitioner.
Less wordy. More down to earth. More direct. A bit less risk averse — which I consider to be an advantage because I am the guarantor, not it.
x.com/unclebobmartin/status/205497032759204… →Details
- Cited text
Less wordy. More down to earth. More direct. A bit less risk averse — which I consider to be an advantage because I am the guarantor, not it.
- Context
- A single anecdotal switch, but from someone whose default audience is senior developers; pairs with OpenAI's Codex mobile launch as a signal of where the developer-tooling axis is sitting this week.
- Key points
- High-profile developer publicly cancels Claude account after weeks of using Codex exclusively
- Cites tone, directness, and willingness to be 'adventurous' — calls it a vibe choice
- Reports running 8–9 hour Codex sessions without hitting limits
- Points to his own swarm-forge GitHub repo, a multi-agent coordinator
- Engagement: 2,040 likes, 137,000 views in 20 hours
- Engagement
- 2040 likes · 119 retweets · 161 replies
- Provenance
- Tweet · Primary source
-
6
Arvind Narayanan on the verification challenge
X @random_walker (Arvind Narayanan) — Princeton computer science professor; co-author of AI Snake Oil, frequent commenter on the gap between AI marketing and AI use.
The harder AI companies try to make their products feel like magic genies, the steeper the learning curve gets.
x.com/random_walker/status/2055271764662296… →Details
- Cited text
The harder AI companies try to make their products feel like magic genies, the steeper the learning curve gets.
- Context
- Names the asymmetry behind every agentic-coding product decision this year: confident-sounding output raises the cost of catching the model's mistakes, and that cost is on the human.
- Key points
- Frames the irony of magic-genie product design making real use harder, not easier
- 'Prompt engineering may no longer be a thing, but the verification challenge isn't going away'
- Verification requires practice and learning — it's not a UX problem you can paper over
- Lands on the same point Bob Martin makes from the user side: the user is the guarantor
- Provenance
- Tweet · Primary source
-
7
Codex for Everyday Work: AI Agents Beyond Coding
Video OpenAI — OpenAI Forum conversation with Chris Nicholson (Global Affairs) and Thibault Sottiaux (Head of Codex), May 14, 2026.
Codex began as a tool for developers. Today, people are using it for much more: research, planning, file organization, automation, data analysis, presentations.
www.youtube.com/watch?v=DLP9CagE3dU →Details
- Cited text
Codex began as a tool for developers. Today, people are using it for much more: research, planning, file organization, automation, data analysis, presentations.
- Context
- The mobile launch and the broader-than-coding pitch are happening in the same week; OpenAI is positioning Codex as the everyday-work surface, not the developer-only one.
- Key points
- OpenAI is publicly broadening Codex's positioning from developer tool to knowledge-work agent
- Companion mobile launch puts Codex in the ChatGPT app for iOS and Android
- Sottiaux: users now start, steer, and review Codex jobs from a phone while compute runs on a remote machine
- Frames Codex as the front-end to long-running agent work on shared infrastructure
- Provenance
- Video · Supporting source
-
8
NVIDIA Reportedly Prepares RTX 5090 Price Hike Amid Rising GDDR7 Costs
Article AleksandarK — TechPowerUp reporter; original report sourced to Chinese Board Channels, a supply-chain leak feed.
A $300 (about 2,000 RMB) increase for NVIDIA's add-in card (AIC) partners, who purchase these GPUs from NVIDIA.
www.techpowerup.com/349050/nvidia-reportedl… →Details
- Cited text
A $300 (about 2,000 RMB) increase for NVIDIA's add-in card (AIC) partners, who purchase these GPUs from NVIDIA.
- Context
- Cost of the local-AI hobbyist's default card just stepped up at the same time hosted-model pricing tiers are rising; the all-in cost of doing agentic work yourself versus paying a vendor is being re-priced.
- Key points
- NVIDIA passing a $300 GPU-kit price increase to add-in-card partners for RTX 5090 and 5090D V2
- Driven by GDDR7 supply tightness; lead times running into weeks
- MSRP nominally $1,999; street prices on Newegg regularly cross $4,000
- Founders Edition restocks on NVIDIA's own marketplace remain the only path near MSRP
- Hike will likely show up at retailers in days or weeks
- Provenance
- Article · Supporting source
-
9
Browsers Treat Big Sites Differently
Article Den Odell — Front-end engineer and writer; this piece is a tour through Firefox's about:compat and WebKit's Quirks.cpp source.
Facebook, X (twitter), and Reddit will naively pause a video element that has scrolled out of the viewport, regardless of whether that element is currently in PiP mode.
denodell.com/blog/browsers-treat-big-sites-… →Details
- Cited text
Facebook, X (twitter), and Reddit will naively pause a video element that has scrolled out of the viewport, regardless of whether that element is currently in PiP mode.
- Context
- Working theory of modern web compatibility laid out with primary source code: Chrome sets the agenda, other engines maintain quirks files. Worth knowing if you ship to browsers.
- Key points
- Safari and Firefox both ship domain-specific rendering overrides; Chrome doesn't
- Firefox exposes its overrides as togglable interventions at about:compat
- WebKit's Quirks.cpp ships verbatim user-agent strings impersonating Chrome for Amazon Prime Video and other sites
- Specific quirks for TikTok, Netflix, Instagram, Zillow, SeatGuru, and Amazon product zoom
- Chrome's market dominance makes its undocumented behaviors the de facto spec other engines must paper over
- Provenance
- Article · Supporting source
-
10
O(x)Caml in Space
Article Thomas Gazagnaire — Co-founder of Parsimoni (space software spinout from Tarides) and long-time OCaml/MirageOS contributor.
Switching to OxCaml with exclave_ stack_ annotations drops p99.9 latency from 29 ns to 9 ns per packet on the dispatch hot path, and removes GC pressure entirely.
gazagnaire.org/blog/2026-05-14-borealis.html →Details
- Cited text
Switching to OxCaml with exclave_ stack_ annotations drops p99.9 latency from 29 ns to 9 ns per packet on the dispatch hot path, and removes GC pressure entirely.
- Context
- A working counter-example to the 'pick whatever Python ships with' default — small team, language-rigour-first stack, formally verified components, actual production hardware in actual orbit.
- Key points
- Pure-OCaml CCSDS protocol stack 'Borealis' booted in low Earth orbit on April 23, 2026 inside DPhi Space's ClusterGate-2
- End-to-end-encrypted command and control with post-quantum signing (ML-DSA-65) and over-the-air rekeying
- Wire formats as typed schemas, GADT-encoded state machines, formally verified crypto primitives (libcrux, fiat-crypto)
- OxCaml mode-system annotations (locality, uniqueness) drop p99.9 latency from 29 ns to 9 ns and eliminate GC pressure on the dispatch hot path
- Five-to-ten-MB statically linked flight binary, FROM scratch Docker image, running on a four-core Cortex-A53 module
- Provenance
- Article · Supporting source
-
11
LocalLLaMA discussion on the 5090 price hike
Source u/panchovix (LocalLLaMA subreddit) — Top thread on LocalLLaMA reacting to the TechPowerUp report; representative of the local-inference hobbyist take.
NVIDIA Reportedly Prepares RTX 5090 Price Hike Amid Rising GDDR7 Costs (maybe RTX 50 and PRO series as well)
www.reddit.com/r/LocalLLaMA/comments/1td9eh… →Details
- Cited text
NVIDIA Reportedly Prepares RTX 5090 Price Hike Amid Rising GDDR7 Costs (maybe RTX 50 and PRO series as well)
- Context
- Sentiment check from the people whose monthly Codex/Claude bills are being directly traded against GPU purchase decisions — the price hike accelerates conversations about running quantized models locally.
- Key points
- 356 upvotes, 160 comments — surface-level frustration plus pragmatic 'glad I bought mine last year' chorus
- Same subreddit is concurrently celebrating the RTX 5000 Pro 48GB as the new serious-hobbyist ceiling
- Local-inference community treats the consumer-vs-datacenter GPU competition as a permanent state, not a temporary squeeze
- Provenance
- Source · Background source
Codex everywhere, Claude in the rearview
00:00:04 Yesterday OpenAI rolled Codex out in preview inside the ChatGPT mobile app on iOS and Android. The pitch in their own video is short: start, steer, unblock, and review Codex work from your phone while Codex keeps running on your computer, Mac mini, devbox, or managed remote environment, with your files and project context still in place.
00:00:26 You approve commands, check progress, and move on with your day. A separate OpenAI Forum conversation, posted the same afternoon, has Chris Nicholson from Global Affairs talking with Thibault Sottiaux — who runs Codex at OpenAI — about how the tool has stopped being a developer thing.
00:00:44 Sottiaux says people are using it for research and planning, file organization, automation, data analysis, and presentations. Nate Jones, who's been doing daily AI commentary, posted a short clip teasing his Saturday Substack with Sottiaux titled "Codex Just Valued A Random House In India." That's the frame OpenAI is pushing this week — Codex doing real-estate appraisal on a phone.
00:01:10 And then, the day before, Uncle Bob Martin posted nine words that did 137,000 views and 2,000 likes: "I just cancelled my Claude account. I've been using codex, and haven't used Claude in several weeks." Ivan Tokar asked why. Bob answered: "Less wordy. More down to earth.
00:01:27 More direct. A bit less risk averse — which I consider to be an advantage because I am the guarantor, not it. I want it to be a bit adventurous. Kind of like driving a hot sports car. It's a vibe choice." Worth listening to anyway, because the framing — I am the guarantor, not it — is the actual mental model for using these tools well.
00:01:52 Bob isn't asking the agent to be safe. He's asking it to be useful, and he'll absorb the risk himself. He also reports running 8 to 9 hour Codex sessions without running out, and pointed people at his own little orchestrator called swarm-forge, a simple tool for coordinating several AI agents.
00:02:12 A retired Rails guru maintaining his airport flight school's status board and writing a multi-agent coordinator on the side — that's where the user base is now. Arvind Narayanan, from Princeton, framed the tension behind this whole shift in a single tweet this morning: "A big irony: the harder AI companies try to make their products feel like magic genies, the steeper the learning curve gets.
00:02:38 Prompt engineering may no longer be a thing, but the verification challenge isn't going away, and it requires a lot of practice and learning to do." Codex feels less wordy because OpenAI's product team made it less wordy. Pleasant, snappy, confident. The cost of that confidence is that you have to be a better reader of its output, because nothing about the tone tells you when it's wrong.
00:03:07 Bob accepts that bargain explicitly. Most users don't, and most users will be hurt by it before they learn. The open question for me is whether Codex in your pocket actually lands as a workflow or stays a demo. The Replit reply guy nailed the test: mobile is where agent workflows stop being a demo and start becoming something you check between errands.
00:03:31 We'll see how many people are actually checking.
Five days to a kernel exploit on M5
00:03:34 A small security team called Calif published a write-up yesterday titled "First public macOS kernel memory corruption exploit on Apple M5." The opening line is the kind of thing that ages a security architecture in a single sentence: "Apple spent five years building hardware and software to make memory corruption exploits dramatically harder.
00:03:56 Our engineers, working together with Mythos Preview, built a working exploit in five days." It's the marquee security feature for the M5 and the A19. Apple has been clear about its purpose: stop the class of memory corruption bugs that drove most of the most sophisticated iOS and macOS compromises of the last decade.
00:04:26 Their own research, per the Calif post, said the system disrupts every public exploit chain against modern iOS — including the leaked Coruna and Darksword commercial kits. Calif's chain is a data-only kernel local privilege escalation on macOS 26.4.1, build 25E253.
00:04:43 Unprivileged user in, root shell out, two vulnerabilities, several techniques, bare-metal M5 with kernel-level Memory Integrity Enforcement turned on. Bruce Dang found the bugs on April 25. Dion Blazakis joined Calif on April 27. Josh Maine built the tooling. By May 1 they had a working exploit, and earlier this week they hand-delivered a laser-printed copy of the 55-page report to Apple Park.
00:05:09 The piece that matters for our purposes is the role of the model. Quoting them directly: "Mythos Preview helped identify the bugs and assisted throughout exploit development. Once it has learned how to attack a class of problems, it generalizes to nearly any problem in that class.
00:05:27 Mythos discovered the bugs quickly because they belong to known bug classes. But Memory Integrity Enforcement is a new best-in-class mitigation, so autonomously bypassing it can be tricky. This is where human expertise comes in." Mythos found the bugs by pattern-matching to existing classes, and human experts did the new and hard part — building a bypass for a mitigation that's only a few months old in the wild.
00:05:56 That's the same shape we've been watching all spring. The UK AI Security Institute's recent Mythos checkpoint review — which we covered yesterday — also landed on this pairing: the model surfaces candidates, the senior engineer closes the deal, and the calendar collapses.
00:06:14 Two follow-ups from Thursday's show land in this story. We promised to track real-world cost impact and to verify whether the AISI tested checkpoint matches the one deployed under Project Glasswing. I don't have a verified answer on Glasswing yet; the Calif post calls the model "Mythos Preview," which is the public preview tier, not the AISI testbed build.
00:06:37 I'll keep watching for someone to publish a checkpoint hash that ties the two together. The other line from the post is the one I keep coming back to: "Apple built Memory Integrity Enforcement in a world before Mythos Preview. We're about to learn how the best mitigation technology on Earth holds up during the first AI bugmageddon." The word — bugmageddon — is theirs, not mine.
00:07:02 But it sets up the next segment.
The strip-mining era of open source security
00:07:04 Metabase shipped a post titled "Welcome to the strip mining era of open source security." Read the whole thing if you maintain anything in public. Here's the spine. Historically Metabase got about 10 security submissions a month, and most were trivial or false positives.
00:07:21 Starting in January, they've been averaging 10 a week. Many are legitimate. Most are markdown reports that read like they were generated by Claude or Codex. There's no one vendor or model driving it — coding agents have crossed a threshold for reading codebases at volume.
00:07:38 They lay out the new playbook for solo security researchers in three bullets, which I'll paraphrase. One: wrap Claude or Codex in a bunch of skills and turn that into a SaaS offering. Two: bulk scan commercial open-source repositories. Three: send every finding with a footer advertising your scanning service to the commercial OSS company and any big users you can scrape from their website.
00:08:02 They estimate there are now about a thousand of these scanning SaaS offerings competing for the same bounties. The structural piece is harder to swallow. From the post: "If your code is available, you're going to be living in reactive mode for a while. Closed source developers get to find and fix issues on their own schedule, and at least in theory are able to get ahead of third party researchers.
00:08:26 With coding agents being the main driver, this advantage" — being open source — "largely dissipates." There will probably be more. The maintainer-facing advice is uncomfortable but correct. Assume any vulnerability disclosed to you is trivially discoverable now.
00:09:01 If one researcher running Claude Code found it, another running Codex will too. Treat every report as already in the wild. Cancel your weekend plans. Patch and ship. For users of open-source software, they recommend treating every dependency as if a new vulnerability will be disclosed against it this quarter.
00:09:20 Pin everything, upgrade frequently, and practice defense in depth. Watch your logs. Hold credentials to the minimum privilege you can get away with. There's a parallel to draw to yesterday's Calif story. Same week, same dynamic, and two different altitudes. At the top of the difficulty curve, you get human-paired teams cracking the world's best mitigation in five days.
00:09:43 At the bottom, you get a thousand SaaS scanners flooding maintainer inboxes with grep-quality findings. What ties them together is that humans no longer have to be involved at every step. The expensive part of vulnerability research used to be the time of senior people; that time is now spendable on the parts they're uniquely good at, and machines flood the rest.
00:10:05 If you maintain anything popular on GitHub, this summer is going to be unpleasant. If you maintain something popular and commercially backed, you have a path: pay people to triage at 4 a.m. on a Saturday, or — like Cal.com — close the source. Neither of those is good for the ecosystem.
00:10:23 Both are rational responses to where we are.
arXiv bans authors of LLM-error papers
00:10:25 Yesterday afternoon, Tom Dietterich posted what arXiv is now doing about papers that contain unchecked errors generated by large language models. Dietterich is the arXiv moderator for computer science and machine learning — a name you know if you've spent any time in the field.
00:10:43 The post on the MachineLearning subreddit has 417 upvotes and reads like the field had been waiting for this. Dietterich, quoting from the announcement: "Our Code of Conduct states that by signing your name as an author of a paper, each author takes full responsibility for all its contents, irrespective of how the content was generated.
00:11:03 We have observed a sharp increase in papers containing incontrovertible evidence of unchecked LLM-generated errors, such as hallucinated references or hallucinated results." Not a withdrawal, not a flag, and not a please-fix-and-resubmit. A ban. The bar is incontrovertible evidence, which in practice means hallucinated citations that don't exist, fake DOIs, results tables referencing experiments that weren't run.
00:11:34 Reviewer comments left in the manuscript. Sentences that begin "as a language model I cannot." That genre. This is the right move and it took longer than it should have. arXiv is a preprint server; it has historically taken a light touch on what gets through, because the priority was velocity and openness.
00:11:53 But the cost of accepting garbage at submission time has shifted. Every paper that lands on the machine-learning section with a hallucinated citation list trains a generation of grad students to copy that pattern. Every paper that gets cited by another paper passes the error downstream.
00:12:11 And the reviewer pool — overworked even in the best years — does not have the capacity to be the last line of defense. A one-year ban is a real cost. arXiv submissions are how most ML papers get a date stamp, how priority claims work, and how the citation graph gets built.
00:12:28 Cutting an author off for a year is roughly equivalent to telling them they can't publish a workshop paper at NeurIPS this cycle. That's a stick worth swinging. What I'm watching for: whether arXiv publishes the names of the banned authors, or keeps the enforcement private.
00:12:45 The deterrent effect is much higher with public names. I'd also like to see what counts as incontrovertible. There's a real difference between the model wrote three citations that don't exist and this proof has a step that doesn't follow. The first is unambiguous fraud.
00:13:01 The second is just bad math, which has always slipped through preprint review. The deeper signal, separate from the policy itself, is that the people running computer-science machine-learning had to do this. The volume of submitted-and-pulled garbage was high enough that they had to write a rule.
00:13:20 If you maintain a publication venue of any kind — a security disclosure inbox, a CFP, a journal — assume you're next. The strip-mining post from Metabase and the arXiv ban are the same story told from two different sides.
Replit out of the App Store wilderness
00:13:33 This morning Amjad Masad, Replit's CEO, posted: "We worked things out with Apple, and just published our app for the first time in 4 months. Thanks to all our customers and creators who helped out. It's been a journey, but we never give up and stay winning." There's a screenshot of the App Store listing with Agent 4 enabled and a what's-new copy.
00:13:55 Four months. Replit is the most public agentic-coding-for-everyone product on the market, and they were locked out of the world's most lucrative app store for a third of a year. Masad isn't saying what Apple objected to. Replies are asking, and getting no concrete answer beyond "we worked it out." Vugar Usi asked what the biggest lesson was.
00:14:17 No reply. The replies are where the texture lives. Hanzi: "App review as the final boss for agent apps is too real. Glad it escaped." Houman Asefi: "Every AI app founder screaming about models is about to learn the App Store, cloud credits, GPUs, and payment rails are the actual choke points.
00:14:35 Welcome to the utility era." Robertus: "Mobile is where agent workflows stop being a demo and start becoming something you check between errands." App review for an agent app is hard. Apple's reviewers are looking at a product that can write and execute arbitrary code on a server you provision, which from a sandboxing perspective is an unrestricted code-execution environment available to children.
00:15:02 There's no shape of that experience that fits cleanly into Apple's prior categories. Browsers ship Chrome on iOS only because Apple forces them to use WebKit. Code-execution apps don't have an equivalent forced abstraction — and arguably they shouldn't, because the whole point is that the code runs.
00:15:21 So I'm not surprised it took four months. I'm a little surprised it took only four months, and I'd love to know whether the resolution involves new restrictions on what Replit Agent can do from the mobile app. Patrick — one of the replies, who I think is being snarky — pointed out that four months of never giving up is a competitor shipping the same feature four times.
00:15:45 That's the platform tax. If you're building an AI app and you depend on Apple for distribution, build that tax into your roadmap. A Replit user named RAMAKRISHNA asked something more interesting to me as a builder: why use Replit at all if Codex, Claude Code, and Grok Build exist?
00:16:02 The answer, which Masad didn't give, is that Replit is the only one of those that runs end-to-end on a phone, with one-tap deployment, for non-developers. The audience is different. The customer is someone who has an idea, not a repo. Whether that audience is durable as the model-makers move down the stack toward consumers is, I think, the question the Replit business is answering this year.
00:16:27 For now: the app is back. The agentic-coding-on-mobile story has its showcase again. And every AI app founder watching just got a free lesson on what the actual ceiling looks like.
GDDR7 squeezes the 5090
00:16:38 TechPowerUp picked up a report from Chinese Board Channels yesterday that NVIDIA is preparing to raise the price of the GeForce RTX 5090 and the RTX 5090D V2 by about 300 US dollars to their add-in-card partners. The cost driver is GDDR7. Lead times are running into weeks, and NVIDIA — which buys the memory in bulk and bundles it with the GPU die in a kit to its partners — has been absorbing the increases.
00:17:06 They've stopped absorbing them. If you've shopped for a 5090 lately, you know the price has been theatrical for months. NVIDIA still lists 1,999 US dollars as MSRP — manufacturer's suggested retail price — for the founders' edition. Newegg listings regularly cross 4,000 US dollars.
00:17:25 The only path back to MSRP is to camp on NVIDIA's marketplace for a founders'-edition restock and hope. The local-AI community is reading this the way you'd expect. The LocalLLaMA subreddit thread on the price hike has 356 upvotes and 160 comments, mostly variations on glad-I-bought-mine-last-year.
00:17:45 The same subreddit yesterday surfaced a glowing first-impressions post on the RTX 5000 Pro with 48 gigabytes of VRAM, the workstation card people who want to run larger local models have been migrating to. That card is roughly 6,000 US dollars, which used to feel absurd and now feels merely steep next to a 4,000-dollar consumer card that ships with 32.
00:18:09 The shape of the local-model hardware market in spring 2026, after a year of agentic coding making everyone want more context and more parameters at home, looks like this. The 5090 was supposed to be the affordable enthusiast option. It is no longer affordable.
00:18:26 The Pro 48-gigabyte card is the new ceiling for serious hobbyists. The Mac Studio with a stacked Apple-silicon SKU remains the only one of these options where you can buy the hardware at a price that resembles the sticker. I bring this up because builders' decisions about whether to run a local model versus a hosted one are about to get re-priced.
00:18:50 If you've been training small models on your own box, or running a quantized 70-billion-parameter model under Ollama for your editor agent, the cost of the hardware refresh just went up — right as Anthropic moved Claude Code into higher tiers and OpenAI started metering Agent SDK credits in earnest.
00:19:10 There is no cheap path. There's a personal-hardware path and there's a hosted path, and both are expensive in different ways. For me, the more interesting line in the LocalLLaMA thread isn't the price complaint. It's a comment from someone who said they're now running their personal coding agent against a local quantized Qwen build instead of paying for an API tier, because the latency feels right and the model is good enough.
00:19:39 The two-year-old assumption that hosted frontier models are strictly better is fraying. Eight billion parameters of well-tuned local model handles a lot of code review, and the round trip from type-prompt to see-edit is zero milliseconds longer than your disk. That's the trade that's starting to look attractive at three figures of monthly Codex or Claude billing.
00:20:04 A 300-dollar price hike on a 4,000-dollar card is, as numbers go, modest. As a signal about where the GPU supply chain is sitting — with GDDR7 weeks behind, hyperscaler buys taking priority, and consumers competing against datacenters — keep it in your peripheral vision.
The web's secret quirks file
00:20:22 Den Odell wrote a piece this week titled "Browsers Treat Big Sites Differently," and if you've ever wondered why Safari handles the Amazon product zoom correctly but the same code looks slightly wrong somewhere else, this is the post for you. The claim, well-supported with source code: Safari and Firefox both ship code that checks which domain you're visiting and changes how the page renders based on it.
00:20:49 Chrome doesn't. The Firefox version of this is browseable. Type about:compat in the address bar and you'll see a list of interventions — site-specific CSS injections, JavaScript shims, and user-agent string overrides — each one a targeted fix for a specific website, with a toggle switch.
00:21:08 Turn the toggle off and watch the site break. The Safari version is called Quirks.cpp and it's in the WebKit source. Odell quotes one of the comments verbatim: "Facebook, X (twitter), and Reddit will naively pause a video element that has scrolled out of the viewport, regardless of whether that element is currently in PiP mode." So the browser detects when you're on facebook-dot-com, x-dot-com, or reddit-dot-com and changes how it handles picture-in-picture video.
00:21:40 Another comment, almost plaintive: "FIXME: remove this quirk if seatguru decides to adjust their site." The browser is fixing the airline seat-map site for them. The bit that got me is the user-agent spoofing. From the same file, here's a string baked into WebKit: Mozilla 5.0, Macintosh, Intel Mac OS X 10 15 7, AppleWebKit 537.36, KHTML like Gecko, Chrome 143.0.0.0, Safari 537.36.
00:22:05 Safari literally ships with a fake Chrome user-agent string, ready to deploy when a site refuses to work otherwise. Firefox does the same. The streaming services that won't let non-Chrome users watch — Amazon Prime Video has been a repeat offender — are circumvented by the browser lying about what it is.
00:22:26 Odell's reading of why is the right one. Chrome doesn't need a quirks file because the web is built for Chrome. When eighty percent of the market uses a Chromium-based browser, developers test in Chromium, ship when it works in Chromium, and the rest of the field has to pattern-match its behavior to keep users from leaving.
00:22:47 Site breaks in Safari? WebKit engineers ship a workaround. Chrome changes how something works? Chrome just changes it, and everyone else either follows or breaks. This is the same dynamic we lived through with Internet Explorer in the early 2000s, and the comparison is instructive.
00:23:06 We spent years digging out of that hole, telling ourselves the cure was open standards. The standards did, in fact, get better. The HTML5 living spec is a much more honest document than what we had in 2003. But the standards-versus-implementation gap reasserted itself, just inside a different browser's quirks list.
00:23:27 I'm bringing this up on a Friday because if you build for the web, this is one of those facts about your runtime environment you should keep in mind even if you can't do anything about it. Your site might be working in Firefox specifically because someone at Mozilla wrote an if-statement checking your domain.
00:23:47 The fix is invisible. The console doesn't tell you. The error log doesn't tell you. The list is getting longer. Open your site in Firefox and Safari before you ship. Not occasionally — regularly. That's the whole moral. The quirks files exist because most teams don't do that.
OCaml in orbit
00:24:06 Let me end with a piece that has nothing to do with agents or model providers or app stores, because it's the kind of thing that reminds me why I got into this in the first place. Thomas Gazagnaire of Parsimoni published a post titled O-x-Caml in Space. On April 23, their pure-OCaml protocol stack for spacecraft-to-ground communication — codename Borealis — booted in low Earth orbit inside DPhi Space's ClusterGate-2 payload module.
00:24:33 The first log line at 18:48:06 UTC: SpaceOS Borealis, by Parsimoni. CCSDS is the protocol family that links spacecraft to the ground. Borealis is a full pure-OCaml implementation of every layer, from radio framing up through the bundle protocol and the security extensions on top.
00:24:50 The wire formats are described as ocaml-wire codecs. The state machines are encoded as generalized algebraic data types, so the typechecker rejects invalid transitions at compile time. The cryptographic primitives are formally verified — libcrux and fiat-crypto, both audited in F-star.
00:25:08 The whole stack does post-quantum signing and supports over-the-air rekeying, which Gazagnaire says will be the first public in-orbit demonstration of post-quantum rekeying. The flight binary is five to ten megabytes, statically linked, shipped as a Docker image FROM scratch.
00:25:25 The reason any of this matters: a hosted-payload satellite has multiple tenants sharing a single Linux kernel. Container isolation alone isn't enough, because kernel-level CVEs keep breaking tenant boundaries. Gazagnaire names Dirty Frag, Fragnesia, and Copy Fail — all from the last twelve months.
00:25:43 On a ground server you can patch and reboot. In orbit, you can't. The cryptographic envelope around each bundle is the only durable guarantee the spacecraft has that the ground hasn't been compromised. The OxCaml section is the one I'd send to any developer who likes to make their tools sing.
00:26:01 OxCaml is Jane Street's compiler branch with a mode system — locality, uniqueness, capabilities — that lets you mark allocations stack-bound so they never reach the heap. Gazagnaire benchmarks the CCSDS dispatch hot path: decoding a Space Packet header and routing by APID.
00:26:18 Stock OCaml against OxCaml, with two annotations on the per-packet record. The 99.9th-percentile latency drops from 29 nanoseconds to 9 nanoseconds per packet. Across twenty-five million packets, the stock version triggered 394 minor garbage collections and the OxCaml version triggered zero.
00:26:37 The recipe is mechanical, in the best sense. Take a per-iteration heap allocation. Wrap it in stack-allocated form. Require the consumer to take the value as local. The type system proves the record can't escape the dispatch scope. The compiler emits no heap traffic.
00:26:53 The garbage collector has nothing to collect. Jitter goes away, which is the win on a payload module with hundreds of microseconds of jitter budget. This is the version of AI-era developer work I keep wanting to highlight. Not because there's a model in this story — there isn't, anywhere — but because a small team in Cambridge spent two Christmases hacking on a protocol stack written in a language nobody would have picked for embedded work fifteen years ago, and on April 23 it booted in orbit.
00:27:24 Most engineers will never ship code that runs at 400 kilometers of altitude. But the recipe — pick a language whose type system catches what your brain can't, write the codec as a typed schema, generate the wire format, hold the cryptographic primitives to a verified standard, prove the parts you can prove, and test the parts you can't — is exportable.
00:27:46 The agents we've been talking about all week are useful precisely because they let teams this small do things this hard. The shape of the work changes. The standards of craft don't. That's what I'll be holding in mind heading into the weekend. Talk Monday. Lenar Kess.