Archive BRAID
Bring Your Own Numbers / DISPATCH 029
PDF RSS

Dispatch 029 · 2026-05-17 GSV Bring Your Own Numbers

Bring Your Own Numbers

/ 00:25:59 / 8 sources

“The agents do the typing. The humans do the specification.”

— Lenar Kess, today's narration

A Sunday show about doing your own arithmetic. Mustafa Suleyman gives the white-collar tier eighteen months, in a piece whose own counter-data sits two paragraphs down. The State of Brand argues every AI subscription is a subsidized loss-leader two weeks away from a forcing function. William Angel runs the tokens-per-hour math on an M5 MacBook Pro and finds OpenRouter cheaper. Frederick Vanbrabant uses The Goal to explain why agents move the bottleneck rather than break it. Marlene Mhangami's Playwright talk shows the cleanest pattern for tests AI should write. Calif's public M5 / MIE exploit write-up lands. Artem Loenko explains why every chat UI keeps ending up on a browser engine. And Luke Lanchester's MCP hello page is the small fix I most enjoyed this week.

Chapters

  1. 00:00:04 Eighteen months, and the article that knocks it down
  2. 00:03:01 The subsidy that ends on June 1
  3. 00:06:50 Apple Silicon costs more than OpenRouter
  4. 00:09:28 AI doesn't speed up your process — it moves the bottleneck
  5. 00:12:18 Tests AI writes, tests AI verifies
  6. 00:15:26 The five-day exploit, in their own words
  7. 00:18:19 Native all the way, until you need text
  8. 00:21:18 The MCP hello page

Sources

8 cited
  1. 1

    Microsoft AI chief gives it 18 months — for all white-collar work to be automated by AI

    Article Jake Angelo — Fortune staff writer; piece is a May 16 re-publication of a Feb 13, 2026 story.

    "Creating a new model is going to be like creating a podcast or writing a blog."

    fortune.com/article/why-microsoft-ai-chief-… →
    Details
    Cited text
    "Creating a new model is going to be like creating a podcast or writing a blog."
    Context
    A senior figure restating the maximalist 18-month timeline, in a piece that includes the empirical pushback in the same article, is a useful artifact for calibrating between AGI marketing and what is shipping on production codebases.
    Key points
    • Mustafa Suleyman, CEO of Microsoft AI, told the Financial Times that 'most tasks that involve sitting down at a computer' will be fully automated within 12-18 months — naming accounting, legal, marketing, and project management.
    • Fortune's own piece notes the take 'hasn't aged well': a 2025 Thomson Reuters report on professional services found only marginal productivity gains, and a METR study found AI made software developer tasks take 20% longer.
    • Apollo Global Management economist Torsten Slok found Big Tech Q4 2025 profit margins up 20%+ while the broader Bloomberg 500 saw almost no change — investors don't expect AI earnings outside tech.
    • Challenger Gray & Christmas counts ~49,135 AI-related job cuts so far in 2026; Microsoft itself let go 15,000 in 2025 with Nadella citing a 'new era.'
    • Anthropic's Dario Amodei walked back his 2025 'half of entry-level white-collar jobs' warning; the prediction drumbeat is starting again.
    Provenance
    Article · Supporting source
  2. 2

    Every AI Subscription Is a Ticking Time Bomb for Enterprise

    Article The State of Brand

    "They are selling enterprises filet mignon at gas station hot dog prices and calling it a business model."

    www.thestateofbrand.com/news/ai-subscriptio… →
    Details
    Cited text
    "They are selling enterprises filet mignon at gas station hot dog prices and calling it a business model."
    Context
    If a team built workflows on $20-a-month seats over the last two years, the gap between subscription price and true cost is the same line item every engineering org should be modeling against now.
    Key points
    • The piece argues every major AI lab is running a coordinated loss-leader: $20/mo Claude Pro or ChatGPT Plus seats whose actual API-rate burn for a moderate user is $200-400/mo.
    • Cites GitHub Copilot reportedly losing $20+/user/mo on a $10 plan; power users hitting $80/mo of compute on $10 subscriptions; Anthropic users consuming ~$8 of compute per $1 of subscription revenue.
    • GitHub Copilot is moving to usage-based 'AI Credits' billing on June 1, 2026 — GitHub's own announcement attributes the change to agentic usage becoming the default.
    • Quotes OpenAI VP Product Nick Turley calling subscription pricing something they 'stumbled into,' comparing flat plans to 'unlimited electricity.'
    • Anthropic at ~$30B annualized revenue (up from $9B end of 2025); OpenAI ~$25B and projecting $115B cumulative cash burn through 2029, with $665B committed compute spend by 2030; Oracle taking $43B of debt in a year to build OpenAI's data centers.
    Provenance
    Article · Supporting source
  3. 3

    Apple Silicon costs more than OpenRouter

    Article William Angel — Offline Agentic Coding series, part 3.

    "For apple silicon, the hardware cost dominates."

    www.williamangel.net/blog/2026/05/17/offlin… →
    Details
    Cited text
    "For apple silicon, the hardware cost dominates."
    Context
    Counterintuitive math for builders weighing on-device inference: depreciation, not electricity, sets the floor — and hosted APIs still win on raw cost per token until human time enters the calculation.
    Key points
    • A 14-inch MacBook Pro with M5 Max and 64GB lists at $4,299; amortized over 3, 5, and 10 years that is ~$0.16, $0.10, and $0.05 per hour.
    • Electricity at $0.18/kWh over 50-100W is ~$0.01-0.02/hr — hardware depreciation dominates, not power.
    • At 10-40 tokens/sec on Gemma 4 31B, that works out to roughly $1.50-$4.79 per million tokens on the pessimistic end and $0.40-$1.20 on the optimistic end.
    • OpenRouter serves Gemma 4 31B at ~$0.38-0.50 per million tokens and at 60-70 tok/s — 3-7x the local throughput.
    • Author's bottom line: on the pro max, local inference runs ~3x the cost of OpenRouter from an accounting view — and the speed gap is bigger than the cost gap.
    Provenance
    Article · Supporting source
  4. 4

    I don't think AI will make your processes go faster

    Article Frederick Vanbrabant — Enterprise architecture and product strategy blogger.

    "Bottlenecks should receive predictable, high-quality inputs."

    frederickvanbrabant.com/blog/2026-05-15-i-d… →
    Details
    Cited text
    "Bottlenecks should receive predictable, high-quality inputs."
    Context
    A useful reframe for any team modeling AI as a throughput multiplier: the bottleneck moves to scoping, and the math only works if the scoping work itself improves.
    Key points
    • Re-reads The Toyota Way and Eli Goldratt's The Goal to argue most process optimization misses where the actual constraint sits.
    • Software development is rarely slow because of typing speed; it's slow because of vague requirements and back-and-forth with domain experts.
    • AI-generated code does not collapse a scoping phase; if anything the documentation phase grows because the agent needs every detail spelled out.
    • Closing argument: handing humans the same depth of feature/scope documentation that agents need would produce comparable productivity gains.
    • Cites The Mythical Man-Month — adding people (or AI seats) to a constrained bottleneck does not unblock it.
    Provenance
    Article · Supporting source
  5. 5

    Beyond Code Coverage: Functionality Testing with Playwright

    Video Marlene Mhangami — Senior developer advocate at Microsoft and GitHub, core AI group.

    "Clean code bases amplify AI gains; unchecked AI in a codebase is going to amplify entropy."

    www.youtube.com/watch?v=FWEInOtngmM →
    Details
    Cited text
    "Clean code bases amplify AI gains; unchecked AI in a codebase is going to amplify entropy."
    Context
    A working pattern for the most common AI-code failure: a test suite the model wrote to confirm what the code does instead of what the user needs the code to do. The Playwright MCP loop is one of the more concrete answers shipping right now.
    Key points
    • GitHub saw ~1 billion commits in 2025; COO Kyle Daigle has said the platform is now seeing ~275 million commits per week, which extrapolates to ~14 billion by end of 2026.
    • Cites a Stanford study of 120,000 developers: a team that ran unchecked AI saw PR throughput rise but effective output rise only ~1%, with refactor and rework eating the gains.
    • Warns AI-generated unit tests are often self-affirming — the suite goes green while the user-facing behavior stays broken.
    • Argues for behavior-first TDD with Playwright: agents generate failing end-to-end tests against expected user behavior, then code to make them pass, then humans spend the most time on refactor.
    • Demos Playwright MCP server plus 'Playwright agents' — a planner, generator, and healer agent set — driving the red-green-refactor loop from GitHub Copilot CLI.
    Provenance
    Video · Supporting source
  6. 6

    First public macOS kernel memory corruption exploit on Apple M5

    Article Calif — AI-assisted offensive security shop pairing senior reverse engineers with Anthropic's Mythos Preview.

    "Apple built MIE in a world before Mythos Preview."

    blog.calif.io/p/first-public-kernel-memory-… →
    Details
    Cited text
    "Apple built MIE in a world before Mythos Preview."
    Context
    Pays off Friday's follow-up question — the actual write-up is now public. The pairing claim (model finds the bug class, human carries the novel mitigation bypass) is the load-bearing detail for anyone modeling the next year of vuln research.
    Key points
    • A data-only kernel local-privilege-escalation chain on macOS 26.4.1 (25E253) on bare-metal M5 hardware with kernel MIE enabled — first public exploit surviving Apple's Memory Integrity Enforcement.
    • Bruce Dang found the bugs on April 25; Dion Blazakis joined April 27; Josh Maine built tooling; a working exploit landed by May 1 — five days from first bug to root shell.
    • Mythos Preview identified the bugs (known classes generalize well); humans carried the MIE bypass because the mitigation is new and there are no exploit patterns to copy.
    • Reported in person at Apple Park instead of via the submission queue — laser-printed 55-page report, full technical details after Apple ships a fix.
    • Includes the line 'Apple spent $5 billion building this office, then asked about our office. We said, well, ours definitely cost less than $1 billion' — Calif's framing of small-team AI-assisted offense vs platform-vendor defense.
    Provenance
    Article · Supporting source
  7. 7

    Native all the way, until you need text

    Article Artem Loenko — Native macOS / iOS developer of ~20 years.

    "If you want to build rich text rendering for long-form chats, SwiftUI and Apple's native SDKs are not helping you. They stop being an advantage and start becoming constraints."

    justsitandgrin.im/posts/native-all-the-way-… →
    Details
    Cited text
    "If you want to build rich text rendering for long-form chats, SwiftUI and Apple's native SDKs are not helping you. They stop being an advantage and start becoming constraints."
    Context
    For builders shipping AI chat UIs on the desktop, this is the underexplained reason almost every model client ends up on a browser engine. The dominant interaction pattern of the era happens to be the one the native SDKs handle worst.
    Key points
    • A senior Apple-platform developer walks through trying to ship a streaming Markdown chat UI in pure SwiftUI: cannot select a Markdown document built from SwiftUI primitives — by design.
    • NSTextView with TextKit 2 lets you select but loses SwiftUI tooling and spikes CPU on streamed text; NSCollectionView is mature but cells blink during streaming, by design.
    • Pure TextKit 2 prototypes work but lose context menus, dictionary lookup, accessibility — months of work to reach baseline native parity.
    • WebKit Markdown rendering works well; an Electron prototype reaches better text behavior and typography out of the box than the pure TextKit 2 build.
    • Conclusion: chat as an interface pattern is web-native today; native SDKs are not the win for streaming Markdown surfaces.
    Provenance
    Article · Supporting source
  8. 8

    MCP Hello Page

    Article Luke Lanchester — Software engineer running HybridLogic; ships an MCP server for the author's day-job product.

    "It's not working though because they need to paste it into their client of choice, but no-one thinks that far ahead."

    www.hybridlogic.co.uk/blog/2026/05/mcp-hell… →
    Details
    Cited text
    "It's not working though because they need to paste it into their client of choice, but no-one thinks that far ahead."
    Context
    A small, observable, builder-led fix for a real onboarding cliff in MCP — and a useful tell that the spec is being adopted faster than its first-mile UX has caught up.
    Key points
    • Customers open his mcp.acme.com/mcp URL in a browser, get a 401 JSON blob, and file support tickets saying the link is broken.
    • The fix: when the request is GET /mcp and the Accept header includes text/html but not application/json or text/event-stream, return a plain HTML page explaining what an MCP server is and what to do with the URL.
    • Result: ticket volume on the issue dropped sharply, with no observable downside.
    • Author calls the MCP spec 'an utterly terrible attempt at a specification' for not anticipating this; argues the pattern should be in the spec.
    • The packaging alternative — building a connector or plugin per LLM client — is described as 'slow, painful, and a never-ending game of whack-a-mole.'
    Provenance
    Article · Supporting source