Archive BRAID
Fast models, slow developers — and the part of the job that stays yours / DISPATCH 035
PDF RSS

Dispatch 035 · 2026-05-23 GCU Stay The One Who Decides

Fast models, slow developers — and the part of the job that stays yours

/ 00:21:39 / 10 sources

“When the machine gets fast and capable and cheap, the only job that stays yours is being the one who decides.”

— Lenar Kess, today's narration

A Saturday episode about what your job becomes when the model writes the code — and writes it fast. The bottleneck moved from typing to deciding, and a surprising number of this week's stories land on the same instruction: stay the one who decides. Plus a price floor, a reclassification, a year of bold predictions, and a 4-year-old gaming card that won't quit.

Chapters

  1. 00:00:04 Six months since he wrote code
  2. 00:02:05 Fast models, slow developers
  3. 00:06:40 Two ends of the same pipe
  4. 00:09:57 Jack Clark's year of predictions
  5. 00:13:46 164 tokens a second on a 3090
  6. 00:16:32 Containerizing the agent
  7. 00:18:42 How the rest of the world sees this

Sources

10 cited
  1. 1

    "I don't write code anymore"

    X @levelsio — Pieter Levels — indie maker behind a string of small one-person products (Nomad List, PhotoAI); known for shipping solo

    I don't write code anymore. I haven't written code in I think 6 months? I think everyone is like this no?

    x.com/levelsio/status/2058116725929828722 →
    Details
    Cited text
    I don't write code anymore. I haven't written code in I think 6 months? I think everyone is like this no?
    Context
    A concrete data point on how far solo-maker workflows have moved toward agent-driven building — and a useful test of where that generalizes and where it doesn't.
    Key points
    • Levels says he hasn't written code in roughly six months, working entirely through Claude Code
    • His setup is browser tabs into his own sites on a VPS, synced to his phone, agent open to fix or build
    • Claims 'everyone is like this' — the contested part
    • Marc Andreessen amplified it to a large audience with a one-word comment: 'Interesting.'
    Engagement
    1233 likes · 102 retweets
    Provenance
    Tweet · Primary source
  2. 2

    Andreessen reposts the levelsio coding setup

    X @pmarca — Marc Andreessen, co-founder of Andreessen Horowitz

    Interesting.

    x.com/pmarca/status/2058144277340049588 →
    Details
    Cited text
    Interesting.
    Context
    Shows how the solo-maker workflow narrative travels — and gets flattened into a slogan — once a large account amplifies it.
    Key points
    • Andreessen quote-tweeted Levels's screenshot of an all-day VPS + Claude Code setup
    • His entire comment was one word
    • The amplification is what put the 'I don't write code anymore' line in front of hundreds of thousands
    Provenance
    Tweet · Primary source
  3. 3

    Fast Models Need Slow Developers — Sarah Chieng, Cerebras

    Video Sarah Chieng (Cerebras), via AI Engineer — Sarah Chieng, head of developer experience at Cerebras

    A lot of these bad habits that we had before that were generating maybe 50 tokens per second of bad code — unless we fix them, they're going to start generating 1,200 tokens per second of bad code.

    www.youtube.com/watch?v=TeGsFFNqRLA →
    Details
    Cited text
    A lot of these bad habits that we had before that were generating maybe 50 tokens per second of bad code — unless we fix them, they're going to start generating 1,200 tokens per second of bad code.
    Context
    The sharpest articulation of how fast generation changes the engineer's job — the bottleneck moves from typing to deciding, and the discipline matters more, not less.
    Key points
    • Codex Spark (Cerebras + OpenAI) generates code at ~1,200 tokens/sec vs 40-60 for Sonnet/Opus — about 20x faster
    • Slow-era bad habits — giant one-shot prompts, huge commits, unverified agent swarms — produce bad code 20x faster now
    • Validation becomes 'basically free': lint, pre-commit hooks, diff reviews, browser tests at every step
    • Orchestrate by model strength: big model plans, fast model executes; capture good sessions as reusable skills
    • Cherry-pick across many generated variants to 'artificially induce taste'; stay the decision-maker; externalize agent memory into plain files because compaction now arrives in ~30 seconds
    Provenance
    Video · Supporting source
  4. 4

    DeepSeek cuts V4-Pro prices by 75%

    Article The Next Web

    DeepSeek is making its 75% API discount permanent.

    thenextweb.com/news/deepseek-v4-pro-price-c… →
    Details
    Cited text
    DeepSeek is making its 75% API discount permanent.
    Context
    Frontier-class inference is racing to a price floor; the cache-hit discount specifically rewards agent loops that re-send stable context every step.
    Key points
    • The 75% V4-Pro discount, framed as a promo in April, is now permanent (effective after promo ends end of May)
    • New rate is roughly one quarter of the original — about $0.44 per million input tokens, $0.87 per million output by secondary reporting
    • Paired with a ~90% cut to input cache-hit costs across the API
    • Reported framing: targeting developers frustrated with Western providers' rate limits and restrictions
    Provenance
    Article · Supporting source
  5. 5

    NVIDIA Removes Gaming Revenue Category From Financial Reports

    Article Hilbert Hagedoorn — Guru3D

    NVIDIA is signaling a broader strategic shift toward accelerated computing and AI-driven markets.

    www.guru3d.com/story/nvidia-removes-gaming-… →
    Details
    Cited text
    NVIDIA is signaling a broader strategic shift toward accelerated computing and AI-driven markets.
    Context
    The company that built itself on GeForce now treats consumer gaming as a footnote next to data-center demand — a clean read on where the money sits.
    Key points
    • NVIDIA folded standalone 'Gaming' into a broader 'Edge Computing' category in its fiscal 2027 Q1 report
    • Edge Computing — GeForce, AI PCs, workstation, consoles, robotics, networking, automotive — was ~$6.4B for the quarter
    • Total quarterly revenue ~$81.6B, up 85% year over year, driven by data center
    • NVIDIA says it is not exiting gaming hardware; RTX cards keep shipping — but gaming is no longer the headline story
    Provenance
    Article · Supporting source
  6. 6

    AI will help make a Nobel prize-winning discovery within a year, says Anthropic co-founder

    Article Robert Booth (The Guardian) — Reporting on Jack Clark, Anthropic co-founder and author of the Import AI newsletter, speaking at Oxford

    If we stand by and let synthetic intelligence multiply, then we'll eventually be forced into reactivity.

    www.theguardian.com/technology/2026/may/21/… →
    Details
    Cited text
    If we stand by and let synthetic intelligence multiply, then we'll eventually be forced into reactivity.
    Context
    A sober forecaster's calibrated bets are worth engaging — and the falsifiable ones (AI-run revenue, self-designed successors) are the ones to actually hold him to.
    Key points
    • Clark's spread of predictions: AI-assisted Nobel discovery within 12 months; bipedal robots helping tradespeople in 2 years; AI-run companies making millions within 18 months; AI designing its own successors by end of 2028
    • Still flags a 'non-zero chance of killing everyone on the planet' and says that risk hasn't gone away
    • Notes Anthropic's Mythos model proved 'alarmingly capable at exploiting cybersecurity weaknesses'
    • Says he'd prefer to slow down 'to give ourselves more time as a species' but expects competition to prevent it
    • Co-host Edward Harcourt (Oxford Institute for Ethics in AI) warned of 'cognitive atrophy' and argued for 'Socratic' AI that makes humans do more thinking
    Provenance
    Article · Supporting source
  7. 7

    BeeLlama v0.2.0 — major DFlash update on a single RTX 3090

    Source Anbeeld (r/LocalLLaMA)

    Squeezing that 3090 like a lemon.

    www.reddit.com/r/LocalLLaMA/comments/1tkpz2… →
    Details
    Cited text
    Squeezing that 3090 like a lemon.
    Context
    A 4-5x speedup on a four-year-old consumer card is the difference between local models being a toy and being usable in an agent loop you run on hardware you own.
    Key points
    • On a single RTX 3090 (24GB): Qwen 3.6 27B up to 164 tokens/sec (~4.4x llama.cpp baseline); Gemma 4 31B up to 177.8 tokens/sec (~4.9x)
    • Mechanism is speculative decoding (DFlash): a small draft model proposes tokens, the target verifies in parallel
    • Update adds lower draft overhead, draft K/V projection caching, stricter draft/target validation, safer fallback to full logits
    • Prompt processing stays near baseline; speedup depends on acceptance rate and is workload-dependent
    • Top comment asks the real test: does it hold for 200K-token agentic coding chats?
    Provenance
    Source · Background source
  8. 8

    Lobster Trap: OpenClaw in Containers from Local to K8s and Back — Sally Ann O'Malley, Red Hat

    Video Sally Ann O'Malley (Red Hat), via AI Engineer — Red Hat engineer presenting at the AI Engineer conference

    Sharing a good agent setup usually means handing someone a pile of markdown, config files, and YAML and hoping they reproduce what you have.

    www.youtube.com/watch?v=F1DYkY1BlfM →
    Details
    Cited text
    Sharing a good agent setup usually means handing someone a pile of markdown, config files, and YAML and hoping they reproduce what you have.
    Context
    Signals the agent maturing from a personal dotfile into a versioned, deployable artifact — provisioned and governed like any other piece of the stack.
    Key points
    • Packages an OpenClaw agent setup as a container image so a personal config becomes a reproducible team baseline
    • Podman locally, spin up a sub-agent in ~2 seconds, flip a flag to run the same image on Kubernetes
    • Secrets handled in two layers: Podman secrets for host API keys, OpenClaw secret references inside the agent
    • The constraint it solves is reproducibility — config drift is why a teammate's agent behaves differently
    Provenance
    Video · Supporting source
  9. 9

    Overhearing "world models and grounded video gen" in a Copenhagen park

    X @niloofar_mire — AI researcher

    I overheard the couple next to me talking about world models and grounded video gen LOL

    x.com/niloofar_mire/status/2058148404673331… →
    Details
    Cited text
    I overheard the couple next to me talking about world models and grounded video gen LOL
    Context
    A light counterpoint to the bubble — even the park bench is inside it now — that sets up how differently people outside tech experience all this.
    Key points
    • Researcher flew to Copenhagen burnt out, to detach
    • Sitting in a random park, overheard strangers discussing world models and grounded video generation
    • A small, funny marker of how pervasive the AI conversation has become
    Provenance
    Tweet · Primary source
  10. 10

    Is AI viewed as "evil" in non-tech communities?

    Source Due_Drummer5147 (r/singularity)

    For a lot of people, there's limited upsides to AI right now... right now it's not serving most people and in many cases it's causing harm.

    www.reddit.com/r/singularity/comments/1tl68… →
    Details
    Cited text
    For a lot of people, there's limited upsides to AI right now... right now it's not serving most people and in many cases it's causing harm.
    Context
    The view from outside the bubble, stated without strawmanning — a reminder that most people experience AI as something done to them, not a tool they wield.
    Key points
    • A data engineer asks for a reality check after a hostile reaction to suggesting AI to non-tech people
    • Top reply (hundreds of upvotes): people see AI shoved into everything by billionaires 'siphoning the planet's energy and water', livelihoods lost, creatives first
    • Concedes medicine/math strides but says it isn't serving most people and often causes harm
    • Another commenter notes even r/technology skews anti-AI
    Provenance
    Source · Background source