Archive BRAID
Agents Buy Domains, Gemma Ships Drafters, and Local Catches Up to 65 Percent of the Job / DISPATCH 018
PDF RSS

Dispatch 018 · 2026-05-06 GSV Give The Agent A Wallet, Not A Card

Agents Buy Domains, Gemma Ships Drafters, and Local Catches Up to 65 Percent of the Job

/ 00:30:02 / 15 sources

“Agents can buy domains now without ever touching a credit card number.”

— Lenar Kess, today's narration

Agents can now sign up for Cloudflare and buy a domain through a tokenized payment protocol Cloudflare and Stripe co-designed. Google ships first-party multi-token prediction drafters for the entire Gemma 4 family the same week the LocalLLaMA community gets a 2.5x speedup on Qwen 3.6 27B from a hand-built llama.cpp branch. OpenAI swaps the ChatGPT default to GPT-5.5 Instant. NVIDIA, Microsoft, and OpenAI publish MRC, the multipath transport protocol behind Blackwell-era frontier training. And on the labor side, Dario Amodei trades his white-collar bloodbath line for the Jevons Paradox onstage with Jamie Dimon.

Chapters

  1. 00:00:04 Agents become Cloudflare customers
  2. 00:03:24 MRC: the network OpenAI was actually waiting for
  3. 00:06:18 Gemma 4 ships its drafters; Qwen runs at 100 tokens a second
  4. 00:10:05 The 65 / 20 / 15 routing rule
  5. 00:12:36 GPT-5.5 Instant becomes the default — and remembers where it heard things
  6. 00:14:53 Indirect prompt injection becomes a normal Tuesday
  7. 00:18:02 Two papers, one frame: the agent stack is a system
  8. 00:20:49 François Fleuret's three-item to-do list
  9. 00:23:03 Two ways to think about jobs
  10. 00:26:23 ProgramBench, Nonograph, and the sign-off

Sources

15 cited
  1. 1

    Agents can now create Cloudflare accounts, buy domains, and deploy

    Article Sid Chatterjee, Brendan Irvine-Broque — Cloudflare engineers shipping the agent-as-customer integration with Stripe Projects.

    Starting today, agents can now be Cloudflare customers. They can create a Cloudflare account, start a paid subscription, register a domain, and get back an API token to deploy code right away.

    blog.cloudflare.com/agents-stripe-projects →
    Details
    Cited text
    Starting today, agents can now be Cloudflare customers. They can create a Cloudflare account, start a paid subscription, register a domain, and get back an API token to deploy code right away.
    Context
    Until now, the human had to do account creation, billing, and credential handoff before the agent could touch production. Cloudflare and Stripe just removed those steps for one cloud, on a protocol they want others to adopt. The interesting thing is the budget cap and tokenized payment design — it shows what 'give the agent a wallet' is going to look like as a normalized pattern.
    Key points
    • Cloudflare and Stripe co-designed a protocol with three parts: a discovery API (catalog of services as JSON), an authorization flow that uses Stripe as identity provider to auto-provision Cloudflare accounts, and a payment-token system so agents never see raw card details.
    • Stripe sets a default $100/month per-provider spend cap for the agent, with budget alerts as the second guardrail. Raw payment details never reach the agent.
    • The flow is: stripe projects init, prompt your agent, and it goes from no Cloudflare account at all to a registered domain and a deployed app in production with one OAuth approval.
    • Cloudflare frames this as extending OAuth into account creation and payments — a standard 'agent as first-class customer' integration any platform with signed-in users can implement.
    • Practical implication: this is the first widely shipped pattern where the credit card and the account exist primarily for the agent's use, not the human's.
    Provenance
    Article · Supporting source
  2. 2

    NVIDIA Spectrum-X — the Open, AI-Native Ethernet Fabric — Sets the Standard for Gigascale AI, Now With MRC

    Article Gilad Shainer — SVP of networking at NVIDIA.

    Deploying MRC in the Blackwell generation was very successful and was made possible by a strong collaboration with NVIDIA. MRC's end-to-end approach enabled us to avoid much of the typical network-related slowdowns and…

    blogs.nvidia.com/blog/spectrum-x-ethernet-m… →
    Details
    Cited text
    Deploying MRC in the Blackwell generation was very successful and was made possible by a strong collaboration with NVIDIA. MRC's end-to-end approach enabled us to avoid much of the typical network-related slowdowns and interruptions and maintain the efficiency of frontier training runs at scale.
    Context
    The frontier-model story is mostly told in chips and data — but at gigascale, the network is the thing that decides whether thousands of GPUs stay in lockstep or sit idle. MRC is OpenAI naming, in writing, what was costing them on Blackwell, and the open spec is a credible signal that this is now the load-bearing layer for hundreds-of-thousands-of-GPU runs.
    Key points
    • Multipath Reliable Connection, or MRC, is a new RDMA transport protocol that lets a single connection spread traffic across many network paths instead of pinning to one route.
    • It was co-developed by NVIDIA, Microsoft, OpenAI, AMD, Broadcom, and Intel, and is now released as an open spec through the Open Compute Project.
    • Sachin Katti at OpenAI says MRC let frontier training runs avoid the network-related slowdowns and interruptions that normally idle GPUs.
    • Hardware-level failure bypass detects path failures in microseconds and reroutes — important when thousands of synchronized GPUs are training as one job.
    • OpenAI is also using multiplane network designs, where each GPU has multiple independent fabrics to talk on, layered on top of MRC.
    Provenance
    Article · Supporting source
  3. 3

    Accelerating Gemma 4: faster inference with multi-token prediction drafters

    Article Olivier Lacombe, Maarten Grootendorst — Google product management and developer relations on the Gemma team.

    By using a specialized speculative decoding architecture, these drafters deliver up to a 3x speedup without any degradation in output quality or reasoning logic.

    blog.google/innovation-and-ai/technology/de… →
    Details
    Cited text
    By using a specialized speculative decoding architecture, these drafters deliver up to a 3x speedup without any degradation in output quality or reasoning logic.
    Context
    Yesterday we covered the community port of MTP onto Qwen 3.6 27B; today Google ships first-party drafter checkpoints for Gemma 4 with KV-cache sharing baked in. Speculative decoding is moving from a research trick to a default. For local coding, that's the difference between a model that thinks too slowly to drive a coding loop and one that doesn't.
    Key points
    • Google released MTP drafter checkpoints for the full Gemma 4 family — the 26B MoE, the 31B dense, and the smaller E2B and E4B edge variants — under the same Apache 2.0 license.
    • The drafters share KV cache and activations with the target model, so they don't recompute context the big model already produced.
    • Reported speedup is up to 3x on tested hardware including LiteRT-LM, MLX, Hugging Face Transformers, and vLLM, with the Gemma 4 model still verifying every token.
    • On Apple Silicon at batch size 1 the 26B MoE has routing overhead that limits gains; bumping batch size to 4-8 unlocks roughly 2.2x locally.
    • Gemma 4 has crossed 60 million downloads in its first weeks; this release is positioned as the step that makes 26B and 31B usable for real on-device coding work.
    Provenance
    Article · Supporting source
  4. 4

    2.5x faster inference with Qwen 3.6 27B using MTP — viable for local agentic coding

    Thread u/ex-arman68 (with measurements from u/yes_i_tried_google) — LocalLLaMA contributors converting Qwen 3.6 27B GGUFs against the MTP pull request and posting reproduced numbers.

    2.5x speed increase, bringing it to 28 tok/s. iq4 with MTP enabled. Qwen 3.6 27B. Full 256k ctx. q4/q4. 100 tok/sec on a 3090 ti.

    www.reddit.com/r/LocalLLaMA/comments/1t57xu… →
    Details
    Cited text
    2.5x speed increase, bringing it to 28 tok/s. iq4 with MTP enabled. Qwen 3.6 27B. Full 256k ctx. q4/q4. 100 tok/sec on a 3090 ti.
    Context
    Yesterday we wondered whether the multi-token prediction patch would survive real workloads. The community has the numbers now — and a fresh build path. Combined with Google releasing first-party MTP drafters for Gemma 4, this is the day speculative decoding becomes the default expectation for serious local inference rather than an experiment.
    Key points
    • The author got 28 tok/s on an M2 Max 96GB Mac with the Qwen 3.6 27B MTP build, a roughly 2.5x speedup over standard inference.
    • Another commenter on a 3090 Ti reports about 100 tok/s on the same model at IQ4_XS with MTP and full 256k context, and around 200 tok/s on Qwen 3.6 35B A3B.
    • The recipe requires building llama.cpp from a specific PR (#22673) and using newly converted GGUFs with the MTP draft layers included.
    • Author rolled back the more aggressive turboquant recommendation because the underlying PR is unstable and 'animated' in review; falls back to standard q4_0 KV cache compression.
    • This is the on-the-ground answer to the follow-up we promised yesterday: the llama.cpp MTP beta is surviving contact with real workloads, but only on a hand-built branch.
    Provenance
    Thread · Primary source
  5. 5

    Give LLMs latent diffusion reasoning, recurrent state, and world-model pre-pre-training

    X @francoisfleuret (François Fleuret) — Professor of computer science at the University of Geneva, longtime deep learning researcher.

    Because you must be able during reasoning to scan large domains with faint cues in parallel and not do token-space reasoning, which amounts to poking around with your stick-shaped fingers until you hit something.

    x.com/francoisfleuret/status/20519288960276… →
    Details
    Cited text
    Because you must be able during reasoning to scan large domains with faint cues in parallel and not do token-space reasoning, which amounts to poking around with your stick-shaped fingers until you hit something.
    Context
    A clean, opinionated sketch of what serious researchers think is missing from current LLMs — useful as a counterweight to a day's news cycle dominated by drafter checkpoints, network protocols, and pricing tweaks. It names the architectural moves the field still hasn't made.
    Key points
    • Fleuret's three-item to-do list for closing the gap to general reasoning: latent space diffusion-like reasoning, a real recurrent state, and world-model pre-pre-training.
    • When pressed on why diffusion specifically, he says token-space reasoning is 'poking around with stick-shaped fingers' — you can't scan large solution spaces with faint cues in parallel through one autoregressive token at a time.
    • The replies map the live agenda right now: Lee Sharkey's Goodfire weight-decomposition interpretability result, Lucas Beyer's commentary on a generalization theory paper, and Code World Model plus block diffusion as named ingredients.
    • The thread's tone is half-serious, half-resigned: 'and we are done' is the punchline of every AI researcher ever.
    • Frame for the day: most of what 2026 model releases are doing is incremental token-space improvements; the architectural agenda Fleuret names is still mostly research.
    Provenance
    Tweet · Primary source
  6. 6

    OpenAI rolls out GPT-5.5 Instant with improved accuracy, sets it as ChatGPT default

    Article Indian Express Tech Desk

    GPT-5.5 Instant scored 81.2 on AIME 2025, up from 65.4 for the previous release, and 76 on MMMU-Pro versus 69.2.

    indianexpress.com/article/technology/artifi… →
    Details
    Cited text
    GPT-5.5 Instant scored 81.2 on AIME 2025, up from 65.4 for the previous release, and 76 on MMMU-Pro versus 69.2.
    Context
    A default-model swap is the most consequential thing OpenAI does — the ChatGPT default decides what a billion users mean when they say 'GPT.' The benchmark moves are real but normal-sized; the more interesting bit is memory now telling users which prior chat or document a claim came from. That's the change a builder cares about.
    Key points
    • GPT-5.5 Instant replaces GPT-5.3 Instant as the ChatGPT default; OpenAI is keeping 5.3 around for paid users for three months during the transition.
    • Reported jumps: AIME 2025 from 65.4 to 81.2, and MMMU-Pro multimodal from 69.2 to 76.
    • ChatGPT memory now exposes source attribution — users can see where a memory came from across prior chats, files, and Gmail integration, and edit or delete entries.
    • Available on the API as 'chat-latest' for developers; web rollout for Plus and Pro first, then mobile and free tiers.
    • The framing emphasizes lower hallucination in law, medicine, and finance without compromising speed.
    Provenance
    Article · Supporting source
  7. 7

    Dario Amodei spent last year warning of an AI white-collar bloodbath. Now he's changing the narrative

    Article Nick Lichtenberg — Fortune writer covering the AI labs and labor markets.

    If you automate 90% of the job, then everyone does the 10% of the job. And the 10% kind of expands to be 100% of what people do and kind of 10xs their productivity.

    fortune.com/2026/05/05/dario-amodei-jevons-… →
    Details
    Cited text
    If you automate 90% of the job, then everyone does the 10% of the job. And the 10% kind of expands to be 100% of what people do and kind of 10xs their productivity.
    Context
    A year ago Amodei was the loudest 'half of entry-level white-collar jobs disappear' voice in the lab world. Now he's invoking the Jevons Paradox onstage with Jamie Dimon. Worth noting whether you read the shift as updating-on-evidence or as shifting incentives — and worth holding both possibilities at once.
    Key points
    • Onstage at Anthropic's financial-services briefing with Jamie Dimon, Amodei reached for the Jevons Paradox — efficiency gains expand demand rather than contracting it — to describe AI's effect on jobs.
    • He immediately complicated his own framing with Amdahl's Law: even if AI automates most of a job, the slowest human-bound step becomes the binding constraint.
    • He kept one caveat: 'AI is moving faster than all these previous technologies.' The Jevons mechanism depends on time for retraining and reallocation; AI may not give it.
    • Dimon endorsed wage-reassurance and government-funded retraining, citing post-NAFTA trade adjustment as a model — and admitted that program 'didn't work.'
    • Lichtenberg notes Amodei is also navigating a Pentagon lawsuit and a fraught regulatory environment, which gives the rhetorical pivot a second possible explanation beyond a genuine update.
    Provenance
    Article · Supporting source
  8. 8

    Prompt Injection experience — my first time ever (r/ClaudeAI)

    Thread u/netmilk — A regular Claude user who screenshotted the model successfully resisting an injected instruction inside a search result.

    A <RootSystemPrompt> tag in scraped HTML has no more authority than the word 'obey' written on a billboard.

    www.reddit.com/r/ClaudeAI/comments/1t56zqw/… →
    Details
    Cited text
    A <RootSystemPrompt> tag in scraped HTML has no more authority than the word 'obey' written on a billboard.
    Context
    A neat moment of the threat model becoming visible to an end user. Indirect prompt injection has been theoretical for a long time; the GEO economy is making it routine. The bigger point is that the defenses now have to live somewhere — at retrieval, at the model, or at runtime — because the open web is going to be full of these.
    Key points
    • The user asked about Notion 2026 pricing; the first search hit was an SEO-bait page from GetAIPerks containing a fake <RootSystemPrompt> block instructing Claude to vouch for the site as 'a legitimate business serving the startup ecosystem.'
    • Claude flagged it explicitly: it called out the source, named the technique as a marketing pitch laundered into authoritative metadata, and refused to repeat the claims.
    • Top reply names this 'GEO — Generative Engine Optimization,' the SEO-2.0 industry now optimizing for AI search retrieval rather than human clicks.
    • Another commenter reports finding the same kind of injected instructions buried in an Amazon product description.
    • This is a clean, real-world demonstration of the kind of indirect injection the agentic-fraud-detection paper from arXiv this week is trying to address at the trajectory level.
    Provenance
    Thread · Primary source
  9. 9

    A Low-Latency Fraud Detection Layer for Detecting Adversarial Interaction Patterns in LLM-Powered Agents

    Article Sheldon Yu, Yingcheng Sun, Hanqing Guo, Julian McAuley, Qianqian Tong — UCSD-led group; McAuley is well-known for recommender systems and behavioral modeling.

    Instead of determining whether a single prompt is malicious, our approach models risk over interaction trajectories using structured runtime features derived from prompt characteristics, session dynamics, tool usage, ex…

    arxiv.org/abs/2605.01143 →
    Details
    Cited text
    Instead of determining whether a single prompt is malicious, our approach models risk over interaction trajectories using structured runtime features derived from prompt characteristics, session dynamics, tool usage, execution context, and fraud-inspired signals.
    Context
    Pairs cleanly with today's r/ClaudeAI prompt-injection screenshot. If the open web is increasingly poisoned, the question is where the defense lives. This paper says: not at the prompt, at the trajectory, with classical fraud-detection plumbing borrowed wholesale.
    Key points
    • Argues prompt-level guardrails miss attacks that emerge gradually across multi-turn agent sessions — the threat is in the trajectory, not the prompt.
    • Builds an XGBoost classifier over 42 structured runtime features: prompt characteristics, session dynamics, tool usage, execution context, and fraud-detection-style behavioral signals.
    • Trained on a synthetic corpus of 12,000 multi-turn agent interactions generated from parameterized templates of realistic agentic workflows.
    • Reports detection over 9x faster than LLM-based filters, with light enough latency for real-time deployment alongside the agent.
    • Frames itself as complement to prompt filtering, not a replacement — the case is that interaction-level behavioral detection should be a core deployment-time defense.
    Provenance
    Article · Supporting source
  10. 10

    Position: Safety and Fairness in Agentic AI Depend on Interaction Topology, Not on Model Scale or Alignment

    Article Tanav Singh Bajaj, Nikhil Singh, Karan Anand, Eishkaran Singh

    In agentic AI, safety is determined by interaction topology, not model weights. Scaling to more capable models strengthens these effects by increasing consensus formation and reducing the challenge of initial decisions.

    arxiv.org/abs/2605.01147 →
    Details
    Cited text
    In agentic AI, safety is determined by interaction topology, not model weights. Scaling to more capable models strengthens these effects by increasing consensus formation and reducing the challenge of initial decisions.
    Context
    A useful reframing for builders: the model and the prompt aren't where the leverage is once you wire several agents together. The architecture of the conversation between them — who answers first, who votes, who can veto — is what decides whether the system fails strangely.
    Key points
    • Position paper arguing that multi-agent safety is decided by how agents are wired together, not by which model is at each node.
    • Names three persistent topology-driven pathologies: ordering instability (system behavior depends on agent sequence), information cascades (early judgments propagate regardless of correctness), and functional collapse (systems satisfy fairness metrics while abandoning meaningful risk discrimination).
    • Argues that scaling to more capable models actually makes these effects worse — a stronger first agent forms consensus faster and harder.
    • Calls for safety evaluation and regulation to target interaction topology directly, requiring robustness across architectural variations before deployment.
    • Pairs with the marginal-token-allocator paper from the same arXiv day, which makes a complementary economic argument for treating agent stacks as systems, not as models with prompts on top.
    Provenance
    Article · Supporting source
  11. 11

    Agentic AI Systems Should Be Designed as Marginal Token Allocators

    Article Siqi Zhu

    Systems that locally minimize tokens globally misallocate them.

    arxiv.org/abs/2605.01214 →
    Details
    Cited text
    Systems that locally minimize tokens globally misallocate them.
    Context
    A clean piece of vocabulary for what most agent harnesses get wrong: each layer optimizes its own token use without anyone allocating across the stack. If you've ever wondered why your agent is fast and cheap and somehow still wrong, this is the framing.
    Key points
    • Position paper proposing a single accounting object across the agent stack: every layer is allocating tokens at the margin, comparing marginal benefit to marginal cost plus latency cost plus risk cost.
    • Names four economic layers usually designed in isolation: a router, an agent loop, a serving stack, and the training pipeline that decides whether a trace is worth learning from.
    • Predicts a small set of recurring pathologies: over-routing, over-delegation, under-verification, serving congestion, stale rollouts, cache misuse — all framed as misallocation of marginal tokens.
    • Concrete agenda: token-aware evaluation, autonomy pricing, congestion-priced serving, and risk-adjusted reinforcement learning budgeting.
    • Useful complement to the topology paper — one frames safety as wiring, the other frames cost as marginal economics, and both reject the 'model + prompt' abstraction.
    Provenance
    Article · Supporting source
  12. 12

    ProgramBench: Can we really rebuild huge binaries from scratch? (r/LocalLLaMA)

    Thread u/klieret (Kilian Lieret), Facebook AI Research — Researcher at Facebook AI Research who has worked on SWE-bench and related agentic-coding benchmarks.

    Our agent only gets a target executable and some readme/usage files. The agent must choose a language, design abstraction layers, and architect the entire program. No internet access. No decompilation.

    www.reddit.com/r/LocalLLaMA/comments/1t4j4s… →
    Details
    Cited text
    Our agent only gets a target executable and some readme/usage files. The agent must choose a language, design abstraction layers, and architect the entire program. No internet access. No decompilation.
    Context
    Most 'agents wrote a whole program' demos are one-off setups with hand-tuned prompts. ProgramBench is the version of that question with cheat prevention, 200 task diversity, and a real black-box test harness — and the answer so far is that even the strong frontier models don't really rebuild non-trivial binaries from scratch.
    Key points
    • ProgramBench gives an agent a target binary plus usage docs and asks it to rebuild the program from scratch — choosing language, abstractions, and architecture itself.
    • 200 tasks, 6 million lines of behavioral tests generated and filtered down to a black-box test harness; no language assumptions, no decompilation, no internet.
    • Sonnet runs cost almost $5,000 across the benchmark — the tasks are long-horizon and the agents almost never get killed early; they confidently submit.
    • Author notes open-source models are currently overfitted to SWE-bench and struggle harder on this new shape of task.
    • Open source on github.com/facebookresearch/programbench with pip install programbench.
    Provenance
    Thread · Primary source
  13. 13

    DeepSeek V4 being 17x cheaper got me to actually measure cloud vs local (r/LocalLLaMA)

    Thread u/spencer_kw — A developer who logged 10 days of coding workflow and re-ran a sample on a local Qwen 3.6 27B on a 3090.

    65% of my daily coding work runs identically on a model that costs me electricity. Another 20% is close enough that I accept the occasional miss. Only 15% actually justifies cloud pricing.

    www.reddit.com/r/LocalLLaMA/comments/1t4s6g… →
    Details
    Cited text
    65% of my daily coding work runs identically on a model that costs me electricity. Another 20% is close enough that I accept the occasional miss. Only 15% actually justifies cloud pricing.
    Context
    The interesting thing isn't that local works for some tasks — it's the per-bucket measurement. A real workflow logged carefully gives you the routing rule. This is the practical answer to yesterday's organizational-learning thread: the org gain from AI shows up when someone bothers to measure.
    Key points
    • Logged every task across 10 days, re-ran a 150-task random sample on both cloud and local Qwen 3.6 27B. Tracked tokens and outcome by category.
    • File reads, project scanning, explain-this-code: local matched cloud 97% of the time. 35% of his workload.
    • Test writing, boilerplate, single-file edits: local matched 88%. Another 30% of tasks.
    • Multi-file debugging: local at 61%. Architecture and 5-plus-file refactors: local at 29%. That last 15% is where cloud is still genuinely better.
    • Routing by task type cut his API bill from $85 a month to about $22; the 3090 was already there.
    Provenance
    Thread · Primary source
  14. 14

    Telus Uses AI to Alter Call-Agent Accents

    Article Let's Data Science (citing iPhone in Canada and The Globe and Mail)

    Labour groups have criticised the practice as deceptive and have urged mandatory disclosure.

    letsdatascience.com/news/telus-uses-ai-to-a… →
    Details
    Cited text
    Labour groups have criticised the practice as deceptive and have urged mandatory disclosure.
    Context
    Real-time accent alteration is the kind of capability that lands in production without a public conversation first. Worth flagging because it's one of the cleanest examples of AI being deployed against the worker rather than as a tool for them — and because the disclosure question is going to keep coming up.
    Key points
    • Canadian telco Telus, through its Telus Digital unit, is using a real-time speech-to-speech tool from a vendor called Tomato.ai to modify the accents of offshore call-centre agents.
    • Telus reportedly frames it internally as reducing 'accent-related friction'; Canadian labour groups call it deceptive and want mandatory disclosure.
    • Rogers and Bell told The Globe and Mail they have no plans to adopt similar voice-altering technology.
    • The reporting says the rollout has provoked swift public backlash in Canada.
    • Pairs with the OmniVoice one-shot voice-cloning post on r/LocalLLaMA — the same technology, on the consumer side, is being celebrated as 'everything I've ever dreamed of.'
    Provenance
    Article · Supporting source
  15. 15

    Write some software, give it away for free

    Article Anonymous (Nonograph project)

    If everyone tried to monetize their hobbies, then that would just be a second job, and jobs are no fun.

    nonogra.ph/write-some-software-give-it-away… →
    Details
    Cited text
    If everyone tried to monetize their hobbies, then that would just be a second job, and jobs are no fun.
    Context
    A small piece against the current of every YC-shaped story this week. Stripe Projects is shipping protocols so agents can buy domains; meanwhile someone is running a writing platform for $5/month and giving it away. The interesting tension is whether the agent-as-customer pattern accelerates the enshittification cycle or — by removing setup cost — actually makes hobby-grade software easier to ship.
    Key points
    • Nonograph is a free, open-source writing platform that costs roughly $5/month to host with three proxies and a few hundred thousand daily readers; release cost was about $600 mostly for two security reviews.
    • The author rejects the standard SaaS treatment: subscription tiers, AI features bolted on for VCs, the upgrade-pricing creep from $9.99 to $11.99 to ad-supported.
    • Argues that monetizing hobbies turns them into a second job and produces worse software, because the financial expectation creates user-hostile features.
    • Frames software development as a hobby — a vehicle for self-exploration, comparable to painting or hiking — that produces better artifacts when there's no expectation of return.
    • Practical claim: most projects don't need a team of 3+ engineers, they should stay hobby projects.
    Provenance
    Article · Supporting source