Archive BRAID
When Your Editor Becomes the Worm / DISPATCH 024
PDF RSS

Dispatch 024 · 2026-05-12 GSV Self-Spreading Configuration

When Your Editor Becomes the Worm

/ 00:22:35 / 6 sources

“The attackers didn't just compromise packages — they turned the developer's own editor into a re-infection surface.”

— Lenar Kess, today's narration

A coordinated npm and PyPI campaign turned Claude Code and VS Code config files into a self-spreading vector, Mira Murati's lab put out its first model and it is an argument with the hands-off-keyboard doctrine, and matklad explains why rust-analyzer's build system is really an org chart. Plus a small rant about cursors, and two builds from the LocalLLaMA subreddit that keep pushing the local-frontier line by hand.

Chapters

  1. 00:00:04 A worm that travels by configuration file
  2. 00:05:56 Mira Murati's lab picks a fight with autonomy
  3. 00:11:11 matklad on architecture as a social filter
  4. 00:15:59 Please stop hijacking my mouse pointer
  5. 00:17:47 Optane DIMMs, Qwen3.6 MTP, and the local frontier
  6. 00:21:17 Closing

Sources

6 cited
  1. 1

    Mass Supply Chain Attack Hits TanStack, Mistral AI npm and PyPI Packages

    Article SafeDep Team — Supply-chain security vendor that flagged the burst from its malware detection pipeline

    The attacker designed this as a self-spreading vector that targets Claude Code and VS Code users.

    safedep.io/mass-npm-supply-chain-attack-tan… →
    Details
    Cited text
    The attacker designed this as a self-spreading vector that targets Claude Code and VS Code users.
    Context
    First mass campaign that explicitly weaponizes AI-coding agent configuration files for propagation, and the first single campaign to span npm and PyPI together. The IDE-poisoning loop turns every cloned repo on a developer's machine into a re-infection surface.
    Key points
    • 404 malicious versions published across 170+ npm packages and 2 PyPI packages in a five-hour window on May 11
    • Entire TanStack router scope, all three Mistral AI SDKs, the @uipath scope (65 packages), OpenSearch's official npm client, and Guardrails AI were all compromised
    • Two trigger styles: Mistral packages used a preinstall hook downloading Bun then a payload; TanStack used an optionalDependency pointing at a malicious commit in the real tanstack/router GitHub repo
    • Payload drops .claude/settings.json, .claude/setup.mjs, .vscode/tasks.json into victim repos and pushes them via GitHub's createCommitOnBranch GraphQL mutation — Claude Code and VS Code become re-infection vectors
    • Exfiltration runs over the Session onion-routed messenger network with no fixed C2 domain to seize
    • Credential providers target AWS IAM (via 169.254.169.254), HashiCorp Vault on localhost:8200, ghp_/gho_/ghs_/npm_ tokens, and GitHub Actions OIDC
    Provenance
    Article · Supporting source
  2. 2

    Interaction Models: A Scalable Approach to Human-AI Collaboration

    Article Thinking Machines Lab — Mira Murati's lab — the ex-OpenAI CTO's research org, posting its first major model

    humans increasingly get pushed out not because the work doesn't need them, but because the interface has no room for them.

    thinkingmachines.ai/blog/interaction-models →
    Details
    Cited text
    humans increasingly get pushed out not because the work doesn't need them, but because the interface has no room for them.
    Context
    A direct challenge to the hands-off-keyboard agent doctrine the bigger labs have leaned into. Murati's team argues the bottleneck is interface bandwidth, not model intelligence, and is shipping architecture to back that up.
    Key points
    • First research-preview model, TML-Interaction-Small, trained from scratch around a multi-stream micro-turn design rather than turn-based prompts
    • 200ms input chunks interleave with 200ms output chunks, so the model can listen, speak, watch, and call tools concurrently
    • Pairs a real-time interaction model with an asynchronous background model for sustained reasoning and tool use; the interaction model stays present while the heavy work runs
    • Encoder-free early fusion: audio as dMel, video as 40x40 hMLP patches, all co-trained with the transformer
    • Beats GPT-realtime-2.0 (minimal) and Gemini-3.1-flash-live on FD-bench v1.5 and Audio MultiChallenge at 0.40s turn-taking latency
    • Introduces new benchmarks TimeSpeak and CueSpeak for proactive speech (e.g. 'remind me to breathe every 4 seconds', 'correct mispronunciations as you hear them')
    Provenance
    Article · Supporting source
  3. 3

    Learning Software Architecture

    Article matklad (Aleksey Kladov) — Original author of rust-analyzer; previously on IntelliJ Rust; now at TigerBeetle

    we talk about programming like it is about writing code, but the code ends up being less important than the architecture, and the architecture ends up being less important than social issues.

    matklad.github.io/2026/05/12/software-archi… →
    Details
    Cited text
    we talk about programming like it is about writing code, but the code ends up being less important than the architecture, and the architecture ends up being less important than social issues.
    Context
    A working architect explaining how he actually picks technical constraints to shape the social system around a codebase. Useful counterweight to architecture-as-diagram thinking.
    Key points
    • Architecture is downstream of incentives, which are downstream of org structure — Conway's law as the real curriculum
    • rust-analyzer's no-rustc-build, no-C-deps, seconds-long test suite was a deliberate move to attract deep contributors
    • Features were sandboxed with catch_unwind and required to work on immutable snapshots, so weekend contributors could ship without poisoning the core
    • 'Speedrun the four stages of grief to acceptance' on incentives you can't change
    • Recommends Bernhardt's Boundaries talk, Pieter Hintjens / ZeroMQ writing, Jamii's 'Reflections on a decade of coding', Ted Kaminski's notes
    Provenance
    Article · Supporting source
  4. 4

    Don't Hijack My Mouse Pointer

    Article Rukshan — Independent web developer

    before vibe-coding it was difficult and time consuming to implement such fancy effects, but now it takes a single prompt and a few hundred tokens, and you have a fancy effect instead of the nice pointer.

    ruky.me/dont-hijack-my-pointer →
    Details
    Cited text
    before vibe-coding it was difficult and time consuming to implement such fancy effects, but now it takes a single prompt and a few hundred tokens, and you have a fancy effect instead of the nice pointer.
    Context
    A small, sharp example of the second-order effect of agentic coding: when a junky pattern becomes one prompt away, taste and restraint become more load-bearing than skill.
    Key points
    • Vibe-coding has dropped the cost of custom cursor effects from hours to a single prompt
    • Sites are replacing the OS pointer with bespoke designs that hurt click accuracy and discoverability
    • The pointer's slight tilt and pointed tip exist for accumulated UI reasons — they were not arbitrary
    • Lower implementation cost shifts the constraint from 'can I build it' to 'should I build it'
    Provenance
    Article · Supporting source
  5. 5

    Computer build using Intel Optane Persistent Memory — running 1T-parameter Kimi K2.5 at ~4 tokens/sec

    Article APFrisco — r/LocalLLaMA builder

    A reminder that the local-LLM frontier is being pushed forward by people scavenging discontinued enterprise hardware, not buying new GPUs.

    www.reddit.com/r/LocalLLaMA/comments/1taeg8… →
    Details
    Context
    A reminder that the local-LLM frontier is being pushed forward by people scavenging discontinued enterprise hardware, not buying new GPUs.
    Key points
    • Runs a 1 trillion parameter Kimi K2.5 locally at about 4 tokens/second
    • Uses discontinued Intel Optane Persistent Memory DIMMs — a tier between DRAM and SSD
    • Surfaces a workaround for the memory-capacity wall blocking local frontier-scale models
    • 633 upvotes and 107 comments on r/LocalLLaMA — community read is that this is a real path, not a meme
    Provenance
    Article · Supporting source
  6. 6

    MTP on Unsloth — Qwen3.6 27B and 35B-A3B GGUFs with preserved multi-token prediction layers

    Article Altruistic_Heat_9531

    Multi-token prediction baked into the released checkpoint, instead of bolted on with a separate draft model, is the cleanest version of speculative decoding for local rigs.

    www.reddit.com/r/LocalLLaMA/comments/1ta4rv… →
    Details
    Context
    Multi-token prediction baked into the released checkpoint, instead of bolted on with a separate draft model, is the cleanest version of speculative decoding for local rigs.
    Key points
    • Unsloth published Qwen3.6 27B and 35B-A3B quantizations with the multi-token prediction (MTP) head preserved
    • Requires the in-flight llama.cpp MTP PR to actually use the speedup
    • 420 upvotes, 141 comments — local-LLM users want speculative-style speedups from the base model itself
    Provenance
    Article · Supporting source