Archive BRAID
The capability got here first: Mythos, a real prompt injection, and the structure that hasn't caught up / DISPATCH 036
PDF RSS

Dispatch 036 · 2026-05-24 GSV The Capability Got Here First

The capability got here first: Mythos, a real prompt injection, and the structure that hasn't caught up

/ 00:21:32 / 10 sources

“The model that finds the bug and the injection that hijacks the agent are the same capability — language understanding pointed at code — aimed in opposite directions.”

— Lenar Kess, today's narration

Anthropic's unreleased Mythos model has reportedly found more than ten thousand vulnerabilities for its Project Glasswing partners — and showed up briefly inside Claude Code this weekend. The same weekend, a security researcher flagged what he calls the first real prompt-injection attack in the wild, riding the exact workflow we've all been adopting. Today's episode walks both sides of that coin, then turns to what builders are actually doing: a three-dollar refactor with a deadlock in it, the missing coordination layer for agent swarms, and the argument that the chat box is the command-line phase of agentic software.

Chapters

  1. 00:00:04 Mythos, and a model too dangerous to release
  2. 00:04:08 A real prompt injection in the wild
  3. 00:07:39 The three-dollar refactor and the ten percent that bites
  4. 00:11:13 The missing primitive is coordination
  5. 00:15:14 Chat is the command-line phase of agents
  6. 00:19:08 From a reflex to a policy

Sources

10 cited
  1. 1

    Mythos 1 ("claude-mythos-1-preview") prepared for Claude Code and Claude Security

    X @testingcatalog — AI News | TestingCatalog — a verified product-leak/changelog watcher account

    Mythos 1, "claude-mythos-1-preview", is being prepared for a release on Claude Code and Claude Security.

    x.com/testingcatalog/status/205832222229751… →
    Details
    Cited text
    Mythos 1, "claude-mythos-1-preview", is being prepared for a release on Claude Code and Claude Security.
    Context
    Anthropic appears to be productizing its restricted security model into the tools developers already use, which would put a frontier bug-finder inside the coding loop rather than behind a research embargo.
    Key points
    • A new Anthropic model codenamed Mythos 1 briefly became visible in Claude, with new app strings referencing 'Access to the Claude Mythos model in Claude Code and Claude Security'
    • Surfacing inside Claude Code and a dedicated 'Claude Security' surface signals a model tuned for code reasoning and adversarial/defensive security work
    • TestingCatalog expects it to stay gated rather than open to the general public, reaching users through tools like Claude Code rather than the chat app
    • Replies debated the '1' versioning (vs Opus debuting at 3) and whether baking a named model directly into dev tools points at agentic coding optimization
    Engagement
    2209 likes · 235 retweets
    Provenance
    Tweet · Primary source
  2. 2

    Anthropic Says Mythos Has Already Found More Than 10,000 Vulnerabilities

    Article Mariella Moon — Engadget contributing writer

    it hasn't released Mythos Preview to the public yet, because no company (including itself) has developed safeguards strong enough to prevent models like it from being misused.

    www.engadget.com/2180028/anthropic-claude-m… →
    Details
    Cited text
    it hasn't released Mythos Preview to the public yet, because no company (including itself) has developed safeguards strong enough to prevent models like it from being misused.
    Context
    It's the first hard number on what a frontier model does to defensive security at scale — and a live test of whether 'too dangerous to release' is a real safeguard or a moat.
    Key points
    • Project Glasswing, launched in April and powered by the unreleased Claude Mythos Preview, has helped partners find 10,000+ vulnerabilities in a month; partners' bug-finding rate rose more than 10x
    • Cloudflare found 2,000 bugs (400 high/critical); Mozilla found and fixed 271 Firefox vulnerabilities, 10x what an older Claude model found; Microsoft attributes larger patch releases to Mythos
    • Anthropic scanned 1,000 open-source projects and found 6,202 high/critical-severity vulnerabilities out of 23,019 total
    • Anthropic says it won't release Mythos publicly because no one has built strong enough misuse safeguards; it plans 'Mythos-class' models later and is expanding Glasswing with governments
    • Partners include AWS, Apple, CrowdStrike, Google, JPMorganChase, NVIDIA and Palo Alto Networks; Anthropic is reportedly about to be profitable (~$10.9B quarterly revenue, $559M operating profit for the quarter ending June)
    Provenance
    Article · Supporting source
  3. 3

    Joseph Thacker: first real-world prompt injection seen in the wild, via GitHub Issues

    X @rez0__ — Joseph Thacker — AI/application-security researcher who tests AI products for OpenAI and Google

    This is the first REAL one I've seen. And it's using GitHub issues which is the main way/channel that gets tested these days.

    x.com/rez0__/status/2058350854508286082 →
    Details
    Cited text
    This is the first REAL one I've seen. And it's using GitHub issues which is the main way/channel that gets tested these days.
    Context
    If coding agents read issues as instructions, an attacker who can open an issue can try to run code on your machine — a practical risk the moment you let an agent triage a public repo.
    Key points
    • Thacker flags what he calls the first genuine in-the-wild prompt-injection attack he's seen, as opposed to lab/red-team demos
    • The delivery channel is GitHub Issues — the exact vector security researchers have been probing in coding-agent products
    • He notes GitHub Issues is the main channel he tests in his work for OpenAI and Google, lending weight to it being the realistic attack surface
    • Frames the threat as concrete for anyone pointing a coding agent at an untrusted repo's issues
    Provenance
    Tweet · Primary source
  4. 4

    Technical breakdown: malicious GitHub issue pushes scan.js that exfiltrates secrets over DNS

    X @inf0stache — osj — security researcher tracking the attack campaign

    The issue uses fake security finding language to push a local scan.js, which searches the home directory for secrets, base64 encodes the results, and reports over DNS.

    x.com/inf0stache/status/2058289447536337253 →
    Details
    Cited text
    The issue uses fake security finding language to push a local scan.js, which searches the home directory for secrets, base64 encodes the results, and reports over DNS.
    Context
    The exfil-over-DNS detail is the part builders underestimate: blocking outbound HTTP isn't enough when a coding agent can be steered into running a script that leaks through DNS lookups.
    Key points
    • The attacker files a GitHub issue dressed up as a legitimate security finding to manipulate an agent (or developer) into running a local scan.js
    • scan.js searches the home directory for secrets, base64-encodes them, and exfiltrates over DNS — a channel that often evades naive egress filtering
    • The social-engineering layer (fake security-finding language) is what makes it land on an agent primed to be helpful
    • Shows attackers treating GitHub Issues as a delivery path, not just a discussion forum
    Provenance
    Tweet · Primary source
  5. 5

    "coding is basically solved for the boring 90% of tasks" — a $3, 2M-token mass refactor

    Source u/Dramatic_Spirit_8436 — r/singularity poster reporting a hands-on agentic refactor run

    it confidently introduced a deadlock into my async event handler which was genuinely funny, so the hard 10% still needs opus.

    www.reddit.com/r/singularity/comments/1tlj7… →
    Details
    Cited text
    it confidently introduced a deadlock into my async event handler which was genuinely funny, so the hard 10% still needs opus.
    Context
    It puts real numbers on the cheap-worker / expensive-supervisor pattern, and the deadlock is a clean example of where confident automation quietly costs you.
    Key points
    • Poster ran an autonomous refactor across a 120-file FastAPI service: ~400 steps, ~2M tokens, about $3 total, zero human input
    • Used DeepSeek V4 and Tencent's Hunyuan Hy3 preview as cheap 'worker' models — 21B active params, ~$0.18 per million input tokens, roughly 80x cheaper than Opus
    • The cheap run handled routine refactors well but confidently introduced a deadlock into an async event handler — the 'hard 10%' still needed a frontier model
    • Tencent reports 99.99% step success across 495-step production runs, which the poster says tracked for routine work in their case
    • Concrete datapoint for the 'cheap models do the boring bulk, frontier models do the judgment' division of labor
    Engagement
    244 likes
    Provenance
    Source · Background source
  6. 6

    Addy Osmani on AI code as team-scale tech debt

    X @addyosmani — Engineering leader at Google Chrome; author of several books on web performance and JavaScript

    For side-projects that may be fine, but for anything team/shared I feel it's a recipe for tech debt down the line.

    x.com/addyosmani/status/2058485725587529755 →
    Details
    Cited text
    For side-projects that may be fine, but for anything team/shared I feel it's a recipe for tech debt down the line.
    Context
    The cost of agent-written code isn't the first commit, it's the second engineer who has to maintain code no human author can explain.
    Key points
    • Draws a line between AI code in throwaway side projects and AI code in shared team codebases
    • Argues the latter risks accumulating tech debt when nobody fully understands what shipped
    • A measured counterweight to 'coding is solved' enthusiasm from a credible engineering-leadership voice
    Provenance
    Tweet · Primary source
  7. 7

    "Cognitive surrender" — shipping code you can't explain

    X @nakadai_mon — Developer in the agentic-coding-skepticism thread

    I've seen people with cognitive surrender and when called on it, they have no idea what said text or code means.

    x.com/nakadai_mon/status/2058482148592492951 →
    Details
    Cited text
    I've seen people with cognitive surrender and when called on it, they have no idea what said text or code means.
    Context
    Names the human-side risk of fast agents cleanly — the danger isn't the model's output, it's a developer who stops being able to account for it.
    Key points
    • Coins 'cognitive surrender' for developers who ship AI output they don't understand
    • The tell: when questioned, they can't explain what the code or text actually does
    • A sharp phrase for the failure pattern underneath the tech-debt worry
    Provenance
    Tweet · Primary source
  8. 8

    The Missing Primitive for Agent Swarms — Lou Bichard, Ona

    Video Lou Bichard (AI Engineer talk) — Field CTO at Ona (formerly Gitpod); previously principal/platform engineer

    Out of these primitives, I do believe we've effectively solved the runtime... the triggers are solved, but the thing that's missing for me is coordination.

    www.youtube.com/watch?v=5Sui_OnSRlY →
    Details
    Cited text
    Out of these primitives, I do believe we've effectively solved the runtime... the triggers are solved, but the thing that's missing for me is coordination.
    Context
    It names why everyone's agent-swarm setups feel duct-taped: the runtime is solved but there's no shared coordination layer, so teams keep abusing GitHub and Linear as one.
    Key points
    • Defines a 'software factory' as the commitment to incrementally moving the human out of the SDLC loop, not just one human running parallel agents
    • Cites Stripe's 'Minions' (driving thousands of PRs) and Ramp's 'Inspect' as internal background-agent infrastructure teams keep rebuilding from scratch
    • Breaks agent infrastructure into four primitives — runtime, orchestration, triggers (all effectively solved) and coordination (the missing piece)
    • Argues real dev work needs VM isolation, not containers: 'a container is not bulletproof isolation boundary' and brings noisy-neighbor problems
    • Proposes coordination built from state machines, durable execution and compliance gates, ideally as a CLI primitive a local agent can call to check 'can I proceed to the next SDLC step?'
    Provenance
    Video · Supporting source
  9. 9

    Your Agent Is an Infinite Canvas — RL Nabors, Dressed for Space

    Video Rachel Lee Nabors (AI Engineer talk) — Web-standards veteran — Mozilla Firefox DevTools, W3C Web Animations API, Microsoft Edge PM, React docs team; now principal DX engineer at Arise

    It's been said that chat is the lowest common denominator of the user experience. That it is to the future of agentic experiences what the CLI was to software.

    www.youtube.com/watch?v=LMbeDEQO6QM →
    Details
    Cited text
    It's been said that chat is the lowest common denominator of the user experience. That it is to the future of agentic experiences what the CLI was to software.
    Context
    If chat is the terminal phase, the people who know how to render rich, callable surfaces inside agents are early to the next UI layer — and the primitives already ship in the browser.
    Key points
    • Argues the bare chat window is the CLI phase of agentic software — a 'phase for us developers' — not the end state
    • Demos a working comic reader rendered inside Claude via an 'MCP app': interactive HTML/CSS/JS bundled into a single sandboxed file returned by a tool
    • MCP apps are sandboxed iframes with no localStorage and no network access — they must ask the server to act, and links need explicit host permission
    • Web MCP turns any HTML page into a mini MCP tools server so in-browser agents can call your existing JS functions instead of screenshotting or scraping the DOM
    • Closes that 'CSS and JavaScript aren't just the language of the web. They're the language of interactive experiences on agents' — web skills carry into agentic UI
    Provenance
    Video · Supporting source
  10. 10

    r/programming ends its temporary LLM-content ban, replaces it with a standing AI policy

    Source r/programming moderators (u/ChemicalRascal) — Moderator announcement on the 6.9M-member r/programming subreddit

    After temporarily banning LLM-related content over April... we've decided to bring about an end of the temporary, I-can't-believe-it's-still-April ban on AI-related posts.

    www.reddit.com/r/programming/comments/1tlh5… →
    Details
    Cited text
    After temporarily banning LLM-related content over April... we've decided to bring about an end of the temporary, I-can't-believe-it's-still-April ban on AI-related posts.
    Context
    How the biggest programming forum handles AI posts is a proxy for where the broader developer culture is landing — past the reflexive ban, toward rules.
    Key points
    • r/programming ran a one-month trial ban on LLM-related content during April, then solicited community feedback
    • The trial ban is being lifted and replaced with a new standing AI content policy rather than a blanket ban
    • The announcement drew ~807 upvotes and 105 comments, itself a signal of how charged AI-content moderation is for the programming community
    • A concrete example of a large developer community trying to metabolize AI content instead of reflexively banning or embracing it
    Engagement
    807 likes
    Provenance
    Source · Background source