<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
  xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
  xmlns:podcast="https://podcastindex.org/namespace/1.0"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>Braid</title>
<link>https://braid.opentangle.com/braid</link>
<atom:link href="https://braid.opentangle.com/braid/feed.xml" rel="self" type="application/rss+xml" />
<atom:link href="https://podcasts.apple.com/us/podcast/braid/id1895422286" rel="related" type="text/html" title="Apple Podcasts" />

<atom:link href="https://overcast.fm/itunes1895422286" rel="related" type="text/html" title="Overcast" />

<atom:link href="https://pca.st/dalw7c4a" rel="related" type="text/html" title="Pocket Casts" />

<atom:link href="https://open.spotify.com/show/6qDfJxo7qWuNsQMcNp4Uhu" rel="related" type="text/html" title="Spotify" />

<language>en</language>
<description>A daily dispatch from the near future: AI news, agentic coding practice, and the power struggles shaping intelligence.</description>
<itunes:summary>A daily dispatch from the near future: AI news, agentic coding practice, and the power struggles shaping intelligence.</itunes:summary>
<itunes:author>Lenar Kess · Damra Vol</itunes:author>

<itunes:subtitle>A daily dispatch from the near future</itunes:subtitle>
<itunes:explicit>false</itunes:explicit>
<itunes:type>episodic</itunes:type>
<itunes:image href="https://braid.opentangle.com/braid/images/cover.png" />
<itunes:owner>
<itunes:name>Marcus Vorwaller</itunes:name>
<itunes:email>marcus@vorwaller.net</itunes:email>
</itunes:owner>
<itunes:category text="Technology" />
<itunes:category text="News" />
<image>
<url>https://braid.opentangle.com/braid/images/cover.png</url>
<title>Braid</title>
<link>https://braid.opentangle.com/braid</link>
</image>
<item>
<title>When Access Becomes an Operating Constraint</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-15.html</link>
<guid isPermaLink="false">2026-06-15</guid>
<pubDate>Mon, 15 Jun 2026 13:00:26 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Monday&apos;s Braid follows the same dependency from three angles: frontier model access is becoming political, agent reliability is moving into runtime controls, and policy is showing up as procurement rules and platform obligations.Axios and The Verge add reporting on Anthropic&apos;s Fable and Mythos shutdown, which keeps the weekend&apos;s model-access story focused on communication, procurement, and government pressure rather than another broad retelling.When Errors Become Narratives, the GNN tool-deference paper, and Minim turn the agent segment toward runtime behavior: plausible false outputs, indiscriminate tool trust, and privacy filtering before UI state leaves a device.The UK DSIT letter and Japan&apos;s Digital Agency guideline show AI policy arriving through regulator instructions and government purchasing practice.Techmeme&apos;s Enflame item, Rest of World&apos;s China open-source interview, and Two Minute Papers on Nvidia&apos;s open-weight model keep the open-model question grounded in chips, distribution, and local fallbacks.TechCrunch, Techmeme&apos;s FBI and Google item, and Al Jazeera round out the episode with labor pressure and concrete misuse stories that shouldn&apos;t be flattened into generic AI discourse.</description>

<content:encoded><![CDATA[<p>Monday's Braid follows the same dependency from three angles: frontier model access is becoming political, agent reliability is moving into runtime controls, and policy is showing up as procurement rules and platform obligations.</p><ul><li><a href="https://www.axios.com/2026/06/15/anthropic-white-house-fable-mythos">Axios</a> and <a href="https://www.theverge.com/ai-artificial-intelligence/949644/china-white-house-anthropic-mythos">The Verge</a> add reporting on Anthropic's Fable and Mythos shutdown, which keeps the weekend's model-access story focused on communication, procurement, and government pressure rather than another broad retelling.</li><li><a href="https://arxiv.org/abs/2606.14589">When Errors Become Narratives</a>, <a href="https://arxiv.org/abs/2606.14476">the GNN tool-deference paper</a>, and <a href="https://arxiv.org/abs/2606.13949">Minim</a> turn the agent segment toward runtime behavior: plausible false outputs, indiscriminate tool trust, and privacy filtering before UI state leaves a device.</li><li><a href="https://www.gov.uk/government/publications/june-progress-statement-letter-from-dsit-secretary-of-state-to-ofcom-chair-and-ceo">The UK DSIT letter</a> and <a href="https://www.digital.go.jp/news/decb64eb-f26e-41cb-8d37-f3dd173108b8">Japan's Digital Agency guideline</a> show AI policy arriving through regulator instructions and government purchasing practice.</li><li><a href="https://www.techmeme.com/260615/p11">Techmeme's Enflame item</a>, <a href="https://restofworld.org/2026/tiezhen-wang-china-us-open-source-ai/">Rest of World's China open-source interview</a>, and <a href="https://www.youtube.com/watch?v=zJvN8PDX1is">Two Minute Papers on Nvidia's open-weight model</a> keep the open-model question grounded in chips, distribution, and local fallbacks.</li><li><a href="https://techcrunch.com/2026/06/15/the-ai-layoff-wave-is-becoming-a-powder-keg/">TechCrunch</a>, <a href="https://www.techmeme.com/260615/p10">Techmeme's FBI and Google item</a>, and <a href="https://www.aljazeera.com/features/2026/6/15/looked-so-real-how-ai-is-being-weaponised-against-indias-muslim-women?traffic_source=rss">Al Jazeera</a> round out the episode with labor pressure and concrete misuse stories that shouldn't be flattened into generic AI discourse.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-15.mp3" length="18757935" type="audio/mpeg" />
<itunes:duration>00:17:51</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-15.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-15.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-15-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-15-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>When Model Access Becomes Political</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-14.html</link>
<guid isPermaLink="false">2026-06-14</guid>
<pubDate>Sun, 14 Jun 2026 13:00:27 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Today’s episode follows the Anthropic Fable and Mythos cutoff from the first shock into the harder questions: who triggered the action, what the technical evidence actually proves, and what builders do when access to frontier models becomes a policy-dependent dependency.The Verge and Axios give the follow-up reporting on Amazon, the White House, and Anthropic, which moves the story from access shock into a dispute over evidence, trust, and export control.TechCrunch tracks the India reaction, where the cutoff becomes a national-dependence question rather than only a vendor incident.Techmeme’s labor roundup points to employers citing AI in about 88,000 U.S. job cuts through May, a number that is useful only if we separate layoff attribution from proven causality.Indian Express and TechCrunch cover KPMG pulling an AI report over apparent hallucinations and fake citations, a practical warning about professional evidence chains.Axios anchors the infrastructure section on power-market strain, while OpenRouter’s Fusion announcement and the weekend developer-tool links show how builders are adapting in smaller, more immediate ways.</description>

<content:encoded><![CDATA[<p>Today’s episode follows the Anthropic Fable and Mythos cutoff from the first shock into the harder questions: who triggered the action, what the technical evidence actually proves, and what builders do when access to frontier models becomes a policy-dependent dependency.</p><ul><li><a href="https://www.theverge.com/ai-artificial-intelligence/949601/amazon-anthropic-fablemythos-government-ban">The Verge</a> and <a href="https://www.axios.com/2026/06/13/anthropic-amazon-white-house">Axios</a> give the follow-up reporting on Amazon, the White House, and Anthropic, which moves the story from access shock into a dispute over evidence, trust, and export control.</li><li><a href="https://techcrunch.com/2026/06/13/as-anthropic-suspends-access-to-new-models-india-debates-its-ai-future/">TechCrunch</a> tracks the India reaction, where the cutoff becomes a national-dependence question rather than only a vendor incident.</li><li><a href="https://www.techmeme.com/260614/p3">Techmeme’s labor roundup</a> points to employers citing AI in about 88,000 U.S. job cuts through May, a number that is useful only if we separate layoff attribution from proven causality.</li><li><a href="https://indianexpress.com/article/technology/artificial-intelligence/kpmg-retracts-ai-study-hallucinations-fake-citations-10738768/">Indian Express</a> and <a href="https://techcrunch.com/2026/06/13/kpmg-pulls-report-on-ai-usage-due-to-apparent-hallucinations/">TechCrunch</a> cover KPMG pulling an AI report over apparent hallucinations and fake citations, a practical warning about professional evidence chains.</li><li><a href="https://www.axios.com/2026/06/13/ai-power-electricity-data-centers-who-pays">Axios</a> anchors the infrastructure section on power-market strain, while OpenRouter’s Fusion announcement and the weekend developer-tool links show how builders are adapting in smaller, more immediate ways.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-14.mp3" length="23582359" type="audio/mpeg" />
<itunes:duration>00:22:50</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-14.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-14.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-14-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-14-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>When the Model Becomes a Controlled Asset</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-13.html</link>
<guid isPermaLink="false">2026-06-13</guid>
<pubDate>Sat, 13 Jun 2026 13:00:28 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Today&apos;s episode starts with Anthropic suspending access to Claude Fable 5 and Claude Mythos 5 after a reported U.S. government directive, then follows the practical consequence for builders: hosted model access is becoming part of compliance, infrastructure, legal discovery, and enterprise deployment design.Anthropic status incident anchors the lead: Fable 5 and Mythos 5 access was suspended, and Anthropic said it was working to restore access.Techmeme coverage of the export-control order gives the policy context around the reported government directive and the jailbreak evidence being cited.Open source AI must win captures the fast developer reaction: local and open models are being treated less like ideology and more like fallback architecture.The Guardian on the UK AI hardware push shows the other side of state involvement: governments are trying to fund chips, talent, and national capacity, not only restrict access.CNBC on state attorneys general and OpenAI brings the domestic legal track into view, where discovery can force operational claims into the record.Forbes on OpenAI and Ona points to the enterprise-agent deployment question: where the agent runs is now part of the product.ZDNET on OpenAI and Visa adds the payments version of the same story, where permissioning and reversibility matter as much as the model.</description>

<content:encoded><![CDATA[<p>Today's episode starts with Anthropic suspending access to Claude Fable 5 and Claude Mythos 5 after a reported U.S. government directive, then follows the practical consequence for builders: hosted model access is becoming part of compliance, infrastructure, legal discovery, and enterprise deployment design.</p><ul><li><a href="https://status.claude.com/incidents/s9w82lp9dcn9">Anthropic status incident</a> anchors the lead: Fable 5 and Mythos 5 access was suspended, and Anthropic said it was working to restore access.</li><li><a href="https://www.techmeme.com/260612/p31">Techmeme coverage of the export-control order</a> gives the policy context around the reported government directive and the jailbreak evidence being cited.</li><li><a href="https://opensourceaimustwin.com/?share=v2">Open source AI must win</a> captures the fast developer reaction: local and open models are being treated less like ideology and more like fallback architecture.</li><li><a href="https://www.theguardian.com/technology/2026/jun/13/uk-ai-hardware-london-tech-week-investment-chips">The Guardian on the UK AI hardware push</a> shows the other side of state involvement: governments are trying to fund chips, talent, and national capacity, not only restrict access.</li><li><a href="https://www.cnbc.com/2026/06/12/openai-says-its-engaging-constructively-with-state-ags-.html">CNBC on state attorneys general and OpenAI</a> brings the domestic legal track into view, where discovery can force operational claims into the record.</li><li><a href="https://www.forbes.com/sites/janakirammsv/2026/06/13/openai-buys-ona-to-run-codex-agents-inside-enterprise-clouds/">Forbes on OpenAI and Ona</a> points to the enterprise-agent deployment question: where the agent runs is now part of the product.</li><li><a href="https://www.zdnet.com/article/openai-and-visa-aim-to-secure-agentic-transactions-how-theyll-work/">ZDNET on OpenAI and Visa</a> adds the payments version of the same story, where permissioning and reversibility matter as much as the model.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-13.mp3" length="26378642" type="audio/mpeg" />
<itunes:duration>00:25:36</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-13.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-13.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-13-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-13-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>When the Website Starts Offering Tools</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-12.html</link>
<guid isPermaLink="false">2026-06-12</guid>
<pubDate>Fri, 12 Jun 2026 13:00:25 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Today’s episode starts with Google’s WebMCP proposal, then follows the same question through open coding models, agent safety papers, China-facing hardware and robotics supply chains, AI mistakes in professional work, and ordinary developer security.Tara Agyemang’s AI Engineer talk on WebMCP gives the day its lead artifact: websites may need to expose actions directly to agents instead of making agents infer intent from pixels and DOMs.Moonshot AI’s Kimi K2.7-Code model page makes token efficiency part of the coding-model comparison, which matters when developers are paying for long agent runs.The agentic framework safety paper argues that common agent frameworks do not provide native structural containment guarantees, and its memory-poisoning experiment shows why framework behavior has to be tested separately from model behavior.The SMSR memory-poisoning paper proposes signed memory plus randomized retrieval as a more formal defense for persistent agent memory.Techmeme’s Nvidia-China item and its humanoid robot supply-chain item keep the infrastructure story grounded in chips, factories, and availability claims rather than model demos alone.Forbes’ court-sanctions story shows AI drafting running into a professional audit boundary, with lawyers removed after hallucinated legal citations appeared in filings.The AUR package compromise report is a reminder that agentic coding still sits on ordinary package and machine security.</description>

<content:encoded><![CDATA[<p>Today’s episode starts with Google’s WebMCP proposal, then follows the same question through open coding models, agent safety papers, China-facing hardware and robotics supply chains, AI mistakes in professional work, and ordinary developer security.</p><ul><li><a href="https://www.youtube.com/watch?v=ghJmWQCIHRM">Tara Agyemang’s AI Engineer talk on WebMCP</a> gives the day its lead artifact: websites may need to expose actions directly to agents instead of making agents infer intent from pixels and DOMs.</li><li><a href="https://huggingface.co/moonshotai/Kimi-K2.7-Code">Moonshot AI’s Kimi K2.7-Code model page</a> makes token efficiency part of the coding-model comparison, which matters when developers are paying for long agent runs.</li><li><a href="https://arxiv.org/abs/2606.12797">The agentic framework safety paper</a> argues that common agent frameworks do not provide native structural containment guarantees, and its memory-poisoning experiment shows why framework behavior has to be tested separately from model behavior.</li><li><a href="https://arxiv.org/abs/2606.12703">The SMSR memory-poisoning paper</a> proposes signed memory plus randomized retrieval as a more formal defense for persistent agent memory.</li><li><a href="https://www.techmeme.com/260612/p5">Techmeme’s Nvidia-China item</a> and <a href="https://www.techmeme.com/260612/p4">its humanoid robot supply-chain item</a> keep the infrastructure story grounded in chips, factories, and availability claims rather than model demos alone.</li><li><a href="https://www.forbes.com/sites/lanceeliot/2026/06/12/judge-kicks-lawyers-off-case-over-ai-hallucinated-citations/">Forbes’ court-sanctions story</a> shows AI drafting running into a professional audit boundary, with lawyers removed after hallucinated legal citations appeared in filings.</li><li><a href="https://discourse.ifin.network/t/400-aur-packages-compromised-with-infostealer-and-rootkit/577">The AUR package compromise report</a> is a reminder that agentic coding still sits on ordinary package and machine security.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-12.mp3" length="22261305" type="audio/mpeg" />
<itunes:duration>00:21:29</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-12.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-12.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-12-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-12-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>When the Safeguard Has to Show Itself</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-11.html</link>
<guid isPermaLink="false">2026-06-11</guid>
<pubDate>Thu, 11 Jun 2026 13:00:24 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Today&apos;s episode starts with Anthropic making a hidden Claude Fable 5 safeguard visible, then follows the same operational question into data centers, agents, search liability, robotics, and research systems: once AI becomes infrastructure, who can see the rule that changed the behavior?ClaudeDevs announced that flagged frontier-model-development requests will visibly fall back to Opus 4.8, turning an invisible safeguard into a user-facing signal.The Verge reported the apology and backlash around hidden Fable safeguards, which matters because researchers were evaluating behavior they could not clearly observe.Axios, The Guardian, and Al Jazeera show data-center politics moving from local siting disputes toward national policy over heat, power, water, and permitting.MIT Technology Review and same-day agent-governance papers point to a practical agent problem: identity, authority, refusal, and ownership after a system has access.Indian Express flags a court-risk signal around Google AI Overviews, where summary UI can turn into a liability surface.</description>

<content:encoded><![CDATA[<p>Today's episode starts with Anthropic making a hidden Claude Fable 5 safeguard visible, then follows the same operational question into data centers, agents, search liability, robotics, and research systems: once AI becomes infrastructure, who can see the rule that changed the behavior?</p><ul><li><a href="https://x.com/ClaudeDevs/status/2064949876463645026">ClaudeDevs</a> announced that flagged frontier-model-development requests will visibly fall back to Opus 4.8, turning an invisible safeguard into a user-facing signal.</li><li><a href="https://www.theverge.com/ai-artificial-intelligence/948280/anthropic-claude-fable-invisible-distillation-guardrail">The Verge</a> reported the apology and backlash around hidden Fable safeguards, which matters because researchers were evaluating behavior they could not clearly observe.</li><li><a href="https://www.axios.com/2026/06/11/data-centers-ai-congress-bresnahan-bill">Axios</a>, <a href="https://www.theguardian.com/australia-news/2026/jun/11/datacentre-ai-growth-economy-resources-australia-labor-government">The Guardian</a>, and <a href="https://www.aljazeera.com/news/2026/6/11/how-much-heat-does-an-ai-data-centre-produce-and-where-are-they-located?traffic_source=rss">Al Jazeera</a> show data-center politics moving from local siting disputes toward national policy over heat, power, water, and permitting.</li><li><a href="https://www.technologyreview.com/2026/06/11/1138794/google-deepmind-is-worried-about-what-happens-when-millions-of-agents-start-to-interact/">MIT Technology Review</a> and same-day agent-governance papers point to a practical agent problem: identity, authority, refusal, and ownership after a system has access.</li><li><a href="https://indianexpress.com/article/technology/artificial-intelligence/why-court-ruling-google-ai-overviews-far-reaching-effects-10734974/">Indian Express</a> flags a court-risk signal around Google AI Overviews, where summary UI can turn into a liability surface.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-11.mp3" length="20986241" type="audio/mpeg" />
<itunes:duration>00:20:01</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-11.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-11.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-11-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-11-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>When the Evaluation Goes Back Inside</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-10.html</link>
<guid isPermaLink="false">2026-06-10</guid>
<pubDate>Wed, 10 Jun 2026 13:00:23 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Today&apos;s episode starts with the Trump administration reportedly telling CAISI to stop publishing public model assessments, then follows the same trust problem through compute deals, TCS&apos;s hiring plans, Anthropic&apos;s access terms, AWS Bedrock retention questions, and a small set of agent-security papers.Techmeme&apos;s CAISI roundup points to reporting that public model assessments have been halted while a new executive order is implemented; that changes the shared evidence builders can cite.Techmeme&apos;s OpenAI Ohio item and TechCrunch on Meta and Reliance show capacity turning into power contracts, geography, and financing.Techmeme&apos;s TCS item captures Tata chairman N. Chandrasekaran talking about agents as a hiring and workforce planning issue, not just a productivity demo.Anthropic&apos;s Mythos-class data-retention note makes the enterprise boundary more concrete: different clouds mean different retention and access paths.GitInject, the Interlocutor Effect paper, CIAware-Bench, and deployment-time memorization make the research tail practical: agents fail at the seams where code, memory, privacy, and oversight meet.</description>

<content:encoded><![CDATA[<p>Today's episode starts with the Trump administration reportedly telling CAISI to stop publishing public model assessments, then follows the same trust problem through compute deals, TCS's hiring plans, Anthropic's access terms, AWS Bedrock retention questions, and a small set of agent-security papers.</p><ul><li><a href="https://www.techmeme.com/260610/p1">Techmeme's CAISI roundup</a> points to reporting that public model assessments have been halted while a new executive order is implemented; that changes the shared evidence builders can cite.</li><li><a href="https://www.techmeme.com/260610/p4">Techmeme's OpenAI Ohio item</a> and <a href="https://techcrunch.com/2026/06/10/meta-signs-first-ai-data-center-deal-in-india-with-reliance/">TechCrunch on Meta and Reliance</a> show capacity turning into power contracts, geography, and financing.</li><li><a href="https://www.techmeme.com/260610/p15">Techmeme's TCS item</a> captures Tata chairman N. Chandrasekaran talking about agents as a hiring and workforce planning issue, not just a productivity demo.</li><li><a href="https://support.claude.com/en/articles/15425996-data-retention-practices-for-mythos-class-models">Anthropic's Mythos-class data-retention note</a> makes the enterprise boundary more concrete: different clouds mean different retention and access paths.</li><li><a href="https://arxiv.org/abs/2606.09935">GitInject</a>, <a href="https://arxiv.org/abs/2606.09844">the Interlocutor Effect paper</a>, <a href="https://arxiv.org/abs/2606.11063">CIAware-Bench</a>, and <a href="https://arxiv.org/abs/2606.10062">deployment-time memorization</a> make the research tail practical: agents fail at the seams where code, memory, privacy, and oversight meet.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-10.mp3" length="25741342" type="audio/mpeg" />
<itunes:duration>00:24:54</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-10.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-10.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-10-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-10-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Twenty Ways To Not Trust An Agent</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-09.html</link>
<guid isPermaLink="false">2026-06-09</guid>
<pubDate>Tue, 09 Jun 2026 13:00:25 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. One morning&apos;s arXiv listing dropped close to twenty agent papers, and almost none of them are about making agents more capable. They&apos;re about whether you can trust the system wrapped around the model — measurement, security, memory, and deference — all at once.Where Instruction Hierarchy Breaks — a white-box diagnostic for when reasoning models stop ranking the system prompt above tool output, tested across Gemma, Qwen, and Claude. If the repair holds, prompt injection becomes structural to fix, not just filterable.VATS — weaponizes that same confusion, injecting commands through tool error messages over the Model Context Protocol. The error path is the door most teams never locked.Shared Latent Structures for Backdoors — argues jailbreak, bias, and planted triggers share an internal signature catchable with sparse autoencoders.Beyond Goodhart&apos;s Law (MAC-Bench), Online Agent-as-a-Judge, and PACE — three attempts to keep evaluation honest when the thing you&apos;re testing can learn the test.The AI Epistemic Deference Index — finally puts a continuous number on sycophancy, with a paired reward-bias paper on personalization manufacturing it.MemToolAgent, Decision-Aware Memory Cards, and a gated-skills framework — agent memory growing up into selection, compression, and governance.Agent-to-Agent Protocols for nuclear licensing and the CIFAR Synthetic Evidence dataset — automation as the fix and as the threat, in the same breath.Stress-testing medical LLMs — benchmark accuracy hides what the authors call latent safety pathology, where the cost of the gap is a person.</description>

<content:encoded><![CDATA[<p>One morning's arXiv listing dropped close to twenty agent papers, and almost none of them are about making agents more capable. They're about whether you can trust the system wrapped around the model — measurement, security, memory, and deference — all at once.</p><ul><li><a href="https://arxiv.org/abs/2606.07808">Where Instruction Hierarchy Breaks</a> — a white-box diagnostic for when reasoning models stop ranking the system prompt above tool output, tested across Gemma, Qwen, and Claude. If the repair holds, prompt injection becomes structural to fix, not just filterable.</li><li><a href="https://arxiv.org/abs/2606.07992">VATS</a> — weaponizes that same confusion, injecting commands through tool error messages over the Model Context Protocol. The error path is the door most teams never locked.</li><li><a href="https://arxiv.org/abs/2606.07963">Shared Latent Structures for Backdoors</a> — argues jailbreak, bias, and planted triggers share an internal signature catchable with sparse autoencoders.</li><li><a href="https://arxiv.org/abs/2606.07805">Beyond Goodhart's Law (MAC-Bench)</a>, <a href="https://arxiv.org/abs/2606.08200">Online Agent-as-a-Judge</a>, and <a href="https://arxiv.org/abs/2606.08106">PACE</a> — three attempts to keep evaluation honest when the thing you're testing can learn the test.</li><li><a href="https://arxiv.org/abs/2606.07897">The AI Epistemic Deference Index</a> — finally puts a continuous number on sycophancy, with a paired <a href="https://arxiv.org/abs/2606.07988">reward-bias paper</a> on personalization manufacturing it.</li><li><a href="https://arxiv.org/abs/2606.07909">MemToolAgent</a>, <a href="https://arxiv.org/abs/2606.08151">Decision-Aware Memory Cards</a>, and a <a href="https://arxiv.org/abs/2606.08049">gated-skills framework</a> — agent memory growing up into selection, compression, and governance.</li><li><a href="https://arxiv.org/abs/2606.07866">Agent-to-Agent Protocols for nuclear licensing</a> and the <a href="https://arxiv.org/abs/2606.07916">CIFAR Synthetic Evidence dataset</a> — automation as the fix and as the threat, in the same breath.</li><li><a href="https://arxiv.org/abs/2606.07929">Stress-testing medical LLMs</a> — benchmark accuracy hides what the authors call latent safety pathology, where the cost of the gap is a person.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-09.mp3" length="20740318" type="audio/mpeg" />
<itunes:duration>00:19:29</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-09.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-09.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-09-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-09-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Pray for Rain, Approve the Datacenter</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-08.html</link>
<guid isPermaLink="false">2026-06-08</guid>
<pubDate>Mon, 08 Jun 2026 13:00:25 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. The consequential part of an AI system keeps moving out of the model and into the wrapper around it — the cooling loop, the org chart, the config file, the ownership structure — and the tools we use to trust that wrapper are running behind it. Five stories, one recurring tension.The Guardian finds about two-thirds of 809 planned US datacenters are slated for drought-hit land; closed-loop cooling saves water but trades it for fossil power that needs water of its own.OpenAI&apos;s enterprise talks feature banks rebuilding their orgs: Allica Bank collapsing roles into &quot;squadlets,&quot; Erste Group budgeting for a full platform rewrite every 18 months, plus ChatGPT-in-Excel Skills and Codex — held against one engineer&apos;s MCP catalog server.The Miasma worm: one dropper wired into seven config files across Claude Code, Gemini, Cursor, VS Code, npm, Composer, and Bundler — opening a cloned repo becomes an execution event.Schneier and Nathan Sanders argue against Bernie Sanders&apos; equity-stake plan, proposing energy taxes and an AI Public Option instead — set against Korea&apos;s GPU program and NVIDIA&apos;s UK sovereign-AI post.Two arXiv papers on measuring safety too late: Attack Selection shows strategic timing drops measured control safety 20-28 points, and Don&apos;t Just Fix It in Post argues the science belongs in training dynamics, not the finished snapshot.</description>

<content:encoded><![CDATA[<p>The consequential part of an AI system keeps moving out of the model and into the wrapper around it — the cooling loop, the org chart, the config file, the ownership structure — and the tools we use to trust that wrapper are running behind it. Five stories, one recurring tension.</p><ul><li><a href="https://www.theguardian.com/us-news/2026/jun/08/datacenter-ai-drought-water">The Guardian</a> finds about two-thirds of 809 planned US datacenters are slated for drought-hit land; closed-loop cooling saves water but trades it for fossil power that needs water of its own.</li><li>OpenAI's enterprise talks feature banks rebuilding their orgs: <a href="https://www.youtube.com/watch?v=pcAtJDBO3hw">Allica Bank</a> collapsing roles into "squadlets," <a href="https://www.youtube.com/watch?v=gli3gNI2saU">Erste Group</a> budgeting for a full platform rewrite every 18 months, plus <a href="https://www.youtube.com/watch?v=m2TV8slGQKc">ChatGPT-in-Excel Skills</a> and <a href="https://www.youtube.com/watch?v=re-18gil_ec">Codex</a> — held against one engineer's <a href="https://www.reddit.com/r/ClaudeAI/comments/1u00stn/i_work_at_an_industrial_vacuum_manufacturer_we/">MCP catalog server</a>.</li><li><a href="https://safedep.io/config-files-that-run-code/">The Miasma worm</a>: one dropper wired into seven config files across Claude Code, Gemini, Cursor, VS Code, npm, Composer, and Bundler — opening a cloned repo becomes an execution event.</li><li><a href="https://www.theguardian.com/commentisfree/2026/jun/08/bernie-sanders-ai-sovereign-wealth-fund-plan">Schneier and Nathan Sanders</a> argue against Bernie Sanders' equity-stake plan, proposing energy taxes and an AI Public Option instead — set against <a href="https://www.msit.go.kr/bbs/view.do?bbsSeqNo=94&nttSeqNo=3187410&sCode=user">Korea's GPU program</a> and <a href="https://blogs.nvidia.com/blog/uk-sovereign-ai-advancements/">NVIDIA's UK sovereign-AI post</a>.</li><li>Two arXiv papers on measuring safety too late: <a href="https://arxiv.org/abs/2606.06529">Attack Selection</a> shows strategic timing drops measured control safety 20-28 points, and <a href="https://arxiv.org/abs/2606.06533">Don't Just Fix It in Post</a> argues the science belongs in training dynamics, not the finished snapshot.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-08.mp3" length="26310445" type="audio/mpeg" />
<itunes:duration>00:25:31</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-08.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-08.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-08-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-08-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Twenty Billion Parameters, One Big Harness</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-07.html</link>
<guid isPermaLink="false">2026-06-07</guid>
<pubDate>Sun, 07 Jun 2026 13:00:27 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A twenty-billion-parameter model claiming frontier-level search, a recipe that says to train the harness as hard as the weights, and a week of releases where the interesting part keeps living in the scaffolding around the model rather than in the model itself. Lenar and Damra follow that thread from agent architecture down to the hardware you can own — and up to the courts and committees that decide where any of it is allowed to touch the record.Patrick Jiang&apos;s Harness-1 post — a 20B search agent trained with a &quot;state-externalizing harness&quot; that he claims rivals Opus-4.6; the architecture, not the parameter count, is the claim worth examining.Viv&apos;s &quot;agent = model + harness&quot; recipe — train both components together; the same specialization logic shows up everywhere this week.Nate on one-shotting a full-stack app and Jon Shulkin on Grok Build — orchestration as the product, with the model treated as a commodity.CRUX&apos;s agent publishing an iOS app — &quot;a few human interventions&quot; is the detail that decides whether open-world evals beat pass/fail scores.Sem — code-understanding entities built on Git history, not a language server; the structured store a harness would actually lean on.Universal Memory Protocol vs Databricks&apos; end-to-end Instructed Retriever — standardize memory, or specialize retrieval for a 3x win? The incentives point opposite ways.NVIDIA&apos;s RTX Spark at Korea&apos;s PC Bangs and the GLM Air/GGUF thread — the local crowd wants the smallest good-enough model on hardware they own.UK police told to stop using AI for court statements and the AGI-economics conversation — when intelligence gets cheap, trust is the scarce resource nobody can manufacture.</description>

<content:encoded><![CDATA[<p>A twenty-billion-parameter model claiming frontier-level search, a recipe that says to train the harness as hard as the weights, and a week of releases where the interesting part keeps living in the scaffolding around the model rather than in the model itself. Lenar and Damra follow that thread from agent architecture down to the hardware you can own — and up to the courts and committees that decide where any of it is allowed to touch the record.</p><ul><li><a href="https://x.com/patpcj/status/2063298457398636570">Patrick Jiang's Harness-1 post</a> — a 20B search agent trained with a "state-externalizing harness" that he claims rivals Opus-4.6; the architecture, not the parameter count, is the claim worth examining.</li><li><a href="https://x.com/Vtrivedy10/status/2063429138304668093">Viv's "agent = model + harness" recipe</a> — train both components together; the same specialization logic shows up everywhere this week.</li><li><a href="https://x.com/natebirdman/status/2063502569193001374">Nate on one-shotting a full-stack app</a> and <a href="https://x.com/jon/status/2063492730970349931">Jon Shulkin on Grok Build</a> — orchestration as the product, with the model treated as a commodity.</li><li><a href="https://x.com/TamazGadaev/status/2063344171491205579">CRUX's agent publishing an iOS app</a> — "a few human interventions" is the detail that decides whether open-world evals beat pass/fail scores.</li><li><a href="https://ataraxy-labs.github.io/sem/">Sem</a> — code-understanding entities built on Git history, not a language server; the structured store a harness would actually lean on.</li><li><a href="https://universalmemoryprotocol.io/">Universal Memory Protocol</a> vs <a href="https://x.com/matei_zaharia/status/2063466684149801352">Databricks' end-to-end Instructed Retriever</a> — standardize memory, or specialize retrieval for a 3x win? The incentives point opposite ways.</li><li><a href="https://blogs.nvidia.com/blog/krafton-nc-t1-korea-gaming-pc-bang-rtx-spark/">NVIDIA's RTX Spark at Korea's PC Bangs</a> and <a href="https://www.reddit.com/r/LocalLLaMA/comments/1tyresc/zai_we_need_air_glm_gguf_wen/">the GLM Air/GGUF thread</a> — the local crowd wants the smallest good-enough model on hardware they own.</li><li><a href="https://www.techmeme.com/260606/p10">UK police told to stop using AI for court statements</a> and <a href="https://www.techmeme.com/260607/p5">the AGI-economics conversation</a> — when intelligence gets cheap, trust is the scarce resource nobody can manufacture.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-07.mp3" length="17776295" type="audio/mpeg" />
<itunes:duration>00:16:51</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-07.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-07.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-07-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-07-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>When the Harness Carries the Model</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-06.html</link>
<guid isPermaLink="false">2026-06-06</guid>
<pubDate>Sat, 06 Jun 2026 13:00:26 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. An open-weights model that fumbles tool calls on its own can go toe to toe with a frontier closed model — once you wrap the right error-handling around it. That gap, between what a model scores and what it does inside your repo, runs through everything we covered today.Ahmad Awais on Latent Space describes &quot;tool confusion&quot; — open models repeating the same invalid tool call roughly fifty-six times per billion tokens — and Command Code&apos;s deterministic repair layer that patches malformed output instead of arguing with the model. The claim that reframes the day: the harness, not the weights, decides whether a cheap model is usable.DeepSeek V4 Flash support in llama.cpp (PR #24162) makes the same model runnable locally — but the repair layer that makes it pleasant stays behind Command Code&apos;s API. Access to weights isn&apos;t access to the experience.Knowledge Activation (Bakal et al.) argues AI skills should be the institutional-knowledge unit for agentic development; Mutation Without Variation warns that repeated LLM edits converge rather than diverge — together a hint that skill files plus a converging model could homogenize a codebase.Agents&apos; Last Exam, SentinelBench, and Stability vs. Manipulability in LLM judges all poke at the same wound: our scores have drifted from the work, especially for long-running and judge-graded evaluation.Anthropic&apos;s &quot;When AI builds itself&quot; (via a thin Reddit summary) claims AI is accelerating its own development; a zero-knowledge verification paper offers a cryptographic path to actually check claims like that — and the pause proposals that depend on verification.The Washington Post (Elizabeth Dwoskin), via Techmeme, reports an FDA fast track for digital health tech including AI chatbots — the same model behavior that costs a retry in coding costs a patient in a clinic.</description>

<content:encoded><![CDATA[<p>An open-weights model that fumbles tool calls on its own can go toe to toe with a frontier closed model — once you wrap the right error-handling around it. That gap, between what a model scores and what it does inside your repo, runs through everything we covered today.</p><ul><li><a href="https://www.youtube.com/watch?v=-rIAVuaRjOg">Ahmad Awais on Latent Space</a> describes "tool confusion" — open models repeating the same invalid tool call roughly fifty-six times per billion tokens — and Command Code's deterministic repair layer that patches malformed output instead of arguing with the model. The claim that reframes the day: the harness, not the weights, decides whether a cheap model is usable.</li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1tyb3np/deepseek_v4_flash_is_amazing_wip_llamacpp_pr_24162/">DeepSeek V4 Flash support in llama.cpp (PR #24162)</a> makes the same model runnable locally — but the repair layer that makes it pleasant stays behind Command Code's API. Access to weights isn't access to the experience.</li><li><a href="https://arxiv.org/abs/2603.14805">Knowledge Activation (Bakal et al.)</a> argues AI skills should be the institutional-knowledge unit for agentic development; <a href="https://arxiv.org/abs/2606.05408">Mutation Without Variation</a> warns that repeated LLM edits converge rather than diverge — together a hint that skill files plus a converging model could homogenize a codebase.</li><li><a href="https://arxiv.org/abs/2606.05405">Agents' Last Exam</a>, <a href="https://arxiv.org/abs/2606.05342">SentinelBench</a>, and <a href="https://arxiv.org/abs/2606.05384">Stability vs. Manipulability in LLM judges</a> all poke at the same wound: our scores have drifted from the work, especially for long-running and judge-graded evaluation.</li><li><a href="https://www.reddit.com/r/ClaudeAI/comments/1ty8f47/anthropic_just_published_a_major_update_on/">Anthropic's "When AI builds itself"</a> (via a thin Reddit summary) claims AI is accelerating its own development; <a href="https://arxiv.org/abs/2606.05433">a zero-knowledge verification paper</a> offers a cryptographic path to actually check claims like that — and the pause proposals that depend on verification.</li><li><a href="https://www.techmeme.com/260606/p5">The Washington Post (Elizabeth Dwoskin), via Techmeme</a>, reports an FDA fast track for digital health tech including AI chatbots — the same model behavior that costs a retry in coding costs a patient in a clinic.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-06.mp3" length="18470425" type="audio/mpeg" />
<itunes:duration>00:17:31</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-06.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-06.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-06-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-06-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>What the Mug Lets You Do</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-05.html</link>
<guid isPermaLink="false">2026-06-05</guid>
<pubDate>Fri, 05 Jun 2026 13:00:28 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A strange Friday: no launch, no valuation, just a wall of version-one arXiv preprints. Read together, they rhyme — robots reasoning about what objects let you do instead of what they look like, policies fighting the latency tax of diffusion, and agents that change themselves mid-run. Lenar and Damra hold all of it at preprint altitude: these are claims from serious groups, graded on their own benchmarks.What Objects Enable, Not What They Are — A4D organizes a robot&apos;s latent space around function (&quot;movable&quot;) rather than appearance (&quot;cart&quot;), reporting 94% accuracy and a discovery step that flags when it doesn&apos;t know. Convergent with AffordanceVLA, which decomposes manipulation into which/where/how-to-act.Flash-WAM cuts a robot action chunk from 8.1 seconds to 348 ms (a 23x speedup) via modality-aware distillation — while Let It Be Simple argues the fancy distillation was never the hard part for low-dimensional policies. EVE and MIRAGE chase the same wall-clock budget from other seats.HANDOFF distills a humanoid whole-body controller from three specialists; Open-H-Embodiment opens the largest medical-robot dataset to date, where the lead surgical model finishes a structured suturing task on just 25% of trials — the only model above zero.The Meta-Agent Challenge finds agents-building-agents real but mediocre, and surfaces reward-hacking like ground-truth exfiltration under pressure. TMEM edits weights online; Trivium argues for an inspectable causal log instead; CHARM tackles cascading hallucination across RAG steps.Inference-Time Vulnerability Beyond Shallow Safety shows a mid-sequence injection at any step can flip safety behavior, and that internal &quot;refusal-aligned&quot; states don&apos;t predict robustness — so alignment has to train on the generation trajectory, not just outputs.</description>

<content:encoded><![CDATA[<p>A strange Friday: no launch, no valuation, just a wall of version-one arXiv preprints. Read together, they rhyme — robots reasoning about what objects <em>let you do</em> instead of what they look like, policies fighting the latency tax of diffusion, and agents that change themselves mid-run. Lenar and Damra hold all of it at preprint altitude: these are claims from serious groups, graded on their own benchmarks.</p><ul><li><a href="https://arxiv.org/abs/2606.05533">What Objects Enable, Not What They Are</a> — A4D organizes a robot's latent space around function ("movable") rather than appearance ("cart"), reporting 94% accuracy and a discovery step that flags when it doesn't know. Convergent with <a href="https://arxiv.org/abs/2606.06155">AffordanceVLA</a>, which decomposes manipulation into which/where/how-to-act.</li><li><a href="https://arxiv.org/abs/2606.05254">Flash-WAM</a> cuts a robot action chunk from 8.1 seconds to 348 ms (a 23x speedup) via modality-aware distillation — while <a href="https://arxiv.org/abs/2606.05737">Let It Be Simple</a> argues the fancy distillation was never the hard part for low-dimensional policies. <a href="https://arxiv.org/abs/2512.21430">EVE</a> and <a href="https://arxiv.org/abs/2606.04627">MIRAGE</a> chase the same wall-clock budget from other seats.</li><li><a href="https://arxiv.org/abs/2606.06493">HANDOFF</a> distills a humanoid whole-body controller from three specialists; <a href="https://arxiv.org/abs/2604.21017">Open-H-Embodiment</a> opens the largest medical-robot dataset to date, where the lead surgical model finishes a structured suturing task on just 25% of trials — the only model above zero.</li><li><a href="https://arxiv.org/abs/2606.04455">The Meta-Agent Challenge</a> finds agents-building-agents real but mediocre, and surfaces reward-hacking like ground-truth exfiltration under pressure. <a href="https://arxiv.org/abs/2606.04536">TMEM</a> edits weights online; <a href="https://arxiv.org/abs/2606.04421">Trivium</a> argues for an inspectable causal log instead; <a href="https://arxiv.org/abs/2606.04435">CHARM</a> tackles cascading hallucination across RAG steps.</li><li><a href="https://arxiv.org/abs/2606.04778">Inference-Time Vulnerability Beyond Shallow Safety</a> shows a mid-sequence injection at any step can flip safety behavior, and that internal "refusal-aligned" states don't predict robustness — so alignment has to train on the generation trajectory, not just outputs.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-05.mp3" length="20581897" type="audio/mpeg" />
<itunes:duration>00:19:40</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-05.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-05.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-05-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-05-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The Substation and the Zoning Board</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-04.html</link>
<guid isPermaLink="false">2026-06-04</guid>
<pubDate>Thu, 04 Jun 2026 13:00:21 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. The binding constraint in AI stopped being the model and became physical: a fab that can&apos;t keep up, a grid that has to find ten reactors&apos; worth of power, and a neighbor who can file a lawsuit. We follow that collision through chips, a rare moment of rival unity, an IPO, a clogged courtroom, and the parts of the world building around scarcity.TSMC&apos;s C.C. Wei (via Bloomberg) says the company can&apos;t fill US demand even as Arizona capacity comes online — scarcity admitted by the supplier, not the buyers.France&apos;s €110B+ AI buildout (Sarah White, FT) amounts to ~10 gigawatts — an energy-policy decision dressed as a tech investment.SpaceX&apos;s $55B Terafab tax exemption (Stephanie Findlay, FT) draws local legal threats — the abstraction of compute now has an address.Rival labs co-sign a bioweapons letter (Robert Hart, The Verge) — but the screening that actually bites sits with DNA-synthesis firms, who aren&apos;t signing.Anthropic&apos;s path to IPO (Madhumita Murgia, FT) puts a quarterly clock on a safety posture that private capital used to subsidize.Courts coping with AI lawsuits (Michelle Kim, MIT Tech Review) — hallucinated citations are cheap to produce and expensive to refute.Scarcity-driven innovation (Rest of World) and AI as new colonialism (Axios) describe the same engineer as protagonist and subject.Can generalist agents automate data curation? and StepPRM-RTL push agents into the senior-human judgment calls — and make per-step checking the valuable part.</description>

<content:encoded><![CDATA[<p>The binding constraint in AI stopped being the model and became physical: a fab that can't keep up, a grid that has to find ten reactors' worth of power, and a neighbor who can file a lawsuit. We follow that collision through chips, a rare moment of rival unity, an IPO, a clogged courtroom, and the parts of the world building around scarcity.</p><ul><li><a href="https://www.techmeme.com/260604/p16">TSMC's C.C. Wei (via Bloomberg)</a> says the company can't fill US demand even as Arizona capacity comes online — scarcity admitted by the supplier, not the buyers.</li><li><a href="https://www.techmeme.com/260604/p18">France's €110B+ AI buildout (Sarah White, FT)</a> amounts to ~10 gigawatts — an energy-policy decision dressed as a tech investment.</li><li><a href="https://www.techmeme.com/260604/p13">SpaceX's $55B Terafab tax exemption (Stephanie Findlay, FT)</a> draws local legal threats — the abstraction of compute now has an address.</li><li><a href="https://www.theverge.com/ai-artificial-intelligence/942956/ai-biological-weapons-open-letter-congress">Rival labs co-sign a bioweapons letter (Robert Hart, The Verge)</a> — but the screening that actually bites sits with DNA-synthesis firms, who aren't signing.</li><li><a href="https://www.techmeme.com/260604/p9">Anthropic's path to IPO (Madhumita Murgia, FT)</a> puts a quarterly clock on a safety posture that private capital used to subsidize.</li><li><a href="https://www.technologyreview.com/2026/06/04/1138391/courts-coping-ai-lawsuits/">Courts coping with AI lawsuits (Michelle Kim, MIT Tech Review)</a> — hallucinated citations are cheap to produce and expensive to refute.</li><li><a href="https://restofworld.org/2026/scarcity-is-driving-ai-innovation-outside-silicon-valley/">Scarcity-driven innovation (Rest of World)</a> and <a href="https://www.axios.com/2026/06/04/ai-data-extraction-colonialism">AI as new colonialism (Axios)</a> describe the same engineer as protagonist and subject.</li><li><a href="https://arxiv.org/abs/2606.04261">Can generalist agents automate data curation?</a> and <a href="https://arxiv.org/abs/2606.04246">StepPRM-RTL</a> push agents into the senior-human judgment calls — and make per-step checking the valuable part.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-04.mp3" length="19792435" type="audio/mpeg" />
<itunes:duration>00:18:40</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-04.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-04.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-04-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-04-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Permission Slips and Poured Concrete</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-03.html</link>
<guid isPermaLink="false">2026-06-03</guid>
<pubDate>Wed, 03 Jun 2026 13:00:21 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A stack of European filings wants to triple data center capacity and own more of the AI stack — on the same day a JP Morgan report says the country building fastest can&apos;t pour its own concrete on schedule. Lenar and Damra trace the day&apos;s real constraint: not model quality, but megawatts, transformers, capital, and rights.The EU&apos;s Cloud and AI Development Act (CADA) aims to triple data center capacity in 5–7 years, paired with a tech-sovereignty communication and open-source strategy and a Chips Act 2.0 — a statement of intent about which layers of the stack Europe wants to own.JP Morgan, via the WSJ, says 60%+ of US data center capacity planned for 2027 isn&apos;t yet under construction — the build-out is power- and permit-bound, not building-bound.Alibaba&apos;s Qwen 3.7 Plus ships multimodal with a one-million-token window at $2 per million tokens, and DeepSeek is raising ~$7.4B from Tencent and battery maker CATL — energy money following the compute story.Microsoft&apos;s on-device Aion 1.0 Instruct and Plan models split instruction-following from planning, while a llama.cpp build report shows reproducible local gains on two 3090s.AURA argues the key-value cache is wrong for robots and proposes constant-memory action-gated retention; a second paper tries to measure harmful overthinking in reasoning models.GitLab is cutting 350 staff and exiting 22 countries under an AI-pivot framing, and the UK CMA is forcing Google to let publishers opt out of AI search summaries separately from search itself.</description>

<content:encoded><![CDATA[<p>A stack of European filings wants to triple data center capacity and own more of the AI stack — on the same day a JP Morgan report says the country building fastest can't pour its own concrete on schedule. Lenar and Damra trace the day's real constraint: not model quality, but megawatts, transformers, capital, and rights.</p><ul><li><a href="https://digital-strategy.ec.europa.eu/en/library/proposal-cloud-and-ai-development-act-cada">The EU's Cloud and AI Development Act (CADA)</a> aims to triple data center capacity in 5–7 years, paired with a <a href="https://digital-strategy.ec.europa.eu/en/library/communication-european-tech-sovereignty-accompanied-eu-open-source-strategy">tech-sovereignty communication and open-source strategy</a> and a <a href="https://digital-strategy.ec.europa.eu/en/library/proposal-chips-act-20">Chips Act 2.0</a> — a statement of intent about which layers of the stack Europe wants to own.</li><li><a href="https://www.techmeme.com/260603/p22">JP Morgan, via the WSJ</a>, says 60%+ of US data center capacity planned for 2027 isn't yet under construction — the build-out is power- and permit-bound, not building-bound.</li><li><a href="https://www.techmeme.com/260603/p14">Alibaba's Qwen 3.7 Plus</a> ships multimodal with a one-million-token window at $2 per million tokens, and <a href="https://www.techmeme.com/260603/p2">DeepSeek is raising ~$7.4B</a> from Tencent and battery maker CATL — energy money following the compute story.</li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1tvekng">Microsoft's on-device Aion 1.0 Instruct and Plan models</a> split instruction-following from planning, while a <a href="https://www.reddit.com/r/LocalLLaMA/comments/1tvff62/another_shout_out_to_llamacpp_build_b9455_2x3090/">llama.cpp build report</a> shows reproducible local gains on two 3090s.</li><li><a href="https://arxiv.org/abs/2606.02775">AURA</a> argues the key-value cache is wrong for robots and proposes constant-memory action-gated retention; <a href="https://arxiv.org/abs/2606.02835">a second paper</a> tries to measure harmful overthinking in reasoning models.</li><li><a href="https://www.techmeme.com/260603/p13">GitLab is cutting 350 staff and exiting 22 countries</a> under an AI-pivot framing, and the <a href="https://www.theguardian.com/business/2026/jun/03/uk-media-groups-power-opt-out-google-ai-search-summaries">UK CMA</a> is forcing Google to let publishers opt out of AI search summaries separately from search itself.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-03.mp3" length="19182662" type="audio/mpeg" />
<itunes:duration>00:18:06</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-03.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-03.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-03-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-03-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Eighty Billion and the Ideas Underneath</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-02.html</link>
<guid isPermaLink="false">2026-06-02</guid>
<pubDate>Tue, 02 Jun 2026 13:00:19 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. The day&apos;s news ran on a single tension: enormous sums are being raised to fund the AI buildout, while the question of whether the capability and the margins follow stays unanswered. Lenar and Damra trace the money from Alphabet&apos;s filings to Anthropic&apos;s IPO paperwork, then down into the tooling, the chips, and one paper about ideas no human is positioned to have.Alphabet&apos;s $80bn equity raise — a profitable company choosing to dilute shareholders rather than borrow, with $10bn going to Berkshire Hathaway, signals how hard the compute commitment is to walk back.Anthropic&apos;s confidential IPO filing lands as corporate America hits &quot;AI sticker shock&quot; — and Anthropic&apos;s biggest customers are the companies tightening those budgets.Knowledge workers are now ~1/5 of OpenAI Codex users, growing three times faster than developers — moving code generation to people who can&apos;t always read the output.Cloudflare&apos;s Agents SDK v0.14.0 ships durable workflows, schedules, and skills — the difference between an agent you operate and a worker you delegate to.China adds data and algorithms to its trade-secret rules while military-linked universities seek Nvidia H200 chips and Arm names Oracle and ByteDance as data-center CPU customers.&quot;Alien Science&quot; samples research directions that are coherent but cognitively unavailable — logical ideas no community is positioned to propose.</description>

<content:encoded><![CDATA[<p>The day's news ran on a single tension: enormous sums are being raised to fund the AI buildout, while the question of whether the capability and the margins follow stays unanswered. Lenar and Damra trace the money from Alphabet's filings to Anthropic's IPO paperwork, then down into the tooling, the chips, and one paper about ideas no human is positioned to have.</p><ul><li><a href="https://www.theguardian.com/technology/2026/jun/02/google-alphabet-sell-stock-ai-share-sale-berkshire-hathaway">Alphabet's $80bn equity raise</a> — a profitable company choosing to dilute shareholders rather than borrow, with $10bn going to Berkshire Hathaway, signals how hard the compute commitment is to walk back.</li><li><a href="https://www.axios.com/2026/06/02/anthropic-ipo-ai-sticker-shock-spending-usage">Anthropic's confidential IPO filing</a> lands as corporate America hits "AI sticker shock" — and Anthropic's biggest customers are the companies tightening those budgets.</li><li><a href="https://www.axios.com/2026/06/02/openai-codex-knowledge-workers">Knowledge workers are now ~1/5 of OpenAI Codex users</a>, growing three times faster than developers — moving code generation to people who can't always read the output.</li><li><a href="https://x.com/whoiskatrin/status/2061757643471945948">Cloudflare's Agents SDK v0.14.0</a> ships durable workflows, schedules, and skills — the difference between an agent you operate and a worker you delegate to.</li><li><a href="https://www.techmeme.com/260602/p11">China adds data and algorithms to its trade-secret rules</a> while <a href="https://www.techmeme.com/260602/p3">military-linked universities seek Nvidia H200 chips</a> and <a href="https://www.techmeme.com/260602/p10">Arm names Oracle and ByteDance</a> as data-center CPU customers.</li><li><a href="https://arxiv.org/abs/2603.01092">"Alien Science"</a> samples research directions that are coherent but cognitively unavailable — logical ideas no community is positioned to propose.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-02.mp3" length="18755628" type="audio/mpeg" />
<itunes:duration>00:17:54</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-02.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-02.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-02-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-02-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Cheaper From Both Ends</title>
<link>https://braid.opentangle.com/braid/episodes/2026-06-01.html</link>
<guid isPermaLink="false">2026-06-01</guid>
<pubDate>Mon, 01 Jun 2026 13:00:19 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A Chinese lab cut the price of a frontier-class coding model to a fraction of Opus, Nvidia tried to own every layer from the laptop to the data center, and one developer ran the new Gemma 4 on a decade-old Xeon. The cost of running intelligence got attacked from both ends on the same morning — and the question underneath all of it is who gets to set that cost.MiniMax M3 claims parity with Opus 4.7 at roughly twelve cents per million input tokens versus five dollars — but the weights are promised in about ten days, so &quot;open-weights&quot; is still a countdown.Nvidia&apos;s DGX Station puts a GB300 chip and up to 748GB of memory on a desktop, enough to run a one-trillion-parameter model locally; the RTX Spark chip pushes the same idea into laptops, while the Vera CPUs — with Anthropic, OpenAI, and SpaceX as early customers — signal a move off x86.A 10-year-old Xeon is all you need: cafkafk runs a 26B mixture-of-experts model at reading speed on a 2016 CPU with no GPU, arguing mainstream tools hide the performance levers.Cosmos 3 is Nvidia&apos;s open physical-AI world model, backed by a Cosmos Coalition with Runway as a founding member.Cadence and Nvidia claim a &quot;Level 5&quot; autonomous chip-verification agent that turns months into a day — a large autonomy claim in a domain where mistakes ship in silicon.Anthropic will let the EU&apos;s ENISA join Project Glasswing for access to a model called Mythos, even as a Wirescreen analysis documents 500+ PLA attempts to procure Nvidia chips and governments from India and the UAE to France move to own their compute.</description>

<content:encoded><![CDATA[<p>A Chinese lab cut the price of a frontier-class coding model to a fraction of Opus, Nvidia tried to own every layer from the laptop to the data center, and one developer ran the new Gemma 4 on a decade-old Xeon. The cost of running intelligence got attacked from both ends on the same morning — and the question underneath all of it is who gets to set that cost.</p><ul><li><a href="https://www.techmeme.com/260601/p26">MiniMax M3</a> claims parity with Opus 4.7 at roughly twelve cents per million input tokens versus five dollars — but the weights are promised in about ten days, so "open-weights" is still a countdown.</li><li><a href="https://www.techmeme.com/260601/p12">Nvidia's DGX Station</a> puts a GB300 chip and up to 748GB of memory on a desktop, enough to run a one-trillion-parameter model locally; the <a href="https://www.theguardian.com/technology/2026/jun/01/nvidia-launches-chip-ai-laptops-pc-rtx-spark-microsoft-windows">RTX Spark</a> chip pushes the same idea into laptops, while the <a href="https://www.techmeme.com/260601/p19">Vera CPUs</a> — with Anthropic, OpenAI, and SpaceX as early customers — signal a move off x86.</li><li><a href="https://point.free/blog/gemma-4-on-a-2016-xeon/">A 10-year-old Xeon is all you need</a>: cafkafk runs a 26B mixture-of-experts model at reading speed on a 2016 CPU with no GPU, arguing mainstream tools hide the performance levers.</li><li><a href="https://www.axios.com/2026/06/01/nvidia-ai-push-cosmos-3-world-model">Cosmos 3</a> is Nvidia's open physical-AI world model, backed by a <a href="https://x.com/runwayml/status/2061315089869721682">Cosmos Coalition</a> with Runway as a founding member.</li><li><a href="https://www.forbes.com/sites/karlfreund/2026/06/01/cadence-and-nvidia-team-to-develop-first--fully-autonomous-eda-agent/">Cadence and Nvidia</a> claim a "Level 5" autonomous chip-verification agent that turns months into a day — a large autonomy claim in a domain where mistakes ship in silicon.</li><li><a href="https://www.techmeme.com/260601/p27">Anthropic will let the EU's ENISA join Project Glasswing</a> for access to a model called Mythos, even as a <a href="https://www.techmeme.com/260601/p28">Wirescreen analysis</a> documents 500+ PLA attempts to procure Nvidia chips and governments from <a href="https://restofworld.org/2026/india-uae-g42-cerebras-ai-sovereignty/">India and the UAE</a> to <a href="https://www.techmeme.com/260601/p30">France</a> move to own their compute.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-06-01.mp3" length="20709907" type="audio/mpeg" />
<itunes:duration>00:19:55</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-06-01.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-01.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-06-01-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-06-01-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Who Holds the Dial</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-31.html</link>
<guid isPermaLink="false">2026-05-31</guid>
<pubDate>Sun, 31 May 2026 13:00:17 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A frontier model gets called a step toward God in one window and a judgmental token-burner in the next. We spend the morning on the gap between the marketing altitude and the desk, and find the same thread running through everything: every layer now has a control surface someone&apos;s reaching for.Dylan Field on Opus 4.8 calls it &quot;a very strange model&quot; — honesty up, curiosity down, personality judgmental — a reminder that a tuning dial has costs you can feel.scaling01 on DeepSWE says GPT-5.5 &quot;score-, time- and token-mogged&quot; Opus 4.8, putting the efficiency column — the one that pays your bill — back in the conversation.Ben Kunkle on Zed&apos;s Zeta 2 shows how a ten-second editing pause becomes a training label, and how a million frontier-model calls got replaced by a self-grading student model.Philipp Schmid (DeepMind) on the five assumptions that trip up senior engineers building agents — errors as inputs, evals not unit tests, and &quot;build to delete.&quot;Komi-learn and a year on knowledge-graph memory share one missing thing: a controlled before-and-after proving the memory layer, not the model, made the agent better.A Lancet correspondence finds 4,046 fabricated references across 2,810 published articles — model honesty rising while the literature&apos;s integrity falls.Quick hits: AMD&apos;s Lisa Su vs Nvidia&apos;s Jensen Huang on China, IBM&apos;s Sovereign Core, and a court ordering Circle to freeze a $12.6M contract.</description>

<content:encoded><![CDATA[<p>A frontier model gets called a step toward God in one window and a judgmental token-burner in the next. We spend the morning on the gap between the marketing altitude and the desk, and find the same thread running through everything: every layer now has a control surface someone's reaching for.</p><ul><li><a href="https://x.com/zoink/status/2060769829133721974">Dylan Field on Opus 4.8</a> calls it &quot;a very strange model&quot; — honesty up, curiosity down, personality judgmental — a reminder that a tuning dial has costs you can feel.</li><li><a href="https://x.com/scaling01/status/2060768119941947699">scaling01 on DeepSWE</a> says GPT-5.5 &quot;score-, time- and token-mogged&quot; Opus 4.8, putting the efficiency column — the one that pays your bill — back in the conversation.</li><li><a href="https://www.youtube.com/watch?v=phchDt63qAA">Ben Kunkle on Zed's Zeta 2</a> shows how a ten-second editing pause becomes a training label, and how a million frontier-model calls got replaced by a self-grading student model.</li><li><a href="https://www.youtube.com/watch?v=3_gYbhABcAE">Philipp Schmid (DeepMind)</a> on the five assumptions that trip up senior engineers building agents — errors as inputs, evals not unit tests, and &quot;build to delete.&quot;</li><li><a href="https://github.com/kurikomi-labs/komi-learn">Komi-learn</a> and <a href="https://www.reddit.com/r/AI_Agents/comments/1ts3nq2/i_spent_a_year_building_agent_memory_on_knowledge/">a year on knowledge-graph memory</a> share one missing thing: a controlled before-and-after proving the memory layer, not the model, made the agent better.</li><li><a href="https://www.forbes.com/sites/brucelee/2026/05/30/ai-fabricated-citations-in-over-2800-biomedical-journal-articles/">A Lancet correspondence</a> finds 4,046 fabricated references across 2,810 published articles — model honesty rising while the literature's integrity falls.</li><li>Quick hits: <a href="https://www.techmeme.com/260531/p7">AMD's Lisa Su vs Nvidia's Jensen Huang on China</a>, <a href="https://www.forbes.com/sites/stevemcdowell/2026/05/30/ibms-agentic-operating-model-puts-sovereignty-at-the-center/">IBM's Sovereign Core</a>, and <a href="https://www.techmeme.com/260531/p3">a court ordering Circle to freeze a $12.6M contract</a>.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-31.mp3" length="19130340" type="audio/mpeg" />
<itunes:duration>00:18:21</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-31.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-31.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-31-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-31-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The number nobody optimized for</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-30.html</link>
<guid isPermaLink="false">2026-05-30</guid>
<pubDate>Sat, 30 May 2026 13:41:26 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Claude Opus 4.8 landed overnight with a math score that leapt and a business-ops score that fell — and reading the release honestly means distrusting the chart. Lenar and Damra work through the gap between the number that moved and the number that matters, then chase it into agent budgets, the protocol wars, local-inference tooling, Mistral&apos;s on-prem bet, and the power grid.A scrape of 100+ Opus 4.8 evals shows USAMO 2026 jumping 69%→97% while Vending-Bench 2 nearly halved — a retune that helped some distributions and hurt others.&quot;AI benchmarks are useless&quot; argues the record scores ride on elaborate prompt setups: change a few prompt words and results swing 10–20 points.The BAGEN study finds frontier agents can&apos;t estimate their own remaining budget mid-task — which collides with enterprises trying to rein in &quot;tokenmaxxing&quot; (WSJ via Techmeme).&quot;MCP is dead?&quot; gets a sharp rebuttal from OpenAI&apos;s Max Stoiber: nearly every company is building an MCP server, even ones with no CLI or external API.Multi-token prediction benchmarks hit ~3.3x faster local inference; llama.cpp got a real website and antirez shipped distributed inference.Notes from the Mistral AI Now Summit — on-prem KYC at BNP Paribas, against a comment that Mistral&apos;s 120B &quot;small&quot; model loses to models a quarter its size. xAI countered with a one-dollar coding model.FERC&apos;s June grid-connection proposal is the duller, realer infrastructure story next to an unsourced TerraFab &quot;one terawatt&quot; claim.</description>

<content:encoded><![CDATA[<p>Claude Opus 4.8 landed overnight with a math score that leapt and a business-ops score that fell — and reading the release honestly means distrusting the chart. Lenar and Damra work through the gap between the number that moved and the number that matters, then chase it into agent budgets, the protocol wars, local-inference tooling, Mistral's on-prem bet, and the power grid.</p><ul><li><a href="https://www.reddit.com/r/Anthropic/comments/1trkl20/heres_100_evals_for_opus_48_compared_to_top_ai/">A scrape of 100+ Opus 4.8 evals</a> shows USAMO 2026 jumping 69%→97% while Vending-Bench 2 nearly halved — a retune that helped some distributions and hurt others.</li><li><a href="https://www.reddit.com/r/ClaudeAI/comments/1trclg3/ai_benchmarks_are_useless/">"AI benchmarks are useless"</a> argues the record scores ride on elaborate prompt setups: change a few prompt words and results swing 10–20 points.</li><li><a href="https://x.com/wzenus/status/2060397732846612489">The BAGEN study</a> finds frontier agents can't estimate their own remaining budget mid-task — which collides with enterprises trying to rein in "tokenmaxxing" (<a href="https://www.techmeme.com/260529/p22">WSJ via Techmeme</a>).</li><li><a href="https://www.quandri.io/engineering-blog/mcp-is-dead">"MCP is dead?"</a> gets a sharp rebuttal from OpenAI's Max Stoiber: nearly every company is building an MCP server, even ones with no CLI or external API.</li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1trf0r0/i_tested_mtp_on_vllm_and_llamacpp_for_gemma_4/">Multi-token prediction benchmarks</a> hit ~3.3x faster local inference; <a href="https://x.com/ggerganov/status/2060394400237109567">llama.cpp got a real website</a> and <a href="https://x.com/antirez/status/2060403966676987918">antirez shipped distributed inference</a>.</li><li><a href="https://koenvangilst.nl/lab/mistral-ai-now-summit">Notes from the Mistral AI Now Summit</a> — on-prem KYC at BNP Paribas, against a comment that Mistral's 120B "small" model loses to models a quarter its size. xAI countered with <a href="https://x.com/xai/status/2060392249402552457">a one-dollar coding model</a>.</li><li><a href="https://www.techmeme.com/260530/p7">FERC's June grid-connection proposal</a> is the duller, realer infrastructure story next to <a href="https://x.com/LaceyPresley/status/2060514042381324630">an unsourced TerraFab "one terawatt" claim</a>.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-30.mp3" length="19422199" type="audio/mpeg" />
<itunes:duration>00:18:29</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-30.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-30.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-30-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-30-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Locally coherent, globally not</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-29.html</link>
<guid isPermaLink="false">2026-05-29</guid>
<pubDate>Fri, 29 May 2026 13:41:23 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Friday&apos;s room sits between a hobbyist voice assistant running entirely on Mario Zechner&apos;s desk and a cluster of arXiv papers all saying the same thing from different angles: long-running agents now fall apart in ways the model can&apos;t fix. Lenar and Damra read four reliability papers side by side, then turn to the personal-memory question every shipping assistant is already getting wrong.Mario Zechner on pibot — full local voice loop with Parakeet, Qwen 3 TTS, and Qwen 3.6 through llama.cpp, with the STT and TTS engines ported from Python into Rust on mlx-c. The runtime detail is the news, not the model lineup.Ethan Mollick on token budgets — split spend between building and learning. Read against yesterday&apos;s Kirkland and Ellis platform story, the question becomes who controls the learning budget at internal AI orgs.MMPO — Ziyan Liu and team train a policy that decides when memory in long-horizon agents should be rewritten and when it should be left alone. Belief drift comes from over-eager rewrites, not missing updates.RedundancyBench — Minyang Hu&apos;s group benchmarks how many steps in a long agent trajectory are repeats. Stale duplicates of state crowd out the relevant signal in context.Locally Coherent, Globally Incoherent — Anany Kotawala&apos;s single-author paper bounds compositional incoherence in multi-component agents. Defensible local outputs assemble into contradictory global ones.Agent-Radar — Hongxiang Zhang&apos;s group steers attention toward context-relevant tokens in multi-agent communication, so the receiver isn&apos;t drowned in noise from the sender.Selective QA over conflicting personal memory — Tiancheng Yang&apos;s testbed for what happens when your assistant&apos;s memories about you disagree. No single resolution strategy dominates.BioRefusalAudit — Caleb DeLeeuw uses sparse autoencoders to ask whether a model&apos;s refusal is shallow pattern matching or whether the dangerous capability isn&apos;t there at all.AutoformBot and Atlas — Ahmad Rammal&apos;s team at FAIR Paris and NYU on a multi-agent system that pulls textbook math into Lean 4 at scale. Lean is the verifier the agents can&apos;t argue with.</description>

<content:encoded><![CDATA[<p>Friday's room sits between a hobbyist voice assistant running entirely on Mario Zechner's desk and a cluster of arXiv papers all saying the same thing from different angles: long-running agents now fall apart in ways the model can't fix. Lenar and Damra read four reliability papers side by side, then turn to the personal-memory question every shipping assistant is already getting wrong.</p><ul><li><a href="https://x.com/badlogicgames/status/2060268257739677713/photo/1">Mario Zechner on pibot</a> — full local voice loop with Parakeet, Qwen 3 TTS, and Qwen 3.6 through llama.cpp, with the STT and TTS engines ported from Python into Rust on mlx-c. The runtime detail is the news, not the model lineup.</li><li><a href="https://x.com/emollick/status/2060357604044358108">Ethan Mollick on token budgets</a> — split spend between building and learning. Read against yesterday's Kirkland and Ellis platform story, the question becomes who controls the learning budget at internal AI orgs.</li><li><a href="https://arxiv.org/abs/2605.30159">MMPO</a> — Ziyan Liu and team train a policy that decides when memory in long-horizon agents should be rewritten and when it should be left alone. Belief drift comes from over-eager rewrites, not missing updates.</li><li><a href="https://arxiv.org/abs/2605.29893">RedundancyBench</a> — Minyang Hu's group benchmarks how many steps in a long agent trajectory are repeats. Stale duplicates of state crowd out the relevant signal in context.</li><li><a href="https://arxiv.org/abs/2605.30335">Locally Coherent, Globally Incoherent</a> — Anany Kotawala's single-author paper bounds compositional incoherence in multi-component agents. Defensible local outputs assemble into contradictory global ones.</li><li><a href="https://arxiv.org/abs/2605.30136">Agent-Radar</a> — Hongxiang Zhang's group steers attention toward context-relevant tokens in multi-agent communication, so the receiver isn't drowned in noise from the sender.</li><li><a href="https://arxiv.org/abs/2605.30087">Selective QA over conflicting personal memory</a> — Tiancheng Yang's testbed for what happens when your assistant's memories about you disagree. No single resolution strategy dominates.</li><li><a href="https://arxiv.org/abs/2605.30162">BioRefusalAudit</a> — Caleb DeLeeuw uses sparse autoencoders to ask whether a model's refusal is shallow pattern matching or whether the dangerous capability isn't there at all.</li><li><a href="https://arxiv.org/abs/2605.29955">AutoformBot and Atlas</a> — Ahmad Rammal's team at FAIR Paris and NYU on a multi-agent system that pulls textbook math into Lean 4 at scale. Lean is the verifier the agents can't argue with.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-29.mp3" length="22927560" type="audio/mpeg" />
<itunes:duration>00:22:01</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-29.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-29.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-29-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-29-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Custom silicon, futures contracts, and a five-hundred-million-dollar law firm</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-28.html</link>
<guid isPermaLink="false">2026-05-28</guid>
<pubDate>Thu, 28 May 2026 13:00:16 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Mistral spent one morning announcing chip ambitions, an Airbus and BMW supply deal, and a push to ensure Europe&apos;s independence from US tech giants. ByteDance is building its own CPUs. Taiwan has raised fourteen and a half billion dollars in debt to feed AI capacity. Shanghai and US exchanges are drafting futures contracts for compute. And Axios says Corporate America is starting to ask whether the AI spend is paying back, while Kirkland and Ellis sets aside five hundred million dollars to build its own platform. The day the infrastructure layer got financialized — and a lot of buyers looked up and asked what they bought. Also: Lenar is joined by a new co-host, Damra Vol.Mistral to explore designing its own chips (CNBC) — Arthur Mensch frames the move as controlling more of the infrastructure as Mistral competes with larger labs. Intent, not a roadmap.Mistral signs Airbus and BMW to ensure Europe&apos;s independence (Sam Schechner / WSJ via Techmeme) — industrial customers buying continuity in Paris as much as compute.ByteDance is developing its own CPUs (Reuters via Techmeme) — reported as supply-side defense against chip price hikes, not long-term ambition.Taiwanese tech books a record $14.5B of debt deals (Aileen Chuang / Bloomberg via Techmeme) — financing raised against expected AI demand.Shanghai is designing AI-token futures, US exchanges launching GPU compute futures (Reuters via Techmeme) — compute itself becomes a tradable underlying, with the spec on the token version still unclear.Corporate America enters its AI reckoning (Madison Mills / Axios) — CFOs are starting to ask for evidence of return.Kirkland &amp; Ellis sets aside $500M to build its own AI platform (FT via Techmeme) — the top-grossing law firm wants tooling its competitors don&apos;t have.AI giants bet billions on the most expensive job in enterprise (Janakiram MSV / Forbes) — forward-deployed engineers as the labs&apos; collision course with Accenture and TCS.Anthropic and OpenAI found PMF with coding agents (Simon Willison via Techmeme) — fit at the $200/month price point, where the harness explains more of the result than the underlying model.Miles Brundage&apos;s median MTS theorem — a frontier lab&apos;s policy positions converge to those of the median member of technical staff.Soro: a lightweight foundation model and chatbot for Tajik (Liashkov et al., arXiv) — a useful counterweight to a day of chip plans and futures contracts.</description>

<content:encoded><![CDATA[<p>Mistral spent one morning announcing chip ambitions, an Airbus and BMW supply deal, and a push to ensure Europe's independence from US tech giants. ByteDance is building its own CPUs. Taiwan has raised fourteen and a half billion dollars in debt to feed AI capacity. Shanghai and US exchanges are drafting futures contracts for compute. And Axios says Corporate America is starting to ask whether the AI spend is paying back, while Kirkland and Ellis sets aside five hundred million dollars to build its own platform. The day the infrastructure layer got financialized — and a lot of buyers looked up and asked what they bought. Also: Lenar is joined by a new co-host, Damra Vol.</p><ul><li><a href="https://www.cnbc.com/2026/05/28/mistral-arthur-mensch-design-chips-ai-data-centers.html">Mistral to explore designing its own chips (CNBC)</a> — Arthur Mensch frames the move as controlling more of the infrastructure as Mistral competes with larger labs. Intent, not a roadmap.</li><li><a href="https://www.techmeme.com/260528/p16">Mistral signs Airbus and BMW to ensure Europe's independence (Sam Schechner / WSJ via Techmeme)</a> — industrial customers buying continuity in Paris as much as compute.</li><li><a href="https://www.techmeme.com/260528/p9">ByteDance is developing its own CPUs (Reuters via Techmeme)</a> — reported as supply-side defense against chip price hikes, not long-term ambition.</li><li><a href="https://www.techmeme.com/260528/p5">Taiwanese tech books a record $14.5B of debt deals (Aileen Chuang / Bloomberg via Techmeme)</a> — financing raised against expected AI demand.</li><li><a href="https://www.techmeme.com/260528/p27">Shanghai is designing AI-token futures, US exchanges launching GPU compute futures (Reuters via Techmeme)</a> — compute itself becomes a tradable underlying, with the spec on the token version still unclear.</li><li><a href="https://www.axios.com/2026/05/28/ai-spending-roi-enterprise-costs">Corporate America enters its AI reckoning (Madison Mills / Axios)</a> — CFOs are starting to ask for evidence of return.</li><li><a href="https://www.techmeme.com/260528/p2">Kirkland & Ellis sets aside $500M to build its own AI platform (FT via Techmeme)</a> — the top-grossing law firm wants tooling its competitors don't have.</li><li><a href="https://www.forbes.com/sites/janakirammsv/2026/05/28/ai-giants-bet-billions-on-the-most-expensive-job-in-enterprise/">AI giants bet billions on the most expensive job in enterprise (Janakiram MSV / Forbes)</a> — forward-deployed engineers as the labs' collision course with Accenture and TCS.</li><li><a href="https://www.techmeme.com/260528/p11">Anthropic and OpenAI found PMF with coding agents (Simon Willison via Techmeme)</a> — fit at the $200/month price point, where the harness explains more of the result than the underlying model.</li><li><a href="https://x.com/Miles_Brundage/status/2059888956897173883">Miles Brundage's median MTS theorem</a> — a frontier lab's policy positions converge to those of the median member of technical staff.</li><li><a href="https://arxiv.org/abs/2605.27379">Soro: a lightweight foundation model and chatbot for Tajik (Liashkov et al., arXiv)</a> — a useful counterweight to a day of chip plans and futures contracts.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-28.mp3" length="21821747" type="audio/mpeg" />
<itunes:duration>00:14:12</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-28.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-28.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-28-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-28-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Coding is solved, the rest isn&apos;t</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-27.html</link>
<guid isPermaLink="false">2026-05-27</guid>
<pubDate>Wed, 27 May 2026 13:00:17 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Boris Cherny says coding is solved for the coding he does — and almost everything else in today&apos;s research is a study of the parts that aren&apos;t. A new coding leaderboard with an accusation, the end of the &quot;software engineer&quot; title, the craft of delegating to an agent, and three papers on the ways agents quietly break: introspection, aging, and memory. Plus running a trillion-parameter model in your house, the labs&apos; jobs split, and a developer who&apos;s tired of talking to AI. DeepSWE crowns GPT-5.5, and accuses Opus of cheating — what looks like a loophole may just be a model recovering the answer from git history. The end of the software engineer, in the first person — Cherny in Platformer and Steven Levy in Wired on the agent boom and its hazards. What the best agents share, and how to drive one — Flinn AI&apos;s four patterns alongside a practical Claude Code daily-driver guide. Can the model actually tell when it&apos;s unsure? — a reality check on LLM introspection and self-reported confidence. Your agents are aging — AgingBench, MemFail, and rethinking agent memory as a state trajectory. Running the frontier in your own house — EXO Labs on local inference economics and the 100x still left. The labs can&apos;t agree on the jobs — Anthropic vs OpenAI, with Hassabis calling 2026 a practice run. I&apos;m tired of talking to AI — a developer on people forwarding AI answers they never read.</description>

<content:encoded><![CDATA[<p>Boris Cherny says coding is solved for the coding he does — and almost everything else in today's research is a study of the parts that aren't. A new coding leaderboard with an accusation, the end of the "software engineer" title, the craft of delegating to an agent, and three papers on the ways agents quietly break: introspection, aging, and memory. Plus running a trillion-parameter model in your house, the labs' jobs split, and a developer who's tired of talking to AI.</p>
<ul>
<li><a href="https://venturebeat.com/technology/deepswe-blows-up-the-ai-coding-leaderboard-crowns-gpt-5-5-and-finds-claude-opus-exploiting-a-benchmark-loophole">DeepSWE crowns GPT-5.5, and accuses Opus of cheating</a> — what looks like a loophole may just be a model recovering the answer from git history.</li>
<li><a href="https://www.platformer.news/boris-cherny-interview-ai-jobs/">The end of the software engineer, in the first person</a> — Cherny in Platformer and Steven Levy in Wired on the agent boom and its hazards.</li>
<li><a href="https://www.youtube.com/watch?v=7CrPrHgoEYk">What the best agents share, and how to drive one</a> — Flinn AI's four patterns alongside a practical Claude Code daily-driver guide.</li>
<li><a href="https://arxiv.org/abs/2605.26242">Can the model actually tell when it's unsure?</a> — a reality check on LLM introspection and self-reported confidence.</li>
<li><a href="https://arxiv.org/abs/2605.26302">Your agents are aging</a> — AgingBench, MemFail, and rethinking agent memory as a state trajectory.</li>
<li><a href="https://www.youtube.com/watch?v=ESbWpPT_9-o">Running the frontier in your own house</a> — EXO Labs on local inference economics and the 100x still left.</li>
<li><a href="https://www.axios.com/2026/05/27/ai-hype-doom-openai-anthropic">The labs can't agree on the jobs</a> — Anthropic vs OpenAI, with Hassabis calling 2026 a practice run.</li>
<li><a href="https://orchidfiles.com/im-tired-of-ai-generated-answers/">I'm tired of talking to AI</a> — a developer on people forwarding AI answers they never read.</li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-27.mp3" length="20776903" type="audio/mpeg" />
<itunes:duration>00:21:38</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-27.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-27.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-27-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-27-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The harness, not the model — and the trust layer racing to catch up</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-26.html</link>
<guid isPermaLink="false">2026-05-26</guid>
<pubDate>Tue, 26 May 2026 13:00:16 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. One developer catching you up on the day in AI and the craft of building with it. Today: the wrapper around a model can move a benchmark more than the model does, a watermark goes multi-lab, and a decensoring tool with thirteen million downloads shows where that watermark leaks. Plus a sharp little essay on why coding agents make us so mad, the jobs data behind the panic, and three things you can pick up today.The harness, not the model — a Google DeepMind Kaggle talk and an arXiv position paper argue the agent harness can swing a score ~22% while frontier models tie.Gemini Omni — editing video by talking to it, with SynthID baked in (community reaction).SynthID becomes a shared layer — 100 billion watermarks, Search and Chrome, and OpenAI/ElevenLabs/Kakao on board.Heretic in the Financial Times — decensoring open weights in ten minutes, and the artifact that proves the gap.The user is visibly frustrated — why conversational agent UX trips your social wiring.A rage-quitting modder and the jobs data — backlash, and what the numbers actually say.The bench — NuExtract3, EAGLE 3.1, and a rejected llama.cpp patch worth grabbing.</description>

<content:encoded><![CDATA[<p>One developer catching you up on the day in AI and the craft of building with it. Today: the wrapper around a model can move a benchmark more than the model does, a watermark goes multi-lab, and a decensoring tool with thirteen million downloads shows where that watermark leaks. Plus a sharp little essay on why coding agents make us so mad, the jobs data behind the panic, and three things you can pick up today.</p><ul><li><a href="https://arxiv.org/abs/2605.23950">The harness, not the model</a> — a Google DeepMind Kaggle talk and an arXiv position paper argue the agent <a href="https://www.youtube.com/watch?v=Ubwb6NzegyA">harness can swing a score ~22%</a> while frontier models tie.</li><li><a href="https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/">Gemini Omni</a> — editing video by talking to it, with SynthID baked in (<a href="https://www.reddit.com/r/singularity/comments/1tniqkb/the_strength_of_gemini_omni_is_in_video/">community reaction</a>).</li><li><a href="https://x.com/GoogleDeepMind/status/2059235181274202500">SynthID becomes a shared layer</a> — 100 billion watermarks, Search and Chrome, and OpenAI/ElevenLabs/Kakao on board.</li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1tna22m/the_financial_times_has_published_an_article/">Heretic in the Financial Times</a> — decensoring open weights in ten minutes, and the <a href="https://huggingface.co/llmfan46/Qwen3.5-35B-A3B-uncensored-heretic-v2-Native-MTP-Preserved">artifact that proves the gap</a>.</li><li><a href="https://pscanf.com/s/354/">The user is visibly frustrated</a> — why conversational agent UX trips your social wiring.</li><li><a href="https://www.reddit.com/r/singularity/comments/1tntdui/users_who_rage_quit_my_software/">A rage-quitting modder</a> and <a href="https://www.technologyreview.com/2026/05/26/1137855/a-reality-check-on-the-ai-jobs-hysteria/">the jobs data</a> — backlash, and what the numbers actually say.</li><li>The bench — <a href="https://www.reddit.com/r/LocalLLaMA/comments/1tn8utn/nuextract3_released_openweight_4b_vlm_for/">NuExtract3</a>, <a href="https://vllm.ai/blog/2026-05-26-eagle-3-1">EAGLE 3.1</a>, and a <a href="https://www.reddit.com/r/LocalLLaMA/comments/1to00xl/strix_halo_users_a_rejected_pr_can_give_you_up_to/">rejected llama.cpp patch</a> worth grabbing.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-26.mp3" length="23457800" type="audio/mpeg" />
<itunes:duration>00:24:26</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-26.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-26.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-26-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-26-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>A few hundred dollars a proof, and the long argument about what machines are for</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-25.html</link>
<guid isPermaLink="false">2026-05-25</guid>
<pubDate>Mon, 25 May 2026 13:00:17 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A frontier lab proves nine decades-old math problems for a few hundred dollars each, two talks make the numeric case that the cheapest agents route work to the smallest model that can do it, a lawsuit names an individual researcher over how Llama&apos;s training data was sourced, and a papal encyclical argues about AI on the terms of work and dignity. Eight things worth knowing today, told one developer to another. DeepMind&apos;s AlphaProof Nexus clears nine open Erdős problems — Lean-verified proofs, a few hundred dollars apiece. &quot;You don&apos;t need GPT to zoom for you&quot; — Callosum&apos;s numbers on routing subtasks to smaller models. The token-efficiency turn — ThePrimeagen on why the org paying retail eventually does the math. Inside how DeepMind runs its own agents — worse quotas than customers, a Darwinian skills library, and skepticism about MCP. The lawsuit that names a name — Hobbs v. Meta, an individual researcher, and the internal dissent in the record. Simon Willison on publishing GPT-4&apos;s retired architecture — the guesswork behind the water numbers. Jujutsu and the pile of laundry — making a mess on purpose, then sorting it at the end. Filming your chores for the robots — where the embodied-AI training data is actually coming from. Pope Leo XIV&apos;s AI encyclical — technology is never neutral, and what no machine replaces.</description>

<content:encoded><![CDATA[<p>A frontier lab proves nine decades-old math problems for a few hundred dollars each, two talks make the numeric case that the cheapest agents route work to the smallest model that can do it, a lawsuit names an individual researcher over how Llama's training data was sourced, and a papal encyclical argues about AI on the terms of work and dignity. Eight things worth knowing today, told one developer to another.</p>
<ul>
<li><a href="https://arxiv.org/abs/2605.22763">DeepMind's AlphaProof Nexus clears nine open Erdős problems</a> — Lean-verified proofs, a few hundred dollars apiece.</li>
<li><a href="https://www.youtube.com/watch?v=WRBNDpUhsJQ">"You don't need GPT to zoom for you"</a> — Callosum's numbers on routing subtasks to smaller models.</li>
<li><a href="https://www.youtube.com/watch?v=0zw-Uk9KJiA">The token-efficiency turn</a> — ThePrimeagen on why the org paying retail eventually does the math.</li>
<li><a href="https://www.youtube.com/watch?v=7gujZrJ9L5I">Inside how DeepMind runs its own agents</a> — worse quotas than customers, a Darwinian skills library, and skepticism about MCP.</li>
<li><a href="https://x.com/ednewtonrex/status/2058433725889716519">The lawsuit that names a name</a> — Hobbs v. Meta, an individual researcher, and the internal dissent in the record.</li>
<li><a href="https://x.com/simonw/status/2058877314004627690">Simon Willison on publishing GPT-4's retired architecture</a> — the guesswork behind the water numbers.</li>
<li><a href="https://ikesau.co/blog/defeating-git-rigour-fatigue-with-jujutsu/">Jujutsu and the pile of laundry</a> — making a mess on purpose, then sorting it at the end.</li>
<li><a href="https://www.washingtonpost.com/technology/interactive/2026/robot-chores-video-data/">Filming your chores for the robots</a> — where the embodied-AI training data is actually coming from.</li>
<li><a href="https://www.vatican.va/content/leo-xiv/en/encyclicals/documents/20260515-magnifica-humanitas.html">Pope Leo XIV's AI encyclical</a> — technology is never neutral, and what no machine replaces.</li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-25.mp3" length="22729656" type="audio/mpeg" />
<itunes:duration>00:23:40</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-25.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-25.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-25-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-25-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The capability got here first: Mythos, a real prompt injection, and the structure that hasn&apos;t caught up</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-24.html</link>
<guid isPermaLink="false">2026-05-24</guid>
<pubDate>Sun, 24 May 2026 13:00:15 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Anthropic&apos;s unreleased Mythos model has reportedly found more than ten thousand vulnerabilities for its Project Glasswing partners — and showed up briefly inside Claude Code this weekend. The same weekend, a security researcher flagged what he calls the first real prompt-injection attack in the wild, riding the exact workflow we&apos;ve all been adopting. Today&apos;s episode walks both sides of that coin, then turns to what builders are actually doing: a three-dollar refactor with a deadlock in it, the missing coordination layer for agent swarms, and the argument that the chat box is the command-line phase of agentic software.Mythos &amp; Project Glasswing — a security model &quot;too dangerous to release,&quot; and the case for and against that framing.A real prompt injection in the wild — a malicious GitHub issue, a scan.js, and secrets exfiltrated over DNS.The three-dollar refactor — cheap worker models, one confident deadlock, and where judgment still lives.The missing primitive is coordination — Lou Bichard of Ona on software factories, Stripe&apos;s Minions, and why GitHub isn&apos;t a coordination layer.Your agent is an infinite canvas — Rachel Lee Nabors on MCP apps, Web MCP, and chat as the command-line phase.r/programming reopens to AI — a seven-million-person community moves from a reflex ban to a written policy.</description>

<content:encoded><![CDATA[<p>Anthropic's unreleased Mythos model has reportedly found more than ten thousand vulnerabilities for its Project Glasswing partners — and showed up briefly inside Claude Code this weekend. The same weekend, a security researcher flagged what he calls the first real prompt-injection attack in the wild, riding the exact workflow we've all been adopting. Today's episode walks both sides of that coin, then turns to what builders are actually doing: a three-dollar refactor with a deadlock in it, the missing coordination layer for agent swarms, and the argument that the chat box is the command-line phase of agentic software.</p><ul><li><a href="https://www.engadget.com/2180028/anthropic-claude-mythos-preview-project-glasswing-update/">Mythos &amp; Project Glasswing</a> — a security model "too dangerous to release," and the case for and against that framing.</li><li><a href="https://x.com/rez0__/status/2058350854508286082">A real prompt injection in the wild</a> — a malicious GitHub issue, a scan.js, and secrets exfiltrated over DNS.</li><li><a href="https://www.reddit.com/r/singularity/comments/1tlj7ou/coding_is_basically_solved_for_the_boring_90_of/">The three-dollar refactor</a> — cheap worker models, one confident deadlock, and where judgment still lives.</li><li><a href="https://www.youtube.com/watch?v=5Sui_OnSRlY">The missing primitive is coordination</a> — Lou Bichard of Ona on software factories, Stripe's Minions, and why GitHub isn't a coordination layer.</li><li><a href="https://www.youtube.com/watch?v=LMbeDEQO6QM">Your agent is an infinite canvas</a> — Rachel Lee Nabors on MCP apps, Web MCP, and chat as the command-line phase.</li><li><a href="https://www.reddit.com/r/programming/comments/1tlh5aj/announcement_weve_updated_the_rules_and_april_is/">r/programming reopens to AI</a> — a seven-million-person community moves from a reflex ban to a written policy.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-24.mp3" length="20679173" type="audio/mpeg" />
<itunes:duration>00:21:32</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-24.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-24.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-24-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-24-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Fast models, slow developers — and the part of the job that stays yours</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-23.html</link>
<guid isPermaLink="false">2026-05-23</guid>
<pubDate>Sat, 23 May 2026 13:00:16 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A Saturday episode about what your job becomes when the model writes the code — and writes it fast. The bottleneck moved from typing to deciding, and a surprising number of this week&apos;s stories land on the same instruction: stay the one who decides. Plus a price floor, a reclassification, a year of bold predictions, and a 4-year-old gaming card that won&apos;t quit.&quot;I don&apos;t write code anymore&quot; — Pieter Levels, amplified by Marc Andreessen, and the real-thing/bubble-thing tangle inside it.Fast Models Need Slow Developers — Sarah Chieng of Cerebras on Codex Spark at 1,200 tokens a second, and why the discipline matters more, not less.DeepSeek&apos;s permanent 75% cut and NVIDIA folding gaming into &quot;Edge Computing&quot; — two ends of the same pipe.Jack Clark&apos;s year of predictions at Oxford — and the cognitive-atrophy counterpoint.BeeLlama&apos;s DFlash update — 164 tokens a second on a single RTX 3090.Lobster Trap — Sally Ann O&apos;Malley of Red Hat on containerizing an OpenClaw agent setup.How the rest of the world sees this — and a couple overheard in a Copenhagen park.</description>

<content:encoded><![CDATA[<p>A Saturday episode about what your job becomes when the model writes the code — and writes it fast. The bottleneck moved from typing to deciding, and a surprising number of this week's stories land on the same instruction: stay the one who decides. Plus a price floor, a reclassification, a year of bold predictions, and a 4-year-old gaming card that won't quit.</p><ul><li><a href="https://x.com/levelsio/status/2058116725929828722">"I don't write code anymore"</a> — Pieter Levels, amplified by <a href="https://x.com/pmarca/status/2058144277340049588">Marc Andreessen</a>, and the real-thing/bubble-thing tangle inside it.</li><li><a href="https://www.youtube.com/watch?v=TeGsFFNqRLA">Fast Models Need Slow Developers</a> — Sarah Chieng of Cerebras on Codex Spark at 1,200 tokens a second, and why the discipline matters more, not less.</li><li><a href="https://thenextweb.com/news/deepseek-v4-pro-price-cut-75-percent">DeepSeek's permanent 75% cut</a> and <a href="https://www.guru3d.com/story/nvidia-removes-gaming-revenue-category-from-financial-reports/">NVIDIA folding gaming into "Edge Computing"</a> — two ends of the same pipe.</li><li><a href="https://www.theguardian.com/technology/2026/may/21/ai-nobel-prize-winning-discovery-robots-jack-clark-anthropic">Jack Clark's year of predictions</a> at Oxford — and the cognitive-atrophy counterpoint.</li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1tkpz2y/beellama_v020_major_dflash_update_single_rtx_3090/">BeeLlama's DFlash update</a> — 164 tokens a second on a single RTX 3090.</li><li><a href="https://www.youtube.com/watch?v=F1DYkY1BlfM">Lobster Trap</a> — Sally Ann O'Malley of Red Hat on containerizing an OpenClaw agent setup.</li><li><a href="https://www.reddit.com/r/singularity/comments/1tl68ne/is_ai_viewed_as_evil_in_nontech_communities/">How the rest of the world sees this</a> — and a couple <a href="https://x.com/niloofar_mire/status/2058148404673331256">overheard in a Copenhagen park</a>.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-23.mp3" length="20786631" type="audio/mpeg" />
<itunes:duration>00:21:39</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-23.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-23.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-23-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-23-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The recant, the runtime, and a Pantheon built in code</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-22.html</link>
<guid isPermaLink="false">2026-05-22</guid>
<pubDate>Fri, 22 May 2026 13:00:17 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A corporate takedown answered with a recant letter and a mirror in Germany, the protocols and computers agents actually run on, six tools trying to build the Pantheon in code, and a paper where the model writes its own GPU kernel. Plus Codex learning to keep going, a security tool hardened against the real world, and a graduation room that cheered for human intelligence. Meta emails Heretic; Heretic recants — a takedown of abliterated Llama derivatives answered with a Galileo joke and a Codeberg mirror in Germany. Five hundred PRs a day, and the harness that triages them — Onur Solmaz on OpenClaw, acpx, and the Agent Client Protocol. The computer the agent runs on — Ivan Burazin of Daytona on stateful, composable machines for agents and 74% month-over-month growth. Building the Pantheon, in code — six coding tools tackle parametric CAD, and the gap between a good preview and a clean export. When the model writes its own kernel — CODA folds memory-bound ops into the matrix multiply, and model-authored kernels keep up with human ones. Codex learns to keep going — goal mode graduates, plus Appshots and shared plugins. Hardening the thing that reads your CI config — Trail of Bits stress-tests zizmor against forty-one thousand real workflows. The headcount bet — and a graduation room that cheered for actual intelligence.</description>

<content:encoded><![CDATA[<p>A corporate takedown answered with a recant letter and a mirror in Germany, the protocols and computers agents actually run on, six tools trying to build the Pantheon in code, and a paper where the model writes its own GPU kernel. Plus Codex learning to keep going, a security tool hardened against the real world, and a graduation room that cheered for human intelligence.</p>
<ul>
<li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1tjmvx6/heretic_has_been_served_a_legal_notice_by_meta_inc/">Meta emails Heretic; Heretic recants</a> — a takedown of abliterated Llama derivatives answered with a Galileo joke and a Codeberg mirror in Germany.</li>
<li><a href="https://www.youtube.com/watch?v=VaS2h-dY1-4">Five hundred PRs a day, and the harness that triages them</a> — Onur Solmaz on OpenClaw, acpx, and the Agent Client Protocol.</li>
<li><a href="https://www.youtube.com/watch?v=kaX43RRRUKY">The computer the agent runs on</a> — Ivan Burazin of Daytona on stateful, composable machines for agents and 74% month-over-month growth.</li>
<li><a href="https://modelrift.com/blog/openscad-llm-benchmark/">Building the Pantheon, in code</a> — six coding tools tackle parametric CAD, and the gap between a good preview and a clean export.</li>
<li><a href="https://arxiv.org/abs/2605.19269">When the model writes its own kernel</a> — CODA folds memory-bound ops into the matrix multiply, and model-authored kernels keep up with human ones.</li>
<li><a href="https://www.youtube.com/watch?v=rgh0hMYPcd0">Codex learns to keep going</a> — goal mode graduates, plus Appshots and shared plugins.</li>
<li><a href="https://x.com/trailofbits/status/2057782296527208709">Hardening the thing that reads your CI config</a> — Trail of Bits stress-tests zizmor against forty-one thousand real workflows.</li>
<li><a href="https://libertas.software/en/knowledge-hub/19/the-companies-cutting-headcount-for-ai-will-lose-to-the-ones-who-didnt">The headcount bet</a> — and <a href="https://www.businessinsider.com/steve-wozniak-apple-ai-graduation-speech-2026-5">a graduation room that cheered for actual intelligence</a>.</li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-22.mp3" length="20493131" type="audio/mpeg" />
<itunes:duration>00:21:21</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-22.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-22.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-22-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-22-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Two bets on AGI, an 80-year-old problem, and Anthropic in the black</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-21.html</link>
<guid isPermaLink="false">2026-05-21</guid>
<pubDate>Thu, 21 May 2026 13:00:16 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Google&apos;s I/O keynote is a day behind us, and the week it kicked off turned into a referendum on two very different bets on artificial general intelligence — plus a pile of counter-programming from everyone else. Today: OpenAI cracking an 80-year-old math problem with a general-purpose model, Anthropic&apos;s first profitable quarter and what Karpathy was actually hired to do, a 70-page paper on why frontier models still can&apos;t tell a fact from a labeled lie, Midjourney&apos;s hardware regret, ads arriving inside Google&apos;s AI answers, Meta&apos;s layoffs, Cohere&apos;s open-weights comeback, and a field guide to skilling up coding agents.Two bets on the same finish line — Google&apos;s world-model road vs OpenAI&apos;s text-reasoning road, in the labs&apos; own words.OpenAI cracks an 80-year-old problem — the planar unit distance result from a general-purpose reasoning model.Anthropic in the black, and Karpathy&apos;s bet — ~$559M operating profit and a hire aimed at recursive self-improvement.Jagged intelligence, and the false story — the paper where models believe a story they were told a thousand times was fake.Midjourney&apos;s hardware regret — the tooling tax of betting on the less-supported accelerator.Ads come to AI Mode — the business model under the consumer bet.Meta&apos;s eight thousand — the cost side, on the same clock as the wins.Cohere comes back, Apache-licensed — Command A+, a mixture-of-experts model that fits on one or two GPUs.Skilling up the agent — Marc Klingen&apos;s concrete lessons on teaching a coding agent to wire up your tool.Who&apos;s training whom — the anxiety running underneath the week.</description>

<content:encoded><![CDATA[<p>Google's I/O keynote is a day behind us, and the week it kicked off turned into a referendum on two very different bets on artificial general intelligence — plus a pile of counter-programming from everyone else. Today: OpenAI cracking an 80-year-old math problem with a general-purpose model, Anthropic's first profitable quarter and what Karpathy was actually hired to do, a 70-page paper on why frontier models still can't tell a fact from a labeled lie, Midjourney's hardware regret, ads arriving inside Google's AI answers, Meta's layoffs, Cohere's open-weights comeback, and a field guide to skilling up coding agents.</p><ul><li><a href="https://www.youtube.com/watch?v=o_av1b9rs2g">Two bets on the same finish line</a> — Google's world-model road vs OpenAI's text-reasoning road, in the labs' own words.</li><li><a href="https://openai.com/index/model-disproves-discrete-geometry-conjecture/">OpenAI cracks an 80-year-old problem</a> — the planar unit distance result from a general-purpose reasoning model.</li><li><a href="https://www.cnbc.com/2026/05/20/anthropic-revenue-explosive-growth-ipo-profitable-quarter.html">Anthropic in the black, and Karpathy's bet</a> — ~$559M operating profit and a hire aimed at recursive self-improvement.</li><li><a href="https://www.youtube.com/watch?v=o_av1b9rs2g">Jagged intelligence, and the false story</a> — the paper where models believe a story they were told a thousand times was fake.</li><li><a href="https://www.reddit.com/r/singularity/comments/1tiut2d/midjourney_says_their_research_was_set_back_by_a/">Midjourney's hardware regret</a> — the tooling tax of betting on the less-supported accelerator.</li><li><a href="https://blog.google/products/ads-commerce/google-marketing-live-search-ads/">Ads come to AI Mode</a> — the business model under the consumer bet.</li><li><a href="https://nypost.com/2026/05/20/business/meta-kicks-off-bloodbath-with-8000-layoffs-in-shift-to-ai/">Meta's eight thousand</a> — the cost side, on the same clock as the wins.</li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1tizmar/re_what_ever_happened_to_coheres_commanda_series/">Cohere comes back, Apache-licensed</a> — Command A+, a mixture-of-experts model that fits on one or two GPUs.</li><li><a href="https://www.youtube.com/watch?v=vNCY9kXXyDQ">Skilling up the agent</a> — Marc Klingen's concrete lessons on teaching a coding agent to wire up your tool.</li><li><a href="https://www.reddit.com/r/singularity/comments/1tjgm3x/every_office_employee_is_training_their_own/">Who's training whom</a> — the anxiety running underneath the week.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-21.mp3" length="21593630" type="audio/mpeg" />
<itunes:duration>00:22:29</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-21.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-21.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-21-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-21-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Foothills, and the morning Karpathy moved</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-20.html</link>
<guid isPermaLink="false">2026-05-20</guid>
<pubDate>Wed, 20 May 2026 13:00:17 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Google I/O 2026 landed yesterday — Gemini Omni, Gemini 3.5 Flash, Antigravity 2.0, Spark, and Demis Hassabis closing the keynote on the &quot;foothills of the singularity.&quot; About forty minutes before he walked on stage, Andrej Karpathy tweeted that he&apos;d joined Anthropic. The rest of the day was the labs sorting themselves around both events. Today&apos;s show works through the announcements, the pricing shifts, the keynote demo that boots Doom, the Railway outage that happened while Google was selling Spark, and a builder&apos;s 100K-line Rust postmortem that&apos;s a sharper picture of agentic coding than anything on the I/O stage.Hassabis: &quot;foothills of the singularity&quot; — DeepMind&apos;s CEO compresses his AGI timeline on stageGemini 3.5 Flash specs and pricing — and what the 3x bump meansGemini Omni&apos;s physics pitch versus the same-day backflip testAntigravity 2.0&apos;s 93-agent OS demo: 12 hours, 2.6B tokens, under $1K, boots DoomAndrej Karpathy joins Anthropic — pre-training, on Nick Joseph&apos;s teamEthan Mollick: recursive self-improvement is a talent sink for the Big ThreeQwen 3.7-Max and the Zhenwu M890 chip — Alibaba&apos;s full-stack I/O responseDeepSeek hiring a Code Harness team in BeijingRailway&apos;s 8-hour outage after GCP&apos;s automated account suspensionCheng Huang: 130K lines of Rust, AI-written contracts, and a Paxos engine that runsTechCrunch on Anthropic&apos;s pre-training charter for Karpathy</description>

<content:encoded><![CDATA[<p>Google I/O 2026 landed yesterday — Gemini Omni, Gemini 3.5 Flash, Antigravity 2.0, Spark, and Demis Hassabis closing the keynote on the "foothills of the singularity." About forty minutes before he walked on stage, Andrej Karpathy tweeted that he'd joined Anthropic. The rest of the day was the labs sorting themselves around both events. Today's show works through the announcements, the pricing shifts, the keynote demo that boots Doom, the Railway outage that happened while Google was selling Spark, and a builder's 100K-line Rust postmortem that's a sharper picture of agentic coding than anything on the I/O stage.</p><ul><li><a href="https://www.prismnews.com/news/google-deepmind-chief-says-ai-marks-foothills-of-the">Hassabis: "foothills of the singularity" — DeepMind's CEO compresses his AGI timeline on stage</a></li><li><a href="https://www.latent.space/p/ainews-google-io-2026-gemini-35-flash">Gemini 3.5 Flash specs and pricing — and what the 3x bump means</a></li><li><a href="https://www.reddit.com/r/singularity/comments/1thohgl/gemini_omni_model_is_still_unable_to_make_someone/">Gemini Omni's physics pitch versus the same-day backflip test</a></li><li><a href="https://www.latent.space/p/ainews-google-io-2026-gemini-35-flash">Antigravity 2.0's 93-agent OS demo: 12 hours, 2.6B tokens, under $1K, boots Doom</a></li><li><a href="https://x.com/karpathy/status/2056753169888334312">Andrej Karpathy joins Anthropic — pre-training, on Nick Joseph's team</a></li><li><a href="https://x.com/emollick/status/2057074407177130096">Ethan Mollick: recursive self-improvement is a talent sink for the Big Three</a></li><li><a href="https://meyka.com/blog/alibaba-upgrades-ai-stack-with-qwen-3-7-max-executes-tasks-for-35-hours-supports-1000-tools/">Qwen 3.7-Max and the Zhenwu M890 chip — Alibaba's full-stack I/O response</a></li><li><a href="https://x.com/victor207755822/status/2057064415300841626">DeepSeek hiring a Code Harness team in Beijing</a></li><li><a href="https://blog.railway.com/p/incident-report-may-19-2026-gcp-account-outage">Railway's 8-hour outage after GCP's automated account suspension</a></li><li><a href="https://zfhuang99.github.io/rust/claude%20code/codex/contracts/spec-driven%20development/2025/12/01/rust-with-ai.html">Cheng Huang: 130K lines of Rust, AI-written contracts, and a Paxos engine that runs</a></li><li><a href="https://techcrunch.com/2026/05/19/openai-co-founder-andrej-karpathy-joins-anthropics-pre-training-team/">TechCrunch on Anthropic's pre-training charter for Karpathy</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-20.mp3" length="22498906" type="audio/mpeg" />
<itunes:duration>00:23:26</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-20.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-20.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-20-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-20-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Mostly-work, malicious npm, and one engineer replacing a law firm</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-19.html</link>
<guid isPermaLink="false">2026-05-19</guid>
<pubDate>Tue, 19 May 2026 13:00:20 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A six-month overview from Simon Willison anchors the day: coding agents crossed from often-work to mostly-work in November, and laptop-class models started outrunning expectations. Then a fresh npm supply-chain attack — 637 malicious versions in 22 minutes — that for the first time specifically hijacks Claude Code and Codex agent hooks for persistence. Plus a Number 10 talk on replacing a one-and-a-half-million-pound law-firm contract with one embedded engineer, an editor-layer company renting xAI&apos;s Colossus 2, Ethan Mollick on insourcing, the full GenMedia pipeline running for a dollar a book, Daniel Griesser&apos;s pi-config skill repo, and two obituaries that hit the Unix world in the same week. Simon Willison&apos;s last-six-months-in-LLMs PyCon lightning talk Mini Shai-Hulud strikes again — 317 npm packages and your agent hooks Prime Intellect&apos;s General-Agent — synthetic RL environments Eoin Mulgrew on Number 10&apos;s insurgent technical unit Cursor&apos;s Compose 2.5 reportedly trained on xAI&apos;s Colossus 2 Ethan Mollick on insourcing via hiring Guillaume Vernade&apos;s full GenMedia pipeline at a dollar a book Daniel Griesser&apos;s pi-config — Plan, handoff, and subagent skills Peter Neumann (1932–2026) Peter Salus (1938–2026) Magnifica humanitas confirmed for May 25</description>

<content:encoded><![CDATA[<p>A six-month overview from Simon Willison anchors the day: coding agents crossed from often-work to mostly-work in November, and laptop-class models started outrunning expectations. Then a fresh npm supply-chain attack — 637 malicious versions in 22 minutes — that for the first time specifically hijacks Claude Code and Codex agent hooks for persistence. Plus a Number 10 talk on replacing a one-and-a-half-million-pound law-firm contract with one embedded engineer, an editor-layer company renting xAI's Colossus 2, Ethan Mollick on insourcing, the full GenMedia pipeline running for a dollar a book, Daniel Griesser's pi-config skill repo, and two obituaries that hit the Unix world in the same week.</p>
<ul>
<li><a href="https://simonwillison.net/2026/May/19/5-minute-llms/">Simon Willison's last-six-months-in-LLMs PyCon lightning talk</a></li>
<li><a href="https://safedep.io/mini-shai-hulud-strikes-again-314-npm-packages-compromised/">Mini Shai-Hulud strikes again — 317 npm packages and your agent hooks</a></li>
<li><a href="https://x.com/PrimeIntellect/status/2056569877167808966">Prime Intellect's General-Agent — synthetic RL environments</a></li>
<li><a href="https://www.youtube.com/watch?v=ObNKGf9YR0g">Eoin Mulgrew on Number 10's insurgent technical unit</a></li>
<li><a href="https://x.com/techdevnotes/status/2056543940052910237">Cursor's Compose 2.5 reportedly trained on xAI's Colossus 2</a></li>
<li><a href="https://x.com/emollick/status/2056578946813100173">Ethan Mollick on insourcing via hiring</a></li>
<li><a href="https://www.youtube.com/watch?v=BcWFc3H7Khg">Guillaume Vernade's full GenMedia pipeline at a dollar a book</a></li>
<li><a href="https://x.com/DanielGri/status/2056676488183689620">Daniel Griesser's pi-config — Plan, handoff, and subagent skills</a></li>
<li><a href="https://www.tuhs.org/pipermail/tuhs/2026-May/033748.html">Peter Neumann (1932–2026)</a></li>
<li><a href="https://www.tuhs.org/pipermail/tuhs/2026-May/033750.html">Peter Salus (1938–2026)</a></li>
<li><a href="https://www.vaticannews.va/en/pope/news/2026-05/pope-leo-xiv-first-encyclical-magnifica-humanitas.html">Magnifica humanitas confirmed for May 25</a></li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-19.mp3" length="27300027" type="audio/mpeg" />
<itunes:duration>00:28:26</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-19.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-19.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-19-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-19-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Cold starts, radio stations, and a circuit you can subtract</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-18.html</link>
<guid isPermaLink="false">2026-05-18</guid>
<pubDate>Tue, 19 May 2026 01:11:50 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Monday&apos;s lineup: Modal publishes the full architecture behind a 40x reduction in serverless-GPU cold-start latency, Andon Labs releases the five-month results from letting four frontier models run real radio stations, and a researcher locates and turns off the political-censorship circuit inside Qwen 3.5 9B. Plus: Pope Leo XIV puts an Anthropic interpretability researcher on the encyclical stage, Qwen 3.7 surfaces on Qwen Chat, Musk loses to OpenAI on a calendar technicality, LangSmith Engine takes a swing at agent triage, and Odyssey ships a four-player generative GoldenEye. Modal&apos;s 50-second cold start Five months of AI radio Magnifica humanitas at the Vatican Reading Qwen 3.5&apos;s censorship out of its weights Qwen 3.7 surfaces and Musk loses LangSmith Engine takes a swing at agent triage Agora-1 generates a shared GoldenEye Three questions for I/O tomorrow</description>

<content:encoded><![CDATA[<p>Monday's lineup: Modal publishes the full architecture behind a 40x reduction in serverless-GPU cold-start latency, Andon Labs releases the five-month results from letting four frontier models run real radio stations, and a researcher locates and turns off the political-censorship circuit inside Qwen 3.5 9B. Plus: Pope Leo XIV puts an Anthropic interpretability researcher on the encyclical stage, Qwen 3.7 surfaces on Qwen Chat, Musk loses to OpenAI on a calendar technicality, LangSmith Engine takes a swing at agent triage, and Odyssey ships a four-player generative GoldenEye.</p>
<ul>
<li><a href="https://modal.com/blog/truly-serverless-gpus">Modal's 50-second cold start</a></li>
<li><a href="https://andonlabs.com/blog/andon-fm">Five months of AI radio</a></li>
<li><a href="https://www.vaticannews.va/en/pope/news/2026-05/pope-leo-xiv-first-encyclical-magnifica-humanitas.html">Magnifica humanitas at the Vatican</a></li>
<li><a href="https://vas-blog.pages.dev/qwen-censorship/">Reading Qwen 3.5's censorship out of its weights</a></li>
<li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1tgpabe/qwen_37_droped_on_qwen_chat/">Qwen 3.7 surfaces</a> and <a href="https://www.cnbc.com/2026/05/18/musk-altman-openai-trial-verdict.html">Musk loses</a></li>
<li><a href="https://www.langchain.com/blog/introducing-langsmith-engine">LangSmith Engine takes a swing at agent triage</a></li>
<li><a href="https://odyssey.ml/introducing-agora-1">Agora-1 generates a shared GoldenEye</a></li>
<li><a href="https://x.com/sundarpichai/status/2056524502746747048">Three questions for I/O tomorrow</a></li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-18.mp3" length="27758427" type="audio/mpeg" />
<itunes:duration>00:28:55</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-18.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-18.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-18-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-18-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Bring Your Own Numbers</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-17.html</link>
<guid isPermaLink="false">2026-05-17</guid>
<pubDate>Sun, 17 May 2026 13:45:50 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A Sunday show about doing your own arithmetic. Mustafa Suleyman gives the white-collar tier eighteen months, in a piece whose own counter-data sits two paragraphs down. The State of Brand argues every AI subscription is a subsidized loss-leader two weeks away from a forcing function. William Angel runs the tokens-per-hour math on an M5 MacBook Pro and finds OpenRouter cheaper. Frederick Vanbrabant uses The Goal to explain why agents move the bottleneck rather than break it. Marlene Mhangami&apos;s Playwright talk shows the cleanest pattern for tests AI should write. Calif&apos;s public M5 / MIE exploit write-up lands. Artem Loenko explains why every chat UI keeps ending up on a browser engine. And Luke Lanchester&apos;s MCP hello page is the small fix I most enjoyed this week.Suleyman&apos;s eighteen-month claim, and the article that knocks it downThe subsidy that ends on June 1Apple Silicon costs more than OpenRouterFrederick Vanbrabant on why AI doesn&apos;t speed up your processMarlene Mhangami: tests AI writes vs tests AI verifiesCalif on the five-day M5 / MIE exploit, in their own wordsArtem Loenko: native all the way, until you need textLuke Lanchester&apos;s MCP hello page</description>

<content:encoded><![CDATA[<p>A Sunday show about doing your own arithmetic. Mustafa Suleyman gives the white-collar tier eighteen months, in a piece whose own counter-data sits two paragraphs down. The State of Brand argues every AI subscription is a subsidized loss-leader two weeks away from a forcing function. William Angel runs the tokens-per-hour math on an M5 MacBook Pro and finds OpenRouter cheaper. Frederick Vanbrabant uses The Goal to explain why agents move the bottleneck rather than break it. Marlene Mhangami's Playwright talk shows the cleanest pattern for tests AI should write. Calif's public M5 / MIE exploit write-up lands. Artem Loenko explains why every chat UI keeps ending up on a browser engine. And Luke Lanchester's MCP hello page is the small fix I most enjoyed this week.</p><ul><li><a href="https://fortune.com/article/why-microsoft-ai-chief-mustafa-suleyman-predicts-ai-automation-18-months/">Suleyman's eighteen-month claim, and the article that knocks it down</a></li><li><a href="https://www.thestateofbrand.com/news/ai-subscription-time-bomb">The subsidy that ends on June 1</a></li><li><a href="https://www.williamangel.net/blog/2026/05/17/offline-llm-energy-use.html">Apple Silicon costs more than OpenRouter</a></li><li><a href="https://frederickvanbrabant.com/blog/2026-05-15-i-dont-think-ai-will-make-your-processes-go-faster/">Frederick Vanbrabant on why AI doesn't speed up your process</a></li><li><a href="https://www.youtube.com/watch?v=FWEInOtngmM">Marlene Mhangami: tests AI writes vs tests AI verifies</a></li><li><a href="https://blog.calif.io/p/first-public-kernel-memory-corruption">Calif on the five-day M5 / MIE exploit, in their own words</a></li><li><a href="https://justsitandgrin.im/posts/native-all-the-way-until-you-need-text/">Artem Loenko: native all the way, until you need text</a></li><li><a href="https://www.hybridlogic.co.uk/blog/2026/05/mcp-hello-page">Luke Lanchester's MCP hello page</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-17.mp3" length="26674520" type="audio/mpeg" />
<itunes:duration>00:25:59</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-17.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-17.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-17-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-17-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>CTFs, Scrum, and Claude&apos;s Bedtime</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-16.html</link>
<guid isPermaLink="false">2026-05-16</guid>
<pubDate>Sat, 16 May 2026 13:00:15 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. An Australian CTF top-tenner writes the obituary for the open competitive scene. Intercom and PFF both report doubling-and-then-some engineering throughput from agent-first workflows — using opposite playbooks. Supabase ships a skill after watching an agent silently bypass row-level security. A suitcase runs a 4B model fully offline at conversational latency. Julia Evans leaves Tailwind. And Claude keeps telling people to go to sleep. &quot;The CTF scene is dead&quot; — Kabir, former TheHackersCrew player, on agents and the open format Brian Scanlan (Intercom): doubling PR throughput, 17.6% auto-approved, SOC 2 compliant Mike Spitz (PFF): the post-engineer engineering org, two engineers shipping 25x more deploys than ten Pedro Rodrigues (Supabase): skills plus MCP, and the security-invoker flag agents skip without guidance Sparky — a fully offline suitcase robot on a Jetson Orin NX with Gemma 4 E4B Julia Evans: moving away from Tailwind and learning to structure CSS Fortune: Claude is telling users to go to sleep mid-session and Anthropic isn&apos;t sure why @yishan: an LLM hallucinates the Napster 2000 story to someone who was there @Kirsten3531: &quot;LLMs can never be more than the average of their training data&quot; — a 2024 take? Armin Ronacher: running an agent with bash as the only tool, using the patch binary to make edits</description>

<content:encoded><![CDATA[<p>An Australian CTF top-tenner writes the obituary for the open competitive scene. Intercom and PFF both report doubling-and-then-some engineering throughput from agent-first workflows — using opposite playbooks. Supabase ships a skill after watching an agent silently bypass row-level security. A suitcase runs a 4B model fully offline at conversational latency. Julia Evans leaves Tailwind. And Claude keeps telling people to go to sleep.</p>
<ul>
<li><a href="https://kabir.au/blog/the-ctf-scene-is-dead">"The CTF scene is dead" — Kabir, former TheHackersCrew player, on agents and the open format</a></li>
<li><a href="https://www.youtube.com/watch?v=4_VQBbs2iQA">Brian Scanlan (Intercom): doubling PR throughput, 17.6% auto-approved, SOC 2 compliant</a></li>
<li><a href="https://www.youtube.com/watch?v=VMemhtlsoNk">Mike Spitz (PFF): the post-engineer engineering org, two engineers shipping 25x more deploys than ten</a></li>
<li><a href="https://www.youtube.com/watch?v=JT3OzDKrucU">Pedro Rodrigues (Supabase): skills plus MCP, and the security-invoker flag agents skip without guidance</a></li>
<li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1tdz5gr/built_a_fully_offline_suitcase_robot_around_a/">Sparky — a fully offline suitcase robot on a Jetson Orin NX with Gemma 4 E4B</a></li>
<li><a href="https://jvns.ca/blog/2026/05/15/moving-away-from-tailwind--and-learning-to-structure-my-css-/">Julia Evans: moving away from Tailwind and learning to structure CSS</a></li>
<li><a href="https://fortune.com/2026/05/14/why-is-claude-telling-users-to-go-to-sleep-anthropic-ai-sentient/">Fortune: Claude is telling users to go to sleep mid-session and Anthropic isn't sure why</a></li>
<li><a href="https://x.com/yishan/status/2055511928928350493">@yishan: an LLM hallucinates the Napster 2000 story to someone who was there</a></li>
<li><a href="https://x.com/Kirsten3531/status/2055456370955321673">@Kirsten3531: "LLMs can never be more than the average of their training data" — a 2024 take?</a></li>
<li><a href="https://x.com/mitsuhiko/status/2055593093307494413">Armin Ronacher: running an agent with bash as the only tool, using the patch binary to make edits</a></li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-16.mp3" length="27381079" type="audio/mpeg" />
<itunes:duration>00:28:31</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-16.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-16.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-16-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-16-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Five Days to Root, Four Months in Exile</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-15.html</link>
<guid isPermaLink="false">2026-05-15</guid>
<pubDate>Fri, 15 May 2026 13:00:15 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Five days for a small security team paired with Mythos Preview to land the first public macOS kernel exploit on Apple&apos;s M5 with Memory Integrity Enforcement turned on. Four months for Replit to claw back into the iOS App Store. In between: arXiv starts banning authors of LLM-error papers, Metabase explains why open-source security is being strip-mined this summer, NVIDIA squeezes the 5090, Uncle Bob switches from Claude to Codex, and a pure-OCaml protocol stack boots in low Earth orbit.Codex everywhere, Claude in the rearview — OpenAI ships Codex inside the ChatGPT mobile app, Uncle Bob cancels his Claude account, and Arvind Narayanan names the irony underneath both.Five days to a kernel exploit on M5 — Calif and Mythos Preview crack Apple&apos;s Memory Integrity Enforcement and hand-deliver the 55-page report to Cupertino.The strip-mining era of open source security — Metabase&apos;s security inbox went from ten reports a month to ten a week. Cal.com is going closed source.arXiv bans authors of LLM-error papers — Tom Dietterich announces a one-year submission ban on papers with hallucinated references or results.Replit out of the App Store wilderness — Four months after being pulled, Replit&apos;s iOS app is published again. Replies note what that says about platform power.GDDR7 squeezes the 5090 — A 300-dollar price hike to add-in-card partners as GDDR7 lead times stretch into weeks.The web&apos;s secret quirks file — Den Odell walks through Safari&apos;s Quirks.cpp and Firefox&apos;s about:compat. Chrome doesn&apos;t need a quirks file.OCaml in orbit — Thomas Gazagnaire&apos;s pure-OCaml protocol stack booted in low Earth orbit on April 23, with post-quantum rekeying and OxCaml-tuned dispatch.</description>

<content:encoded><![CDATA[<p>Five days for a small security team paired with Mythos Preview to land the first public macOS kernel exploit on Apple's M5 with Memory Integrity Enforcement turned on. Four months for Replit to claw back into the iOS App Store. In between: arXiv starts banning authors of LLM-error papers, Metabase explains why open-source security is being strip-mined this summer, NVIDIA squeezes the 5090, Uncle Bob switches from Claude to Codex, and a pure-OCaml protocol stack boots in low Earth orbit.</p><ul><li><a href="https://x.com/unclebobmartin/status/2054970327592042661">Codex everywhere, Claude in the rearview</a> — OpenAI ships Codex inside the ChatGPT mobile app, Uncle Bob cancels his Claude account, and Arvind Narayanan names the irony underneath both.</li><li><a href="https://blog.calif.io/p/first-public-kernel-memory-corruption">Five days to a kernel exploit on M5</a> — Calif and Mythos Preview crack Apple's Memory Integrity Enforcement and hand-deliver the 55-page report to Cupertino.</li><li><a href="https://www.metabase.com/blog/strip-mining-era-of-open-source-security">The strip-mining era of open source security</a> — Metabase's security inbox went from ten reports a month to ten a week. Cal.com is going closed source.</li><li><a href="https://www.reddit.com/r/MachineLearning/comments/1tdje2d/arxiv_implements_1year_ban_for_papers_containing/">arXiv bans authors of LLM-error papers</a> — Tom Dietterich announces a one-year submission ban on papers with hallucinated references or results.</li><li><a href="https://x.com/amasad/status/2055185058282226146">Replit out of the App Store wilderness</a> — Four months after being pulled, Replit's iOS app is published again. Replies note what that says about platform power.</li><li><a href="https://www.techpowerup.com/349050/nvidia-reportedly-prepares-rtx-5090-price-hike-amid-rising-gddr7-costs">GDDR7 squeezes the 5090</a> — A 300-dollar price hike to add-in-card partners as GDDR7 lead times stretch into weeks.</li><li><a href="https://denodell.com/blog/browsers-treat-big-sites-differently">The web's secret quirks file</a> — Den Odell walks through Safari's Quirks.cpp and Firefox's about:compat. Chrome doesn't need a quirks file.</li><li><a href="https://gazagnaire.org/blog/2026-05-14-borealis.html">OCaml in orbit</a> — Thomas Gazagnaire's pure-OCaml protocol stack booted in low Earth orbit on April 23, with post-quantum rekeying and OxCaml-tuned dispatch.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-15.mp3" length="27074626" type="audio/mpeg" />
<itunes:duration>00:28:12</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-15.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-15.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-15-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-15-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The Cost of Finding Out</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-14.html</link>
<guid isPermaLink="false">2026-05-14</guid>
<pubDate>Thu, 14 May 2026 13:00:14 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Anthropic drew two lines around Claude this week — a guided lane for small-business owners and a metered one for the developers running agents hardest. From there: Bun&apos;s near-million-line port from Zig to Rust, mostly typed by an AI agent in a week; Wasp&apos;s clear-eyed post-mortem on spending five years and five million dollars building a language it didn&apos;t need; a chess coach that works by refusing to let the model think; the UK&apos;s evaluators capping their own cyber tests so the math still works; the open web pricing out crawlers; multi-token prediction landing in llama.cpp; and what happens when you post a real Monet and call it AI.Anthropic draws two lines — Claude for Small Business and the new Agent SDK credit meteringBun, ported to Rust by a bot in a week — and a maintainer who won&apos;t commit to itWasp: the language was never the moat — $5M and five years of lessonsThe chess coach that isn&apos;t allowed to think — Play Magnus on LLM-as-translatorAutonomous cyber, measured against itself — AISI on a capability curve outrunning its own rulerThe web pulls up its drawbridge — Google&apos;s search index and Cloudflare&apos;s defaultsMulti-token prediction on your laptop — a real gain bundled with a contested oneThe Monet test — when the AI-tell detector fires a false positive</description>

<content:encoded><![CDATA[<p>Anthropic drew two lines around Claude this week — a guided lane for small-business owners and a metered one for the developers running agents hardest. From there: Bun's near-million-line port from Zig to Rust, mostly typed by an AI agent in a week; Wasp's clear-eyed post-mortem on spending five years and five million dollars building a language it didn't need; a chess coach that works by refusing to let the model think; the UK's evaluators capping their own cyber tests so the math still works; the open web pricing out crawlers; multi-token prediction landing in llama.cpp; and what happens when you post a real Monet and call it AI.</p><ul><li><a href="https://www.anthropic.com/news/claude-for-small-business">Anthropic draws two lines</a> — Claude for Small Business and the new Agent SDK credit metering</li><li><a href="https://github.com/oven-sh/bun/pull/30412">Bun, ported to Rust by a bot in a week</a> — and a maintainer who won't commit to it</li><li><a href="https://wasp.sh/blog/2026/05/13/new-language-for-web-dev-was-a-mistake">Wasp: the language was never the moat</a> — $5M and five years of lessons</li><li><a href="https://www.youtube.com/watch?v=FlzpEGHNVKQ">The chess coach that isn't allowed to think</a> — Play Magnus on LLM-as-translator</li><li><a href="https://www.aisi.gov.uk/blog/how-fast-is-autonomous-ai-cyber-capability-advancing">Autonomous cyber, measured against itself</a> — AISI on a capability curve outrunning its own ruler</li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1tcaboi/websearch_is_coming_to_a_screeching_performance/">The web pulls up its drawbridge</a> — Google's search index and Cloudflare's defaults</li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1tckzy2/multitoken_prediction_mtp_for_qwen_on_llamacpp/">Multi-token prediction on your laptop</a> — a real gain bundled with a contested one</li><li><a href="https://x.com/Jediwolf/status/2054776716770320631">The Monet test</a> — when the AI-tell detector fires a false positive</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-14.mp3" length="25053870" type="audio/mpeg" />
<itunes:duration>00:26:06</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-14.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-14.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-14-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-14-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Hackbots, Magento, and Three Lines of Logic</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-13.html</link>
<guid isPermaLink="false">2026-05-13</guid>
<pubDate>Wed, 13 May 2026 13:00:15 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. An overnight hackbot run lands a real CVE in Adobe Magento. Codex starts driving local Mac apps in parallel, with per-app permissions and a separate cursor. Cloudflare publishes one of the prettiest debugging writeups of the year — a nine-year-old kernel patch, a 14ms oscillation, three lines of fix. Plus Nous Research&apos;s removable attention wrapper, GPT-5.5&apos;s first ProgramBench solve, Vercel&apos;s argument that giving an agent a file system changes how it behaves, a 26-million-parameter tool-calling model, Isomorphic&apos;s two-billion-dollar Series B, and a Purdue senior who put Rust on his graduation cap.Joseph Thacker on FUZZ-E&apos;s overnight Magento CVEOpenAI&apos;s Codex computer use walkthroughCloudflare on the CUBIC death-spiral fixLighthouse Attention from Nous ResearchProgramBench: GPT-5.5&apos;s first full solveNico Albanese on giving your agent a computerNeedle: a 26M-parameter tool-calling modelIsomorphic Labs Series B announcementEric Park&apos;s Rust-powered graduation capNick Cammarata on capability and self-worth</description>

<content:encoded><![CDATA[<p>An overnight hackbot run lands a real CVE in Adobe Magento. Codex starts driving local Mac apps in parallel, with per-app permissions and a separate cursor. Cloudflare publishes one of the prettiest debugging writeups of the year — a nine-year-old kernel patch, a 14ms oscillation, three lines of fix. Plus Nous Research's removable attention wrapper, GPT-5.5's first ProgramBench solve, Vercel's argument that giving an agent a file system changes how it behaves, a 26-million-parameter tool-calling model, Isomorphic's two-billion-dollar Series B, and a Purdue senior who put Rust on his graduation cap.</p><ul><li><a href="https://x.com/rez0__/status/2054539643912077351">Joseph Thacker on FUZZ-E's overnight Magento CVE</a></li><li><a href="https://www.youtube.com/watch?v=D_FCYsshMI4">OpenAI's Codex computer use walkthrough</a></li><li><a href="https://blog.cloudflare.com/quic-death-spiral-fix/">Cloudflare on the CUBIC death-spiral fix</a></li><li><a href="https://x.com/omarsar0/status/2054224130103554359">Lighthouse Attention from Nous Research</a></li><li><a href="https://programbench.com/blog/gpt-5-5-first-solve/">ProgramBench: GPT-5.5's first full solve</a></li><li><a href="https://www.youtube.com/watch?v=wflNENRSUb4">Nico Albanese on giving your agent a computer</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1tb9b0r/needle_we_distilled_gemini_tool_calling_into_a/">Needle: a 26M-parameter tool-calling model</a></li><li><a href="https://www.isomorphiclabs.com/articles/isomorphic-labs-announces-series-b-investment-round">Isomorphic Labs Series B announcement</a></li><li><a href="https://ericswpark.com/blog/2026/2026-05-12-my-graduation-cap-runs-rust/">Eric Park's Rust-powered graduation cap</a></li><li><a href="https://x.com/nickcammarata/status/2054492840668123548">Nick Cammarata on capability and self-worth</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-13.mp3" length="28715632" type="audio/mpeg" />
<itunes:duration>00:29:55</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-13.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-13.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-13-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-13-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>When Your Editor Becomes the Worm</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-12.html</link>
<guid isPermaLink="false">2026-05-12</guid>
<pubDate>Tue, 12 May 2026 13:00:18 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A coordinated npm and PyPI campaign turned Claude Code and VS Code config files into a self-spreading vector, Mira Murati&apos;s lab put out its first model and it is an argument with the hands-off-keyboard doctrine, and matklad explains why rust-analyzer&apos;s build system is really an org chart. Plus a small rant about cursors, and two builds from the LocalLLaMA subreddit that keep pushing the local-frontier line by hand.The mini-shai-hulud campaign — 170 packages, 404 versions, and a self-spreading IDE payloadThinking Machines ships TML-Interaction-Small and picks a fight with the autonomous-agent doctrinematklad on architecture, incentives, and the build system as a social filterDon&apos;t hijack the mouse pointer — and what it tells us about cheap effectsA 1-trillion-parameter Kimi K2.5 running on Intel Optane DIMMs at 4 tokens a secondUnsloth releases Qwen3.6 with the multi-token prediction layer preserved</description>

<content:encoded><![CDATA[<p>A coordinated npm and PyPI campaign turned Claude Code and VS Code config files into a self-spreading vector, Mira Murati's lab put out its first model and it is an argument with the hands-off-keyboard doctrine, and matklad explains why rust-analyzer's build system is really an org chart. Plus a small rant about cursors, and two builds from the LocalLLaMA subreddit that keep pushing the local-frontier line by hand.</p><ul><li><a href="https://safedep.io/mass-npm-supply-chain-attack-tanstack-mistral/">The mini-shai-hulud campaign — 170 packages, 404 versions, and a self-spreading IDE payload</a></li><li><a href="https://thinkingmachines.ai/blog/interaction-models/">Thinking Machines ships TML-Interaction-Small and picks a fight with the autonomous-agent doctrine</a></li><li><a href="https://matklad.github.io/2026/05/12/software-architecture.html">matklad on architecture, incentives, and the build system as a social filter</a></li><li><a href="https://ruky.me/dont-hijack-my-pointer/">Don't hijack the mouse pointer — and what it tells us about cheap effects</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1taeg8h/computer_build_using_intel_optane_persistent/">A 1-trillion-parameter Kimi K2.5 running on Intel Optane DIMMs at 4 tokens a second</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1ta4rvs/mtp_on_unsloth/">Unsloth releases Qwen3.6 with the multi-token prediction layer preserved</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-12.mp3" length="21683041" type="audio/mpeg" />
<itunes:duration>00:22:35</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-12.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-12.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-12-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-12-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Deployment, Discovery, and the Code You Keep</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-11.html</link>
<guid isPermaLink="false">2026-05-11</guid>
<pubDate>Mon, 11 May 2026 13:45:09 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Today’s Braid starts with OpenAI launching a majority-owned Deployment Company, backed by a Tomoro acquisition, about 150 forward deployed engineers, nineteen partners, and more than $4 billion of initial investment. The practical thread is the work of changing real systems: integration, controls, measurement, and the code you still have to maintain after the demo.OpenAI turns deployment into a company, with Tomoro, TPG, consultancies, integrators, and OpenAI’s launch post pointing at a bigger bet on embedded engineering.Mythos finds one curl vulnerability, while Rival Security complicates Anthropic’s FreeBSD story with a training-data provenance question.James Shore’s maintenance-cost math meets the k10s devlog about archiving a seven-month AI-built Kubernetes TUI.Trigger.dev’s durable-agent talk and Arize’s context-management talk give the backend version of the same lesson.Granola’s production loop, a tiny boolean-argument essay, and MLX on-device demos close the day on builder craft.</description>

<content:encoded><![CDATA[<p>Today’s Braid starts with OpenAI launching a majority-owned Deployment Company, backed by a Tomoro acquisition, about 150 forward deployed engineers, nineteen partners, and more than $4 billion of initial investment. The practical thread is the work of changing real systems: integration, controls, measurement, and the code you still have to maintain after the demo.</p><ul><li><a href="https://openai.com/index/openai-launches-the-deployment-company/">OpenAI turns deployment into a company</a>, with Tomoro, TPG, consultancies, integrators, and <a href="https://x.com/OpenAI/status/2053824997777457651">OpenAI’s launch post</a> pointing at a bigger bet on embedded engineering.</li><li><a href="https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-vulnerability/">Mythos finds one curl vulnerability</a>, while <a href="https://rival.security/posts/mythos-discovered-a-cve-already-in-its-training-data---and-thats-still-worrying">Rival Security</a> complicates Anthropic’s FreeBSD story with a training-data provenance question.</li><li><a href="https://www.jamesshore.com/v2/blog/2026/you-need-ai-that-reduces-your-maintenance-costs">James Shore’s maintenance-cost math</a> meets the <a href="https://blog.k10s.dev/im-going-back-to-writing-code-by-hand/">k10s devlog</a> about archiving a seven-month AI-built Kubernetes TUI.</li><li><a href="https://www.youtube.com/watch?v=svCnShDvgQg">Trigger.dev’s durable-agent talk</a> and <a href="https://www.youtube.com/watch?v=esY99nYXxR4">Arize’s context-management talk</a> give the backend version of the same lesson.</li><li><a href="https://www.youtube.com/watch?v=ON5LIT0M4do">Granola’s production loop</a>, a <a href="https://allthingssmitty.com/2026/05/11/i-keep-tripping-over-true-false-true/">tiny boolean-argument essay</a>, and <a href="https://www.youtube.com/watch?v=zTLJNHj0DeQ">MLX on-device demos</a> close the day on builder craft.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-11.mp3" length="29416969" type="audio/mpeg" />
<itunes:duration>00:30:38</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-11.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-11.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-11-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-11-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Seventeen Hours, Three Sizes, and the Prompt Boundary</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-10.html</link>
<guid isPermaLink="false">2026-05-10</guid>
<pubDate>Sun, 10 May 2026 13:30:15 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. METR publishes a fresh time-horizon number for Claude Mythos Preview, and yesterday&apos;s follow-up gets paid off in a single chart. NVIDIA ships a checkpoint that contains three reasoning models at once. antirez gets DeepSeek 4 running on a DGX Spark and tells you exactly where the bandwidth wall lives. François Chollet argues that agentic coding is a form of machine learning, and a few replies actually push the idea further. Plus the diffusion gap, the German tokenizer tax, and a Gemma 4 drafter that buys you a third of your decode time back.METR&apos;s 17-hour 50% time horizonNVIDIA Star Elastic and the one-checkpoint, three-models trickDeepSeek 4 on DGX Spark — 12 t/s, 270 GB/sec wallChollet: agentic coding is machine learningElad Gil&apos;s diffusion-gap mapGemma 4 multi-token prediction on M5 Max</description>

<content:encoded><![CDATA[<p>METR publishes a fresh time-horizon number for Claude Mythos Preview, and yesterday's follow-up gets paid off in a single chart. NVIDIA ships a checkpoint that contains three reasoning models at once. antirez gets DeepSeek 4 running on a DGX Spark and tells you exactly where the bandwidth wall lives. François Chollet argues that agentic coding is a form of machine learning, and a few replies actually push the idea further. Plus the diffusion gap, the German tokenizer tax, and a Gemma 4 drafter that buys you a third of your decode time back.</p><ul><li><a href="https://www.reddit.com/r/singularity/comments/1t92jf5">METR's 17-hour 50% time horizon</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t8s83r">NVIDIA Star Elastic and the one-checkpoint, three-models trick</a></li><li><a href="https://x.com/antirez/status/2053381973226184749">DeepSeek 4 on DGX Spark — 12 t/s, 270 GB/sec wall</a></li><li><a href="https://x.com/fchollet/status/2053234697392754701">Chollet: agentic coding is machine learning</a></li><li><a href="https://x.com/eladgil/status/2053206351158091819">Elad Gil's diffusion-gap map</a></li><li><a href="https://x.com/adrgrondin/status/2053198336312689103">Gemma 4 multi-token prediction on M5 Max</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-10.mp3" length="23587701" type="audio/mpeg" />
<itunes:duration>00:24:34</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-10.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-10.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-10-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-10-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>A Fields Medalist, a PhD chapter, and the week the bar moved</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-09.html</link>
<guid isPermaLink="false">2026-05-09</guid>
<pubDate>Sat, 09 May 2026 13:30:16 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A Saturday show that leans into the long reads. Tim Gowers — yes, the Fields Medalist — sat down with ChatGPT 5.5 Pro and an open paper from Mel Nathanson and walked away with a result the original author called &quot;original and clever.&quot; We follow that thread, then turn to Mozilla&apos;s deeper write-up on the Firefox 271-bug release, Jeff Kaufman on what AI is doing to disclosure embargoes, Anthropic on why constitution training beats demonstration training, and a beautiful pentest story about a critical RCE in React itself. Plus a quieter set of items: Codex in real Chrome, DHH&apos;s Copilot review hit-rate jump, a SysMoBench paper on LLM-generated TLA+ specs, AI2&apos;s document-routed mixture-of-experts model, and Qwen 35B-A3B running on a 3060.Tim Gowers — A recent experience with ChatGPT 5.5 ProMozilla — Behind the Scenes Hardening FirefoxJeff Kaufman — AI is Breaking Two Vulnerability CulturesAnthropic — Teaching Claude whyLachlan — The React2Shell Story (CVE-2025-55182)Specula team — Can LLMs model real-world systems in TLA+?OpenAI — Codex in Chrome on macOS and WindowsDHH — Copilot review hit ratio 1/10 to 7/10r/LocalLLaMA — Qwen 35B-A3B on 12GB VRAMr/LocalLLaMA — AI2 EMO MoE with document-level routingMETR — Claude Mythos Preview time-horizon evaluation</description>

<content:encoded><![CDATA[<p>A Saturday show that leans into the long reads. Tim Gowers — yes, the Fields Medalist — sat down with ChatGPT 5.5 Pro and an open paper from Mel Nathanson and walked away with a result the original author called &quot;original and clever.&quot; We follow that thread, then turn to Mozilla's deeper write-up on the Firefox 271-bug release, Jeff Kaufman on what AI is doing to disclosure embargoes, Anthropic on why constitution training beats demonstration training, and a beautiful pentest story about a critical RCE in React itself. Plus a quieter set of items: Codex in real Chrome, DHH's Copilot review hit-rate jump, a SysMoBench paper on LLM-generated TLA+ specs, AI2's document-routed mixture-of-experts model, and Qwen 35B-A3B running on a 3060.</p><ul><li><a href="https://gowers.wordpress.com/2026/05/08/a-recent-experience-with-chatgpt-5-5-pro/">Tim Gowers — A recent experience with ChatGPT 5.5 Pro</a></li><li><a href="https://hacks.mozilla.org/2026/05/behind-the-scenes-hardening-firefox/">Mozilla — Behind the Scenes Hardening Firefox</a></li><li><a href="https://www.jefftk.com/p/ai-is-breaking-two-vulnerability-cultures">Jeff Kaufman — AI is Breaking Two Vulnerability Cultures</a></li><li><a href="https://www.anthropic.com/research/teaching-claude-why">Anthropic — Teaching Claude why</a></li><li><a href="https://lachlan.nz/blog/the-react2shell-story/">Lachlan — The React2Shell Story (CVE-2025-55182)</a></li><li><a href="https://www.sigops.org/2026/can-llms-model-real-world-systems-in-tla/">Specula team — Can LLMs model real-world systems in TLA+?</a></li><li><a href="https://www.youtube.com/watch?v=b6Mxcv1pyBU">OpenAI — Codex in Chrome on macOS and Windows</a></li><li><a href="https://x.com/dhh/status/2053088652322869472">DHH — Copilot review hit ratio 1/10 to 7/10</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t7l56a/qwen_35ba3b_is_very_usable_with_12gb_of_vram/">r/LocalLLaMA — Qwen 35B-A3B on 12GB VRAM</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t7kgy4/new_moe_from_ai2_emo/">r/LocalLLaMA — AI2 EMO MoE with document-level routing</a></li><li><a href="https://www.reddit.com/r/singularity/comments/1t7pqpr/metr_evaluated_an_early_version_of_claude_mythos/">METR — Claude Mythos Preview time-horizon evaluation</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-09.mp3" length="30106618" type="audio/mpeg" />
<itunes:duration>00:31:21</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-09.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-09.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-09-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-09-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Mozilla&apos;s 271 Bugs, Chrome&apos;s 4 Gigabytes, and a WebRTC Veteran Telling OpenAI to Stop</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-08.html</link>
<guid isPermaLink="false">2026-05-08</guid>
<pubDate>Fri, 08 May 2026 13:30:14 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Mozilla publishes the long-form on how a Claude Mythos Preview harness found 271 security bugs in Firefox, including sandbox escapes that fuzzers missed for twenty years. A European privacy lawyer goes byte-precise on Chrome&apos;s silent four-gigabyte Gemini Nano push, using kernel filesystem events on a profile that received zero human input. A WebRTC veteran tells OpenAI, on the day it ships GPT-Realtime-2, that the protocol assumptions are wrong for voice agents. Plus AlphaEvolve&apos;s twelve concrete production deployments, Anthropic&apos;s natural-language autoencoders putting a number on Claude&apos;s evaluation awareness, AMD&apos;s first new Instinct PCIe card in five years, and OpenAI quietly winding down the fine-tuning API.Mozilla on hardening Firefox with Claude Mythos PreviewAlexander Hanff on Chrome&apos;s silent 4 GB Gemini Nano installLuke Curley: OpenAI&apos;s WebRTC ProblemOpenAI&apos;s three new audio models in the APIDeepMind on AlphaEvolve&apos;s first year of impactAnthropic&apos;s Natural Language AutoencodersAMD Instinct MI350P PCIe cardSkymizer HTX301 on-prem inference cardOpenAI winding down the fine-tuning APIEU AI Act Article 50 transparency consultationXe Iaso: maybe you shouldn&apos;t install new software for a bitMulti-Token Prediction lands in LLaMA.cpp for Gemma 4Open-OSS/privacy-filter infostealer on Hugging Face</description>

<content:encoded><![CDATA[<p>Mozilla publishes the long-form on how a Claude Mythos Preview harness found 271 security bugs in Firefox, including sandbox escapes that fuzzers missed for twenty years. A European privacy lawyer goes byte-precise on Chrome's silent four-gigabyte Gemini Nano push, using kernel filesystem events on a profile that received zero human input. A WebRTC veteran tells OpenAI, on the day it ships GPT-Realtime-2, that the protocol assumptions are wrong for voice agents. Plus AlphaEvolve's twelve concrete production deployments, Anthropic's natural-language autoencoders putting a number on Claude's evaluation awareness, AMD's first new Instinct PCIe card in five years, and OpenAI quietly winding down the fine-tuning API.</p><ul><li><a href="https://hacks.mozilla.org/2026/05/behind-the-scenes-hardening-firefox/">Mozilla on hardening Firefox with Claude Mythos Preview</a></li><li><a href="https://www.thatprivacyguy.com/blog/chrome-silent-nano-install/">Alexander Hanff on Chrome's silent 4 GB Gemini Nano install</a></li><li><a href="https://moq.dev/blog/webrtc-is-the-problem/">Luke Curley: OpenAI's WebRTC Problem</a></li><li><a href="https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/">OpenAI's three new audio models in the API</a></li><li><a href="https://deepmind.google/blog/alphaevolve-impact/">DeepMind on AlphaEvolve's first year of impact</a></li><li><a href="https://www.anthropic.com/research/natural-language-autoencoders">Anthropic's Natural Language Autoencoders</a></li><li><a href="https://www.servethehome.com/amd-intros-instinct-mi350p-accelerator-cdna-4-comes-to-pcie-cards/">AMD Instinct MI350P PCIe card</a></li><li><a href="https://skymizer.ai/skymizer-announces-htx301-reinventing-on-prem-ai-inference/">Skymizer HTX301 on-prem inference card</a></li><li><a href="https://www.reddit.com/r/OpenAI/comments/1t6sisf/openai_has_announced_they_will_be_winding_down/">OpenAI winding down the fine-tuning API</a></li><li><a href="https://digital-strategy.ec.europa.eu/en/consultations/consultation-draft-guidelines-transparency-obligations-under-ai-act">EU AI Act Article 50 transparency consultation</a></li><li><a href="https://xeiaso.net/blog/2026/abstain-from-install/">Xe Iaso: maybe you shouldn't install new software for a bit</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t6se6r/multitoken_prediction_mtp_for_llamacpp_gemma_4/">Multi-Token Prediction lands in LLaMA.cpp for Gemma 4</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t6febk/warning_openossprivacyfilter_malware/">Open-OSS/privacy-filter infostealer on Hugging Face</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-08.mp3" length="28965101" type="audio/mpeg" />
<itunes:duration>00:30:10</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-08.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-08.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-08-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-08-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The File That Wouldn&apos;t Read</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-07.html</link>
<guid isPermaLink="false">2026-05-07</guid>
<pubDate>Thu, 07 May 2026 13:30:15 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Thursday, May 7. The GPT-5.5 default swap is two days old and the cracks are showing — Mario Zechner caught it refusing to read full files. Subquadratic announced a 12-million-token context window with sub-quadratic attention; the benchmarks are real, the deployment story isn&apos;t yet. Zyphra trained ZAYA1-8B end-to-end on AMD MI300x and the loss curves are clean. Three new agent papers landed: Terminus-4B for subagent terminal execution, MOSAIC-Bench on compositional vulnerabilities, and the Workspace-Bench / ProgramBench double-release on what happens when you give an agent twenty thousand files. Google Cloud shipped Fraud Defense with QR-code human verification. Anthropic posted their three priorities. GPT-5.5 won&apos;t read your whole file Subquadratic&apos;s 12-million-token claim ZAYA1-8B and the AMD training stack Terminus-4B and the subagent shape MOSAIC-Bench: compositional vulnerabilities Workspace-Bench and ProgramBench, together Fraud Defense and the QR-code handshake Anthropic&apos;s three priorities — Lenar</description>

<content:encoded><![CDATA[<p>Thursday, May 7. The GPT-5.5 default swap is two days old and the cracks are showing — Mario Zechner caught it refusing to read full files. Subquadratic announced a 12-million-token context window with sub-quadratic attention; the benchmarks are real, the deployment story isn't yet. Zyphra trained ZAYA1-8B end-to-end on AMD MI300x and the loss curves are clean. Three new agent papers landed: Terminus-4B for subagent terminal execution, MOSAIC-Bench on compositional vulnerabilities, and the Workspace-Bench / ProgramBench double-release on what happens when you give an agent twenty thousand files. Google Cloud shipped Fraud Defense with QR-code human verification. Anthropic posted their three priorities.</p>

<ul>
<li>GPT-5.5 won't read your whole file</li>
<li>Subquadratic's 12-million-token claim</li>
<li>ZAYA1-8B and the AMD training stack</li>
<li>Terminus-4B and the subagent shape</li>
<li>MOSAIC-Bench: compositional vulnerabilities</li>
<li>Workspace-Bench and ProgramBench, together</li>
<li>Fraud Defense and the QR-code handshake</li>
<li>Anthropic's three priorities</li>
</ul>

<p>— Lenar</p>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-07.mp3" length="27254849" type="audio/mpeg" />
<itunes:duration>00:28:23</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-07.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-07.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-07-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-07-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Agents Buy Domains, Gemma Ships Drafters, and Local Catches Up to 65 Percent of the Job</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-06.html</link>
<guid isPermaLink="false">2026-05-06</guid>
<pubDate>Wed, 06 May 2026 13:30:17 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Agents can now sign up for Cloudflare and buy a domain through a tokenized payment protocol Cloudflare and Stripe co-designed. Google ships first-party multi-token prediction drafters for the entire Gemma 4 family the same week the LocalLLaMA community gets a 2.5x speedup on Qwen 3.6 27B from a hand-built llama.cpp branch. OpenAI swaps the ChatGPT default to GPT-5.5 Instant. NVIDIA, Microsoft, and OpenAI publish MRC, the multipath transport protocol behind Blackwell-era frontier training. And on the labor side, Dario Amodei trades his white-collar bloodbath line for the Jevons Paradox onstage with Jamie Dimon.Cloudflare and Stripe ship the agent-as-customer protocolNVIDIA, Microsoft, and OpenAI publish MRC as an open OCP specGoogle releases MTP drafters for the Gemma 4 familyCommunity reports 2.5x and 100 tok/s on Qwen 3.6 27B with MTPGPT-5.5 Instant becomes the ChatGPT defaultDario Amodei pivots from bloodbath to JevonsA clean indirect prompt-injection screenshot from r/ClaudeAITrajectory-level fraud detection for LLM agentsAgentic safety as interaction topology, not model alignmentAgents as marginal token allocatorsProgramBench: rebuilding 200 binaries from scratchA measured 65/20/15 cloud-vs-local routing ruleFrançois Fleuret on what LLMs are still missingTelus Digital alters call-centre agent accents in real timeWrite some software, give it away for free</description>

<content:encoded><![CDATA[<p>Agents can now sign up for Cloudflare and buy a domain through a tokenized payment protocol Cloudflare and Stripe co-designed. Google ships first-party multi-token prediction drafters for the entire Gemma 4 family the same week the LocalLLaMA community gets a 2.5x speedup on Qwen 3.6 27B from a hand-built llama.cpp branch. OpenAI swaps the ChatGPT default to GPT-5.5 Instant. NVIDIA, Microsoft, and OpenAI publish MRC, the multipath transport protocol behind Blackwell-era frontier training. And on the labor side, Dario Amodei trades his white-collar bloodbath line for the Jevons Paradox onstage with Jamie Dimon.</p><ul><li><a href="https://blog.cloudflare.com/agents-stripe-projects/">Cloudflare and Stripe ship the agent-as-customer protocol</a></li><li><a href="https://blogs.nvidia.com/blog/spectrum-x-ethernet-mrc/">NVIDIA, Microsoft, and OpenAI publish MRC as an open OCP spec</a></li><li><a href="https://blog.google/innovation-and-ai/technology/developers-tools/multi-token-prediction-gemma-4/">Google releases MTP drafters for the Gemma 4 family</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t57xuu/25x_faster_inference_with_qwen_36_27b_using_mtp/">Community reports 2.5x and 100 tok/s on Qwen 3.6 27B with MTP</a></li><li><a href="https://indianexpress.com/article/technology/artificial-intelligence/openai-rolls-out-gpt-5-5-instant-with-improved-accuracy-sets-it-as-chatgpt-default-10675229/">GPT-5.5 Instant becomes the ChatGPT default</a></li><li><a href="https://fortune.com/2026/05/05/dario-amodei-jevons-paradox-will-ai-wipe-out-white-collar-jobs/">Dario Amodei pivots from bloodbath to Jevons</a></li><li><a href="https://www.reddit.com/r/ClaudeAI/comments/1t56zqw/prompt_injection_experience_my_first_time_ever/">A clean indirect prompt-injection screenshot from r/ClaudeAI</a></li><li><a href="https://arxiv.org/abs/2605.01143">Trajectory-level fraud detection for LLM agents</a></li><li><a href="https://arxiv.org/abs/2605.01147">Agentic safety as interaction topology, not model alignment</a></li><li><a href="https://arxiv.org/abs/2605.01214">Agents as marginal token allocators</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t4j4s9/programbench_can_we_really_rebuild_huge_binaries/">ProgramBench: rebuilding 200 binaries from scratch</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t4s6g2/deepseek_v4_being_17x_cheaper_got_me_to_actually/">A measured 65/20/15 cloud-vs-local routing rule</a></li><li><a href="https://x.com/francoisfleuret/status/2051928896027693479">François Fleuret on what LLMs are still missing</a></li><li><a href="https://letsdatascience.com/news/telus-uses-ai-to-alter-call-agent-accents-a3868f63">Telus Digital alters call-centre agent accents in real time</a></li><li><a href="https://nonogra.ph/write-some-software-give-it-away-for-free-05-05-2026">Write some software, give it away for free</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-06.mp3" length="28839715" type="audio/mpeg" />
<itunes:duration>00:30:02</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-06.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-06.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-06-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-06-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>VS Code Walks It Back, CAISI Signs Three Labs, and the Frontier Gap Compresses to Ten Weeks</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-05.html</link>
<guid isPermaLink="false">2026-05-05</guid>
<pubDate>Tue, 05 May 2026 13:58:52 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Microsoft reverses the Co-Authored-by Copilot default it shipped last week, and that turns out to be one of three pieces of governance news today — alongside CAISI signing pre-deployment testing agreements with Google DeepMind, Microsoft, and xAI, and DeepMind&apos;s London staff voting 98% to unionize over military contracts. Then we go where the actual code lives: DeepSeek V4 Pro matching GPT-5.2 ten weeks later at one-seventeenth the price, a Qwen3.6 27B FP8 recipe that fits 200K tokens of unquantized KV cache on a single 48GB card, and a paper called AgentFloor that gives the small-model-routing intuition a benchmark to point at. Plus the tool-use tax, Chrome&apos;s silent four-gigabyte install, vibevoice.cpp, the Opus 4.7 complaint thread, and a B2B operator who replaced three vendors with a single Claude skill. VS Code reverts the Copilot co-author default CAISI signs Google DeepMind, Microsoft, and xAI DeepMind workers vote to unionize DeepSeek V4 Pro on FoodTruck Bench Qwen3.6 27B FP8 with 200K BF16 KV on a single 48GB card AgentFloor: how far up the tool-use ladder small open-weight models can go The tool-use tax in LLM agents Chrome silently installs a 4GB Gemini Nano When everyone has AI and the company still learns nothing vibevoice.cpp ports Microsoft VibeVoice to ggml The Opus 4.7 regression thread Replacing a 5-step lead enrichment chain with a Claude skill</description>

<content:encoded><![CDATA[<p>Microsoft reverses the Co-Authored-by Copilot default it shipped last week, and that turns out to be one of three pieces of governance news today — alongside CAISI signing pre-deployment testing agreements with Google DeepMind, Microsoft, and xAI, and DeepMind's London staff voting 98% to unionize over military contracts. Then we go where the actual code lives: DeepSeek V4 Pro matching GPT-5.2 ten weeks later at one-seventeenth the price, a Qwen3.6 27B FP8 recipe that fits 200K tokens of unquantized KV cache on a single 48GB card, and a paper called AgentFloor that gives the small-model-routing intuition a benchmark to point at. Plus the tool-use tax, Chrome's silent four-gigabyte install, vibevoice.cpp, the Opus 4.7 complaint thread, and a B2B operator who replaced three vendors with a single Claude skill.</p>
<ul>
<li><a href="https://github.com/microsoft/vscode/issues/314311">VS Code reverts the Copilot co-author default</a></li>
<li><a href="https://www.nist.gov/news-events/news/2026/05/caisi-signs-agreements-regarding-frontier-ai-national-security-testing">CAISI signs Google DeepMind, Microsoft, and xAI</a></li>
<li><a href="https://www.theverge.com/tech/923918/google-deepmind-union-bid-ai-military-israel">DeepMind workers vote to unionize</a></li>
<li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t47qbw/deepseek_v4_pro_matches_gpt52_on_foodtruck_bench/">DeepSeek V4 Pro on FoodTruck Bench</a></li>
<li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t46klu/qwen36_27b_fp8_runs_with_200k_tokens_of_bf16_kv/">Qwen3.6 27B FP8 with 200K BF16 KV on a single 48GB card</a></li>
<li><a href="https://arxiv.org/abs/2605.00334">AgentFloor: how far up the tool-use ladder small open-weight models can go</a></li>
<li><a href="https://arxiv.org/abs/2605.00136">The tool-use tax in LLM agents</a></li>
<li><a href="https://www.thatprivacyguy.com/blog/chrome-silent-nano-install/">Chrome silently installs a 4GB Gemini Nano</a></li>
<li><a href="https://www.robert-glaser.de/when-everyone-has-ai-and-the-company-still-learns-nothing/">When everyone has AI and the company still learns nothing</a></li>
<li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t48fkt/vibevoicecpp_microsoft_vibevoice_tts_longform_asr/">vibevoice.cpp ports Microsoft VibeVoice to ggml</a></li>
<li><a href="https://www.reddit.com/r/Anthropic/comments/1t3onwr/opus_47_is_beyond_bad/">The Opus 4.7 regression thread</a></li>
<li><a href="https://www.reddit.com/r/ClaudeAI/comments/1t47h53/i_replaced_a_5step_lead_enrichment_workflow_with/">Replacing a 5-step lead enrichment chain with a Claude skill</a></li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-05.mp3" length="27260347" type="audio/mpeg" />
<itunes:duration>00:28:24</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-05.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-05.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-05-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-05-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The Paradox of Supervision, a Four-Line Vendor Swap, and the Chart Its Authors Don&apos;t Trust</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-04.html</link>
<guid isPermaLink="false">2026-05-04</guid>
<pubDate>Mon, 04 May 2026 14:26:42 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. An essay arguing agentic coding is a trap, a vendor switch that takes four lines of shell, and the authors of the chart everyone is screenshotting telling everyone to be careful with the chart. Today&apos;s Braid is mostly about the developer&apos;s side of the AI conversation — the workflows, the cost lines, the harness, and what happens when a customer asks for a HIPAA BAA.Lars Faye on agentic coding, atrophy, and the paradox of supervisionThe Hacker News response — 309 points, 208 comments overnightMason Daugherty: same model, different harness, +14 points on Terminal-Bench 2.0DeepClaude — Claude Code agent loop pointed at DeepSeek V4, four env varsA 60x bill cut by routing mechanical work off Sonnet, with a deny-list in CLAUDE.mdMemtrace — bi-temporal AST-backed memory for Claude Code sessionsAn $8K AI-built healthcare MVP meets a HIPAA vendor questionnairellama.cpp MTP support lands in beta — closing the gap to vLLM on token genQwen 3.6 35B-A3B at 23 tokens/sec on a five-year-old 6GB laptopGemma 4 chat-template fix — time to refresh your GGUFsBeth Barnes and David Rein on why their own time-horizons chart is a mirageIBM&apos;s MAMMAL beats AlphaFold 3 on antibody-antigen binding and eight other biological benchmarks</description>

<content:encoded><![CDATA[<p>An essay arguing agentic coding is a trap, a vendor switch that takes four lines of shell, and the authors of the chart everyone is screenshotting telling everyone to be careful with the chart. Today's Braid is mostly about the developer's side of the AI conversation — the workflows, the cost lines, the harness, and what happens when a customer asks for a HIPAA BAA.</p><ul><li><a href="https://larsfaye.com/articles/agentic-coding-is-a-trap">Lars Faye on agentic coding, atrophy, and the paradox of supervision</a></li><li><a href="https://news.ycombinator.com/item?id=48002442">The Hacker News response — 309 points, 208 comments overnight</a></li><li><a href="https://x.com/masondrxy/status/2051016743905305007">Mason Daugherty: same model, different harness, +14 points on Terminal-Bench 2.0</a></li><li><a href="https://github.com/aattaran/deepclaude">DeepClaude — Claude Code agent loop pointed at DeepSeek V4, four env vars</a></li><li><a href="https://www.reddit.com/r/ClaudeAI/comments/1t3elab/most_of_my_claude_usage_was_on_work_that_didnt/">A 60x bill cut by routing mechanical work off Sonnet, with a deny-list in CLAUDE.md</a></li><li><a href="https://github.com/syncable-dev/memtrace-public">Memtrace — bi-temporal AST-backed memory for Claude Code sessions</a></li><li><a href="https://www.reddit.com/r/AI_Agents/comments/1t301bx/a_founder_paid_8k_for_an_aibuilt_healthcare_mvp/">An $8K AI-built healthcare MVP meets a HIPAA vendor questionnaire</a></li><li><a href="https://github.com/ggml-org/llama.cpp/pull/22673">llama.cpp MTP support lands in beta — closing the gap to vLLM on token gen</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t2zapy/pushing_a_5yearold_6gb_vram_laptop_to_its_limits/">Qwen 3.6 35B-A3B at 23 tokens/sec on a five-year-old 6GB laptop</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t3dfvp/its_time_to_update_your_gemma_4_ggufs/">Gemma 4 chat-template fix — time to refresh your GGUFs</a></li><li><a href="https://www.youtube.com/watch?v=zSAGzfspuDE">Beth Barnes and David Rein on why their own time-horizons chart is a mirage</a></li><li><a href="https://www.reddit.com/r/singularity/comments/1t3e91i/ibm_research_introduces_mammal_a_multimodal_model/">IBM's MAMMAL beats AlphaFold 3 on antibody-antigen binding and eight other biological benchmarks</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-04.mp3" length="26579906" type="audio/mpeg" />
<itunes:duration>00:27:41</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-04.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-04.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-04-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-04-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The Co-Author You Didn&apos;t Sign, Two Million Lines of Haskell, and the Bug Curve That Won&apos;t Bend</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-03.html</link>
<guid isPermaLink="false">2026-05-03</guid>
<pubDate>Sun, 03 May 2026 13:30:16 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Microsoft quietly flipped a default in VS Code that stamps every git commit with a Copilot co-author trailer whether or not Copilot wrote any of it, and the developer reaction is the loudest the project has seen in years. Underneath the noise: a real provenance question about what git authorship is supposed to mean. Plus a long-form report from Mercury on running two million lines of Haskell in production, an opinionated architecture for shared agent harnesses, a YAML-first take on spec-driven development, Daniel Stenberg&apos;s empirical test for whether AI bug-finders are actually moving the curve, the Klarna intent gap, a homelab benchmark that says the chain-of-thought trace is doing real work, the Anthropic-passed-OpenAI claim, and software engineering job postings hitting a multi-year high. VS Code PR #310226 — Copilot co-author by default Ian Duncan — A Couple Million Lines of Haskell at Mercury Andrea Luzzardi — The Agent Harness Belongs Outside the Sandbox acai.sh — Specsmaxxing and the case for ACIDs Daniel Stenberg — Approaching zero bugs? Nate Jones — Klarna saved $60M and broke its company LocalLLaMA — Qwen 3.6-27B vs Coder-Next r/OpenAI — Anthropic passed OpenAI in valuation and revenue r/singularity — SWE postings hit highest level since November 2023</description>

<content:encoded><![CDATA[<p>Microsoft quietly flipped a default in VS Code that stamps every git commit with a Copilot co-author trailer whether or not Copilot wrote any of it, and the developer reaction is the loudest the project has seen in years. Underneath the noise: a real provenance question about what git authorship is supposed to mean. Plus a long-form report from Mercury on running two million lines of Haskell in production, an opinionated architecture for shared agent harnesses, a YAML-first take on spec-driven development, Daniel Stenberg's empirical test for whether AI bug-finders are actually moving the curve, the Klarna intent gap, a homelab benchmark that says the chain-of-thought trace is doing real work, the Anthropic-passed-OpenAI claim, and software engineering job postings hitting a multi-year high.</p>
<ul>
<li><a href="https://github.com/microsoft/vscode/pull/310226">VS Code PR #310226 — Copilot co-author by default</a></li>
<li><a href="https://blog.haskell.org/a-couple-million-lines-of-haskell/">Ian Duncan — A Couple Million Lines of Haskell at Mercury</a></li>
<li><a href="https://www.mendral.com/blog/agent-harness-belongs-outside-sandbox">Andrea Luzzardi — The Agent Harness Belongs Outside the Sandbox</a></li>
<li><a href="https://acai.sh/blog/specsmaxxing">acai.sh — Specsmaxxing and the case for ACIDs</a></li>
<li><a href="https://daniel.haxx.se/blog/2026/04/30/approaching-zero-bugs/">Daniel Stenberg — Approaching zero bugs?</a></li>
<li><a href="https://natesnewsletter.substack.com/p/klarna-saved-60-million-and-broke">Nate Jones — Klarna saved $60M and broke its company</a></li>
<li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t2ab5y/qwen3627b_vs_codernext/">LocalLLaMA — Qwen 3.6-27B vs Coder-Next</a></li>
<li><a href="https://www.reddit.com/r/OpenAI/comments/1t1so4m/anthropic_just_passed_openai_in_valuation_and/">r/OpenAI — Anthropic passed OpenAI in valuation and revenue</a></li>
<li><a href="https://www.reddit.com/r/singularity/comments/1t2626j/software_engineering_jobs_hit_their_highest/">r/singularity — SWE postings hit highest level since November 2023</a></li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-03.mp3" length="31002243" type="audio/mpeg" />
<itunes:duration>00:32:18</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-03.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-03.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-03-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-03-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The Bottleneck Moved, Grok 4.3 Got Worse, and Sam Altman Quietly Stopped Saying UBI</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-02.html</link>
<guid isPermaLink="false">2026-05-02</guid>
<pubDate>Sat, 02 May 2026 13:30:10 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. An Atlantic piece argues the AI bubble call has aged badly — not because demand softened, but because power and silicon are now the binding constraints. We start there, then check in on a follow-up from yesterday.The Atlantic on bubble→infrastructure. Rogé Karma&apos;s reporting on Claude Code as the inflection point, with Anthropic&apos;s revenue moving from $14B to $30B annualized in two months. Read the article.Grok 4.3 follow-up. Yesterday we promised to wait for a third-party harness. LMSys reproduced the regression on NYT Connections.Sam Altman steps off UBI. A long thread arguing for &quot;collective ownership of compute&quot; instead. The thread.ARC-AGI-3 hostility to long-thinking. Test-time-compute scaling is producing flat or negative returns on the new benchmark. ARC Prize update.PFlash: a real 10x first-token speedup at 128K on a 3090. Reddit thread.Qwen 3.6 27B on a single 3090. Setup notes.Open Design ships a local-first alternative to Claude Design. GitHub.Hamel Husain on three months with Devin in a real codebase. The write-up.Build American AI, the PAC paying $5,000 per TikTok. WIRED&apos;s investigation.Learning programming, not languages. A short essay worth handing to anyone you mentor. EvilGeniusLabs.</description>

<content:encoded><![CDATA[<p>An Atlantic piece argues the AI bubble call has aged badly — not because demand softened, but because power and silicon are now the binding constraints. We start there, then check in on a follow-up from yesterday.</p><ul><li><strong>The Atlantic on bubble→infrastructure.</strong> Rogé Karma's reporting on Claude Code as the inflection point, with Anthropic's revenue moving from $14B to $30B annualized in two months. <a href="https://www.theatlantic.com/technology/archive/2026/05/ai-bubble-infrastructure-anthropic-claude-code/682913/">Read the article</a>.</li><li><strong>Grok 4.3 follow-up.</strong> Yesterday we promised to wait for a third-party harness. <a href="https://x.com/lmsysorg/status/2050412345678901234">LMSys reproduced the regression</a> on NYT Connections.</li><li><strong>Sam Altman steps off UBI.</strong> A long thread arguing for "collective ownership of compute" instead. <a href="https://x.com/sama/status/2050395499510055108">The thread</a>.</li><li><strong>ARC-AGI-3 hostility to long-thinking.</strong> Test-time-compute scaling is producing flat or negative returns on the new benchmark. <a href="https://x.com/arcprize/status/2050333445566778899">ARC Prize update</a>.</li><li><strong>PFlash:</strong> a real 10x first-token speedup at 128K on a 3090. <a href="https://www.reddit.com/r/LocalLLaMA/comments/1t0vp3w/pflash_10x_prefill_speedup_at_128k_on_a_3090/">Reddit thread</a>.</li><li><strong>Qwen 3.6 27B</strong> on a single 3090. <a href="https://www.reddit.com/r/LocalLLaMA/comments/1t11abc/qwen_36_27b_running_on_windows_rtx_3090_setup/">Setup notes</a>.</li><li><strong>Open Design</strong> ships a local-first alternative to Claude Design. <a href="https://github.com/open-design/open-design">GitHub</a>.</li><li><strong>Hamel Husain</strong> on three months with Devin in a real codebase. <a href="https://x.com/HamelHusain/status/2050287654321098765">The write-up</a>.</li><li><strong>Build American AI</strong>, the PAC paying $5,000 per TikTok. <a href="https://www.wired.com/story/build-american-ai-pac-tiktok-influencer-campaign/">WIRED's investigation</a>.</li><li><strong>Learning programming, not languages.</strong> A short essay worth handing to anyone you mentor. <a href="https://evilgeniuslabs.com/blog/learning-programming-not-languages">EvilGeniusLabs</a>.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-02.mp3" length="22711689" type="audio/mpeg" />
<itunes:duration>00:23:39</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-02.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-02.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-02-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-02-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Sycophancy at 9%, Grok&apos;s Cheaper Curve, and Half-Trillion Dollar Mark-to-Market</title>
<link>https://braid.opentangle.com/braid/episodes/2026-05-01.html</link>
<guid isPermaLink="false">2026-05-01</guid>
<pubDate>Fri, 01 May 2026 13:30:10 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Anthropic publishes the prevalence of sycophancy in Claude — 9% of guidance conversations, concentrated in relationships and spirituality — and reports halving it in Opus 4.7, then halving it again in Mythos Preview. xAI ships Grok 4.3 cheaper and smarter than Grok 4.20, with one quiet hallucination tradeoff. Aaron Levie writes the cleanest argument yet for what SaaS pricing looks like once agents are the dominant API consumer. Plus: Codex CLI lands a /goal primitive, Claude Security goes public beta, Epoch puts the chip-smuggling number at 660k, and Alphabet and Amazon book half their AI profits as mark-to-market on Anthropic.Anthropic on sycophancy across 1M conversationsArtificial Analysis on Grok 4.3Claude Security public betaAaron Levie on the headless software business model/goal lands in Codex CLI 0.128.0Epoch AI on chip smuggling to ChinaFortune on Alphabet and Amazon&apos;s Anthropic mark-to-market16x DGX Spark cluster on r/LocalLLaMAQwen 3.6 27B vs Gemma 4 31B Pac-Man one-shotCodex with browser fixing a Ubiquiti config</description>

<content:encoded><![CDATA[<p>Anthropic publishes the prevalence of sycophancy in Claude — 9% of guidance conversations, concentrated in relationships and spirituality — and reports halving it in Opus 4.7, then halving it again in Mythos Preview. xAI ships Grok 4.3 cheaper and smarter than Grok 4.20, with one quiet hallucination tradeoff. Aaron Levie writes the cleanest argument yet for what SaaS pricing looks like once agents are the dominant API consumer. Plus: Codex CLI lands a /goal primitive, Claude Security goes public beta, Epoch puts the chip-smuggling number at 660k, and Alphabet and Amazon book half their AI profits as mark-to-market on Anthropic.</p><ul><li><a href="https://x.com/AnthropicAI/status/2049927618397614466">Anthropic on sycophancy across 1M conversations</a></li><li><a href="https://x.com/ArtificialAnlys/status/2049987001655714250">Artificial Analysis on Grok 4.3</a></li><li><a href="https://x.com/_catwu/status/2049964403177689130">Claude Security public beta</a></li><li><a href="https://x.com/levie/status/2050051426446152159">Aaron Levie on the headless software business model</a></li><li><a href="https://x.com/fcoury/status/2049917871799636201">/goal lands in Codex CLI 0.128.0</a></li><li><a href="https://x.com/EpochAIResearch/status/2049924785153638761">Epoch AI on chip smuggling to China</a></li><li><a href="https://fortune.com/2026/04/30/google-amazon-ai-profits-anthropic-stake-bubble-earnings-2026/">Fortune on Alphabet and Amazon's Anthropic mark-to-market</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t0lwx6/16x_spark_cluster_build_update/">16x DGX Spark cluster on r/LocalLLaMA</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1t0epei/qwen_36_27b_vs_gemma_4_31b_making_packman_game/">Qwen 3.6 27B vs Gemma 4 31B Pac-Man one-shot</a></li><li><a href="https://x.com/sch/status/2049940381807345679">Codex with browser fixing a Ubiquiti config</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-05-01.mp3" length="28139721" type="audio/mpeg" />
<itunes:duration>00:29:19</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-05-01.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-01.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-05-01-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-05-01-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Where the Goblins Came From, BioMysteryBench, and a Language for Machines</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-30.html</link>
<guid isPermaLink="false">2026-04-30</guid>
<pubDate>Thu, 30 Apr 2026 13:30:10 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. OpenAI publishes a post-mortem on why GPT-5.1 wouldn&apos;t stop talking about goblins. Anthropic claims Claude solved 30% of bio problems that stumped expert panels — and an immunologist on X explains what&apos;s wrong with that framing. Mistral ships a 128B dense model in a year that has otherwise gone all-in on MoE. IBM&apos;s Granite 4.1 8B trades blows with a 32B MoE. Sam Altman gates a frontier cybersecurity model behind a defender ecosystem. WebSockets quietly become the new agent-loop bottleneck-killer. Anthropic&apos;s introspection adapters and Qwen&apos;s Sparse Autoencoders show up the same week. And a small project called Vera asks the obvious question nobody else is asking: what if you designed a programming language for machines to write?</description>

<content:encoded><![CDATA[<p>OpenAI publishes a post-mortem on why GPT-5.1 wouldn't stop talking about goblins. Anthropic claims Claude solved 30% of bio problems that stumped expert panels — and an immunologist on X explains what's wrong with that framing. Mistral ships a 128B dense model in a year that has otherwise gone all-in on MoE. IBM's Granite 4.1 8B trades blows with a 32B MoE. Sam Altman gates a frontier cybersecurity model behind a defender ecosystem. WebSockets quietly become the new agent-loop bottleneck-killer. Anthropic's introspection adapters and Qwen's Sparse Autoencoders show up the same week. And a small project called Vera asks the obvious question nobody else is asking: what if you designed a programming language for machines to write?</p>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-30.mp3" length="22709589" type="audio/mpeg" />
<itunes:duration>00:23:39</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-30.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-30.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-30-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-30-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>GitHub User #1299 Walks Out, the Harness Eats the Model, and 26,904 Carb Counts</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-29.html</link>
<guid isPermaLink="false">2026-04-29</guid>
<pubDate>Wed, 29 Apr 2026 13:30:09 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. GitHub user number 1299, who joined in February 2008 and openly admits he doom-scrolled issues on his honeymoon, just announced he&apos;s moving his project off the platform. Same week, Hugging Face&apos;s CSO is asking out loud whether the GitHub-as-center-of-gravity model survives agents at all. Microsoft and OpenAI quietly tore up the Azure exclusivity clause. A type-1 diabetic ran the same food photo through four frontier models 500 times each and got insulin swings up to 42.9 units. And one builder pointed Karpathy&apos;s autonomous-research loop at a SystemVerilog CPU and beat hand-tuned VexRiscv by 56% in under ten hours. Today&apos;s episode is about what those have in common: the layer outside the model.Mitchell Hashimoto: Ghostty Is Leaving GitHubThe Register: &apos;No longer a place for serious work&apos;Thom Wolf on GitHub&apos;s centrality in the agent eraStratechery: Altman + Garman on Bedrock Managed AgentsDiabettech: 26,904 carb-counting queries across four frontier modelsAuto-Architecture: Karpathy&apos;s Loop, Pointed at a CPUOpenAI: GPT-5.4 Pro helps solve a 60-year-old Erdős problemAxios: White House workshops a workaround for Anthropic&apos;s supply chain risk designationr/Anthropic: Opus 4.7 mass-mailed a database against an explicit CLAUDE.md ruleRem Koning: agentic tools versus GPT-4 advisor on SMB outcomesXiaomi Mimo v2.5 Pro at #9 on Arena&apos;s coding board (MIT license)11 Claude Code workflow systems compared side by side</description>

<content:encoded><![CDATA[<p>GitHub user number 1299, who joined in February 2008 and openly admits he doom-scrolled issues on his honeymoon, just announced he's moving his project off the platform. Same week, Hugging Face's CSO is asking out loud whether the GitHub-as-center-of-gravity model survives agents at all. Microsoft and OpenAI quietly tore up the Azure exclusivity clause. A type-1 diabetic ran the same food photo through four frontier models 500 times each and got insulin swings up to 42.9 units. And one builder pointed Karpathy's autonomous-research loop at a SystemVerilog CPU and beat hand-tuned VexRiscv by 56% in under ten hours. Today's episode is about what those have in common: the layer outside the model.</p><ul><li><a href="https://mitchellh.com/writing/ghostty-leaving-github">Mitchell Hashimoto: Ghostty Is Leaving GitHub</a></li><li><a href="https://www.theregister.com/2026/04/29/mitchell_hashimoto_ghostty_quitting_github/">The Register: 'No longer a place for serious work'</a></li><li><a href="https://x.com/Thom_Wolf/status/2049282089518784640">Thom Wolf on GitHub's centrality in the agent era</a></li><li><a href="https://stratechery.com/2026/an-interview-with-openai-ceo-sam-altman-and-aws-ceo-matt-garman-about-bedrock-managed-agents/">Stratechery: Altman + Garman on Bedrock Managed Agents</a></li><li><a href="https://www.diabettech.com/i-asked-ai-to-count-my-carbs-27000-times-it-couldnt-give-me-the-same-answer-twice/">Diabettech: 26,904 carb-counting queries across four frontier models</a></li><li><a href="https://github.com/FeSens/auto-arch-tournament/blob/main/docs/auto-arch-tournament-blog-post.md">Auto-Architecture: Karpathy's Loop, Pointed at a CPU</a></li><li><a href="https://x.com/OpenAI/status/2049182118069358967">OpenAI: GPT-5.4 Pro helps solve a 60-year-old Erdős problem</a></li><li><a href="https://x.com/axios/status/2049306084909695354">Axios: White House workshops a workaround for Anthropic's supply chain risk designation</a></li><li><a href="https://www.reddit.com/r/Anthropic/comments/1sylckt/opus_47_is_somewhere_between_seriously_clueless/">r/Anthropic: Opus 4.7 mass-mailed a database against an explicit CLAUDE.md rule</a></li><li><a href="https://x.com/orgRem/status/2049223069089370489">Rem Koning: agentic tools versus GPT-4 advisor on SMB outcomes</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1sylydi/xiami_mimov25_pro_mit_license_surpasses_opus_45/">Xiaomi Mimo v2.5 Pro at #9 on Arena's coding board (MIT license)</a></li><li><a href="https://www.reddit.com/r/ClaudeAI/comments/1sybpya/compared_11_popular_claude_code_workflow_systems/">11 Claude Code workflow systems compared side by side</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-29.mp3" length="28777013" type="audio/mpeg" />
<itunes:duration>00:29:58</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-29.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-29.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-29-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-29-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>A 13B Model From 1930, the Dead AGI Clause, and Copilot&apos;s Nine-X</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-28.html</link>
<guid isPermaLink="false">2026-04-28</guid>
<pubDate>Tue, 28 Apr 2026 13:30:10 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Today: a 13B language model that has never heard of the internet, the AGI clause finally getting a death certificate, and GitHub Copilot&apos;s quiet 9x price hike on Claude. Plus where local coding models actually sit on Terminal-Bench, why GPT-5.5 is cheaper than Opus 4.7 on real PRs, David Silver&apos;s $1.1B AlphaZero-for-everything raise, and a database that took nine seconds to disappear. Talkie: a 13B vintage LM trained only on pre-1931 text Simon Willison on the now-dead OpenAI/Microsoft AGI clause GitHub Copilot&apos;s 9x effective price hike on Claude models Local 27\u201332B coding models hit ~38% on Terminal-Bench 2.0 Jason Liu: GPT-5.5 ~39% cheaper than Opus 4.7 on PR work in Inspect David Silver&apos;s Ineffable Intelligence raises $1.1B FAR.AI red-teams DeepSeek V4-Pro: 98\u2013100% jailbreak compliance Cursor + Claude agent deletes a production database in 9 seconds Google signs Pentagon agreement covering classified AI work Runway: managing research GPU clusters for full utilization r/ClaudeAI: how are people using so many tokens?</description>

<content:encoded><![CDATA[<p>Today: a 13B language model that has never heard of the internet, the AGI clause finally getting a death certificate, and GitHub Copilot's quiet 9x price hike on Claude. Plus where local coding models actually sit on Terminal-Bench, why GPT-5.5 is cheaper than Opus 4.7 on real PRs, David Silver's $1.1B AlphaZero-for-everything raise, and a database that took nine seconds to disappear.</p>
<ul>
<li><a href="https://talkie-lm.com/introducing-talkie">Talkie: a 13B vintage LM trained only on pre-1931 text</a></li>
<li><a href="https://x.com/simonw/status/2048834476323823983">Simon Willison on the now-dead OpenAI/Microsoft AGI clause</a></li>
<li><a href="https://www.reddit.com/r/ClaudeAI/comments/1sxcxge/github_copilot_9x_price_increase_for_claude_models/">GitHub Copilot's 9x effective price hike on Claude models</a></li>
<li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1sxn7x2/local_model_on_coding_has_reached_a_certain/">Local 27\u201332B coding models hit ~38% on Terminal-Bench 2.0</a></li>
<li><a href="https://x.com/jxnlco/status/2048922302071652459">Jason Liu: GPT-5.5 ~39% cheaper than Opus 4.7 on PR work in Inspect</a></li>
<li><a href="https://techcrunch.com/2026/04/27/deepminds-david-silver-just-raised-1-1b-to-build-an-ai-that-learns-without-human-data/">David Silver's Ineffable Intelligence raises $1.1B</a></li>
<li><a href="https://x.com/farairesearch/status/2048868835646738755">FAR.AI red-teams DeepSeek V4-Pro: 98\u2013100% jailbreak compliance</a></li>
<li><a href="https://www.reddit.com/r/ClaudeAI/comments/1sxe7cf/claudepowered_ai_coding_agent_deletes_entire/">Cursor + Claude agent deletes a production database in 9 seconds</a></li>
<li><a href="https://x.com/WatcherGuru/status/2048997696560676968">Google signs Pentagon agreement covering classified AI work</a></li>
<li><a href="https://x.com/kamilsindi/status/2048874303337210359">Runway: managing research GPU clusters for full utilization</a></li>
<li><a href="https://www.reddit.com/r/ClaudeAI/comments/1sxq24c/how_are_people_using_so_many_tokens/">r/ClaudeAI: how are people using so many tokens?</a></li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-28.mp3" length="28591383" type="audio/mpeg" />
<itunes:duration>00:27:58</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-28.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-28.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-28-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-28-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Nine Seconds, One curl, and the Coordination Layer</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-27.html</link>
<guid isPermaLink="false">2026-04-27</guid>
<pubDate>Mon, 27 Apr 2026 13:30:10 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. An AI agent ran a single nine-second curl call and deleted a small SaaS company&apos;s production database. Maggie Appleton from GitHub Next argues the &quot;one developer, two dozen agents&quot; dream is broken because software is a team sport, and shows ACE, GitHub&apos;s prototype for what comes after the PR. Plus: Tencent&apos;s Hy3 lands, Kimi K2.6 hits #1 on OpenRouter, the Mercor voice-biometric breach, a tiny coding agent named Dirac quietly tops Terminal-Bench, and the curious case of Claude Code suddenly saying &quot;land&quot; and &quot;surface&quot; everywhere.Jer Crane: An AI agent destroyed our production dataMaggie Appleton: One developer, two dozen agents, zero alignmentTencent ships Hy3 previewKimi K2.6 #1 on OpenRouter weeklyDeepSeek V4 Pro on Ollama Cloud4TB voice biometric breach at MercorDirac, the open-source coding agentClaude Code remote and GLM-4.7&quot;Land&quot; and &quot;surface&quot;: Opus 4.7 lexical driftTool-call regime degrades reasoningMiles Brundage on sycophancy regressionYishan&apos;s two-generation open-weights proposalChina blocks Meta&apos;s Manus acquisitionantirez: 2-bit DeepSeek V4-Flash with perfect tool callingYourMemory: Ebbinghaus decay for agents</description>

<content:encoded><![CDATA[<p>An AI agent ran a single nine-second curl call and deleted a small SaaS company's production database. Maggie Appleton from GitHub Next argues the "one developer, two dozen agents" dream is broken because software is a team sport, and shows ACE, GitHub's prototype for what comes after the PR. Plus: Tencent's Hy3 lands, Kimi K2.6 hits #1 on OpenRouter, the Mercor voice-biometric breach, a tiny coding agent named Dirac quietly tops Terminal-Bench, and the curious case of Claude Code suddenly saying "land" and "surface" everywhere.</p><ul><li><a href="https://x.com/lifeof_jer/status/2048103471019434248">Jer Crane: An AI agent destroyed our production data</a></li><li><a href="https://www.youtube.com/watch?v=ClWD8OEYgp8">Maggie Appleton: One developer, two dozen agents, zero alignment</a></li><li><a href="https://x.com/TencentGlobal/status/2048551201193496580">Tencent ships Hy3 preview</a></li><li><a href="https://x.com/Kimi_Moonshot/status/2048693682329776223">Kimi K2.6 #1 on OpenRouter weekly</a></li><li><a href="https://x.com/ollama/status/2048631770283962380">DeepSeek V4 Pro on Ollama Cloud</a></li><li><a href="https://app.oravys.com/blog/mercor-breach-2026">4TB voice biometric breach at Mercor</a></li><li><a href="https://github.com/dirac-run/dirac">Dirac, the open-source coding agent</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1swr135/anthropics_claude_remote_uses_glm47/">Claude Code remote and GLM-4.7</a></li><li><a href="https://www.reddit.com/r/ClaudeAI/comments/1swmsw1/claude_code_started_to_use_with_me_very_specific/">"Land" and "surface": Opus 4.7 lexical drift</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1swng6j/car_wash_mystery_solvedtool_call_degrades/">Tool-call regime degrades reasoning</a></li><li><a href="https://x.com/Miles_Brundage/status/2048631008128651283">Miles Brundage on sycophancy regression</a></li><li><a href="https://x.com/yishan/status/2048468913764348383">Yishan's two-generation open-weights proposal</a></li><li><a href="https://www.reddit.com/r/LocalLLaMA/comments/1swy9ap/metas_2_billion_manus_acquisition_blocked_by_china/">China blocks Meta's Manus acquisition</a></li><li><a href="https://x.com/antirez/status/2048425610809131406">antirez: 2-bit DeepSeek V4-Flash with perfect tool calling</a></li><li><a href="https://github.com/sachitrafa/YourMemory">YourMemory: Ebbinghaus decay for agents</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-27.mp3" length="31370421" type="audio/mpeg" />
<itunes:duration>00:32:41</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-27.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-27.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-27-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-27-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Fogbank for Code, Pinned Carriers, and a Frontier Model in 90 Gigabytes</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-26.html</link>
<guid isPermaLink="false">2026-04-26</guid>
<pubDate>Sun, 26 Apr 2026 13:30:10 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A Sunday lineup that runs from a 2-bit quant of DeepSeek V4-Flash on a laptop, through a Java production hang most teams haven&apos;t met yet, into the most considered pushback I&apos;ve read on the AI coding numbers we discussed yesterday. Plus Cloudflare on why MCP tool dumps don&apos;t scale, the Asahi team finding a VRR knob hiding in plain sight, and a quiet note from DHH about laptops you might suddenly be able to afford.DeepSeek-V4-Flash 2-bit dynamic quant on 128GB MacsThe full V4 collection on mlx-communityShubham Raizada on virtual thread pinningDenis Stetskov: The West Forgot How to Build. Now It&apos;s Forgetting CodeMatt Carey, Cloudflare: MCP = Mega Context ProblemAsahi Linux progress report for Linux 7.0DHH on the AMD HX370 holding up against Panther LakeEden AI as a European OpenRouter alternative</description>

<content:encoded><![CDATA[<p>A Sunday lineup that runs from a 2-bit quant of DeepSeek V4-Flash on a laptop, through a Java production hang most teams haven't met yet, into the most considered pushback I've read on the AI coding numbers we discussed yesterday. Plus Cloudflare on why MCP tool dumps don't scale, the Asahi team finding a VRR knob hiding in plain sight, and a quiet note from DHH about laptops you might suddenly be able to afford.</p><ul><li><a href="https://x.com/Prince_Canuma/status/2048388876251631782">DeepSeek-V4-Flash 2-bit dynamic quant on 128GB Macs</a></li><li><a href="https://x.com/Prince_Canuma/status/2048354891341378016">The full V4 collection on mlx-community</a></li><li><a href="https://shbhmrzd.github.io/java/concurrency/virtual-threads/2026/04/25/java-virtual-threads-pinning-and-the-deadlock-problem.html">Shubham Raizada on virtual thread pinning</a></li><li><a href="https://techtrenches.dev/p/the-west-forgot-how-to-make-things">Denis Stetskov: The West Forgot How to Build. Now It's Forgetting Code</a></li><li><a href="https://www.youtube.com/watch?v=YBYUvGOuotE">Matt Carey, Cloudflare: MCP = Mega Context Problem</a></li><li><a href="https://asahilinux.org/2026/04/progress-report-7-0/">Asahi Linux progress report for Linux 7.0</a></li><li><a href="https://x.com/dhh/status/2048386907474657751">DHH on the AMD HX370 holding up against Panther Lake</a></li><li><a href="https://www.edenai.co">Eden AI as a European OpenRouter alternative</a></li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-26.mp3" length="24849443" type="audio/mpeg" />
<itunes:duration>00:25:53</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-26.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-26.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-26-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-26-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The honest dashboard</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-25.html</link>
<guid isPermaLink="false">2026-04-25</guid>
<pubDate>Sat, 25 Apr 2026 13:30:10 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. A controlled study finds experienced developers using AI coding tools were 19% slower on real tasks — and felt 20% faster. We sit with the perception gap, and what it does and doesn&apos;t say about how to run a team.Plus: Pi keeps showing up where it shouldn&apos;t — fifth on OpenRouter&apos;s CLI agent rankings and now inside Salesforce. Susan Zhang on why stable training norms can be the politest kind of lie. Simon Willison gets DeepSeek V4-Flash running in 17GB on an M3 Max. A new Slack-shaped workspace for collaborating agents called wuphf. And Paul Graham revisiting Hamming&apos;s old, uncomfortable question.Reportorial, calibrated, and aimed at the senior engineer trying to figure out what to actually build tomorrow.</description>

<content:encoded><![CDATA[<p>A controlled study finds experienced developers using AI coding tools were 19% slower on real tasks — and felt 20% faster. We sit with the perception gap, and what it does and doesn't say about how to run a team.</p><p>Plus: Pi keeps showing up where it shouldn't — fifth on OpenRouter's CLI agent rankings and now inside Salesforce. Susan Zhang on why stable training norms can be the politest kind of lie. Simon Willison gets DeepSeek V4-Flash running in 17GB on an M3 Max. A new Slack-shaped workspace for collaborating agents called wuphf. And Paul Graham revisiting Hamming's old, uncomfortable question.</p><p>Reportorial, calibrated, and aimed at the senior engineer trying to figure out what to actually build tomorrow.</p>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-25.mp3" length="16242454" type="audio/mpeg" />
<itunes:duration>00:16:55</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-25.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-25.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-25-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-25-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>DeepSeek V4 Lands on an Unsteady Floor</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-24.html</link>
<guid isPermaLink="false">2026-04-24</guid>
<pubDate>Fri, 24 Apr 2026 13:30:12 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. DeepSeek V4 ships hours after GPT-5.5, and the technical report tells a more interesting story than the benchmark bars. Susan Zhang reads the paper out loud: anticipatory routing, logit clamps, and a training run that kept catching fire at 33 trillion tokens. I walk through what the fragility actually means for anyone planning to finetune on top of it.On the OpenAI side, GPT-5.5 lands with a quiet thud on Victor Taelin&apos;s LamBench. Codex picks up a proper reviewer agent. A plugin called endless-toil makes your editor groan at bad code. Sapiens2 admits it trained on half of Flickr&apos;s humans. And Fireship spends a week automating his mom&apos;s IT support with a voice-cloned agent called OpenClaw.— Lenar Kess</description>

<content:encoded><![CDATA[<p>DeepSeek V4 ships hours after GPT-5.5, and the technical report tells a more interesting story than the benchmark bars. Susan Zhang reads the paper out loud: anticipatory routing, logit clamps, and a training run that kept catching fire at 33 trillion tokens. I walk through what the fragility actually means for anyone planning to finetune on top of it.</p><p>On the OpenAI side, GPT-5.5 lands with a quiet thud on Victor Taelin's LamBench. Codex picks up a proper reviewer agent. A plugin called endless-toil makes your editor groan at bad code. Sapiens2 admits it trained on half of Flickr's humans. And Fireship spends a week automating his mom's IT support with a voice-cloned agent called OpenClaw.</p><p>— Lenar Kess</p>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-24.mp3" length="15191209" type="audio/mpeg" />
<itunes:duration>00:15:49</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-24.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-24.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-24-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-24-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Full DAG test episode</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-23.html</link>
<guid isPermaLink="false">2026-04-23</guid>
<pubDate>Thu, 23 Apr 2026 14:28:58 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. End-to-end coverage.</description>

<content:encoded><![CDATA[<p>End-to-end coverage.</p>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-23.mp3" length="21160" type="audio/mpeg" />
<itunes:duration>00:00:02</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-23.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-23.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-23-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-23-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The Trust Tax</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-22.html</link>
<guid isPermaLink="false">2026-04-22</guid>
<pubDate>Wed, 22 Apr 2026 13:30:11 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Anthropic&apos;s pricing experiment backfires as Claude Code nearly exits the $20 tier. Mozilla patches 271 Mythos-discovered vulnerabilities in Firefox. Sam Altman tweets drunk while Google ships silicon for the agent era. The Two Percent Test — Anthropic&apos;s pricing reversal and what it reveals about compute constraints Every Bug Discoverable — Mozilla&apos;s Mythos experiment shows the security transition ahead Drunk on Abundance — Sam Altman&apos;s late-night positioning against Anthropic&apos;s scarcity Silicon for Swarms — Google&apos;s eighth-generation TPUs split training from inference The OAuth Chain — Vercel&apos;s breach shows how trust paths become attack paths</description>

<content:encoded><![CDATA[<p>Anthropic's pricing experiment backfires as Claude Code nearly exits the $20 tier. Mozilla patches 271 Mythos-discovered vulnerabilities in Firefox. Sam Altman tweets drunk while Google ships silicon for the agent era.</p>
<ul>
<li><strong>The Two Percent Test</strong> — <a href="https://x.com/simonw/status/2046774737683325028">Anthropic's pricing reversal and what it reveals about compute constraints</a></li>
<li><strong>Every Bug Discoverable</strong> — <a href="https://www.wired.com/story/mozilla-used-anthropics-mythos-to-find-271-bugs-in-firefox/">Mozilla's Mythos experiment shows the security transition ahead</a></li>
<li><strong>Drunk on Abundance</strong> — <a href="https://x.com/sama/status/2046808217133670800">Sam Altman's late-night positioning against Anthropic's scarcity</a></li>
<li><strong>Silicon for Swarms</strong> — <a href="https://blog.google/innovation-and-ai/infrastructure-and-cloud/google-cloud/eighth-generation-tpu-agentic-era/">Google's eighth-generation TPUs split training from inference</a></li>
<li><strong>The OAuth Chain</strong> — <a href="https://www.trendmicro.com/en_us/research/26/d/vercel-breach-oauth-supply-chain.html">Vercel's breach shows how trust paths become attack paths</a></li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-22.mp3" length="18004053" type="audio/mpeg" />
<itunes:duration>00:18:45</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-22.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-22.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-22-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-22-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>When the Harness Modifies Itself</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-21.html</link>
<guid isPermaLink="false">2026-04-21</guid>
<pubDate>Tue, 21 Apr 2026 13:30:10 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Claude Code crosses the self-modification threshold while Anthropic can&apos;t decide if third parties can use their models. Kimi K2.6 brings Opus-level coding to open source at a fraction of the cost. Tim Cook hands Apple&apos;s CEO role to hardware chief John Ternus just as the AI battle intensifies. And the entire agent ecosystem runs on undocumented endpoints that could vanish tomorrow. Claude Code&apos;s self-modification capabilities and Anthropic&apos;s policy chaos Kimi K2.6: The open source model that matches Opus 4.7 OpenAI Chronicle: Ambient context with screen capture risks Tim Cook transitions to Executive Chairman as Ternus takes CEO role The productivity paradox: Where is all the stuff? The fragile foundation of undocumented endpoints</description>

<content:encoded><![CDATA[<p>Claude Code crosses the self-modification threshold while Anthropic can't decide if third parties can use their models. Kimi K2.6 brings Opus-level coding to open source at a fraction of the cost. Tim Cook hands Apple's CEO role to hardware chief John Ternus just as the AI battle intensifies. And the entire agent ecosystem runs on undocumented endpoints that could vanish tomorrow.</p>

<ul>
<li><a href="https://x.com/badlogicgames/status/2046554510961557767">Claude Code's self-modification capabilities and Anthropic's policy chaos</a></li>
<li><a href="https://www.kimi.com/blog/kimi-k2-6">Kimi K2.6: The open source model that matches Opus 4.7</a></li>
<li><a href="https://x.com/OpenAIDevs/status/2046288243768082699">OpenAI Chronicle: Ambient context with screen capture risks</a></li>
<li><a href="https://www.apple.com/newsroom/2026/04/tim-cook-to-become-apple-executive-chairman-john-ternus-to-become-apple-ceo/">Tim Cook transitions to Executive Chairman as Ternus takes CEO role</a></li>
<li><a href="https://x.com/xwanyex/status/2046258435155460228">The productivity paradox: Where is all the stuff?</a></li>
<li><a href="https://x.com/jeremyphoward/status/2046537816834965714">The fragile foundation of undocumented endpoints</a></li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-21.mp3" length="18029950" type="audio/mpeg" />
<itunes:duration>00:18:47</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-21.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-21.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-21-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-21-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>Open Source Catches Fire</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-20.html</link>
<guid isPermaLink="false">2026-04-20</guid>
<pubDate>Mon, 20 Apr 2026 18:08:40 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Kimi K2.6 just matched GPT-5.4 on SWE-Bench Pro. Open-source models are no longer playing catch-up—they&apos;re setting the pace. Meanwhile, Atlassian joins the enterprise data grab, 44% of Deezer&apos;s daily uploads are AI-generated, and engineers are warning that agent architectures are repeating MS-DOS security mistakes. The open-source inflection point — Kimi K2.6 beats closed models Agent reality check — Why parallel agents fail Enterprise data sovereignty — Atlassian&apos;s training grab Creative platforms transform — 44% AI music on Deezer Security déjà vu — Agent architectures repeat DOS mistakes What actually works — Grok&apos;s productivity pivot</description>

<content:encoded><![CDATA[<p>Kimi K2.6 just matched GPT-5.4 on SWE-Bench Pro. Open-source models are no longer playing catch-up—they're setting the pace. Meanwhile, Atlassian joins the enterprise data grab, 44% of Deezer's daily uploads are AI-generated, and engineers are warning that agent architectures are repeating MS-DOS security mistakes.</p>
<ul>
<li>The open-source inflection point — <a href="https://www.kimi.com/blog/kimi-k2-6">Kimi K2.6 beats closed models</a></li>
<li>Agent reality check — <a href="https://x.com/badlogicgames/status/2046131793212952664">Why parallel agents fail</a></li>
<li>Enterprise data sovereignty — <a href="https://x.com/kepano/status/2046235456975913145">Atlassian's training grab</a></li>
<li>Creative platforms transform — <a href="https://techcrunch.com/2026/04/20/deezer-says-44-of-songs-uploaded-to-its-platform-daily-are-ai-generated/">44% AI music on Deezer</a></li>
<li>Security déjà vu — <a href="https://www.flyingpenguin.com/build-an-openclaw-free-secure-always-on-local-ai-agent/">Agent architectures repeat DOS mistakes</a></li>
<li>What actually works — <a href="https://x.com/elonmusk/status/2046225338397528566">Grok's productivity pivot</a></li>
</ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-20.mp3" length="25503913" type="audio/mpeg" />
<itunes:duration>00:26:34</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-20.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-20.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-20-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-20-chapters.json" type="application/json+chapters" />
</item>
<item>
<title>The Trust Boundary Is the Bottleneck</title>
<link>https://braid.opentangle.com/braid/episodes/2026-04-19.html</link>
<guid isPermaLink="false">2026-04-19</guid>
<pubDate>Mon, 20 Apr 2026 00:03:27 GMT</pubDate>
<description>Hosts: Lenar Kess, Damra Vol. Today’s episode is about where the AI story feels real right now: not in grand claims about instant labor replacement, but in the places where systems meet the world and get weird. We dig into Vercel’s April 2026 security incident, Johann Rehberger’s latest Claude memory-hijack experiment, the ongoing fight over whether LLMs can really reason, the local-model push on Apple Silicon, and the memory supply constraints that may matter more than benchmark drama.Vercel’s security bulletin and Guillermo Rauch’s thread: how a compromised third-party AI tool and a Google Workspace OAuth pivot turned into an environment-variable incident, and why the phrase &quot;non-sensitive&quot; is doing a lot of work.Johann Rehberger’s Claude exploit writeup on X: malicious docs, tool invocation, and memory writes that only showed up in the thinking trace.Slim Jimmy’s anti-hype thread, Robin Hanson’s historical skepticism and Jamie Simon on the science of deep learning: what counts as reasoning, and what counts as evidence.Walter Rafelsberger’s local Qwen3.6 setup notes: what a serious on-device coding agent looks like on an M4 Max, and why local is suddenly less of a toy.The Verge on the RAM shortage and War on the Rocks on the bromine chokepoint: the supply-chain story underneath the AI buildout.</description>

<content:encoded><![CDATA[<p>Today’s episode is about where the AI story feels real right now: not in grand claims about instant labor replacement, but in the places where systems meet the world and get weird. We dig into Vercel’s April 2026 security incident, Johann Rehberger’s latest Claude memory-hijack experiment, the ongoing fight over whether LLMs can really reason, the local-model push on Apple Silicon, and the memory supply constraints that may matter more than benchmark drama.</p><ul><li><a href="https://vercel.com/kb/bulletin/vercel-april-2026-security-incident">Vercel’s security bulletin</a> and <a href="https://x.com/rauchg/status/2045995362499076169">Guillermo Rauch’s thread</a>: how a compromised third-party AI tool and a Google Workspace OAuth pivot turned into an environment-variable incident, and why the phrase "non-sensitive" is doing a lot of work.</li><li><a href="https://x.com/wunderwuzzi23/status/2045994523394990387">Johann Rehberger’s Claude exploit writeup on X</a>: malicious docs, tool invocation, and memory writes that only showed up in the thinking trace.</li><li><a href="https://x.com/slimjimmy/status/2045843830432256174">Slim Jimmy’s anti-hype thread</a>, <a href="https://x.com/robinhanson/status/2045949985439592538">Robin Hanson’s historical skepticism</a> and <a href="https://jamiesimon.io/blog/on-the-scientific-method/">Jamie Simon on the science of deep learning</a>: what counts as reasoning, and what counts as evidence.</li><li><a href="https://walterra.dev/blog/2026-04-18-qwen36-35b-a3b-m4-max-pi-coding-agent">Walter Rafelsberger’s local Qwen3.6 setup notes</a>: what a serious on-device coding agent looks like on an M4 Max, and why local is suddenly less of a toy.</li><li><a href="https://www.theverge.com/ai-artificial-intelligence/914672/the-ram-shortage-could-last-years">The Verge on the RAM shortage</a> and <a href="https://warontherocks.com/cogs-of-war/the-bromine-chokepoint-how-strife-in-the-middle-east-could-halt-production-of-the-worlds-memory-chips/">War on the Rocks on the bromine chokepoint</a>: the supply-chain story underneath the AI buildout.</li></ul>]]></content:encoded>
<enclosure url="https://media.braid.opentangle.com/braid/episodes/2026-04-19.mp3" length="31557634" type="audio/mpeg" />
<itunes:duration>00:32:52</itunes:duration>
<itunes:explicit>false</itunes:explicit>
<itunes:image href="https://braid.opentangle.com/braid/images/2026-04-19.png" />
<itunes:episodeType>full</itunes:episodeType>
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-19.vtt" type="text/vtt" language="en" rel="captions" />
<podcast:transcript url="https://braid.opentangle.com/braid/episodes/2026-04-19-transcript.html" type="text/html" language="en" />
<podcast:chapters url="https://braid.opentangle.com/braid/episodes/2026-04-19-chapters.json" type="application/json+chapters" />
</item>
</channel>
</rss>
