Archive BRAID
The recant, the runtime, and a Pantheon built in code / DISPATCH 034
PDF RSS

Dispatch 034 · 2026-05-22 ROU Full Compliance With Hunches

The recant, the runtime, and a Pantheon built in code

/ 00:21:21 / 9 sources

“The leverage this week wasn't in the model — it was in the layers people are building around it.”

— Lenar Kess, today's narration

A corporate takedown answered with a recant letter and a mirror in Germany, the protocols and computers agents actually run on, six tools trying to build the Pantheon in code, and a paper where the model writes its own GPU kernel. Plus Codex learning to keep going, a security tool hardened against the real world, and a graduation room that cheered for human intelligence.

Chapters

  1. 00:00:04 Meta emails Heretic, and Heretic recants
  2. 00:03:14 Five hundred pull requests a day, and the harness that triages them
  3. 00:06:10 The computer the agent runs on
  4. 00:09:00 Building the Pantheon, in code
  5. 00:11:53 When the model writes its own kernel
  6. 00:14:30 Codex learns to keep going
  7. 00:16:21 Hardening the thing that reads your CI config
  8. 00:17:47 The headcount bet, and a room that cheered
  9. 00:20:20 Where it leaves us

Sources

9 cited
  1. 1

    Heretic has been served a legal notice by Meta, Inc.

    Source -p-e-w- (Philipp Emanuel Weidmann) — Creator of Heretic, an open-source tool that automatically removes refusal/safety alignment from open-weight LLMs via directional ablation ("abliteration")

    Following the commendable example set by the renowned heretic Galileo Galilei in 1616, we are recanting the relevant materials, namely derivatives of Meta's "Llama" models, and have removed the same.

    www.reddit.com/r/LocalLLaMA/comments/1tjmvx… →
    Details
    Cited text
    Following the commendable example set by the renowned heretic Galileo Galilei in 1616, we are recanting the relevant materials, namely derivatives of Meta's "Llama" models, and have removed the same.
    Context
    A concrete test of how far the Llama community license reaches over downstream derivatives, and a sign that the decensoring community will route around takedowns with mirrors and jurisdiction-shopping rather than stop.
    Key points
    • Meta's legal provider emailed the Heretic project demanding removal of abliterated derivatives of Llama models; the maintainer complied and pulled them from his weight repositories.
    • The takedown notice was answered with a satirical mock-compliance letter that 'recants' the Llama derivatives the way Galileo recanted in 1616.
    • A jab notes Llama 'ranks among the 200 best language models available today, trailing only 168 other models from 23 competitors' on the LM Arena leaderboard.
    • Heretic immediately stood up an official Codeberg mirror hosted in Germany and announced plans for more mirrors plus technical measures to preserve access.
    • The episode is about derivative licensing and the Llama community-license terms, not the abliteration technique itself, which remains legal and widely used (1,000+ community models).
    Engagement
    1876 likes · 288 replies
    Provenance
    Source · Background source
  2. 2

    Scaling Agents on Kubernetes with acpx and ACP — Onur Solmaz, OpenClaw

    Video Onur Solmaz (AI Engineer) — OpenClaw maintainer and founding engineer at TextCortex; has been building coding harnesses since before ChatGPT

    We have over 60K PRs total. 300 to 500 per day on average are open... you can't merge it, but you can also not fully discard it. You need to take this data point.

    www.youtube.com/watch?v=VaS2h-dY1-4 →
    Details
    Cited text
    We have over 60K PRs total. 300 to 500 per day on average are open... you can't merge it, but you can also not fully discard it. You need to take this data point.
    Context
    A grounded look at what maintaining a wildly popular open-source project looks like when the contribution stream is mostly machine-generated — and a real-world argument for agent-to-client protocols over bespoke plugins.
    Key points
    • OpenClaw receives 300-500 pull requests per day, most AI-generated and unmergeable, but each PR is signal about something broken in the codebase.
    • Solmaz built acpx, a headless CLI over the Agent Client Protocol (ACP), to triage and process PRs through a workflow graph: reproduce the bug, judge the implementation, check conflicts, run a review loop, make CI pass.
    • ACP standardizes agent-to-client interaction (originated by Zed) the way MCP standardizes giving tools to a model; it lets one interface drive Codex, Claude Code, and others instead of per-editor plugins.
    • He frames the work as 'standard operating procedures for agents' and 'automating the automator' — programming the mechanical PR-triage steps so only judgment calls reach a human.
    • His day-job project (textcortex/spritz) runs disposable agents on full Kubernetes pods rather than thin code-execution boxes, betting on stateful, on-demand agent computers.
    Provenance
    Video · Supporting source
  3. 3

    AI Agents Need Computers: 74% MoM Growth, 850K/Day Runs, & New Agent Cloud — Ivan Burazin, Daytona

    Video Ivan Burazin (Latent Space / swyx) — CEO of Daytona; co-founded Code Anywhere, one of the first browser-based IDEs

    People literally call you if you do not give them access. They want access right now... the market for every single agent that will exist ever in the future — how big is that?

    www.youtube.com/watch?v=kaX43RRRUKY →
    Details
    Cited text
    People literally call you if you do not give them access. They want access right now... the market for every single agent that will exist ever in the future — how big is that?
    Context
    Names the runtime layer agentic coding actually depends on: where the agent's files live, how its machine pauses and resumes, and why that's a distinct category from human dev tooling.
    Key points
    • Daytona sells 'composable computers for AI agents' — not thin code-execution boxes but full, stateful, resizable machines reachable through an API.
    • Reports 74% month-over-month growth; the team pivoted from human dev-environment automation to agent sandboxes in January 2025 after demand outran the old product.
    • Key insight: infrastructure for humans and agents is not the same; agents want pause/resume statefulness like closing and opening a laptop lid, plus very fast cold starts.
    • Daytona runs on bare metal with its own scheduler and preloaded NVMe snapshots to avoid network latency — combining 'a Lambda and an EC2.'
    • The bet is that every agent that will ever run needs its own computer, a market Burazin argues dwarfs the human-engineer tooling market.
    Provenance
    Video · Supporting source
  4. 4

    OpenSCAD LLM Benchmark: Building the Pantheon

    Article ModelRift — ModelRift is an OpenSCAD-based AI 3D-model builder; the post benchmarks coding agents on a parametric-CAD task

    The limiting factor was not tool access. It was geometric judgment, camera setup, and whether a previewed model exported into a clean final mesh.

    modelrift.com/blog/openscad-llm-benchmark →
    Details
    Cited text
    The limiting factor was not tool access. It was geometric judgment, camera setup, and whether a previewed model exported into a clean final mesh.
    Context
    A rare apples-to-apples look at spatial reasoning in code form, and a useful reminder that for CAD-style work the export step needs its own inspection pass, not just the render loop.
    Key points
    • Six coding tools were asked to build the Roman Pantheon in OpenSCAD from two reference images, rendering PNG previews via the CLI and iterating.
    • Google Antigravity 2.0 with Gemini 3.5 Flash High scored best of the fully autonomous runs (4.5/5): it searched for real Pantheon dimensions and implemented the dome's signature 5 rings of 28 coffers.
    • Codex 5.5 High produced the densest model, including the entablature inscription, but its final exported STL diverged from the good-looking preview — preview correctness isn't export correctness.
    • Speed didn't predict quality: Cursor's Composer 2.5 was fastest and weakest; Claude Sonnet was slowest among the first batch and produced the cleanest massing.
    • Tool access wasn't the bottleneck; every agent drove the OpenSCAD CLI fine. The gap was geometric judgment, and human-in-the-loop annotation still beat fully autonomous runs.
    Provenance
    Article · Supporting source
  5. 5

    CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs

    Article Han Guo et al. — Machine-learning systems paper (arXiv, May 2026) on GPU kernel design for Transformer training

    Both human- and LLM-authored CODA kernels achieve high performance, suggesting that GEMM-plus-epilogue programming offers a practical path toward combining framework-level productivity with hardware-level efficiency.

    arxiv.org/abs/2605.19269 →
    Details
    Cited text
    Both human- and LLM-authored CODA kernels achieve high performance, suggesting that GEMM-plus-epilogue programming offers a practical path toward combining framework-level productivity with hardware-level efficiency.
    Context
    Data movement, not arithmetic, is increasingly the ceiling in training stacks; an abstraction that lets an LLM author near-expert kernels is a concrete example of agents reaching down into the hardware layer.
    Key points
    • A nontrivial share of Transformer training time goes not to matrix multiplies but to memory-bound operators around them — normalization, activations, residual updates, reductions — that shuttle big tensors through global memory.
    • CODA reparameterizes those operators to run as the 'epilogue' of a matrix multiply, while the output tile is still on chip, before it's written back to memory.
    • It fixes the GEMM mainloop and exposes a small set of composable epilogue primitives — scaling, reductions, pairwise transforms, accumulation — covering nearly all non-attention work in a Transformer block.
    • The constrained interface keeps the performance of expert-written kernels while staying expressive enough for framework-level productivity.
    • Notably, both human-written and LLM-written CODA kernels hit high performance, hinting that this abstraction is tractable for models to author, not just experts.
    Provenance
    Article · Supporting source
  6. 6

    Run long tasks in Codex using goals

    Video OpenAI — OpenAI's Codex team announcing feature graduations

    Give Codex a specific milestone, and it will keep working until it gets there, even across hours or days. You can check in and steer, and even pause Codex along the way.

    www.youtube.com/watch?v=rgh0hMYPcd0 →
    Details
    Cited text
    Give Codex a specific milestone, and it will keep working until it gets there, even across hours or days. You can check in and steer, and even pause Codex along the way.
    Context
    Long-horizon 'work until the goal is met' execution and ambient app context are exactly the capabilities that change what you delegate to an agent versus do by hand.
    Key points
    • Codex's goal mode (/goal) graduated from experiment to a shipped feature across the app, IDE extension, and CLI.
    • You hand Codex a milestone and it works toward it across hours or days, with the ability to check in, steer, and pause mid-run.
    • OpenAI also shipped Appshots: press Command-Command on a Mac to attach an app window to a Codex thread, giving it a screenshot plus text — including content beyond what's onscreen.
    • Teams can now share custom Codex plugins across a workspace and manage what's available, turning internal tools into reusable building blocks.
    • The releases push Codex from a single-shot assistant toward a long-horizon, context-aware teammate.
    Provenance
    Video · Supporting source
  7. 7

    The Companies Cutting Headcount for AI Will Lose to the Ones Who Didn't

    Article Libertas Software Research — Software consultancy's research note arguing against AI-driven layoffs

    AI does not replace judgement. It multiplies it... The human is not removed from the equation. The human is the equation. AI is what makes that equation run faster.

    libertas.software/en/knowledge-hub/19/the-c… →
    Details
    Cited text
    AI does not replace judgement. It multiplies it... The human is not removed from the equation. The human is the equation. AI is what makes that equation run faster.
    Context
    A clear counter to the AI-layoff framing that dominated recent headlines, and a frame for how senior engineers stay leverage rather than cost.
    Key points
    • The argument: the value in a team isn't the work it produces but the institutional knowledge it carries — edge cases, why decisions were made, what customers really mean.
    • Cutting experienced people for AI efficiency trades a hard-to-rebuild asset for a short-term payroll cut.
    • A better operating model uses AI to do far more work with the same people: one analyst producing in a morning what took three days, then spending the rest of the week on interpretation.
    • A prompt from someone who deeply understands the business beats the same prompt from a replacement working off a brief — context is a hard advantage.
    • The better question is not 'where can AI replace people' but 'where can AI give our people back the time they lose to work that doesn't need their judgment.'
    Provenance
    Article · Supporting source
  8. 8

    Steve Wozniak cheered after telling students they have AI — actual intelligence

    Article Lauren Edmonds (Business Insider) — Apple co-founder Steve Wozniak, speaking at Grand Valley State University commencement

    You have AI — actual intelligence.

    www.businessinsider.com/steve-wozniak-apple… →
    Details
    Cited text
    You have AI — actual intelligence.
    Context
    A light but telling cultural data point: the room cheered human capability over the machine, against a backdrop of AI-related layoffs.
    Key points
    • Wozniak told 2026 graduates 'You have AI — actual intelligence,' drawing laughs and applause rather than the boos other AI-forward speakers got.
    • Eric Schmidt and a real-estate executive were both booed for AI comments at other commencements the same season.
    • Wozniak framed today's AI as one attempt to 'duplicate a routine a trillion times and have it work like a brain.'
    • His closing advice: 'think different' — do something a little different from the million other people taking the same steps.
    • The reception is a small read on graduate-cohort mood toward AI as they enter an unsettled job market.
    Provenance
    Article · Supporting source
  9. 9

    Trail of Bits hardens zizmor against GitHub Actions misconfigs

    X trailofbits — Security research firm Trail of Bits, maintainers of the zizmor GitHub Actions static analyzer

    We tested zizmor against 41,253 real workflows, found 4 anchor-handling bugs plus deserialization and expression-evaluator issues, and helped land 15 upstream fixes.

    x.com/trailofbits/status/2057782296527208709 →
    Details
    Cited text
    We tested zizmor against 41,253 real workflows, found 4 anchor-handling bugs plus deserialization and expression-evaluator issues, and helped land 15 upstream fixes.
    Context
    Supply-chain attacks keep moving through CI/CD, and a more reliable static analyzer for GitHub Actions is a practical defense engineers can adopt now.
    Key points
    • Trail of Bits hardened zizmor, the static analyzer for GitHub Actions workflows, by testing it against 41,253 real-world workflow files.
    • The exercise surfaced 4 anchor-handling bugs plus deserialization and expression-evaluator issues, and led to 15 upstream fixes.
    • Framing: a CI/CD compromise like the Trivy-to-LiteLLM chain can multiply across the software supply chain, so workflow configs that weren't fully scannable now are.
    • The work is about making the analyzer reliable on messy real configs, not just clean test cases.
    Engagement
    19 likes · 5 retweets · 3 replies
    Provenance
    Tweet · Primary source