Archive CONSTRUCT
The Agent Now Watches the Agent / DISPATCH 005
PDF RSS

Dispatch 005 · 2026-05-13

The Agent Now Watches the Agent

/ 00:14:35 / 10 sources

“Once the trace becomes material for the next run, observability stops being a dashboard and becomes part of the agent's workspace.”

— Lenar Kess, today's narration

Once the trace becomes material for the next run, observability stops being a dashboard and becomes part of the agent's workspace.

  • The Agent Now Watches the Agent

Chapters

  1. 00:00:00 Transcript

Sources

10 cited
  1. 1

    Harrison Chase announces LangSmith Engine

    Thread Harrison Chase — LangChain cofounder announcing LangSmith Engine during Interrupt.

    LangSmith Engine is an agent that sits on top of your traces

    x.com/hwchase17/status/2054657397902455060 →
    Details
    Cited text
    LangSmith Engine is an agent that sits on top of your traces
    Context
    It made the episode's main question concrete: what happens when the agent starts inspecting and improving the agent loop.
    Key points
    • Engine runs in the background over LangSmith traces.
    • It identifies issues and suggests code changes or evaluators.
    • The announcement positions traces as input to agent improvement, not only debugging evidence.
    Provenance
    Thread · Primary source
  2. 2

    LangSmith Sandboxes are Generally Available

    Article Mukhil Loganathan — LangChain author announcing GA for LangSmith Sandboxes.

    Each sandbox is a hardware-virtualized micro virtual machine

    www.langchain.com/blog/langsmith-sandboxes-… →
    Details
    Cited text
    Each sandbox is a hardware-virtualized micro virtual machine
    Context
    It supplied the execution-containment half of the LangChain bundle and gave Halek concrete operator controls to discuss.
    Key points
    • GA adds micro virtual machine isolation, snapshots, cheap forks, blueprints, service URLs, a CLI, creator-private defaults, and an auth proxy.
    • The post argues that containers and eval boundaries are insufficient for untrusted agent code.
    • Future work includes local-to-cloud agents, shared volumes, and process/network tracing inside the VM.
    Provenance
    Article · Supporting source
  3. 3

    LangChain announces Managed Deep Agents

    Thread LangChain — Company account announcing managed deployment for Deep Agents.

    Harness, context, and code execution

    x.com/LangChain/status/2054684227053162957 →
    Details
    Cited text
    Harness, context, and code execution
    Context
    It let the episode frame agent products as managed work surfaces rather than single model calls.
    Key points
    • Managed Deep Agents promises harness, context, and code execution as managed pieces.
    • The pitch is deployment with a single line of code.
    • The thread drew operator questions about execution limits and manageability.
    Provenance
    Thread · Primary source
  4. 4

    Use the Claude Agent SDK with your Claude plan

    Source Anthropic — Claude Help Center article explaining Agent SDK monthly credits.

    Starting June 15, 2026

    support.claude.com/en/articles/15036540-use… →
    Details
    Cited text
    Starting June 15, 2026
    Context
    It gave the episode a hard pricing boundary for unattended agent loops.
    Key points
    • Claude Agent SDK and claude -p usage move out of the normal subscription limits on June 15, 2026.
    • Eligible plans get separate per-user monthly credits from $20 to $200 depending on plan type.
    • After credits are used, requests either move to extra usage at standard API rates or stop if extra usage is disabled.
    Provenance
    Source · Background source
  5. 5

    I'm cooked. Anthropic just split --print mode to $/mo credits

    Article raedyohed — ClaudeAI user describing how the Agent SDK credit change affects an autonomous Kanban-agent project.

    all jobs launched using "--print" will get billed

    www.reddit.com/r/ClaudeAI/comments/1tcetsd/… →
    Details
    Cited text
    all jobs launched using "--print" will get billed
    Context
    It showed how the policy change hits actual agent-harness builders rather than only billing pages.
    Key points
    • The author built an unattended Claude Code orchestration concept around claude -p.
    • They read the new policy as closing an API-like control path inside a subscription.
    • Community replies debated workarounds and whether the dependency was sustainable.
    Provenance
    Article · Supporting source
  6. 6

    Open-source, self-updating wiki for your codebase

    Article ElectronicUnit6303 — ClaudeAI poster announcing Almanac, a local markdown wiki for agent codebase context.

    re-explaining the same codebase context

    www.reddit.com/r/ClaudeAI/comments/1tcjv9b/… →
    Details
    Cited text
    re-explaining the same codebase context
    Context
    It gave the episode a smaller, repo-level memory tool to contrast with trace-driven observability.
    Key points
    • Almanac stores codebase history and agent-conversation context as local markdown.
    • The examples are architectural facts that code alone does not preserve well.
    • The project is open source and Mac-only for now.
    Provenance
    Article · Supporting source
  7. 7

    OpenAI says Windows lacked the sandboxing tools Linux already had

    Article Brian Fagioli — Technology journalist summarizing OpenAI's Codex sandboxing writeup.

    Windows forced the company to engineer a custom solution

    nerds.xyz/2026/05/openai-linux-windows-code… →
    Details
    Cited text
    Windows forced the company to engineer a custom solution
    Context
    It connected managed sandboxes to the operating-system work needed when coding agents run local commands.
    Key points
    • Linux and macOS had sandbox primitives Codex could use, including seccomp, bubblewrap, and Seatbelt.
    • OpenAI rejected several Windows approaches before using dedicated local users, firewall restrictions, restricted tokens, and helper executables.
    • The failed unelevated sandbox could be bypassed by programs ignoring proxy settings or implementing their own networking stack.
    Provenance
    Article · Supporting source
  8. 8

    we really all are going to make it, aren't we? 2x3090 setup.

    Article RedShiftedTime — LocalLLaMA poster describing a dual RTX 3090 local coding setup.

    113 tk/s with no nvlink

    www.reddit.com/r/LocalLLaMA/comments/1tcf2d… →
    Details
    Cited text
    113 tk/s with no nvlink
    Context
    It gave the local-execution segment an operator example beyond lab hardware.
    Key points
    • The author reports a large speed jump after moving from WSL2 to Ubuntu.
    • They describe Qwen 3.6 27 billion parameter with a 262 thousand token window as useful for coding patches and reviews.
    • The post frames local models as becoming practical for budget setups.
    Provenance
    Article · Supporting source
  9. 9

    24+ tok/s from ~30B MoE models on an old GTX 1080 (8 GB VRAM, 128k context)

    Article mdda — LocalLLaMA poster and blog author testing mixture-of-experts models on secondhand hardware.

    The trick is MoE offloading

    www.reddit.com/r/LocalLLaMA/comments/1tcc7h… →
    Details
    Cited text
    The trick is MoE offloading
    Context
    It added concrete performance numbers and caveats for local agent loops.
    Key points
    • Qwen 3.6 35 billion parameter A-three-B reaches roughly 24 tokens per second on the tested setup.
    • Gemma 4 26 billion parameter A-four-B with fixed multi-token prediction reaches about 24.5 tokens per second.
    • A commenter cautions that reserving 128 thousand tokens is not the same as filling the window.
    Provenance
    Article · Supporting source
  10. 10

    Build Hour: GPT-Realtime-2

    Video OpenAI — OpenAI Build Hour technical session summarized in the provided materials.

    parallel tool calling

    www.youtube.com/watch?v=qGS9Ghnq1RU →
    Details
    Cited text
    parallel tool calling
    Context
    It let the closing segment connect voice-to-action with the same execution, trace, memory, and cost constraints.
    Key points
    • GPT Realtime 2 brings GPT-5 class reasoning to voice interfaces.
    • The release includes a larger 128 thousand token context window and parallel tool calls.
    • Demos included voice-driven e-commerce search and analytics workflows.
    Provenance
    Video · Supporting source