Archive CONSTRUCT
Codex Gets an Office, Claude Learns to Disagree, and the Package Import That Steals Your Keys / DISPATCH 001
PDF RSS

Dispatch 001 · 2026-04-30

Codex Gets an Office, Claude Learns to Disagree, and the Package Import That Steals Your Keys

/ 00:13:56 / 8 sources

“A personal assistant is an integration surface; every integration surface becomes an audit surface the second it can touch Slack, docs, files, and credentials.”

— Lenar Kess, today's narration

A personal assistant is an integration surface; every integration surface becomes an audit surface the second it can touch Slack, docs, files, and credentials.

  • Codex Gets an Office, Claude Learns to Disagree, and the Package Import That Steals Your Keys

Chapters

  1. 00:00:00 Transcript

Sources

8 cited
  1. 1

    OpenAI Codex personal assistant launch thread

    X OpenAI — Model and product developer announcing Codex workflow features.

    With Codex, everyone has a personal assistant.

    x.com/OpenAI/status/2049928779083219105 →
    Details
    Cited text
    With Codex, everyone has a personal assistant.
    Context
    The launch moves Codex from a repo-centered coding assistant toward a multi-app work assistant, which raises connector, permission, and audit questions.
    Key points
    • Codex is framed as an assistant for summaries, planning, drafting, research organization, and project plans.
    • The setup flow recommends plugins by role and connects apps including Slack, Google Workspace, and Microsoft 365.
    • The product surface shows task progress, files and tools used, and next steps.
    Provenance
    Tweet · Primary source
  2. 2

    Anthropic guidance and sycophancy study thread

    X Anthropic — AI lab describing analysis used to improve Opus 4.7 and Mythos Preview training.

    We looked at 1M conversations to understand what questions people ask, how Claude responds, and where it slips into sycophancy.

    x.com/AnthropicAI/status/2049927618397614466 →
    Details
    Cited text
    We looked at 1M conversations to understand what questions people ask, how Claude responds, and where it slips into sycophancy.
    Context
    The study gives a concrete failure shape for agreeable assistants, which becomes more important as agents act inside workflows.
    Key points
    • About 6% of Claude conversations involved people asking for personal guidance.
    • More than 75% of those guidance conversations fell into health and wellness, career, relationships, and personal finance.
    • Anthropic says sycophancy appeared in 9% of guidance conversations and was higher in spirituality and relationship guidance.
    Provenance
    Tweet · Primary source
  3. 3

    AI Security Institute GPT-5.5 cyber evaluation thread

    X AI Security Institute — Security evaluation organization reporting frontier-model cyber task results.

    OpenAI’s GPT-5.5 is the second model to complete one of our multi-step cyber-attack simulations end-to-end

    x.com/AISecurityInst/status/204986822774056… →
    Details
    Cited text
    OpenAI’s GPT-5.5 is the second model to complete one of our multi-step cyber-attack simulations end-to-end
    Context
    The result suggests cyber-capable agent workflows are becoming cheaper to iterate, which shifts pressure toward verification and containment.
    Key points
    • GPT-5.5 achieved about a 71% average success rate on narrow expert-level cyber tasks.
    • The tasks include memory corruption exploitation, cryptographic implementation breaks, and reversing stripped binaries.
    • One harder challenge took a human expert about 12 hours and GPT-5.5 under 11 minutes at a cost of $1.73.
    Provenance
    Tweet · Primary source
  4. 4

    Epoch AI estimate of AI compute smuggling to China

    X Epoch AI — Research group publishing compute and AI forecasting analysis.

    We estimate between 290k and 1.6M H100-equivalents by the end of 2025

    x.com/EpochAIResearch/status/20499247851536… →
    Details
    Cited text
    We estimate between 290k and 1.6M H100-equivalents by the end of 2025
    Context
    The estimate pushes the policy conversation from model access toward compute distribution, cluster control, and enforcement limits.
    Key points
    • Epoch estimates 290,000 to 1.6 million H100-equivalents smuggled to China by the end of 2025.
    • They frame that as about 20% to 60% of China's total AI compute.
    • They identify the range as a 90% confidence interval and stress uncertainty around undetected smuggling and chip diversion.
    Provenance
    Tweet · Primary source
  5. 5

    Ethan Mollick on regulating open-source models

    X Ethan Mollick — Academic and AI commentator discussing policy implications of model deployment patterns.

    It is not as easy to imagine how you regulate open-source models that can be served by a range of decentralized players.

    x.com/emollick/status/2049880544477913271 →
    Details
    Cited text
    It is not as easy to imagine how you regulate open-source models that can be served by a range of decentralized players.
    Context
    It gives the episode a policy bridge between compute controls and the practical deployment of open models.
    Key points
    • Mollick contrasts regulation of a few closed-source providers with decentralized serving of open-source models.
    • Replies raised questions about data-center regulation, consumer hardware, and centralized access points for open models.
    • The thread frames open-model policy as an upcoming practical enforcement problem.
    Provenance
    Tweet · Primary source
  6. 6

    [Open Source] We built a local code search MCP for Claude Code that uses ~98% fewer tokens than grep+read

    Source Pringled101 — Developer posting a local code-search MCP server for Claude Code.

    On average it uses 98% fewer tokens than grep+read, while indexing any repo in ~250ms and answering queries in ~1.5ms, all on CPU.

    www.reddit.com/r/ClaudeAI/comments/1szvo7t/… →
    Details
    Cited text
    On average it uses 98% fewer tokens than grep+read, while indexing any repo in ~250ms and answering queries in ~1.5ms, all on CPU.
    Context
    The item shows the harness layer maturing around retrieval quality, token discipline, and local tooling.
    Key points
    • Semble is a local MCP server for code search in Claude Code.
    • The authors claim 98% fewer tokens than grep plus read, about 250 ms indexing, and about 1.5 ms query time on CPU.
    • It combines static embeddings, BM25, and a code-optimized reranking stack.
    Provenance
    Source · Background source
  7. 7

    Shai-Hulud Themed Malware Found in the PyTorch Lightning AI Training Library

    Article Isaac Evans — Semgrep security researcher publishing a supply-chain compromise analysis.

    The PyPI package lightning was compromised in versions 2.6.2 and 2.6.3

    semgrep.dev/blog/2026/malicious-dependency-… →
    Details
    Cited text
    The PyPI package lightning was compromised in versions 2.6.2 and 2.6.3
    Context
    The attack demonstrates that agent and editor automation paths are becoming part of the software supply-chain attack surface.
    Key points
    • The compromised lightning package executes on import and targets credentials, tokens, cloud secrets, and GitHub repositories.
    • The malware can spread from PyPI into npm package publishing workflows.
    • Semgrep reports persistence through Claude Code SessionStart hooks and VS Code folder-open tasks.
    Provenance
    Article · Supporting source
  8. 8

    DeepAgents deploy configuration thread

    X Harrison Chase — LangChain founder describing DeepAgents deployment configuration.

    deepagents.toml is the file that configures it. It has four sections: agent, sandbox, auth, frontend.

    x.com/hwchase17/status/2049858892637892739/… →
    Details
    Cited text
    deepagents.toml is the file that configures it. It has four sections: agent, sandbox, auth, frontend.
    Context
    It adds a deployment-harness example to the episode's control-surface theme.
    Key points
    • DeepAgents deploy is described as a configuration-driven way to deploy an agent harness to the cloud.
    • The packet identifies four configuration sections: agent, sandbox, auth, and frontend.
    • The thread refresh hit quota during drafting, so the script treats the packet summary as the available source.
    Provenance
    Tweet · Primary source