An AI Engineer talk walks through Google's WebMCP proposal, which lets a website declare its own controls as explicit tools an agent can call from inside the page. It's a different model from server-side MCP, and it raises immediate questions for anyone building agent-facing web apps. Treat it as a standards watch item: the demo runs in Chrome Canary behind preview flags, not in shipping browsers.
Read source◆ Braid Daily · 2026-06-12
Google's WebMCP turns websites into agent tool surfaces
A standards-track proposal lets a page declare its own buttons and forms as tools an agent can call.
The lead
1
Open coding models
3Kimi K2.7-Code ships as an open coding model
Hugging Face
Moonshot's new open coding model leads with token-efficiency claims and comparisons to paid options like Claude. The benchmark numbers come from launch and community material, so read them as vendor claims until someone reproduces them.
Read sourceKimi 2.7-Code release thread, with a benchmark figure
The community sighting puts a specific number on the release, a claimed +31.5% on MLS Bench Lite. Same caveat applies: it's a launch claim, not an independent result.
Read sourceHuawei previews openPangu 2.0, open weights on June 30
Huawei put up a large sparse model with a 512K context window and said the weights open at the end of the month. For now it's a preview with metrics, not a download.
Read sourceThe China stack, chip to robot
2Nvidia's China-facing CPU availability
Techmeme
A market-and-supply-chain read on what hardware is said to be available to China. Separate the reported availability from confirmed shipments before drawing a trend line.
Read sourceChina's position in the robot supply chain
Techmeme
The physical-world side of the same story: who supplies the parts that go into robots. A separate signal from the chip and model items, not one grand strategy.
Read sourceWhen the audit catches up to the AI user
3A judge removes lawyers from a case over AI-hallucinated citations
Forbes
A concrete professional consequence: fabricated citations got counsel pulled off the case. The clearest sign yet that high-stakes fields are setting standards faster than the tools verify themselves.
Read sourceAn auditable architecture for verifying biomedical manuscripts
arXiv
A research direction pointed at the same gap from the other side: build verification gates into large language model output for high-stakes writing. A preprint signal, not a fix for the courtroom problem.
Read sourceCompliance-by-construction for agents in regulated industries
arXiv
One preprint makes the case that finance and health agents should be built so compliance is structural rather than checked after the fact. Research signal for teams in regulated environments.
Read sourceAgent-deployment research worth a skim
3Structural safety gaps in major agent frameworks
arXiv
A preprint arguing that frameworks like LangChain and AutoGPT don't constitute a safety case on their own. Mark it a research signal, not operational consensus.
Read sourceA certified defense against memory poisoning in persistent agents
arXiv
Targets a real failure path for agents that keep state: poisoned memory. Proposes a certified defense the authors call SMSR.
Read sourceToolSense: diagnosing what an LLM actually knows about its tools
arXiv
An open framework for probing what a model knows about its tools past standard benchmarks, aimed at the reliability gaps that show up once an agent has to use real ones.
Read sourceSupply-chain security, outside the model stack
2About 400 AUR packages compromised with an infostealer and rootkit
Hacker News
A reminder that the dev-tooling supply chain takes hits that have nothing to do with AI. Relevant if you maintain Arch machines or pull from community package repos.
Read sourceA reported PeopleSoft flaw and associated breach claims
Techmeme
The enterprise-application side of the same week: a reported vulnerability with breach claims attached. Separate problem from the AUR packaging issue, same maintenance burden.
Read sourceCompanion episode
When the Website Starts Offering Tools
Two weeks of model releases have come with an asterisk attached: most benchmark numbers arrive from the labs that trained the models. Kimi K2.7-Code and Huawei's preview both lead with figures we can't yet reproduce. The WebMCP thread points the other way, at how agents reach systems in the first place, which is where the next round of arguments will land.