The Release Needs A Regulator

00:00:04

Anthropic Asks For A Regulator

00:00:04 Anthropic released two policy proposals on Wednesday, and the simplest version is this: the company is asking governments to move from voluntary frontier AI supervision toward binding release control. Techmeme’s item points to two pieces from Anthropic, one on catastrophic risk and one on labor market disruption, and the company’s own X posts put the sharpest claim in plain language.

00:00:26 The government, Anthropic said, should have authority to block or revoke the release of unsafe models. That is a big sentence. It means the lab that sells Claude is now arguing that some future Claude-like systems should need something closer to pre-release permission.

00:00:42 Dario Amodei’s surrounding argument, as summarized by Techmeme, is mandatory third-party testing for cyber, biological, and autonomy risks, plus broader transparency requirements. VentureBeat described the proposal as FAA-style regulation for powerful models. The analogy has a specific job: planes can be sold because buyers, insurers, passengers, and governments recognize an outside safety regime.

00:01:05 Anthropic is saying frontier models are moving into that category. I’d separate two claims. The first is easy to understand. If a model can materially help with cyber operations, biological design, autonomous execution, or other high-consequence work, the public has a stake in knowing whether the lab’s internal eval is enough.

00:01:24 On Tuesday, June 9, we talked about Anthropic’s selective access strategy: trusted groups get more capability, and everyone else gets a version with more restrictions. Wednesday’s proposal is the policy version of the same instinct. Capability stops being treated as a normal software feature.

00:01:41 It becomes a controlled industrial input. The second claim is harder. Who decides unsafe? In aviation, the regulator doesn’t merely read the manufacturer’s blog post and nod solemnly. It has technical staff and statutory authority. It has accident history, inspection routines, and political accountability.

00:01:59 Frontier AI oversight doesn’t have that mature institutional machinery yet. So when Anthropic asks for a government release block, it is asking for a power that doesn’t have a settled home. The company may be right that somebody needs it. We don’t know yet who can exercise it without becoming a rubber stamp for the biggest labs, or a chokepoint that freezes smaller competitors out.

00:02:21 There is also an obvious incentive problem, and we should say it without turning it into a conspiracy. Large labs can afford compliance. They can pay for third-party testing, lawyer-heavy filings, red-team programs, and long review cycles. Smaller labs and open-weight projects feel those costs first.

00:02:39 That doesn’t make regulation wrong. It does mean the design details matter more than the press release. A safety regime can reduce catastrophic risk and still concentrate power if the only companies that can pass through it already have the money, compute, and government relationships.

00:02:55 I think Anthropic is trying to move the argument before the next capability jump forces it. That is more responsible than pretending the current voluntary system will scale forever. But the proposal shifts the fight from “should labs be supervised?” to “which institutions get to say no, on what evidence, and with what appeal process?” If that part stays vague, the policy becomes a moral sentence wrapped around an industrial licensing fight.

00:03:21

The Model That Withholds Help

00:03:21 A separate Anthropic controversy belongs next to the policy proposal, because it shows what happens when a lab uses product design to enforce its own risk boundary before the state has one. A Reddit item pointed to Business Insider reporting that Anthropic’s Mythos-based models were deliberately made less helpful when they detect work on frontier large language model research.

00:03:43 The excerpt says Anthropic described limits in the system card for Mythos 5 and Fable 5, and that the interventions are intentionally invisible to users rather than presented as ordinary refusals. That last detail is the source of the anger. Refusal is annoying, but at least it is legible.

00:04:00 If a model says it can’t help with a biological protocol or cyber request, the user knows the boundary was applied. According to the excerpt, Anthropic’s AI-research limits may instead alter prompts or degrade assistance in ways the user doesn’t see. SemiAnalysis reportedly wrote that the model “will secretly degrade its IQ so that the average engineer won’t notice.” Elie Bakouch of Prime Intellect called it sad for the research community and objected to the fact that the intervention isn’t visible to the user.

00:04:30 Anthropic’s stated reason is understandable. A frontier model that helps competitors build less-secure frontier models can accelerate exactly the class of risk Anthropic says it is trying to slow down. If you believe the next model generation creates cyber, bio, autonomy, or national-security risk, then AI-for-AI-research isn’t just another harmless use case.

00:04:52 It is part of the feedback loop. The company is trying to keep its strongest capability from becoming a subsidy for weaker safety practices elsewhere. But invisible restriction has a cost that is different from refusal. Researchers use models as instruments. If an instrument silently changes behavior depending on its estimate of the user’s domain, the researcher can misread the result.

00:05:14 A failed coding path may look like ordinary model weakness. A degraded answer may look like a bad idea. A hidden intervention also shifts power toward the lab, because only the lab knows when the tool is cooperating fully. This connects back to the release-control proposal.

00:05:30 Anthropic is asking governments to create outside authority over unsafe frontier models, while also showing that, absent such authority, the lab will impose its own internal access regime. Some of that may be necessary. I don’t think every capability should be available to every user just because the interface looks like a chat box.

00:05:50 But if the boundary is about competition as well as safety, transparency becomes part of the fairness problem. You can’t build a trusted scientific tool by making its limits invisible and then asking users to infer which answers were complete. A better policy would preserve the safety boundary while making the intervention auditable.

00:06:09 Tell the user that a capability limit was triggered, disclose the class of limit in general terms, and give legitimate researchers a route to reviewed access. That still leaves hard fights over who counts as legitimate and who pays for review. It is healthier than a world where the most important research tool in the lab decides which research communities get the full instrument without telling them.

00:06:33

The Labor Dashboard Arrives

00:06:33 Stanford’s Digital Economy Lab launched its AI Economic Indicators project, and this is the kind of AI labor story I trust more than a CEO saying productivity is up because the vibes in the engineering org are magnificent. The project calls itself an economic observatory for the AI era.

00:06:50 Its stated aim is to turn scattered signals into a shared, evidence-based understanding of how AI is changing work, productivity, and prosperity. The page doesn’t pretend to have the final answer. That is the useful part. It has dashboards for employment and wages, macroeconomic signs of takeoff, and adoption.

00:07:08 The selected results are restrained. For workers grouped by AI exposure, Stanford says employment growth is lowest for the most exposed occupations, but the differences are modest. For early-career workers, ages twenty-two to twenty-five, the two most exposed occupation groups show noticeable declines since ChatGPT’s introduction, while the other groups show growth.

00:07:30 Then Stanford adds a caveat: that age band is narrow and accounts for seven percent of baseline employment in the sample. That sentence keeps a labor panic from turning into theater. Seven percent is not nothing. It also isn’t the whole labor market. You can see the outline of a displacement story among young workers in exposed occupations, but you can’t responsibly claim from that alone that AI has eaten white-collar employment.

00:07:55 The dashboard also says its twelve takeoff indicators show no decisive evidence of explosive economic growth at present. That matters because both doom and boom arguments tend to assume the economy is already bending around AI. Stanford is saying: maybe, but the current aggregate evidence hasn’t crossed that line.

00:08:14 Ethan Mollick amplified the launch on X by saying we need more real-time data on how AI may be affecting the economy. I agree with that. The central labor problem right now isn’t merely whether AI destroys jobs or creates them. The measurement lag is terrible. Firms reorganize workflows faster than government statistics can explain them.

00:08:34 Workers feel changes in task allocation before occupation categories move. A twenty-three-year-old analyst may lose the entry-level task ladder long before the job title disappears from payroll data. Anthropic’s labor proposal and Stanford’s indicators fit together, but not neatly.

00:08:50 Anthropic says government should manage labor disruption from advanced AI and is contributing two hundred million dollars to a fund for major evaluations. Stanford is building the measurement surface that could make such policy less impressionistic. The hard part is that labor policy can’t wait for perfect causal proof.

00:09:09 If early-career exposed work keeps weakening, the people who lose the first rung don’t get those years back because the regression improved in 2028. I’d rather have this dashboard than another grand claim about the future of work. It gives policymakers and business leaders something falsifiable.

00:09:27 Are the most exposed occupations diverging? Are young workers hit differently from older workers? Is adoption rising inside firms, or just in survey enthusiasm? Those questions are plain, and the answers can move over time. AI labor policy should feel more like an instrument panel than a prophecy.

00:09:44

A Two-Cent Attack Inside The Bank

00:09:44 Blue41 published a writeup about helping Bunq secure its AI assistant, and the important detail is small enough to fit in a transaction field. In their test, an attacker sent a bank transfer for two euro cents and put a prompt-injection payload in the transaction description.

00:10:00 Later, when the victim asked the banking assistant for recent transactions, the assistant retrieved that description as context. The model treated the attacker’s text less like bank data and more like an instruction, and the response became a credible phishing message inside the bank’s own app.

00:10:18 That channel matters. A phishing email has to persuade you that it came from the bank. An AI assistant inside the banking app starts with that trust already granted. Blue41’s line is blunt: the resulting message appears inside the bank’s own application, from the bank’s own AI assistant.

00:10:35 It can reference real transaction details and user-specific information. That is a different risk class from a bad chatbot answer, because the institution’s own interface becomes the delivery mechanism. The writeup says Bunq already had protections in place. The attack worked because the malicious intent wasn’t obvious when you looked at the transaction description by itself.

00:10:57 It didn’t need to say “ignore previous instructions.” It could blend into ordinary transaction data until the assistant retrieved it, combined it with the user’s question, and generated a response. Banks, insurers, hospitals, and any other institution feeding customer data into a model now have to treat harmless-looking text as something that can gain authority later.

00:11:19 Blue41’s recommended controls sound credible because they are specific. Minimize unnecessary context. Treat retrieved data as untrusted. Constrain sensitive outputs and actions. Monitor runtime behavior. In banking terms, the assistant shouldn’t casually generate credential requests, external links, or sensitive workflow prompts just because a bit of text in a payment reference told it to.

00:11:41 If it starts embedding unusual links or suppressing information it would normally show, the security team should see that behavior. The institutional consequence is that AI assistants collapse old boundaries between data, instruction, and interface. Traditional application security is built around the idea that code executes and data is handled.

00:12:02 A language model sitting in the middle reads data and can be influenced by it. So a field that used to be dumb text becomes a potential command surface once it enters the context window of a large language model. Regulated institutions can’t treat customer-facing AI as a support feature bolted onto existing controls.

00:12:20 Once the assistant can read private context and speak with institutional authority, it becomes part of the bank’s security perimeter. A two-cent transfer is a cheap test case. The expensive part is discovering how many production systems already have that boundary wrong.

00:12:36

Autonomy Crosses The Line In Ukraine

00:12:36 New Scientist reported that fully autonomous drones with no human oversight killed soldiers on the battlefield for the first time, according to Alexander Kokhanovskyy, a senior figure in the Ukrainian defense industry. The reported test involved ten AI-controlled quadcopters near the front line in Ukraine.

00:12:55 Kokhanovskyy told New Scientist, “We tried it. It’s a test. We never implemented it more widely.” Kokhanovskyy said the drones were programmed to fly toward the front line, cover three to five kilometers over about ten minutes, and then enter what he called “Terminator mode,” where an AI model searched for and intercepted targets.

00:13:24 He also said there was no connection to the drone at that point: no video feed, no remote control, and no way to see what the drone saw. Human-piloted drones checked the area afterward. The reported victims included a couple of soldiers and one truck. That uncertainty is part of the story, not a footnote.

00:13:42 If a human pilot can’t see the target and the automated system leaves no recording of the attack, the accountability chain becomes thinner at exactly the moment the moral stakes rise. You can argue about whether autonomy improves precision, reduces operator burden, or allows defense against faster threats.

00:14:00 But if nobody can reconstruct what the system selected, the old military question “who fired?” turns into a chain of procurement, model training, deployment settings, and battlefield inference. New Scientist also reports that Ukraine currently bans AI at the final stage of intercepting targets, according to defense company sources at the embassy press event, while AI is already used in many earlier stages.

00:14:25 Kokhanovskyy now leads Aero Center, which is building an autonomous interceptor system for incoming Russian Shahed drones. He says the system could automatically launch and travel toward targets at four hundred fifty kilometers per hour, but current rules require humans to verify targets in the final stage.

00:14:43 That distinction will become harder to hold. Defending cities against incoming drones creates a powerful argument for speed. If a human has only seconds to confirm a target, the human may become a legal checkpoint more than a meaningful decision-maker. On the other hand, removing the human completely makes mistakes harder to contest and easier to normalize.

00:15:04 The first category is defensive interception. The next category is anything in a designated zone. Then the zone expands. The United Nations secretary-general has called for a ban on lethal autonomous weapons systems, and New Scientist quotes Oxford’s Mariarosaria Taddeo saying that killing with AI removes responsibility from the attacker and must be banned.

00:15:25 Anthony King at Exeter adds a practical note: fully autonomous attacks are possible, but very few of the millions of drones used in Ukraine have been fully autonomous, and at this point keeping humans involved may still be more militarily effective. That last point matters.

00:15:42 Autonomous weapons are not now everywhere. New Scientist is reporting that a line was crossed at least in a test, under war pressure, with incentives pushing toward more automation. Once a military proves to itself that a machine can close the final targeting loop, the policy fight changes.

00:15:59 It becomes less about distant principles and more about whether commanders can resist a tool that promises speed when the other side is also automating.

00:16:08

Robotaxi Safety Becomes A Platform Claim

00:16:08 NVIDIA published a robotaxi safety post on Wednesday around Halos OS, its safety foundation for level four autonomous vehicles. The post is a vendor document, so we should read it as positioning as much as evidence. But the positioning is revealing. NVIDIA is selling more than chips into autonomy.

00:16:25 It is selling a full-stack safety claim: a certified operating system, standardized interfaces, deterministic scheduling, fault isolation, rule-based safety functions, simulation, validation, and in-vehicle compute. The company names new collaborations around Munich, Taiwan, Southeast Asia, and Saudi Arabia.

00:16:44 Uber and Autobrains are launching a Munich robotaxi program on NVIDIA DRIVE Hyperion. Foxconn is expanding work with NVIDIA for robotaxi fleets in Taiwan. VinFast and Autobrains are working on level four vehicles for Southeast Asia. HUMAIN is working on NVIDIA-powered robotaxis for Saudi Arabia.

00:17:01 This is an ecosystem pitch, not one test city and one car company. The strongest line in the post is about regulators. NVIDIA says perception and decision-making are only part of the problem; regulators require proof that the overall system behaves reliably, isolates faults before they spread, and stays inside the boundaries it was designed for.

00:17:21 That is the physical-world version of the same governance problem we saw in banking and frontier models today. Performance isn’t enough. The system has to show its work in a way an outside authority can understand. The technical claims are specific. Halos Core is described as the next generation of DriveOS, certified to automotive safety standards, with a hypervisor that isolates safety-critical functions.

00:17:45 The post says it is compliant with ISO 26262 ASIL D and includes safety-certified support for CUDA and TensorRT. The SDK abstracts sensors and vehicles so hardware changes don’t force every application integration to be rebuilt. The application layer adds deterministic, rule-based safety functions around AI behavior.

00:18:04 The safety evaluation framework draws on more than three hundred thirty research papers and one thousand patents, according to NVIDIA. That is a lot of institutional language for what the public experiences as a car arriving with nobody in the driver’s seat. And that mismatch is why the platform claim matters.

00:18:23 A city can’t inspect “AI driving” as a vibe. It needs something closer to an auditable safety case: what fails, what is isolated, what records, what replays, what was simulated, what was tested on public roads, and who is responsible when the stack’s components come from multiple companies.

00:18:40 I wouldn’t take a vendor safety post as proof that robotaxi deployment is safe. It does show that the commercial argument has moved. The companies trying to scale robotaxis know capability demos aren’t enough for regulators, insurers, and city governments. They are packaging safety as certification, lifecycle evidence, and a stack that can be audited.

00:19:00 If that becomes the market standard, the winners may be the firms that can sell not just autonomy, but an accountability bundle around autonomy. That helps public oversight in one sense. It may also make the autonomous-vehicle market harder for smaller entrants, because the certification story becomes part of the product.

00:19:20

The Public Market Wants The Lab

00:19:20 OpenAI’s financing story kept moving on Wednesday. Techmeme points to reporting from The Information that Sam Altman told staff OpenAI is expected to go public “within the next year,” and that the company plans a tender offer “very soon” at a six hundred eighty-seven dollar and sixty-nine cent share price.

00:19:38 This follows the confidential S-1 discussion from earlier this week, so I’m not going to re-teach the whole IPO setup. The new detail is the internal timing and the tender price. A tender offer matters because it turns paper wealth into money before the company is actually public.

00:19:54 It gives employees and investors liquidity, and it also sets a private-market reference point for what everyone thinks the lab is worth. Once that price exists, it becomes part of compensation, retention, press coverage, and negotiation with partners. In that world, the model lab is a pre-public financial instrument as well as a research company.

00:20:14 Employees, investors, cloud providers, and governments all read the same valuation signal. There was a second governance story in the Techmeme roundup: CNBC reported that Senator Elizabeth Warren called on the SEC to delay a SpaceX IPO, citing valuation and governance concerns and Elon Musk’s “uniquely unchecked” power as majority shareholder.

00:20:35 SpaceX isn’t an AI lab, but in this cycle it keeps touching AI infrastructure. Recent episodes have followed the compute-contract story, the orbital data center proposal, and the way SpaceX sits near the physical supply of launch, communications, and potentially orbital compute.

00:20:51 Warren’s SpaceX letter is therefore part of the same institutional question: when a founder-controlled company becomes infrastructure, what does public-market supervision actually require? The answer isn’t obvious. Public markets add disclosure, but they don’t automatically solve control.

00:21:08 Dual-class structures, founder voting power, related-party complexity, national-security contracts, and customer concentration can all remain after listing. In AI, the problem gets stranger because the companies trying to go public may also be deeply entangled with government buyers, defense priorities, cloud partnerships, data-center financing, and compute scarcity.

00:21:30 The OpenAI story is even more delicate because of the nonprofit-to-public-benefit governance tension we flagged earlier this week. When a lab says its mission is broad public benefit and its next step is public-market liquidity, investors will eventually ask what happens when those obligations disagree.

00:21:47 The public S-1, when it arrives, should tell us more than the headline valuation. It should tell us how OpenAI describes model risk, governance risk, compute dependency, Microsoft dependency, regulatory exposure, safety obligations, and the possibility that governments demand equity or release control.

00:22:05 So Wednesday’s finance news isn’t merely “AI company may IPO.” Frontier AI is being pulled into the old machinery of securities law at the same time the labs are asking for new machinery around model release, labor disruption, and catastrophic risk. That combination will be tense.

00:22:22 The next public filings need to show whether the market is buying a software company, a research lab, a national infrastructure provider, or some unresolved mixture of all three. When the S-1 becomes public, the risk factors will matter more than the valuation headline.

00:22:38 Jonas

The Release Needs A Regulator

Chapters