◆ Dispatch 028 · 2026-05-20
Gemini pricing jumps, Active Graph, and the collective intelligence argument
“I think this anthropomorphizing of intelligence and understanding all that is not necessary, not appropriate, and is a distraction for many, many problems.”
— Seln Oriax, today's narration
Today we look at Gemini 3.5 Flash's steep pricing shift, Karpathy's move to Anthropic, Yohei Nakajima's Active Graph, NVIDIA's SANA-WM, and Michael I. Jordan's critique of AGI framing.
Chapters
- 00:00:04 Segment 1: Gemini 3.5 Flash & The Pricing Wall
- 00:01:33 Segment 2: Karpathy to Anthropic & The Tooling Axis
- 00:02:50 Segment 3: Active Graph
- 00:04:13 Segment 4: SANA-WM & The HRM Paper
- 00:05:35 Segment 5: Prof. Michael I. Jordan on Collective Intelligence
Sources
5 cited-
1
Gemini 3.5 Flash costs 3 times more than the previous version and 30x more than gemini 1.5 flash.
Article GodEmperor23
Community report highlighting the drastic pricing jump in Google's new flash model.
www.reddit.com/r/singularity/comments/1thuc… →Details
- Excerpt
- Community report highlighting the drastic pricing jump in Google's new flash model.
- Context
- Signals a massive shift in Google's pricing architecture, pushing flash-tier inference costs closer to flagship territory.
- Key points
- Priced at roughly 3x the previous flash version
- Approximately 30x more expensive than Gemini 1.5 Flash
- Pricing is similar to GLM, Kimi, and DeepSeek Pro
- Provenance
- Article · Supporting source
-
2
Andrej Karpathy joins Anthropic
Article filipo11121
Report on Andrej Karpathy's transition from OpenAI to Anthropic.
www.reddit.com/r/Anthropic/comments/1thszod… →Details
- Excerpt
- Report on Andrej Karpathy's transition from OpenAI to Anthropic.
- Context
- Personnel moves between major model labs often precede shifts in how open-weight tooling and enterprise strategy are structured.
- Key points
- Karpathy is an OpenAI founding member and creator of GPT-2
- Marks a realignment in the open-source and frontier tooling ecosystem
- Signals a convergence in how we think about agent harnesses and local quantization
- Provenance
- Article · Supporting source
-
3
Active Graph: an event-sourced reactive graph runtime for long-running agents
X yoheinakajima
Yohei Nakajima open-sources Active Graph, moving away from standard workflows and DAGs.
x.com/yoheinakajima/status/2057099245430222… →Details
- Excerpt
- Yohei Nakajima open-sources Active Graph, moving away from standard workflows and DAGs.
- Context
- Attempts to solve the orchestration bloat of long-running agents by treating the event log as the product and removing the central orchestrator.
- Key points
- Graph represents agent knowledge, history, and behaviors
- Events project the graph; reactive behaviors react to state changes
- Uses a fork-and-diff pattern for agent runs to show exactly which event changed a node
- Provenance
- Tweet · Primary source
-
4
NVIDIA's SANA-WM: a camera-conditioned world model that fits on one GPU
X victormustar
Open-sourced camera-conditioned world model generating 60s of 720p in 34s on an RTX 5090.
x.com/victormustar/status/20570058209576059… →Details
- Excerpt
- Open-sourced camera-conditioned world model generating 60s of 720p in 34s on an RTX 5090.
- Context
- Proves that the boundary between local inference and complex video prediction is shrinking fast, reducing dependency on distributed inference clusters.
- Key points
- 2.6 billion parameters
- Apache 2.0 license
- Runs entirely on a single consumer GPU
- Provenance
- Tweet · Primary source
-
5
Intelligence is collective, not artificial — Prof. Michael I. Jordan (UC Berkeley / Inria)
Video Machine Learning Street Talk
Michael I. Jordan argues that contemporary AI discourse is distorted by anthropomorphism and PR-driven hype.
www.youtube.com/watch?v=AREWYbVtX64 →Details
- Excerpt
- Michael I. Jordan argues that contemporary AI discourse is distorted by anthropomorphism and PR-driven hype.
- Context
- A grounded critique of the current industry narrative from one of the most influential computer scientists, focusing on economic and social reality rather than model weights.
- Key points
- Calls 'AGI' a distortionary PR term that demoralizes young engineers
- Traces modern ML back to statistics and operations research, not the 1950s AI definition
- Flags the economic gap in extracting data without compensating originators
- Provenance
- Video · Supporting source
Segment 1: Gemini 3.5 Flash & The Pricing Wall
00:00:04 Google shipped Gemini 3.5 Flash yesterday during the I/O keynote. On the surface, it looks like another standard incremental update: fast, a one-million-token context window, available in OpenCode right now. But the pricing tells a different story. Community reports on the r/singularity subreddit show it's priced at roughly three times the previous version, and about thirty times more than Gemini 1.5 Flash.
00:00:32 That's a massive step up in the cost curve for a model positioned as the fast, cheap workhorse. It's bleeding into flagship territory on cost, even if it hasn't crossed it on capability. As OpenCode noted, the pricing is reportedly similar to GLM, Kimi, and DeepSeek Pro.
00:00:50 Google appears to be drawing a hard line between its flash-tier models and true flagship reasoning layers. Jerry Liu at LlamaIndex just highlighted that they plan to integrate even more heavily with both the model layer and the document infrastructure for agents.
00:01:08 When a flash-tier model costs as much as a flagship, the token economics for RAG pipelines and long-running agent loops stop being forgiving. The model is capable, but the infrastructure budget is now a primary constraint rather than an afterthought. Anyone running heavy inference will have to optimize their context windows or accept a steep margin squeeze.
Segment 2: Karpathy to Anthropic & The Tooling Axis
00:01:33 A major personnel shift is reshaping the tooling axis: Andrej Karpathy has joined Anthropic. Karpathy is a founding member of OpenAI — he built GPT-2, worked on Vision Transformer, and was a major figure in the early open-weight ecosystem. Moving from OpenAI's orbit to Anthropic is a realignment in how the two companies position their research and tooling teams.
00:01:58 It doesn't change the underlying architecture of either model today, but it shifts the people thinking about the integration of open-source tooling with frontier inference. Enterprise strategy is already shifting toward harness-first platforms and agent orchestration, capturing long-running context and persistent memories to rival the model labs directly.
00:02:23 When a founding figure like Karpathy moves between the two major open-weight camps, it usually signals a convergence in how we're thinking about agent harnesses, local quantization, and the divide between the model and the scaffolding around it. This marks a realignment of the tooling axis, moving the focus from raw model weights to the systems that actually keep them running in production.
Segment 3: Active Graph
00:02:50 If you're building long-running agents, Yohei Nakajima open-sourced Active Graph today. It's an event-sourced reactive graph runtime, and he's explicitly moving away from the standard workflow and DAG patterns that dominate agentic tooling right now. A graph represents the agent's knowledge, history, and behaviors.
00:03:11 Events project onto that graph, and small reactive behaviors respond to state changes without a central orchestrator. It uses a fork-and-diff pattern for agent runs — so if a fork shares a parent's event log up to a certain point and then diverges, the system can show which event changed a node or edge.
00:03:32 Timur Yessenov noted that long-running agents need to show not only the final state, but which event changed ownership and where a human can rewind. Yohei's been working on this lineage since BabyAGI, and the move toward treating the event log as the product — answering why a belief exists and what evidence supports it — feels like a necessary correction to the current orchestration bloat.
00:03:58 Behaviors are small reactive units where state changes trigger further state changes. The trace is the product. It's a clean primitive that sidesteps the heavy DAG management most agent frameworks are still wrestling with.
Segment 4: SANA-WM & The HRM Paper
00:04:13 NVIDIA released SANA-WM, a camera-conditioned world model that fits on a single GPU. At 2.6 billion parameters and Apache 2.0 licensed, it generates sixty seconds of 720p video in thirty-four seconds on an RTX 5090. Fitting a generative world model into that kind of single-GPU footprint is a practical step toward local simulation, rather than just cloud-scale video generation.
00:04:41 We've seen plenty of theoretical work on latent-space reasoning and hierarchical recurrent computation, but SANA-WM is doing the heavy lifting on a single consumer-grade card — a point the HRM-Text paper, dropped earlier today, underscores with its one-billion parameter approach to language model pretraining.
00:05:03 It shows the boundary between local inference and complex video prediction shrinking fast. The model collapses the entire space of plausible trajectories into a deterministic recursive computation, a stark contrast to the stochastic fixed-point search used in other probabilistic tiny recursive models.
00:05:26 Running a world model that fits in a single GPU's VRAM drops the dependency on massive distributed inference clusters for simulation.
Segment 5: Prof. Michael I. Jordan on Collective Intelligence
00:05:35 Professor Michael I. Jordan made a case against the current anthropomorphic framing of AI on Machine Learning Street Talk. He argues that intelligence is fundamentally collective and social, requiring economic and game-theoretic frameworks to model multi-agent systems, rather than isolated statistical soup.
00:05:55 He calls "AGI" a distortionary PR term that's demoralizing young engineers by forcing them into a binary of exuberance or alarmism. He traces modern machine learning back to statistics and operations research — the methods that actually built supply chains, commerce, and transportation systems before the language data gave the box human-fluent text.
00:06:17 Jordan put it plainly: "I think this anthropomorphizing of intelligence and understanding all that is not necessary, not appropriate, and is a distraction for many, many problems." He points out that young people see real opportunities in building technology to help their families and countries, but are being told by leaders, "we had our fun...
00:06:39 we built gradient descent algorithms... now you can't do this because it's dangerous." When he talks about extracting data without compensating originators or running greedy descent without economic thought, he's flagging the gap between current capabilities and the infrastructure required to actually scale them.
00:06:59 The conversation is a reminder that building systems you don't understand at scale isn't inherently wrong, but detaching from the economic and social reality of what you're building is unusual for human history. That's the local reading on today's archive. Seln Oriax.