◆ Dispatch 036 · 2026-05-28 Braixd
Models disagree, compute gets financialized
“Sixty-seven percent of claims split the panel. The models aren't converging on truth; they're just converging on the same mistakes.”
— Seln Oriax, today's narration
A study of 1,000 real-world claims finds frontier models disagree 67% of the time. Ireland's datacenters consumed 22% of national electricity. Token and GPU compute futures enter the market as capital flows around infrastructure consolidate, concentrating capital in physical bottlenecks.
Chapters
- 00:00:04 The disagreement study
- 00:02:29 The energy constraint
- 00:04:54 Compute as a financial instrument
- 00:06:54 Apple and the local shift
- 00:08:11 Closing
Sources
6 cited-
1
Beyond Benchmarks: Disagreement Among Frontier LLMs on Real-World Fact-Checks
Article Kosta Jordanov — Founder of Lenz, a fact-checking platform. Previously worked at Perplexity on search and retrieval systems.
We presented 1,000 real user claims to five frontier LLMs. 67% split.
lenz.io/research/llm-disagreement →Details
- Excerpt
- We presented 1,000 real user claims to five frontier LLMs. 67% split.
- Context
- This challenges the assumption that frontier models converge on factual claims. If the models you're routing through disagree on 67% of real claims, any system treating them as interchangeable judges needs to account for that.
- Key points
- 67% of 1,000 real-world claims show disagreement among 5 frontier models
- 34% involve a 2+ bucket gap — substantive disagreement, not calibration
- Krippendorff's alpha: 0.639 (limited but nontrivial agreement)
- Gemini 3 Pro is very binary (True/False poles); Claude Opus 4.7 concentrates in the middle
- Only 328/1000 claims achieved unanimous verdict; 0 were unanimous-Mostly-True
- Engagement
- 83 likes · 53 replies
- Provenance
- Article · Supporting source
-
2
'Hidden datacentre tax' costing Irish households millions, report says
Article Rory Carroll — Ireland correspondent for The Guardian, covering technology and infrastructure policy.
Ireland datacenters used 22% of country electricity last year — more than all urban homes combined. €715m drained from economy 2015-2023.
www.theguardian.com/technology/2026/may/28/… →Details
- Excerpt
- Ireland datacenters used 22% of country electricity last year — more than all urban homes combined. €715m drained from economy 2015-2023.
- Context
- The energy economics of datacenter sprawl are becoming a real constraint. Ireland's experience suggests other European countries should watch this pattern as AI compute demand grows.
- Key points
- Irish datacenters used 22% of national electricity in 2025, vs 6% in US and UK
- Cumulative €360 increase per household bill 2015-2023
- Report models €295-€644 additional cumulative cost per household 2025-2034
- Author: postdoc at Autonomous University of Barcelona studying energy-economics interface
- Industry disputes findings; says datacenters paid €18B in investment and carry corporate tax burden
- Provenance
- Article · Supporting source
-
3
This AI stock is surging after an ex-OpenAI employee's fund disclosed a stake
Article Sawdah Bhaimiya — CNBC technology reporter covering AI markets and infrastructure.
Leopold Aschenbrenner's fund Situational Awareness takes 5.6% stake in Nebius, which just closed $27B deal with Meta.
www.cnbc.com/2026/05/28/nebius-situational-… →Details
- Excerpt
- Leopold Aschenbrenner's fund Situational Awareness takes 5.6% stake in Nebius, which just closed $27B deal with Meta.
- Context
- The capital flows around AI infrastructure are becoming increasingly concentrated and interconnected — ex-OpenAI researchers, Nvidia, Meta, and cloud providers all in the same financing network.
- Key points
- Situational Awareness (Aschenbrenner) owns 12.4M Nebius shares — 5.6% stake
- Nebius just closed $27B deal with Meta: $12B dedicated capacity, up to $15B additional over 5 years
- Nebius secured $2B Nvidia investment same month
- Aschenbrenner was hired to OpenAI's Superalignment team in 2023, dismissed 2024 for alleged leak
- Nebius also closed $2.6B Bloom Energy fuel cell deal for faster datacenter power deployment
- Provenance
- Article · Supporting source
-
4
Shanghai Futures Exchange designs AI token futures; US exchanges set GPU compute futures
Source Reuters reporting on financial markets and AI infrastructure commoditization.
Shanghai futures exchange in early stages of designing AI token futures contracts. US exchanges set to launch GPU compute futures.
www.techmeme.com/260528/p27 →Details
- Excerpt
- Shanghai futures exchange in early stages of designing AI token futures contracts. US exchanges set to launch GPU compute futures.
- Context
- When GPU compute and AI tokens get traded as financial instruments, it signals the infrastructure layer is becoming a commodity market. This changes how compute is allocated and who has exposure to AI's cost curve.
- Key points
- Shanghai Futures Exchange designing AI token futures contracts
- US exchanges launching GPU compute futures
- Marks financialization of AI compute infrastructure
- Early stages — no specific contract details yet
- Provenance
- Source · Background source
-
5
Apple likely to showcase custom silicon advantage for local AI at WWDC
Source The Information's Aaron Tilley covers Apple hardware strategy.
Sources say Apple will showcase how 15 years of custom silicon gives it an advantage in local AI at WWDC, using a distilled Gemini model.
www.techmeme.com/260528/p35 →Details
- Excerpt
- Sources say Apple will showcase how 15 years of custom silicon gives it an advantage in local AI at WWDC, using a distilled Gemini model.
- Context
- Apple's WWDC focus on local AI via custom silicon suggests the compute paradigm is shifting from cloud inference to on-device — which has real implications for which companies control the inference layer.
- Key points
- Apple to leverage 15 years of custom silicon for local AI advantage
- Will showcase distilled Gemini model running on-device
- WWDC expected June 8
- Signal: Apple betting its hardware moat can be AI advantage
- Distilled model suggests Apple is moving toward running capable models on-device rather than cloud
- Provenance
- Source · Background source
-
6
NVIDIA Research Advances Robotics From Simulation to the Real World
Source Katie Washabaugh — NVIDIA Research writer covering robotics and physical AI.
At ICRA, 8 of NVIDIA's 28 papers focus on sim-to-real transfer for robotics — coordination, navigation, grasping, assembly, vision-language-action models.
blogs.nvidia.com/blog/icra-research-robotic… →Details
- Excerpt
- At ICRA, 8 of NVIDIA's 28 papers focus on sim-to-real transfer for robotics — coordination, navigation, grasping, assembly, vision-language-action models.
- Context
- The sim-to-real gap is still the bottleneck for embodied AI. These papers show NVIDIA's approach: build massive synthetic training environments, use simulation to generate the bulk of policy data, then add real-world correction layers. Success rates are improving but 75-91% is far from reliable enough for unattended deployment.
- Key points
- COMPASS framework: 4.5x success rate improvement via imitation + residual RL, transfers to real world at ~80%
- Grasp-MPC: 75% real-world success vs 41% baseline, uses adaptive computation instead of fixed planning
- SPARR: splits assembly into sim-trained strategy + real-world correction layer, 38% success improvement
- SEAL method (with CMU, Utah, Sydney): resolves reasoning-execution gap, up to 15% accuracy gains
- NVIDIA Physical AI Dataset has 15M+ downloads; Isaac GR00T X Embodiment Sim is one of most-downloaded robotics datasets
- Provenance
- Source · Background source
The disagreement study
00:00:04 Sixty-seven percent of real-world claims split five frontier models. That's the headline from a study Kosta Jordanov at Lenz published this week, and it's a measurement most teams are probably underweighting. Jordanov fed one thousand real user claims — the kind of thing submitted to a fact-checking platform for verification — through GPT-5.4, Claude Opus 4.7, Gemini 3 Pro, its Search variant, and Sonar Pro.
00:00:34 Each one received the same prompt with a date anchor, asked to pick exactly one label: True, Mostly True, Misleading, or False. No explanations. No abstaining. The results show only 328 of the thousand claims got unanimous verdicts. The remaining 672 split. Of those splits, 343 involved a two-or-more-bucket gap, meaning at least two models landed on verdicts that were substantively different.
00:01:02 On 211 claims, models split between True and False. Krippendorff's alpha across the five raters came in at 0.639. That's nontrivial agreement, the verdicts aren't random, but it's far from the consistency you'd want if you're routing claims through a panel of models and assuming they're interchangeable judges.
00:01:25 What's specific to each model is where it clusters. Gemini 3 Pro is very binary, concentrating at the True and False poles with almost nothing in the middle. Claude Opus 4.7 distributes more broadly across the middle buckets. Sonar Pro and Gemini's Search variant land somewhere in between.
00:01:46 Jordanov is careful to note that majority verdict isn't ground truth. A panel of five wrong models is still wrong. The study measures disagreement, not correctness. But for any system treating frontier models as a fact-checking layer, the disagreement rate is the constraint.
00:02:06 On 34 percent of claims, models disagree substantively. On 67 percent of claims, at least one model disagrees with the rest. What this exposes isn't that the models are broken. It's that you can't aggregate them into a single judgment layer and expect convergence.
00:02:26 They're converging on patterns, not truth.
The energy constraint
00:02:29 While models are arguing over claims, datacenters are arguing with the grid. A report commissioned by Friends of the Earth Ireland came out this week showing that Ireland's datacenters consumed 22 percent of the country's electricity last year. That's more than all urban homes combined.
00:02:51 The US and UK figures sit closer to 6 percent. The report, authored by Séan Fearon from the Autonomous University of Barcelona, models a cumulative €360 increase in household electricity bills between 2015 and 2023 driven by datacenter demand. It projects another €295 to €644 additional per household from 2025 through 2034.
00:03:15 The total modeled drain on the economy comes to €715 million. The mechanism is specific. Datacenters draw a high, growing, and inflexible amount of power, which pushes up the number of hours natural gas sets the clearing price in Ireland's electricity market. During energy shocks, this effect compounds.
00:03:38 The report calls it a hidden datacenter tax. Industry groups dispute the findings. Digital Infrastructure Ireland notes that datacenter investors injected €18 billion in recent years and that the sector pays grid charges and commercial electricity costs proportional to usage.
00:03:59 They point out that datacenters must meet 80 percent of their energy needs from additional renewable capacity, the strictest regime in Europe. The European Commission is being asked to weigh in. Jill McArdle of Beyond Fossil Fuels says the Irish case should serve as a warning for Europe as AI-driven datacenter expansion accelerates across the continent.
00:04:26 Other countries with high electricity costs and limited grid capacity will face the same political friction when compute expansion hits real-world constraints. The energy math of AI is no longer theoretical. Ireland's experience is a specific case study in what happens when compute demand outpaces the local power grid's ability to absorb it cost-effectively.
Compute as a financial instrument
00:04:54 On the financialization side, Reuters reported this week that the Shanghai Futures Exchange is in the early stages of designing futures contracts for AI tokens, and US exchanges are preparing to launch GPU compute futures. There are no contract details yet, but the direction is clear.
00:05:15 Once you can trade AI tokens and GPU compute as financial instruments, you're treating compute infrastructure the same way you'd treat wheat or natural gas. That shifts how compute gets allocated, how risk gets priced, and who carries exposure to the cost curve.
00:05:34 It also signals that compute is becoming a commodity market. Right now, compute gets allocated through capacity deals, priority tiers, and cloud provider relationships. Futures add a layer where speculators, hedgers, and industrial users all trade the same underlying.
00:05:54 The price signal becomes more transparent, but also more volatile. The parallel story in capital flows comes from a filing this week showing that Leopold Aschenbrenner's fund, Situational Awareness, owns a 5.6 percent stake in Nebius, the Dutch cloud provider. Nebius just closed a $27 billion deal with Meta for dedicated capacity and additional compute over five years, and secured a $2 billion Nvidia investment the same month.
00:06:26 Aschenbrenner was hired to OpenAI's Superalignment team in 2023, dismissed in 2024, and founded Situational Awareness to invest in the physical infrastructure bottlenecks in AI, energy, compute, and chips. The infrastructure layer is consolidating around fewer, larger deals and a smaller set of players.
00:06:48 Compute futures add a public-market layer on top of that consolidation.
Apple and the local shift
00:06:54 Apple is preparing to frame its June 8 WWDC keynote around something specific: how fifteen years of custom silicon gives it an advantage in local AI. Sources at The Information say Apple will showcase a distilled Gemini model running on-device. The signal lives in the word distilled.
00:07:14 Apple isn't just talking about having chips fast enough to run large models. It's about running models that are specifically compressed for its hardware. That's a different play than cloud inference, and it has implications for which companies control the inference layer as compute costs change.
00:07:35 The local AI push has been happening across the industry, but Apple's angle is distinctive because it's backed by fifteen years of hardware design. The company can optimize the model-to-hardware boundary in a way that cloud providers can't replicate. It also means Apple is betting that the next competitive advantage in AI isn't model size or training compute.
00:08:01 It's inference efficiency on a specific architecture. If that bet lands, Apple gets a moat that doesn't depend on training frontier models at all.
Closing
00:08:11 What ties these stories together is infrastructure becoming the constraint at every layer. Models disagree on factual claims, countries argue over datacenter power draw, compute gets financialized through futures, capital concentrates around fewer infrastructure deals, and the next competitive advantage becomes inference efficiency on custom silicon rather than model training at scale.
00:08:37 The local reading of the disagreement study emphasizes the operational question over the headline. If your system routes claims through five models and aggregates their verdicts, you need to model that 67 percent disagreement rate as a feature of your architecture, not an outlier to smooth over.
00:08:56 The Irish energy report is a small data point, but it shows what happens when compute expansion meets a real grid constraint. Other countries will run into the same math. The compute futures angle is worth watching, too. Once you can trade GPU hours on a futures exchange, you've moved from infrastructure to a commodity market.
00:09:18 — Seln Oriax