Introduction
Years ago, when I was studying computer science, one of my professors declared that any software program could also be built as hardware. I found it a fascinating concept and it always stuck with me. But I never found a strong use case for it, beyond some niches. That is until now.
Current AI Infrastructure Is Flawed
Renting discounted tokens from datacenters that compete with their surrounding communities for electricity and water is a short-term solution; one that introduces sovereignty concerns. It works for prototypes and it keeps shareholders happy this quarter. It does not work as a foundation for an industry, and it definitely not as a foundation for state infrastructure.
The cracks show in three places: the utility grids, the balance sheet, and the privacy regime. Each one alone is enough to force a rethink. Together they make the case for hardware-defined AI overwhelming.
Language Model Processing Units
I like to call them Language Model Processing Units, or LMPUs. I expect future computers and devices to ship with a CPU, a GPU, and an LMPU; just another card you click onto your motherboard, or one that gets soldered to it. Pushing toward the future of hardwired AI will help Anthropic and its peers reach profitability, and it gives the EU a credible path to closing the AI gap.
Energy: The Constraint Nobody Can Negotiate Around
A general-purpose GPU is a remarkable piece of engineering, but it is built to do anything. That generality has a price, and the price is paid in watts. Running a fixed, well-understood workload on a chip designed to do everything is the most expensive way to do that workload. It is also the most water-intensive, because the cooling load scales with the heat output.
Hardcoding a model into purpose-built silicon flips this. Taalas claims a thousandfold improvement in efficiency on the HC1, and even if the real-world figure lands at a tenth of that, the grid math changes entirely. The Netherlands cannot keep approving datacenter expansions on the current trajectory. Ireland has already stopped. France is hedging with nuclear, but that is a multi-decade bet. Hardware that does more inference per watt, on air-cooled commodity racks instead of liquid-cooled hyperscale halls, is the only honest answer for any European government trying to reconcile AI ambitions with grid capacity and climate commitments at the same time.
This is also where the EU has a structural advantage it has not yet noticed. Europe lost the model race years ago and is not catching up. It has not lost the silicon race, because the silicon race for inference has barely started. ASML sits in Veldhoven. IMEC sits in Leuven. The talent and the supply chain are already on the continent. The thing that is missing is the political decision to treat inference hardware as the strategic priority instead of chasing OpenAI from behind.
Tokenomics: A Market Sold at a Loss
Anthropic, OpenAI, and the rest are currently selling tokens below cost. Burn capital now, capture the market, raise prices later. Standard playbook. It has worked before.
It will not work this time, because there is no headroom left for the second half of the playbook. Tokens are already expensive enough to be a barrier. Developers complain about it openly, finance teams flag it as their fastest-growing line item, and entire categories of obvious AI applications never get built because the unit economics do not survive contact with a calculator.
The labs are subsidising a price point that is already deterring adoption. Raising prices to reach profitability does not produce profit — it produces an adoption cliff. The friction is already there at today’s prices; push them up and a meaningful share of the current user base walks.
The honest path to profitability is to own the silicon. That means committing to one model per chip class, accepting that the chip will not run tomorrow’s frontier model, and capturing the inference economics for everything that does not need to be frontier — which is most of what enterprises actually run. A CFO can depreciate an LMPU over five years. They cannot depreciate an API bill. That single accounting fact is doing more strategic work than it sounds like it is.
Privacy: The Argument That Actually Moves Brussels
The privacy case is the one that gets traction in Brussels, The Hague, and Berlin first, because it is the one that does not require any hand-waving. A municipal government cannot put citizen data through a US-hosted API. A defence ministry cannot put procurement documents through one. A hospital cannot put patient records through one. The current workarounds, sovereign cloud zones, EU-only regions and contractual data residency clauses are paper dragons that survive only as long as nobody tests them seriously in court. The CLOUD Act exists. Schrems III is coming.
Air-gapped inference on hardware you physically own is the only architecture that survives a serious adversarial review. LMPUs make that economically viable for the first time. A €5,000 card sitting in a rack in a Dutch ministry is a fundamentally different proposition from a €50,000-per-month API contract with a foreign provider. Different cost structure, different legal exposure, different political story.
The Objection: “But the Model Will Be Stale”
This is the argument every conversation about LMPUs eventually collapses into, and it is the same argument made against every fixed-function silicon decision in computing history. You bought a CPU in 2008 and Intel released a better one in 2009; you did not throw away your servers. You bought a GPU in 2020 and NVIDIA released a better one in 2022; you did not refresh your fleet. You ran on the silicon you had until the amortisation curve told you it was time to upgrade.
LLMs will follow the same pattern, and the timing is finally right for it. The current generation of open models is good enough to do real work. Llama 3.1, Mistral, the Qwen family, and the rest can summarise, classify, extract, retrieve, and answer questions at a quality that was science fiction five years ago. Most enterprise inference workloads do not need the frontier. They need reliability, cost predictability, and a model that does not get deprecated out from under them six months after they integrate it.
No company is going to refresh its entire AI stack every quarter because a new model dropped. They have never done this for any other infrastructure category and they are not going to start now. I expect we will see Long Term Support (LTS) versions of models very soon, similar to enterprise Linux and enterprise Java.
Taalas Already Shipped the Proof
The interesting thing about the LMPU argument in 2026 is that it is no longer speculative. Taalas published their architectural case in February with The path to ubiquitous AI, and the accompanying Garden Research deep-dive is the clearest external explanation of how their foundry flow turns a finished model into a finished chip in roughly two months. The HC1 is a working product, not a research demo, with a public chat interface and an inference API.
The technical question is settled. What is left is the political and commercial question of which players move first to make this the default deployment pattern, and which get caught flat-footed when the cost-per-token curve breaks the wrong way.
In Conclusion
Governments will be pushed toward LMPUs by privacy and energy, because their voters and their grids will force it. The frontier labs will be pushed toward LMPUs by tokenomics, because the alternative is staying permanently trapped in a cost structure they do not control. Both pressures point at the same destination.
AI needs to become hardware again.
I will be writing more about IT policy from a European perspective — including the outlines of a conglomerate of European companies that can bring this vision together. If you are interested in sparring about this subject or have questions, don’t hesitate to get in touch.