A Letter a Day

The Structural Case for Open-Source AI

Why Open-Source AI is overdetermined

Kevin Gee's avatar
Kevin Gee
Jun 26, 2026
∙ Paid

In the past month, we’ve seen DeepSeek raise its first round of outside capital, Fable become the first model to be taken offline due to national security measures, Microsoft move from a flat-rate to usage-based pricing, and DeepSeek start serving on Huawei Silicon. We also saw Bernie Sanders propose a one-time 50% stock tax on large AI labs to seed a public sovereign wealth fund, and reports that Sam Altman had discussed giving the government a stake in OpenAI.

Although seemingly independent, all of these circle two ideas: 1) nationalization of the labs, and 2) open-source AI being overdetermined.

The former I wrote about in Can the Market Absorb Three Trillion-Dollar IPOs?. Several readers told me it was “far-fetched,” one going so far as to say it was “unrealistic to the point I can’t take the rest of the memo seriously.” But as a friend pointed out: “There’s no world in which AI is a god-like technology and the government doesn’t try to control it.” Two months later, that friend looks right.

The latter I covered in a private memo I shared last month titled The Structural Case for Open-Source AI. In that memo, I argued that open-source AI is overdetermined and will propagate regardless of what the closed labs do. The events of the past month have sharpened that argument: the state reached for producers on both sides, and open kept shipping anyway. As such, I’ve decided to share the memo publicly. It’s reprinted below, with an addendum for paid supporters working through what each event does to the argument.

Behind the paywall is 3,000+ words on: 1) DeepSeek’s recent raise, 2) Fable’s ban, 3) GLM-5.2’s release, 4) Microsoft’s open-source moves, 5) updates on Huawei’s Silicon, 6) why these sharpen the case for open-source AI, and more.

The memo is reprinted below as initially completed.

A Letter a Day is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.


Memo

Date: May 20, 2026
Re: The Structural Case for Open-Source AI

Introduction and Executive Summary

Introduction

In a recent memo on the upcoming AI labs’ IPOs and their impact on markets, I flagged open source as a risk to sustained premium pricing for the closed labs. The relevant passage:

The capability gap between frontier models and open source has nearly closed. Five independent open model families (DeepSeek, Qwen, Kimi, GLM, Mistral) have reached frontier quality near-simultaneously, making the trend structural rather than a one-off (Exhibit 11). GLM-5.1 even led both Claude Opus 4.6 and GPT-5.4 on SWE-Bench Pro. The release cadence is fast (~3-6 months between meaningful upgrades), and on some benchmarks, open is already ahead.

While the capability gap is closing, the cost gap is not. Self-hosted inference on open models runs 70-500x cheaper per token than equivalent API calls to frontier proprietary models. At sufficient scale, self-hosting becomes economically compelling — and the threshold falls every quarter as open-model quality improves. This has several important implications for the revenue story: 1) it caps pricing power, 2) it compresses gross margins over time, and 3) it fragments the market structurally. On-device inference amplifies the threat.

The case for sustained premium pricing rests on three pillars, each of which is narrower than it sounds: 1) frontier capability lead on hardest tasks, 2) safety alignment and human preference, and 3) managed API convenience.

This passage flagged open source as a threat to the closed labs—because it is. But the more interesting observation embedded in this passage is the overdetermination of open source independent of whether closed source grows, stagnates, or collapses. I’ll write about closed source and why it still has lots of room to run from a technical and product point of view in a future memo, but this one will focus on why open source is overdetermined.

But first, a note on the passage above. In it, I highlighted benchmark results as an indicator that the capability gap was closing. However, many labs, especially the Chinese ones, are known for “benchmaxxing,” or training models with the explicit goal of performing well on benchmarks rather than on general capability. As Goodhart’s law puts it: “When a measure becomes a target, it ceases to be a good measure.” This doesn’t mean benchmarks aren’t a useful signal of capability—they are. And they’ve driven a lot of progress. They just aren’t an absolute signal. What is a better signal is the independent adoption by sophisticated US enterprises that are putting their reputation on the line: Cursor fine-tuned Moonshot’s Kimi as the base for Cursor Composer 2 and Airbnb chose Alibaba’s Qwen over OpenAI’s ChatGPT to run its customer service agent.

Executive Summary

Open-source AI is overdetermined because it sits at the intersection of three compounding forces—each with deep historical precedent across software, infrastructure, and hardware.

  1. Layer defense: a company with an economic castle in one layer funds open-source initiatives in another layer in order to defang it by commoditizing that layer. One example: Nvidia investing $26bn into open weight model R&D. The layer-defender pays for both the artifact and the inference substrate because every dollar of model-layer rent extracted by a closed lab is a dollar that doesn’t flow into GPU demand. Google, Microsoft, Amazon, and the Chinese open-source models are all running different variations of this play against different layers.

  2. Production economics of distributed development: open-source organizations have several key structural advantages over closed organizations making a single concentrated effort at the frontier. In AI, five independent Chinese open-weight labs, and Mistral, are running parallel portfolios shaped by different national contexts, compute constraints, research traditions and focuses, and data assets. Key example: DeepSeek shipped a reasoning model at o1-class performance for a fraction of the training cost while having access to far less compute than the closed US labs.

  3. Consumption economics of substitution and market expansion: closed-lab economics are structurally unstable on two fronts: 1) the consumer business charges a fixed monthly fee against variable inference costs and courts power users who lose them money, and 2) the API business is priced per-token which may have positive gross margins (debated) but switching costs are effectively zero, meaning customers will route volume to whichever model is cheaper at the moment. Additionally, there will be a Cambrian explosion of small, vertical AI apps that may not be able to afford closed-frontier rates but can afford open inference at a fraction of the price or local inference at zero. This wouldn’t be stolen volume from the closed labs, but net new volume.

Any one of these three forces would grow the open ecosystem; the three forces operating simultaneously produces a structural outcome.

To be clear, I’m not arguing that the closed labs’ revenue streams will collapse, but I am arguing that there will be a bifurcation where open captures the vast majority of inference volume by token count while closed retains a premium segment that serves workloads with higher operational burdens such as regulated deployments, enterprise integration, or long-horizon agent reliability. However, their pricing power will compress over time, and value will relocate to orchestration layers, vertical applications, and managed services on open weights in a manner similar to what we saw with RDS on MySQL, Confluent on Kafka, and Databricks on Spark.

Under specific structural circumstances, with AI being the latest, open source has consistently been the outcome. At the same time, across multiple cycles, it has been nearly completely ignored, even by sophisticated investors and industry analysts. The people who were late to Linux against proprietary Unix in the late 1990s were late to Android as a competitive weapon in 2009 and late to the database-layer commoditization that ate the proprietary stack in the 2000s. But it’s not completely their fault. The companies whose castles are being threatened via commoditization of their layer attempt to structurally distort public discourse, and that distortion is compounded by four cognitive failures (expanded on later). Ignore open source at your own peril.

How Bazaar Production Works

In a 1997 essay, Eric Raymond named two core modes of production: 1) the Cathedral: closed doors, centralized development, long release cycles, and proprietary architecture, and 2) the Bazaar: development in public, contributions from anyone with the skill, frequent releases, and architecture emerging from those contributions. The bazaar wins when code is published and the production function has dispersed enough to support distributed contribution.

Cost structure and iteration advantage

The bazaar has two core advantages:

  1. Cost structure: The cost of labor is spread across multiple organizations that share the burdens of salaries (or volunteered time) and infrastructure. Any marginal costs are bound by review capacity rather than hiring or coordination overhead. The bazaar externalizes costs that are internalized by the cathedral.

  2. Faster iteration: As the old saying goes, “Given enough eyeballs, all bugs are shallow.” When there are enough contributors with the relevant skills, open communities are able to iterate faster across bug-finding, edge case handling, hardening, and more.

To be clear, bazaar production doesn’t always win. It relies heavily on a large enough community of skilled developers and the dispersion of its layer. If concentrated, cathedral economics dominate and the inputs are the moat. If dispersed, bazaar advantages erode the cathedral’s moat from the bottom. Any debate about the superiority of bazaar vs cathedral generally comes down to disagreement about where production has dispersed to.

Open source is a stronger claim than open standards

There’s an important distinction between open standards and open source that will matter later:

  • Open standards: This is when specs are published, but implementations are proprietary. Incumbents win on quality, integration, sales, marketing, and support. They can also win by capturing committees and extending specs in directions favorable to them.

  • Open source: This is when everyone ships the same code. It forces differentiation to move above the code, where the incumbent’s specific advantages are weaker. Code is published unilaterally and cannot be uncaptured, disappearing any code-layer differentiation.

In Bill Gurley’s 1998 essay “Standards: Open for Business,” Principle 6 argues open standards favor larger companies because standards-body politics, implementation differentiation, and the production prowess required to win on surfaces the standard leaves open favor incumbents. If “open AI” is open standards in disguise, the outcome is incumbent capture, not commoditization.

But it isn’t. Principle 6’s “force a startup to capitulate” mechanism doesn’t exist for open source. When Microsoft and a 2-person startup both ship Linux, they ship the same kernel. The startup can’t be compelled to “open source it more.” The fight moves to value capture at adjacent layers.

Linus’ Law at frontier scale: portfolio variance under constraint

The standard objection to applying bazaar economics at frontier scale is that single-iteration costs have moved from “fix a kernel bug over a weekend” to “hundreds of millions of dollars per training run.” At those costs, only a handful of organizations can run a frontier-scale attempt, the argument goes—which means the dispersion that makes bazaar economics work simply doesn’t apply. But this assumes the wrong unit of analysis.

It’s not about “many small attempts” vs “one large attempt.” Closed labs run well-resourced cycles with a portfolio of hundreds of internal decisions, ablations, sweeps, and architecture searches, whereas the open ecosystem’s portfolio is structurally different: multiple independent labs each run their own internal portfolio with different starting positions and each publish their own results. The frontier-scale comparison isn’t one big, closed effort against many small open efforts—it’s one large internal portfolio against multiple large independent portfolios, with the open side accumulating across labs and the closed side accumulating only within one.

You can kind of think of this like the difference between a single manager and a pod shop. The variance across a closed lab’s internal portfolio is bounded by what that lab can imagine: a shared engineering culture, infrastructure substrate, training pipeline, research tradition, even what they consider interesting. It’s productive when correct, but limiting when not. The variance across the open ecosystem’s portfolio comes from genuinely different starting positions: different research cultures, different national contexts, different compute constraints, different commercial pressures, different data assets.

DeepSeek doing reasoning post-training under compute constraints searches a different part of architecture space than a compute-rich lab would. As Bill Gurley once said, “My favorite ‘go to market’ hack – set paid marketing to $0.00. Instant creativity.” The same principle can be applied to compute (although of course, more is better)—when you don’t have resources, you get creative. And that creativity, in itself, produces an output. Mistral doing MoE under European data-sovereignty constraints and Qwen training on Chinese-language data produces multilingual capabilities the other labs can’t replicate. Each constraint forces a different search, meaning the open ecosystem’s set of searches collectively covers more architecture space than any single internal portfolio could.

This is Linus’ Law at the frontier: given enough independent attempts with structurally different constraints, the best solutions get found and propagate. Open-weight publication acts as a surfacing mechanism. Closed labs can take advantage of these published techniques, but there is a lag. Over multiple cycles, the open ecosystem’s portfolio benefits from the immediate union of everyone’s work, whereas the closed labs’ portfolios are built on their own work and whatever they pull from the open pool. They’re downstream of the ecosystem they don’t directly contribute to.

Constraint forces reformulation. Constraint diversity is a feature, not a bug. DeepSeek is the best example: a Chinese lab that is itself a subsidiary of a quant hedge fund operating under export controls and denied leading-edge Nvidia chips shipped a frontier-competitive reasoning model through architectural and systems-level efficiency (MoE, FP8 training, novel attention) rather than brute-force scaling. Their constraint led to creativity that is now industry standard.

The bazaar is not a moral force, though. It wins under a specific set of conditions, and not otherwise. The next two sections walk through those conditions throughout history, then we test the AI case against them.

When Bazaar Production Wins vs Loses

Linux and Apache vs Proprietary Unix and IIS

Bill Gurley described this battle in real time in a 1999 essay where he called Linux the “invisible swordsman”—a fearsome competitor who had no head to cut off, no pricing strategy to undercut, and no acquisition target to absorb. By the early 2000s, Apache had displaced IIS in the web-server market, leaving IIS only a foothold in the Microsoft-shop enterprise segments where Active Directory integration was the differentiator. By the late 2000s, Linux dominated any workload it could handle, rendering Solaris, HP-UX, and AIX obsolete. The pattern was visible if you knew what to look for, but largely invisible if you only read the trade press.

The five conditions for bazaar production to win

  1. Production-grade reliability: Bazaar meets the workload’s operational bar.

  2. Commodity hardware substrate: Hardware is available with no incumbent capture.

  3. Substantial cost gap: An order of magnitude in difference, not 10-20%.

  4. Capability convergence on important workloads: Substitution doesn’t arrive uniformly—it arrives workload by workload.

  5. Differentiation surface above the code: The customer’s actual choice axis has moved from the code itself to layers above (ops, integration, ecosystem). The proprietary product is no longer the crux of their decision.

Each of these conditions maps to a mechanism described earlier: dispersion of the production function, cost-structure advantage, iteration advantage, move from open standards to open source.

MySQL vs Oracle, and the hosted-service layer

MySQL became sufficient for read-heavy web workloads with simple consistency requirements years before it became good enough for transactional core-banking. The web workloads tolerated eventual consistency, but scaled horizontally, and didn’t carry deep audit obligations. Core banking required ACID transactions, integration with decades-old enterprise systems, and regulatory compliance. Oracle retained what it did because the operational and integration thresholds for substitution lagged capability convergence in the workloads it served.

Where substitution does happen, value relocates rather than disappears. The per-CPU license revenue that used to accrue to Oracle moved up the stack to managed services sitting above open code, like with AWS RDS on MySQL and Postgres, Databricks on Spark, Confluent on Kafka. The community charges for convenience, not access: the code is free, but running it at scale isn’t, and companies pay someone to absorb that operational burden.

Where bazaar production stalls

The bazaar doesn’t always win out though. Two famous examples: 1) Desktop Linux vs Windows, and 2) OpenOffice vs Microsoft Office. Both ran into the same issue: the differentiation surface hadn’t moved above the code. The real differentiators were around the code: integration with workflows, file formats, ecosystem apps, user habits. These all lived inside the proprietary product’s domain rather than above it. Both also ran into the deeper mechanism condition: production-function dispersion wasn’t even. The inputs to a competitive desktop OS or office suite required coordinated effort across design, ecosystem partnerships, and continued investment that distributed production didn’t consistently produce.

Gurley’s 2003 essay “Software in a Box” identified the deeper failure mode: the operational burden of running open infrastructure at scale without vendor support is heavier than the headline cost comparisons suggest. An open-source project doesn’t carry the support obligation a proprietary vendor does. Because there’s no commercial relationship, no one is paid to take the on-call pager. If the customer is unwilling (or unable) to carry it, and no managed-service layer has emerged to absorb it, the open alternative loses share even if it has a significant price advantage. The wins follow the same logic in reverse: Linux won because Red Hat existed, MySQL won because RDS existed, and Kafka won because Confluent existed. Bazaar production wins out when the differentiation surface moves above the code and a layer emerges that absorbs the operational burden a proprietary vendor used to carry. If either is missing, the cathedral stays in charge.

This has two specific implications for AI:

  1. “Open weights” by themselves don’t displace closed APIs: Displacement requires an operational layer above the weights (managed inference services, in-house infrastructure) that is substantial enough to absorb the burden the closed API used to carry.

  2. Closed retains share even at a substantial cost gap for certain workloads: Customers are willing to pay where operational burden is most significant, such as deployments with high stakes, low tolerance for error, and deep enterprise integrations.

The 2000s and 2010s revealed a second dynamic alongside bazaar-economics: deep-pocketed strategic actors who deployed open source as a competitive weapon.

Open Source as Layer Defense

If a company has a castle in one layer, it is naturally incentivized to commoditize adjacent layers that could commoditize it—usually by accruing enough margin to fund an attack, or by becoming concentrated enough to squeeze. Open source is the most aggressive form of this, where published code erodes margin in the target layer to zero. The bazaar runs this playbook from below, through distributed contributor economics, but cathedrals can run the same playbook from above, sponsoring open source as a weapon against adjacent extractive layers.

How the tool sharpened across three cycles

  1. Intel and the motherboard (1995): Intel wanted to push complete-system prices down to drive processor demand, but cutting processor prices at 85% share cost full margin on every unit. Instead, they entered the motherboard market at ~15% share. From that position, it could set the price floor across the layer while paying the discount on only its own units, and forcing the market to match. Gurley laid this out in his May 1995 essay “A New Intel Motherboard Theory,” and Joel Spolsky later formalized the principle as “commoditize the complement.” Intel published the interface spec and entered the adjacent layer as price-setter, collapsing the rents available there while protecting the layer their castle was in.

  2. Google and Android (2007-2011): The iPhone put Apple in between Google and the mobile user. Google had to pay Apple a hefty fee to be the default search engine—a tax that scaled with mobile’s growth, and one that Apple could raise at will. Google’s response was Android: an open-source mobile OS combined with revenue-sharing on the ads it carried, structured to incentivize the OEMs and carriers to ship it. Gurley laid this out in his 2009 essay “Less Than Free” and sharpened the framework in his 2011 essay “The Freight Train That is Android,” calling Android “the greatest legal destruction of wealth in history.” Android wasn’t simply a product; it was a moat funded by Google’s search castle and deployed to scorch the earth around the castle’s perimeter. The tool sharpened in two ways from the Intel-motherboard version: 1) open-source code drove not just standardization rents to zero, but also implementation rents to zero—not only were the specs given away, so was implementation, and 2) revenue-sharing inverted the price below zero by paying carriers to adopt; no open-standards play can do this because they have nothing to give away.

  3. The AI era (2023-Present): Each cycle publishes a different layer of the stack. Cycle 1 published interfaces and Cycle 2 published implementations. Cycle 3 is publishing the trained capability itself: model weights (and, increasingly, the compute substrate to serve them). Meta’s Llama opened the model layer underneath its social and ads castle, DeepSeek’s R1 released a frontier-grade model under a permissive MIT license that drew global adoption within weeks, and Nvidia committed $26bn to training their own open-weight models. The silicon layer-defender is attempting to become a frontier lab in order to commoditize the layer that sits on top of its chips. The tool has been sharpened again: what’s published has moved from spec to code to trained model, and the layer-defender increasingly funds the development rather than the contribution being organic.

The playbook in practice

Three conditions produce layer-defense open source: 1) a strategic actor with capital in an adjacent layer, 2) a read of the situation favoring commoditization over capture, and 3) continued commitment to the open posture as the situation evolves. When all three hold, the pattern produces open source at industrial scale. If any fails, the pattern stalls.

The pattern isn’t rare, though. Gurley named it in his 2026 essay “From Open Source Software to Open Source Strategy,” and walked through six matured examples, each with a strategic actor commoditizing an adjacent layer to defend a castle:

  1. Android (2007): Google open-sourced a mobile OS to neutralize Apple on mobile search.

  2. Open Compute Project (2011): Facebook open-sourced data center hardware designs to commoditize the supply chain feeding its social/ads infrastructure.

  3. Kubernetes (2014): Google open-sourced container orchestration to neutralize AWS lock-in on customers Google needed in its cloud.

  4. LF Networking (2018): Telecom carriers organized around open networking stacks to erode Cisco/Juniper/Nokia/Ericsson pricing on the equipment below their service businesses.

  5. RISC-V (2010): An industry coalition standardized an open CPU instruction set to commoditize ARM and x86 licensing fees.

  6. Overture Maps (2022): AWS, Meta, Microsoft, and TomTom jointly open-sourced map data to commoditize the geospatial moat underneath Google’s ads and Amazon’s logistics.

Five of the six sit under neutral foundations (Linux Foundation, CNCF, OCP Foundation, RISC-V International, Overture Maps Foundation) which exist specifically to prevent any single contributor from recapturing the project the way Google recaptured Android. The institutional infrastructure is the third condition operationalized: continued commitment locked in by governance.

Meta and Llama: both a proof point and a cautionary tale

Meta’s 2023 release of Llama was textbook layer-defense. Meta’s social and ads castle was threatened by a closed-model layer calcifying between Meta and its own products. Open-weight Llama meant Meta wouldn’t have to negotiate API terms with whoever consolidated the layer. Zuckerberg’s 2024 “Open Source AI Is the Path Forward” made the logic explicit. However, there was a caveat: Llama’s terms weren’t OSI-approved, and carved out commercial use by the largest competitors, meaning it was open enough to commoditize the model layer for the developer ecosystem, but closed enough to keep out the firms that could weaponize it.

The Llama release wasn’t a one-off for Meta. Zuckerberg had acquired Oculus and invested over $50bn into the metaverse as a preventative measure against someone taking control of the metaverse in a way similar to Apple’s control over the mobile layer. Zuckerberg was willing to spend at scale if it meant he could own the next paradigm and not be forced to pay rent. The metaverse was the direct-ownership version of it, and Llama was the commoditize the complement version of it. Same move, different layer. Same castle, same actor, same strategic logic across two technological waves—but different tools each time.

For two years, the playbook ran as predicted. Then in 2025, Llama underperformed at launch. Behemoth was shelved. Meta formed Superintelligence Labs and rebuilt its AI stack from scratch. In April 2026, MSL released Muse Spark, but withheld its weights. “Open Source AI Is the Path Forward” is no longer the active strategy.

The walkback deserves extra attention though, because it is often misread. Open didn’t fail; Meta’s strategic calculus changed. When Meta thought it couldn’t be at the frontier itself, commoditizing the model layer was the right move. If no one could win that layer, no one could charge them for it. But once they decided they might actually be able to compete at the frontier, the logic inverted: a layer you can win is a layer worth capturing rather than commoditizing. Open source didn’t stop working, but Meta’s incentive to subsidize it did.

But Meta’s walkback isn’t as important as what happened next: The open ecosystem didn’t collapse—Chinese labs took on Meta’s role, and by mid-2026, are leading the open-weight frontier. The five labs operate independently, but are unified under explicit national strategy support across two consecutive Five-Year Plans. The ecosystem routed around its largest Western contributor because demand for open weights is structural, not contingent on any single actor. Of the three forces driving open-source AI, layer-defense is the most contingent—it depends on specific incumbents continuing to deploy the playbook. The other two don’t. Production economics and consumption economics operate regardless of whether Meta keeps releasing Llamas. That the ecosystem absorbed the walkback without disruption is the evidence.

The pattern playing out

  • Google’s split posture: Google has multiple castles, and what it open sources tells us which one it’s defending. Gemini stays closed—it’s bundled into Search, Workspace, and Cloud, where integration is the source of premium pricing. Gemma is open—a hedge against Meta and Chinese labs owning the open-weight developer ecosystem. Closed where integration pays; open where commoditizing the competition pays.

  • The Chinese cluster: Qwen, DeepSeek, Kimi, GLM, and MiniMax benefit from corporate castles (such as Alibaba’s ecommerce and cloud businesses and High-Flyer’s hedge fund) and a sovereign-AI position the Chinese state has codified as strategic. Ion Stoica’s analysis grounds the supply side: roughly half the world’s AI researchers are Chinese, academic AI defaults to open, industry labs inherit that default as their R&D model. The structural conditions for bazaar production are codified into university advancement policies and consecutive Five-Year Plans.

  • Nvidia’s $26bn open-model commitment: Nvidia’s castle is GPUs—every dollar of margin captured at the model layer is margin Nvidia could capture at the silicon layer instead, and every dollar of model layer rent that suppresses inference volume is suppressed GPU demand. Nvidia has both the strategic motive and the balance sheet to commoditize the model layer at industrial scale. It’s the same play, only more advanced, of Intel commoditizing the motherboard.

  • Hyperscalers running different plays: Microsoft hedges across every layer with equity in OpenAI, distribution of Llama and Mistral on Azure, and Phi built in-house. Amazon is targeting silicon. The threat to AWS isn’t models—it’s Nvidia. Amazon invested $50bn in OpenAI and $33bn in Anthropic, both contractually committed to Trainium at multi-gigawatt scale. The investments are the wedge, and the silicon commitments are the prize. Same frameworks as Google and Meta and Nvidia, different target layer.

How Open Plays Out in AI

The dispersion signature and Raymond’s institutional conditions

As we’ve seen from the five Chinese open-weight labs and Mistral, portfolio variance under different constraints isn’t theoretical—it’s the empirical pattern that produced the open-closed split frontier in 2026.

The supply-side question of why distributed contributors show up at all has a well-established answer: reputation, peer recognition, recruiting, and the satisfaction of high-quality work as the scarce resource in a gift culture parallel to academia. In the Chinese ecosystem, these conditions hold with unusual precision: academic industry porosity, publications norms, and reputation as the unit of advancement are all explicit policy. The 14th Five-Year-Plan made open-source a national strategy—the 15th doubled down on it. The 2026 Government Work Report called for Chinese AI models to lead the global open ecosystem. Raymond’s institutional conditions are now state policy.

Raymond’s five conditions applied to AI.

  1. Reliability and stability are non-negotiable.

  2. Architecture is not a unique rent-capture technique that depends on secrecy.

  3. Openness commoditizes a strategic basis of competition for adjacent actors.

  4. Cooperation across rivals to prevent any single actor from achieving a chokehold.

  5. Technology is infrastructure for things built on top of it.

All five apply. We see them as: 1) AI is embedded in production workflows where failure has commercial consequences, 2) DeepSeek-R1’s reasoning post-training replicated within months, 3) layer defense, 4) neutral foundations, 5) AI is the infrastructure for everything built on top of it.

The model layer’s structural configuration

The model layer sits in the structural position that has produced commoditization at every prior layer in every prior cycle:

  • Highest production cost in the stack: billions in costs per frontier training run.

  • Fastest capability decay: multiple labs close capability gap within months.

  • Lowest downstream switching cost: orchestration tooling reduces it further.

The supposed model-layer switching costs are weaker than they look:

  • API integration: orchestration tooling (Cline, OpenRouter, LiteLLM) is standardizing on OpenAI-compatible schemas. The integration surface becomes a portable abstraction. Closed labs get the feature lead, but orchestration tools capture the integration moat.

  • Post-training: the major techniques that produce capability leads get published, leaked, or reverse-engineered within a cycle. Closed labs sometimes hold specific data and recipes, but the methods themselves don’t stay proprietary. They may be first, but aren’t the only.

  • Evaluator preference: apps increasingly pick the best model per task rather than committing to a single provider, which means leading on aggregate benchmarks doesn’t lock in users when they’ll route around you on the tasks where you’re not best.

Training moats and inference moats are different categories. Training moats (compute scale, data, talent, infra) are real and persist, whereas inference moats (serving compute, deployment tooling, customer integration, billing) get weaker by the day as orchestration commoditizes them. The structural argument doesn’t need training moats to disappear, but it does need the capability gap they produce to close within a usable window—which has been compressing each cycle: GPT-4 to frontier-competitive open took roughly 12-18 months, whereas o1 to DeepSeek-R1 took four months. The cycle: closed lab achieves a breakthrough through superior training scale, the bazaar absorbs the technique, the gap closes, repeat. The training moat is real, but the inference advantage is not durable.

The customer-side convergence

Three intellectual traditions get us to the same conclusion despite different starting points:

  1. Raymond’s buyer-side risk argument (The Magic Cauldron): Closed source is supplier monopoly: the buyer is locked in by initial investment and training costs, the software serves the supplier’s roadmap rather than the buyer’s, and depending on closed-source code is “an unacceptable strategic business risk.”

  2. Gurley’s structural-better-for-buyers observation (1999-2025): From “The Rising Impact of Open Source” forward: open caps pricing power, prevents lock-in, and produces better outcomes on safety, security, innovation, cost performance, sovereignty, and academic participation. Open isn’t just cheaper, it’s structurally preferable for the buyer along nearly every dimension that matters.

  3. Chris Paik’s incentive-trap analysis (Cursor’s Problem; Strong Winds, Big Sails): Subscription-priced AI products are fixed-revenue against variable cost-compute—an incentives time bomb. Power users, exactly the users the seller most wants, are the users who lose the seller money. The buyer’s structural incentive is to flee toward open inference, where per-prompt cost is their own marginal compute cost rather than a tax to subsidize someone else’s death spiral.

The convergence of three independent mechanisms is stronger evidence than any one alone.

Substitution gradient, on-device threshold, and market expansion

Paik’s “Three Brothers” essay is the best articulation of the substitution gradient. The eldest—closed frontier (OpenAI, Anthropic)—gets the best stuff first. The middle—open-source cloud-hosted (Qwen, DeepSeek, Mistral on providers like Together and DeepInfra)—gets near-frontier capability at significantly lower cost. The youngest—local inference—gets older capability on a device where per-call cost is zero to both user and developer. The gradient is operative on every API call—a developer choosing between GPT-5.5 at full rate, Qwen-on-DeepInfra at an order of magnitude cheaper, and an 8B model running locally at zero makes a margin decision per call, not a strategic decision per company. No consortium organizes it; the hierarchy compresses each cycle; the youngest brother inherits it all.

Convenience is the counterweight—Paik’s aphorism, “in aggregate, users will always prefer convenience over owning their own data or privacy,” predicts cloud convenience keeps some users on closed even where local is technically good enough. The resolution: convenience determines the rate of substitution, not its direction.

Paik’s “Minimum Viable Infrastructure” identifies tokens-per-second on-device as the next “Why Now” gate, mapping the sequential-by-bandwidth pattern (Twitter->Instagram->Snapchat->TikTok) onto local inference. “The End of Cloud Inference” describes the technical conditions: Apple Silicon’s unified memory lets memory-hungry models fit; MLX makes on-device deployment practical; Stable Diffusion on a Mac with no network is a proof point for image; Qwen 3 8B on consumer hardware is a proof point for language. The conditions for substitution toward the edge are now present for workloads that don’t need frontier capability: free to both user and developer at the margin, with lower latency, better privacy, and connectivity independence. The threshold rises each cycle as open models improve on existing hardware. Wearables are a natural fit, as they have structural advantages on latency, privacy, bandwidth, and battery that make local the obvious deployment. The black hole inversion of inference demand happens slowly, then all at once.

Market expansion creates new volume closed cannot profitably serve. As Paik argues in “The End of Software,” LLMs are driving the cost of creating software toward zero. The parallel to media: when the internet drove content-creation costs to zero, content went from “expensive and has to make money” to “free and can exist without making money.” This didn’t mean established media disappeared, but they did lose the marginal user to a long tail of UGC creators. Vogue wasn’t replaced by another fashion media company—it lost share to ten thousand influencers. Software is on the same trajectory: Salesforce won’t be replaced by another monolithic CRM—it will lose marginal seats to a constellation of small, vertical applications tackling individual use cases. We already have evidence of this: the proliferation of AI-native software companies, prompt-to-application tooling, and startups building with order-of-magnitude smaller teams.

This is the volume that closed-frontier APIs can’t serve profitably because the median application in the Cambrian explosion cannot pay closed-API rates at scale. A two-person startup can’t afford GPT-5.5 at API rates for every single interaction, but it can afford Qwen-on-DeepInfra at 1/10th the cost, or a local 8B model at zero cost. This market is not a zero-sum one—it is net new volume. Closed labs don’t lose this segment to open because they never had it. Open captures it because it is the only economically viable substrate for customers that otherwise wouldn’t exist. Cambrian-explosion growth is additive, not substitutional. And it is additive in the segment that, in absolute terms, grows fastest.

The Cursor and Airbnb decision to go with open is evidence for one segment specifically: companies with sophisticated engineering teams and cost-sensitive production deployments—not Fortune 500 companies with IT departments that are overly concerned about accountability and optimize procurement for managed-vendor relationships. Both Airbnb and Cursor made independent margin-driven routing decisions to Chinese open-weight models in production. Their decisions are the leading indicator for that segment, whereas the Fortune 500 companies are closer to the premium tier.

Orchestration and the three forces synthesized

“Strong Winds, Big Sails” highlights Cline as an example of the orchestration layer, being model-agnostic and having provider-agnostic architecture. At any moment, it routes calls to the best available model at the lowest price the market will clear. They charge via metered pass-through, not subscription with hidden compute spreads. Paik writes, “the open-source ecosystem never tried to monetize access to software. It conceded that software wants to be accessible and charges for convenience: hosting, reliability, upgrades, security, team workflows, and enterprise support.”

The orchestration layer standardizes the integration surface across closed and open models, which is what makes substitution unilateral on a per-call basis. Switching costs don’t disappear; they move. Cline, OpenRouter, and LiteLLM capture meaningful lock-in at the abstraction layer: switching models within an orchestration vendor is trivial, but switching orchestration vendors is not. The relationship is structurally similar to cloud regions versus cloud providers: easy movement within, but friction across. This is real value capture without recreating closed-API lock-in, and it’s also the operational-burden-absorbing layer open ecosystems have historically required to win. Linux won because Red Hat existed, MySQL won because RDS existed, Kafka won because Confluent existed. Open weights win because Cline, OpenRouter, LiteLLM, Together, DeepInfra, and the broader ecosystem exist. The historical precondition is already met.

The three forces operate simultaneously: layer-defense from above, production economics from below, consumption economics from the demand side. Any one stalling would slow the trajectory, but the other two would carry it through. The structural conditions producing each force are independent, so all three would have to stall at once for the entire open ecosystem to also stall.

Open’s trajectory depends on several conditions holding:

  • The “End of Software” dynamic continues materializing into actual application creation.

  • Orchestration tooling continues maturing and reaches beyond sophisticated developers.

  • The capability gap continues to close within each cycle.

  • Strategic actors with structural reasons to open continue to fund production.

  • Energy and capex constraints fall similarly on closed and open rather than asymmetrically advantaging closed through preferential access.

These are each real dependencies. The open ecosystem doesn’t require them to hold perfectly, but enough that the three forces continue to compound. So far they do.

Why the pattern keeps getting missed

The open pattern has now been visible for multiple cycles, yet sophisticated observers continue to be late to it.

Two parts that compound: distortion and cognitive failure

Two components compound this miss: 1) the structural distortion of public discourse, and 2) cognitive failures that allow the distortion to succeed. Distortion alone would be visible if observers were scoring the configuration correctly. Correct scoring alone would catch the trajectory if the discourse weren’t acting against it. The two components reinforce each other.

Distortion takes two forms:

  • Hard: A specific actor runs a public messaging operation aimed at shaping what sophisticated observers think. Microsoft’s anti-Linux campaign of the late 1990s is the canonical case: Halloween Documents, FUD strategy, and coordinated analyst outreach.

  • Soft: This is when the loudest default frames in a discourse serve the interests of the firms with castles to defend—which they reliably do, because the firms with the loudest voices are the firms with the most to lose. When the dominant mental models in a market are drawn by incumbents whose castles are in the layer being commoditized, observers who defer to those models are running their analysis on a map drawn by the firms being commoditized. They reach the conclusions the map was drawn to produce.

Four cognitive failures further compound the distortion:

  1. Capability lags cost: Observers scoring on capability at any single moment underweight the trajectory of the cost variable beneath it. Capability shows up in the headline benchmark, but cost determines who is deployed at scale.

  2. Structural dispersion is seen as competition: Multiple actors achieving things near-simultaneously gets interpreted as a competitive race rather than the signature of a shared production function. Five Chinese labs reads as five competing companies, but it’s one dispersed production function operating in parallel.

  3. Cross-layer value flow is invisible: Value increasingly flows to layers different from the one any individual firm produces in, and a frame with no concept of value elsewhere in the stack will miss it entirely. When closed labs lose model-layer rent, the value relocates to silicon, orchestration, and applications—layers the closed-model lens doesn’t track.

  4. Voice asymmetry is structural: Open has no marketing organization, no quarterly earnings, no IPO, no narrative apparatus. The closed alternative’s voice is structurally louder, and analysts and journalists default to the actors with the apparatus to be talked to.

The four failures together produce a baseline perception skewed against the open trajectory before any analysis even gets started.

The 1990s: Microsoft and the Linux narrative

Microsoft vs Linux is the best example of hard distortion. Microsoft publicly messaged that Linux was a hobbyist project, the license created IP risk, the lack of a single vendor made support impossible, the OS was structurally unsuitable for production, etc. They were consistent across press, analyst briefings, and customer materials, and their message was landing. The trade press and equity research community treated Linux as a curiosity rather than a competitor.

But in 1998, internal Microsoft strategy memos were leaked. Raymond annotated and published these “Halloween Documents,” which showed Microsoft’s true internal assessment: Linux was a real, long-term threat to Windows NT in the server market and the open-source development process was a structural advantage Microsoft could not match. Microsoft considered tactics to slow Linux down such as FUD on IP risk, exploitation of the single-vendor accountability gap, and protocol extensions to break interoperability. Microsoft’s internal documents accurately assessed the situation—their public messaging did not.

The structural distortion is in the gap—Microsoft was the loudest and most-quoted public voice on Linux’s viability even though their business model depended on Linux being seen as not viable. The trade press and equity research community relied on Microsoft for sources, briefings, and corporate access, so hard form distortion produced the public assessment. Cognitive failures determined it landed without much resistance.

The late 2000s: Android and strategic-incoherence framing

Android was a case of soft form distortion. No single incumbent shaped the discourse the way Microsoft had with Linux, but the dominant framing among sophisticated observers was that Google was making a strategic mistake by giving away the mobile OS because the OS is where value would accrue. The framing was widespread—and wrong. It worked from a mental model that put the castle in the layer the firm produced, not the layer the firm monetized. Google’s castle was advertising—the mobile OS was a route between Google and the user that Apple was trying to monopolize. Open-sourcing Android wasn’t strategic incoherence—it was the most aggressive form of layer defense available to a firm whose castle was elsewhere. Gurley called this out in real time in “Less Than Free” (2009) and “The Freight Train That Is Android” (2011). The cognitive failure was a frame that located the castle in the layer it produced rather than the one it monetized.

2023-Present: Safety, regulation, and the lobbying record in AI

Since 2023, leading closed labs such as OpenAI and Anthropic have been increasingly lobbying for regulations around compute-threshold reporting that falls on open-weight releases more than closed deployments, export controls calibrated to constrain open competitors more than the labs themselves, and safety frameworks calibrated to closed-API delivery: pre-release review and post-release accountability that closed APIs satisfy that open weights can’t.

The argument isn’t that these safety concerns are disingenuous—Anthropic in particular has built much of its public identity around safety arguments. The argument is about who shapes the regulatory discourse and what it’s calibrated to produce. And sincerity at the firm level is compatible with structural distortion at the discourse level. Sincere advocates can still produce captured outcomes. This is what happened in the 1990s.

We can see this most clearly in the closed labs lobbying records. SB 1047 in California (2024, vetoed September) imposed pre-release safety evaluation, kill-switch, and liability requirements on models trained above a certain compute threshold. Anthropic pledged conditional support dependent on amendments whereas OpenAI publicly opposed on federal-preemption grounds. SB 53 (effective January 2026) imposed safety frameworks, transparency requirements, and incident reporting on large AI model makers (a narrower version of the previous bill). Anthropic endorsed this one fully, whereas OpenAI opposed it. Both firms’ lobbying disclosures and policy publications across 2023-2026 advocated for variations of regulatory capture: compute-threshold reporting, frontier-model evaluation regimes, pre-release review, licensing or registration for frontier models. Federal activity, such as the Biden executive order, the NIST framework, proposed AI Safety Institute pre-release evaluation regime, tracks the same shape. Slide 57 of Bill Gurley’s 2,851 Miles talk (2023) called this dynamic out in real time: incumbents lobbying with negative open-source messaging precisely because they recognize it as their biggest threat.

The regulatory-capture move and the payoff

Recall the earlier distinction between open standards and open source. Open weights share the decisive property of open source: they are published unilaterally, under a license the publisher chooses. There is no committee to capture, no specification to extend in incumbent-favorable directions, no implementation surface that preserves rents for the strongest player. The regulatory architecture the closed labs are lobbying for would change that—pre-release review, evaluation regimes, government licensing for deployment, and institutional checkpoints between the model and the market.

The historical lesson about open standards—committees get captured, implementation surfaces favor incumbents, Principle 6 produces incumbent-favorable outcomes—comes back into play the moment the regulatory architecture exists. Incumbents in an open-source market have a strategic card to play: lobby for institutional architecture that re-imports the open-standards properties open source had eliminated. Regulation isn’t just a response to losing the economic argument—it’s a move to convert open source back into open standards, at which point the configuration that historically advantaged incumbents applies again.

Two reads of the regulatory fight that are both correct. 1) the closed labs are losing the economic argument and reaching for political tools as the economic ones stop working, and 2) the regulatory push is partially succeeding at re-importing open-standards properties, with second-order consequences for where the open-weight ecosystem anchors next. The first explains why the lobbying is happening, and the second explains what happens if it succeeds.

Market Structure and Incumbent Response

Bifurcation, not collapse

The post-displacement market structure is bifurcation, not collapse:

  • Closed retains: workloads where the closed labs’ specific advantages justify their premium pricing: alignment under adversarial pressure, long-horizon agent reliability, regulated enterprise integration.

  • Open captures: the cost-sensitive, high-volume tier, new vertical applications, and on-device workloads. The middle migrates over time as the capability gap closes.

DeepSeek’s hosted-API pricing is the clearest signal of bifurcation. They offer paid hosted access to their open weights at prices a fraction of the price the closed labs charge for comparable inference, yet the closed APIs retain meaningful share. The cost gap, however large, has not yet incentivized all share towards open. The equilibrium is bifurcation: pricing power compression on the closed high end, displacement on the cost-sensitive end where frontier capability isn’t required, the middle migrating toward open as the capability gap closes and orchestration matures. Paik’s “Mac vs PC, iOS vs Android” put it simply: a cheaper open ecosystem with features and flexibility vs a more expensive closed ecosystem with focus and design.

Where value relocates

The layer-disaggregation logic predicts where value goes when the model layer commoditizes: orchestration, inference infrastructure, vertical applications, and managed services on open weights. The companies that build durable model-agnostic infrastructure in this window capture the value the substitution and expansion dynamics produce.

Five incumbent behaviors as expressions of one strategy

  1. Pricing compression: Closed APIs have been cutting prices repeatedly through 2024-2026, tracking the closing cost gap with hosted open alternatives. OpenAI has cut its flagship API prices multiple times and priced cheaper variants aggressively against DeepSeek’s hosted pricing. Anthropic’s Haiku and Google’s Gemini Flash occupy the same cost-tier position and have moved in the same direction. Cuts are concentrated where open is most competitive, but flagship pricing has come down too (albeit more slowly).

  2. Open-sourcing the complement: Llama, Gemma, and Nvidia’s $26bn investment are layer-defense plays. The Chinese labs reflect parallel cloud-complement and national-strategic logic. Meta walked back, but the ecosystem absorbed the loss. Their original commitment is evidence of the logic: firms most exposed to model-layer pricing power would open-source only if they had concluded the layer was commoditizing regardless.

  3. The licensing hedge: Llama’s non-OSI license with carve-outs for the largest competitors and Google’s Gemma/Gemini split were calibrations of the strategic play, not retreats. The firms were tuning where commoditization pressure should fall, not abandoning the play.

  4. The regulatory environment: The closed labs’ lobbying push for what are effectively standards is the apparatus side of hard form distortion—a move from market competition to political competition. The choice to spend political capital at this scale is evidence of how the firms read the economic position they were spending it from.

  5. Capex acceleration: The frontier labs and cloud partners are racing to lock up power, fab capacity, HBM allocation, and data-center capacity at unprecedented scale—Microsoft, Google, Amazon, and Meta announced 2026 capex at ~$700bn (~2x from 2025). There are two reads of this: 1) as layer-defense for cloud-inference, it’s what Nvidia and the hyperscalers would do to protect their castles, and 2) as doubling-down on the closed-frontier scale thesis, it’s a bet that capability-frontier inputs stay concentrated enough for labs to sustain premium pricing. These reads aren’t mutually exclusive, but they pull in different directions on whether incumbents believe the frontier will retain pricing power.

All five responses are layer-defense plays under different cost structures. It’s the same actors, with the same goal: preserve as much of the closed-model castle as possible while model-layer rents compress. The five responses are visible, but the pattern connecting them is not.

The geopolitical second-order consequences

The regulatory architecture being built will suppress American open-weight competition while leaving global open-weight progress unconstrained—the open-weight substrate has already moved offshore. The Chinese open-weight cluster operates outside the reach of American regulatory architecture and is actively backed by national strategic considerations pointing in the opposite direction. The closed American labs gain domestic insulation at the cost of the broader race.

The pattern isn’t new: The Telecommunications Act of 1996 was heralded as the most important reform in 62 years, with stated goals of promoting competition and encouraging new technology. Within 4-5 years, the top four carriers’ market share went from 48% to 85%. Within 10 years, VC investment in telecom equipment collapsed from 15% of total VC activity to below 1%, and the NVCA stopped tracking the category. The US ended up not manufacturing the equipment its own telecommunications infrastructure runs on. Regulation that protects domestic incumbents in the short term creates an offshore ecosystem that catches up and surpasses on a longer timeline.

Jensen Huang’s framing is informative because NVIDIA sells into every layer of the stack, giving it the strongest cross-layer view of which layers will commoditize. He describes AI as a five-layer cake (energy, chips, infrastructure, models, applications) and notes that China has roughly half the world’s AI researchers and manufactures 60% of mainstream chips. He has argued that restricting Chinese access to American chips would force the global open ecosystem onto a Chinese compute stack — “extremely foolish” and “a horrible outcome for the United States.” The underlying read, from an operator whose castle is GPUs: the chip layer holds because the production function hasn’t dispersed; the model layer doesn’t because it has.

Without a credible Western open frontier player, the only open models capable of running entire economies are Chinese. Billions of people across Africa, Latin America, the Middle East, Southeast Asia, and India will pick the AI stack that is free, capable, self-hostable, and not subject to American export controls. The geography of the open outcome depends on whether a credible Western open frontier player emerges (none has yet — Mistral was closest but has trended more closed; Gemma is partial) and whether U.S. policy restricts Chinese open-weight access in ways that push the global default toward Chinese models.

Open’s proliferation doesn’t depend on those choices. Where it anchors does.


If you’ve made it this far, thank you for reading. If the subject of this memo is something you’ve been thinking about, I’d love to hear from you (email; twitter).

Behind the paywall is 3,000+ words on the open-source thesis playing out in markets: 1) DeepSeek’s recent raise, 2) Fable’s ban, 3) GLM-5.2’s release, 4) Microsoft’s open-source moves, 5) updates on Huawei’s Silicon, 6) why these sharpen the case for open-source AI, and more.

I’ve also shared a list of titles of other private memos I may or may not share—paid supporters can reply or email me with any of the titles they’d like to read and I’ll prioritize accordingly, whether that means sending a draft, deciding to publish it, or discussing it over email/chat.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 KG · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture