Why Qwen — Not DeepSeek — Already Won the Open-Source AI Race
DeepSeek grabs the headlines. Qwen is quietly winning everything else.
That is the contrarian case Jon Markman makes in a sharp piece this week, and the data is harder to argue with than the narrative most American tech press has settled on. While DeepSeek was busy cutting token prices and getting fawned over by investors, Alibaba's Qwen crossed a billion cumulative downloads on Hugging Face — faster than any open-source model family in history — and then kept going. In February 2026 alone, Qwen generated 153.6 million downloads. The next eight competitors combined — Meta, DeepSeek, OpenAI, Mistral, Nvidia, Zhipu, Moonshot, and MiniMax — did not beat that number. Qwen captured over 50% of all global open-source model downloads by March 2026. That is not a trend. That is a structural position.
The number that should matter to builders is not the headline download figure. It is the derivative count. Qwen has spawned 180,000 or more checkpoint variants on Hugging Face — more than Google and Meta combined. Stanford and UC Berkeley researchers built competitive fine-tuned models on Qwen for $30 to $50. That is not a marketing claim. That is a price point at which the open-weight stack becomes a genuine substrate for domain-specific work, the way Linux became the substrate for everything from Android to cloud infrastructure. The Jevons Paradox is kicking in: as token prices collapse, usage explodes, and the real value migrates one layer up — to the application layer, to fine-tuned variants, to the tooling built on top of the model rather than the model itself.
Markman's piece uses that framing explicitly, and it is worth sitting with. The conventional wisdom treats falling inference costs as a margin threat to AI labs — a race to the bottom that punishes the providers. The Jevons reading is different: when the cost of a fundamental resource drops sharply, demand for that resource increases until it is effectively ubiquitous, and the value migrates to whatever sits above it. Silicon became cheap. The value moved to chip design, software, and the internet. Token inference is becoming cheap. The value is moving to the application layers — and to the model families that developers build on, fork, fine-tune, and stick with.
Qwen is winning that layer, not because it is the most powerful model on every benchmark, but because it is the most usable. The open-weight distribution is simultaneous — GitHub, Hugging Face, and ModelScope on the same day as any major release. The serving guidance is concrete: Transformers, vLLM, SGLang, KTransformers, with actual server commands published on day one rather than a PDF and a prayer. That sounds boring. It is actually the competitive moat that matters most for adoption, because developers do not choose model families based on benchmark screenshots. They choose them based on whether the weights are easy to run, the docs are good, and the community is active. Qwen is winning all three.
The geopolitical signal is also harder to dismiss than it sounds. Singapore's government chose Qwen over Meta's Llama to build its sovereign regional AI model. Malaysia announced its sovereign AI ecosystem will run on Chinese open-weight models. These are not fringe decisions by small players. Singapore runs one of the most technically sophisticated governments in the world, with access to every model on the planet. They picked Qwen. That tells you something about the model quality and the licensing structure that the benchmark theater coverage keeps missing.
What about the gap? The Artificial Analysis Intelligence Index puts GPT-5.5 at 60, Claude Opus 4.7 at 57, and DeepSeek V4-Pro at 52. Qwen sits below the frontier on that specific index. DeepSeek V4-Pro lists at $1.74 per million input tokens and $3.48 per million output — promotional pricing drops output to roughly $0.87 per million versus GPT-5.5 at $30.21 and Claude Opus 4.7 at $25. That price gap is real, and it matters for high-volume applications. But it also obscures a more important distinction: cheap inference and frontier capability are still different products. The bottom of the market is commoditizing fast. The top is not.
For practitioners, the useful read is this: Qwen is not going to beat Claude Opus 4.7 on complex reasoning tasks if you measure by a benchmark chart. What it will do is give you a deployable open-weight model that fine-tunes cheaply, runs on hardware you own, and is strong enough for the workflows where the frontier gap is either irrelevant or manageable. Repo repair, code review, documentation generation, UI prototyping — the tasks where local inference makes more sense than routing everything to an API, where data governance matters, where you want to run evals on your own codebase rather than a synthetic benchmark. That is a real and growing category of work, and Qwen is the model family most serious about serving it.
The DeepSeek price-war narrative got the most press this quarter. The Qwen infrastructure story is the one that will age better. When inference costs approach zero for cached inputs, the question changes from "which API is cheapest" to "what application layer becomes viable at that cost profile." That is a much more interesting problem to be thinking about — and Qwen is better positioned to benefit from it than the headlines suggest.
The contrarian bet: Qwen ends up being the Linux of this generation of AI infrastructure, while DeepSeek ends up being more like Netscape — got there first on price, but ultimately the infrastructure layer that others build on is the one that wins the long game.
Sources: Forbes, SCMP, MIT Technology Review