China's AI Token Dumping Paradox: Cheaper Prices, Collapsing Models

Over 200 LLMs Trigger Bloody Price War Adopting 'Low Margin, High Volume' Strategy to Boost Market Share Startups Struggle to Survive Against Massive Capital Focus Shifts to Easy and Safe AI Agent Models Chinese Models Rank High in U.S. and Korean Corporate Usage

International|
|
By Park Si-jin
||
null - Seoul Economic Daily International News from South Korea

$0.10 vs $3.

These are the per-million-token input prices of Xiaomi's artificial intelligence (AI) model "MiMo" and OpenAI's latest model. As of this year, the token price gap between Chinese and American AI has widened to an extreme. While more than 200 large language models (LLMs) within China are locked in price competition, the gap with other countries has grown even larger.

Comparing the top-tier models of major U.S. Big Tech firms with leading Chinese models, the difference ranges from 30 times to as much as 170 times. Latest models from OpenAI and Anthropic charge about $3 (approximately 4,400 won) for input and $15 (approximately 22,000 won) for output per million tokens. Chinese models such as DeepSeek and Qwen, by contrast, charge around $0.14 (approximately 208 won) for input. Xiaomi's recently unveiled "MiMo" has dropped to $0.10 (approximately 148 won). Armed with vast domestic data and open-source-based R&D, Chinese models are unleashing an ultra-low-price offensive.

Price competition among Chinese tokens began with "DeepSeek-V2" in May 2024. Priced at 1 yuan (approximately 220 won) per million tokens, DeepSeek was tens to hundreds of times cheaper than existing GPT models. Baidu, ByteDance, Alibaba, and Tencent followed with domino-style price cuts, slashing token prices by 80% to 97% or offering them for free.

These companies adopted the ultra-low-price strategy as a tool to quickly absorb the global developer ecosystem as late movers. Even at the cost of sacrificing profits, they are expanding market share by providing open-source and low-cost application programming interfaces (APIs) — a so-called "low margin, high volume" approach. As "AI agents," which consume massive amounts of tokens, become mainstream, Chinese models are explosively absorbing demand from U.S. and global firms seeking to cut costs by integrating cheaper Chinese models.

The Chinese government's sweeping subsidies have also played a role. While U.S. firms bear the cost of expensive power and infrastructure with their own capital, China has designated AI computing power as a core national project and poured massive subsidies into the sector. Local governments in particular are providing substantial support for electricity rates at AI data centers, with the government covering half (50%) of power costs when Chinese-made AI chips such as those from Huawei or Cambricon are used, dramatically lowering the fundamental cost of AI operations.

null - Seoul Economic Daily International News from South Korea

U.S. Big Tech firms, in particular, have snapped up Nvidia's latest GPUs and poured astronomical sums of capital into building AI data centers. Because these companies must recoup their enormous infrastructure investments, it is structurally difficult to cut API token prices of premium models below a certain level. China, on the other hand, is focused on breaking prices — meaning the two sides' business models are oriented in completely different directions.

The low-price offensive from China is hitting promising AI startups directly. Firms that had been attempting to monetize are now forced to lower their prices. Startups are being pushed out of token price competition against Big Tech and large corporations backed by massive capital. Cited limitations include △infrastructure dependence and the limits of economies of scale △aggressive price cuts by Big Tech △the devaluation of models themselves due to open source, and △fixed cost burdens caused by the explosion in token usage following the adoption of agent AI.

Some point out that fixed subscription pricing — introduced to create a "lock-in effect" — is instead worsening revenue structures. Most startups adopt monthly unlimited or high-capacity subscription plans to secure stable revenue. However, when a small number of "heavy users" consume hundreds of millions of tokens per month, the revenue structure collapses. With low brand loyalty, they cannot introduce pay-as-you-go pricing, and unlike Big Tech services, they cannot easily change their billing methods.

The problem is quality degradation. Running more traffic with the same amount of money has led to declines in accuracy, consistency, speed, and stability. Response delays and disruptions are fatal for interactive products. When tokens per minute (TPM), requests per minute (RPM), and concurrent requests were limited to meet cost targets, output restrictions and timeouts increased.

Experts view low-price token sales as an unsustainable model. Li Qiang, Tencent's vice president, said in an interview with Chinese economic media outlet Yicai, "If tokens are compared to automobile fuel, ignoring the efficiency of the 'engine (AI model architecture)' and focusing only on fuel consumption will ultimately lead to user costs becoming so large that they will be shunned." An "efficient engine (advanced model)" produces the best results with "less fuel (tokens)," but clinging only to token sales volume without optimization is the wrong direction, he meant.

Vice President Li diagnosed the token sales business itself as a "non-sticky business." Even if customers are drawn in through low prices and discounts, they immediately leave when competitors offer lower prices. "Rather than simply selling tokens, we must focus on developing AI agent solutions that are easy and safe for users to use," he stressed. "Our goal is to leap from being a simple token infrastructure provider to an AI agent and cloud solution provider."

Indeed, Tencent is offering an "intelligent agent development platform" based on its proprietary large model "Hunyuan" as a B2B cloud service. Corporate customers can integrate their own data and immediately use it for customer service, marketing, and coding support. Combined with the ecosystem of "WeChat," the national messenger used by 1.4 billion people, Tencent is transitioning toward selling "work productivity and added value" rather than tokens.

Meanwhile, the share of Chinese models in token usage by U.S. and Korean firms continues to rise. Among the top 10 AI models analyzed by OpenRouter based on this week's traffic (May 4-8), five were Chinese models. Tencent's "Hy.3" topped the list with 3.74 trillion tokens over one week, followed by Moonshot AI's Kimi K2.6 (1.78 trillion), Anthropic's Claude Sonnet 4.6 (1.38 trillion), and Anthropic's Claude Opus 4.7 (1.05 trillion).

AI-translated from Korean. Quotes from foreign sources are based on Korean-language reports and may not reflect exact original wording.