The Chinese AI labs are really trying to pop the bubble, too.
How?
Well lemme ask you this. What if models 80-90% as good as Claude, with weights just thrown out there for any provider (or homelab) to host, flood the market? What if they’re so dirt cheap to run, they’re almost free, and don’t even need Nvidia GPUs? What they need fewer resources to run with each update, instead of more?
…What if this already happened, and Big Tech is maddly lobbying to ban/censor them before people realize it, and that the “infinite scaling” thing is a big fat lie?
Yep, the Chinese models are already up 10 times cheaper and now that Anthropic, OpenAI, and Google, all are increasing prices up to 10 more for models like Opus, it will make Chinese models anywhere from 50 to 100 times cheaper.
American corps. are betting that since people have their workflow already established they won’t switch to other providers, but that’s not the case. There’s already a mass move to Chinese models.
Agreed. I am not longer paying token fees as I am running QWEN 3.6 27B MTP on my 4090 GPU and it is as good and as fast as the frontier models for agentic coding.
Deepseek was big because not only did they publish the full model for everyone to use, but the MoE structure significantly brought down the hardware requirements in terms of processing power. As long as you have enough VRAM, you can run it on older hardware with no need for the latest Nvidia stuff.
Now they got v4 which many have found to be within a 10% margin of Claude and ChatGPT.
On top of that, China has cheapo VRAM GPUs available or soon to be released, like the MTT S80. Yeah it sucks as a Graphics card because the chip is behind, but you get 16Gb of GDDR6 for much cheaper than anything else.
But its not a conspiracy to fight China. The infinite scaling was just Nvidia solidifying themselves as the monopoly because they want all AI infrastructure to be dependent on them, which is why they still illegally export to China, despite an export ban attempting to reduce their potential competition.
Moore Threads (MTT) already has their own CUDA like system called MUSA, and I’m sure they’ll be happy to put in proper hardware support for new stuff like Bf16 and FP8/4. It’ll take a few years, but eventually China will catch up to the point where Nvidia gets shanked by cheaper hardware.
Wasn’t there development of a linux translation layer for CUDA workloads to run on AMD GPUs? I haven’t heard about it in a while, but I’d imagine that’d help the situation.
MTT is just a pipe dream, last I checked. But Deepseek is actively being served, in mixed FP8/FP4, on racks of Huawei accelerators.
I believe Baidu trained a model on them, too. But most training (like Deepseek’s) is still done on CUDA.
…Also, be careful equating this stuff with any kind of “consumer friendly” hardware you or I could buy. That’s less likely. The Huawei accelerators (and other local Chinese hardware experiments) are geared towards huge servers serving requests in parallel.
Hyper scaling was always about cornering the computer market, It was never about providing us some vastly new and superior service.
Exactly, its a method of taking tens of billions of dollars in capital and buying a near monopoly. No other providers can compete if the hyperscalers buy all of the hardware, driving up the prices while also selling the service at a loss.
Nobody working out of their garage with a cool idea for a better service can compete if they can’t get hardware and have to charge double what the hyperscalers are charging because they can’t burn capital for years.
It’s a practice that should be considered illegal market manipulation, because that’s what it i
The state of things is what if, that’s true. It has not happened. :)
At some point, it should happen. Still not going to put a dent in the datacenter / dystopia rally though, since they will pick Nvidia and known brands.
The Chinese AI labs are really trying to pop the bubble, too.
How?
Well lemme ask you this. What if models 80-90% as good as Claude, with weights just thrown out there for any provider (or homelab) to host, flood the market? What if they’re so dirt cheap to run, they’re almost free, and don’t even need Nvidia GPUs? What they need fewer resources to run with each update, instead of more?
…What if this already happened, and Big Tech is maddly lobbying to ban/censor them before people realize it, and that the “infinite scaling” thing is a big fat lie?
That’s the state of things.
Yep, the Chinese models are already up 10 times cheaper and now that Anthropic, OpenAI, and Google, all are increasing prices up to 10 more for models like Opus, it will make Chinese models anywhere from 50 to 100 times cheaper.
American corps. are betting that since people have their workflow already established they won’t switch to other providers, but that’s not the case. There’s already a mass move to Chinese models.
People keep talking about Chinese models, where are they? How do I used them instead of Claude? Are they safe?
Agreed. I am not longer paying token fees as I am running QWEN 3.6 27B MTP on my 4090 GPU and it is as good and as fast as the frontier models for agentic coding.
In a way it has actually.
Deepseek was big because not only did they publish the full model for everyone to use, but the MoE structure significantly brought down the hardware requirements in terms of processing power. As long as you have enough VRAM, you can run it on older hardware with no need for the latest Nvidia stuff.
Now they got v4 which many have found to be within a 10% margin of Claude and ChatGPT.
On top of that, China has cheapo VRAM GPUs available or soon to be released, like the MTT S80. Yeah it sucks as a Graphics card because the chip is behind, but you get 16Gb of GDDR6 for much cheaper than anything else.
But its not a conspiracy to fight China. The infinite scaling was just Nvidia solidifying themselves as the monopoly because they want all AI infrastructure to be dependent on them, which is why they still illegally export to China, despite an export ban attempting to reduce their potential competition.
Moore Threads (MTT) already has their own CUDA like system called MUSA, and I’m sure they’ll be happy to put in proper hardware support for new stuff like Bf16 and FP8/4. It’ll take a few years, but eventually China will catch up to the point where Nvidia gets shanked by cheaper hardware.
Wasn’t there development of a linux translation layer for CUDA workloads to run on AMD GPUs? I haven’t heard about it in a while, but I’d imagine that’d help the situation.
MTT is just a pipe dream, last I checked. But Deepseek is actively being served, in mixed FP8/FP4, on racks of Huawei accelerators.
I believe Baidu trained a model on them, too. But most training (like Deepseek’s) is still done on CUDA.
…Also, be careful equating this stuff with any kind of “consumer friendly” hardware you or I could buy. That’s less likely. The Huawei accelerators (and other local Chinese hardware experiments) are geared towards huge servers serving requests in parallel.
Hyper scaling was always about cornering the computer market, It was never about providing us some vastly new and superior service.
They should be strung up. And middle management needs to return to fucking school.
It’s like Kyle Kulinski said “I’m starting to understand re-education camps now”
Exactly, its a method of taking tens of billions of dollars in capital and buying a near monopoly. No other providers can compete if the hyperscalers buy all of the hardware, driving up the prices while also selling the service at a loss.
Nobody working out of their garage with a cool idea for a better service can compete if they can’t get hardware and have to charge double what the hyperscalers are charging because they can’t burn capital for years.
It’s a practice that should be considered illegal market manipulation, because that’s what it i
e: extraneous ‘completely’
‘Dumping’ is considered anti-competitive behaviour in a lot of places. This sounds a lot like that.
The state of things is what if, that’s true. It has not happened. :)
At some point, it should happen. Still not going to put a dent in the datacenter / dystopia rally though, since they will pick Nvidia and known brands.