I’m sorry if I was unclear the first two times I asked, but when I said:
care to link me to all these great models from academics and open-source institutions?
I was interested in the models you’re currently using, not the ones you’re speculating about. Hopefully it goes without saying that “open” weights are precompiled closed-source blobs, and "Open"AI is anything but, etc.
I’m aware new models are trained at the speed of light and hardware is going obsolete faster than it can be put on racks, which is already a problem, so I would love to believe your theory about inexpensive AI GPUs but those very same companies are already going into debt without selling their current stock.
This is not speculation. That’s what I’m actually using nearly every day. It’s not as good as Claude Code with Opus 4.6 but it’s about 90% of the way there (if you use it right). When GLM-5 came out that’s when I cancelled my Claude subscription and just stuck with Ollama Cloud.
I can use gpt-oss:20b on my GPU (4060 Ti 16GB)—and it works well—but for $20/month, the ability to use qwen3.5 and GLM-5 are better options.
I still use my GPU for (serious) image generation though. Using ChatGPT (DALL-E) or Gemini (Nano Banana) are OK for one-offs but they’re slow AF compared to FLUX 2 and qwen’s image models running locally. I can give it a prompt and generate 32 images in no time, pick the best one, then iterate from there (using some sophisticated ComfyUI setups). The end result is a superior image than what you’d get from Big AI.
I’m sorry if I was unclear the first two times I asked, but when I said:
I was interested in the models you’re currently using, not the ones you’re speculating about. Hopefully it goes without saying that “open” weights are precompiled closed-source blobs, and "Open"AI is anything but, etc.
I’m aware new models are trained at the speed of light and hardware is going obsolete faster than it can be put on racks, which is already a problem, so I would love to believe your theory about inexpensive AI GPUs but those very same companies are already going into debt without selling their current stock.
Edit: we have a reason to not assume GPUs will suddenly become cheap.
I literally said I’m using qwen3.5:122b for coding. I also use GLM-5 but it’s slightly slower so I generally stick with qwen.
It’s right there, in ollama’s library: https://ollama.com/library/qwen3.5:122b
The weights and everything else for it are on Huggingface: https://huggingface.co/Qwen/Qwen3.5-122B-A10B
This is not speculation. That’s what I’m actually using nearly every day. It’s not as good as Claude Code with Opus 4.6 but it’s about 90% of the way there (if you use it right). When GLM-5 came out that’s when I cancelled my Claude subscription and just stuck with Ollama Cloud.
I can use gpt-oss:20b on my GPU (4060 Ti 16GB)—and it works well—but for $20/month, the ability to use qwen3.5 and GLM-5 are better options.
I still use my GPU for (serious) image generation though. Using ChatGPT (DALL-E) or Gemini (Nano Banana) are OK for one-offs but they’re slow AF compared to FLUX 2 and qwen’s image models running locally. I can give it a prompt and generate 32 images in no time, pick the best one, then iterate from there (using some sophisticated ComfyUI setups). The end result is a superior image than what you’d get from Big AI.