Sure but there will always be a function of hardware and energy cost.
Today’s models will run cheaply in the distant future, but the distant future’s models could only be dreamed of today. Hopefully at some point we get to a point where quality is “good enough” on cheap hardware and low energy, but I can’t tell you when it’ll get here. I bet at minimum another decade, unless you’re ok with what you get out of today’s models on dedicated consumer GPUs.
I think the current stuff that runs on processors and normal ram is worse than useless though.
I currently run Qwen 35B 3.6 A3B on a 5070 with 12G VRAM and I find it surprisingly useful. I use it to ask questions I want answers to that may contain sensitive information or which I don’t want to feed to the data harvesters.
unless you’re ok with what you get out of today’s models on dedicated consumer GPUs
This is all I use, mostly for quickly putting together personal software and doing linux stuff, it doesn’t feel limiting and is already really powerful. A lot of the stuff those models struggle with can be overcome by giving better context and more specific instructions, and that can be automated, so they should become more useful as harness software advances, independently of advancements in the models themselves. Maybe I have a limited perspective because I just haven’t tried the frontier models, but developing a dependence on services run by malevolent companies that obviously intend to use that dependence as leverage is deeply unappealing, and I’m not sure what they could offer to make that seem worth it on top of what I can already do with my own computer.
Sure but there will always be a function of hardware and energy cost.
Today’s models will run cheaply in the distant future, but the distant future’s models could only be dreamed of today. Hopefully at some point we get to a point where quality is “good enough” on cheap hardware and low energy, but I can’t tell you when it’ll get here. I bet at minimum another decade, unless you’re ok with what you get out of today’s models on dedicated consumer GPUs.
I think the current stuff that runs on processors and normal ram is worse than useless though.
I currently run Qwen 35B 3.6 A3B on a 5070 with 12G VRAM and I find it surprisingly useful. I use it to ask questions I want answers to that may contain sensitive information or which I don’t want to feed to the data harvesters.
This is all I use, mostly for quickly putting together personal software and doing linux stuff, it doesn’t feel limiting and is already really powerful. A lot of the stuff those models struggle with can be overcome by giving better context and more specific instructions, and that can be automated, so they should become more useful as harness software advances, independently of advancements in the models themselves. Maybe I have a limited perspective because I just haven’t tried the frontier models, but developing a dependence on services run by malevolent companies that obviously intend to use that dependence as leverage is deeply unappealing, and I’m not sure what they could offer to make that seem worth it on top of what I can already do with my own computer.