'At some point you've got to make money': Goldman's top AI skeptic warns the clock is running out ahead of OpenAI and Anthropic IPOs

sanitation@lemmy.today · 2 months ago

'At some point you've got to make money': Goldman's top AI skeptic warns the clock is running out ahead of OpenAI and Anthropic IPOs

melfie@lemmy.zip · edit-2 2 months ago

Local is potentially even cheaper than that. This guy talks about how to get 17 t/s with a GTX 1060 that has 6GB of VRAM on the Qwen 3.6 35B MoE model: https://m.youtube.com/watch?v=8F_5pdcD3HY. He’s using a fork of llama.cpp with TurboQuant and his newest video made after this one is using an even more optimized 28B version of the model. I have cmake building the llama.cpp fork in a Dockerfile at the moment and we’ll see how this performs on my $800 laptop with a RTX 4060.

I’m also impressed how good OpenCode is compared to Claude Code. Qwen 3.6 is not quite as good as Claude and the MoE version that doesn’t need 24GB+ of VRAM isn’t quite as good as the dense version, but it also doesn’t cost $200 a month with usage limitations and a company training their models on your data. If it’s anywhere near “good enough”, I can see this being a daily driver.

'At some point you've got to make money': Goldman's top AI skeptic warns the clock is running out ahead of OpenAI and Anthropic IPOs

'At some point you've got to make money': Goldman's top AI skeptic warns the clock is running out ahead of OpenAI and Anthropic IPOs

'At some point you've got to make money': Goldman's top AI skeptic warns the clock is running out ahead of OpenAI and Anthropic IPOs | Fortune