• ikt@aussie.zone
    link
    fedilink
    arrow-up
    2
    ·
    18 hours ago

    But regardless, the main point of the gap is resources

    What makes you think we won’t have the resources in the future?

    Any model that can run on 16GB or less, is not going to be any close in real world tasks, to any other cloud based model. It just cannot be.

    Well you can compare Gemma 4 running in LM Studio on an average gaming PC to ChatGPT3.5 and you tell me? Or is your benchmark purely based on right at this very moment between open source models today vs cloud today?

    For reference Gemma 4 is 26 billion parameters, gp3 thought to be over 175 billion and of course had no optimisations like MoE, it was searching its entire library every single question so was rather slow as well

    We know as well that there is no slow down in pushing for optimisations, Deepseeks initial release was the initial driver for you don’t have to just scale up using hardware alone

    https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/

    They’re also pushing with Chinese native chips from Huawei trying to diversify away from nvidia holding the crown

    The problem I’ve got is that you all have a god of the gaps, the conversation I was having 3 years ago was different to 2 years ago was different to 1 year ago, I was told AI could never do songs good enough then suddenly people were worried they couldn’t tell the difference, then they said they could never do movies, now apparently not only is it good enough it’s hilarious

    https://www.youtube.com/watch?v=fgHn7PI55J4

    The open source LLM’s we have today are incredible and in the last few months we’ve had Qwen, GLM, Nemotron/Nvidia, Mistral, Google and heeaaps of others released, it feels like you’re just looking for a reason to be dour and pessimistic but that’s just me

    Any way I’m off to sleep, have a good one :)

    • mabeledo@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      16 hours ago

      The problem I’ve got is that you all have a god of the gaps, the conversation I was having 3 years ago was different to 2 years ago was different to 1 year ago

      And I guess the problem I have with you, is that you seem to think that you can get results with 16GB, competitive with models that run on a Blackwell 6000 with 96GB, while ignoring the fact that the vast majority of the people in the world are running GPUs with 4 to 8 GB of VRAM, if they even have access to GPUs, at all.

      That’s the gap. Most people don’t have the kind of money you think they do, and even those who do have some money, they will never achieve the same results as with cloud models, because if there’s a state of the art optimization that makes models 10 times smaller, cloud models will become 10 times bigger with that advantage. It’s pretty simple.