• lichtmetzger@discuss.tchncs.de
    link
    fedilink
    arrow-up
    1
    ·
    edit-2
    16 hours ago

    Doesn’t have to be a Mac, my GPD Win Max 2 has 64GB as well for a much lower price and it can somehow use 55GB on the integrated NPU (AMD 780M) for running models with ollama. I can even combine that with an external GPU on the Oculink port to increase the total memory.

    It takes between 30s to 5min to get a reply, but it does work and it’s mainly useful for going over my project asking how to improve the codebase.

    Quality-wise it’s good enough for boilerplate code and small improvements. Wouldn’t trust it to work on big features in larger projects, but I don’t trust LLMs in general for that. I don’t see a big difference to ChatGPT and Gemini (which is a win for local hosting and putting the freedom of computing back into our own hands). But the usual caveats always apply. All models have their problems and people tend to overhype the capabilities of LLMs in general.