Do you host your own ML / AI / LLM? What do you use, and what do you use it for?

  • hexagonwin@lemmy.today
    link
    fedilink
    English
    arrow-up
    2
    ·
    7 小時前

    a haswell xeon e5-1650 machine, i remember running llama 7b in llama.cpp in like 2023 and it was quite sluggish. guess i should try whisper at some point…

    • SuspiciousCarrot78@aussie.zoneOP
      link
      fedilink
      English
      arrow-up
      7
      ·
      edit-2
      7 小時前

      Ha. You were doing inference on CPU on a haswell era. Been there, done that.

      OTOH…whisper.cpp is heavily optimised for it.

      Plus, you’re doing batch transcription, not real-time, so slow doesn’t actually matter.

      Fire Whisper small or medium overnight and wake up to searchable text.

      PS: if you want a good fast little llm, something like Qwen 3.6 2B will work well on the Xeon.