Do you host your own ML / AI / LLM? What do you use, and what do you use it for?

  • SuspiciousCarrot78@aussie.zoneOP
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    8 hours ago

    Ha. You were doing inference on CPU on a haswell era. Been there, done that.

    OTOH…whisper.cpp is heavily optimised for it.

    Plus, you’re doing batch transcription, not real-time, so slow doesn’t actually matter.

    Fire Whisper small or medium overnight and wake up to searchable text.

    PS: if you want a good fast little llm, something like Qwen 3.6 2B will work well on the Xeon.