Do you host your own ML / AI / LLM? What do you use, and what do you use it for?

  • hexagonwin@lemmy.today
    link
    fedilink
    English
    arrow-up
    2
    ·
    9 hours ago

    i don’t use it at all, i do want some selfhosted speech to text model (whisper?) but my computer is ancient so it would be awfully slow. i have some multi hour audio recordings from presentations, would be nice to have them in text and searchable…

    • SuspiciousCarrot78@aussie.zoneOP
      link
      fedilink
      English
      arrow-up
      5
      ·
      edit-2
      7 hours ago

      How ancient is ancient? TTS and STT are much lighter than llm. (eg: Whisper, Piper, Kokoro, Coqui etc)…you might have more capability than you think, especially if you’re doing batch processing like that.

      • hexagonwin@lemmy.today
        link
        fedilink
        English
        arrow-up
        2
        ·
        7 hours ago

        a haswell xeon e5-1650 machine, i remember running llama 7b in llama.cpp in like 2023 and it was quite sluggish. guess i should try whisper at some point…

        • SuspiciousCarrot78@aussie.zoneOP
          link
          fedilink
          English
          arrow-up
          7
          ·
          edit-2
          7 hours ago

          Ha. You were doing inference on CPU on a haswell era. Been there, done that.

          OTOH…whisper.cpp is heavily optimised for it.

          Plus, you’re doing batch transcription, not real-time, so slow doesn’t actually matter.

          Fire Whisper small or medium overnight and wake up to searchable text.

          PS: if you want a good fast little llm, something like Qwen 3.6 2B will work well on the Xeon.