Do you host your own AI?

SuspiciousCarrot78@aussie.zone · 23 hours ago

SuspiciousCarrot78@aussie.zone · edit-2 8 hours ago

Ha. You were doing inference on CPU on a haswell era. Been there, done that.

OTOH…whisper.cpp is heavily optimised for it.

Plus, you’re doing batch transcription, not real-time, so slow doesn’t actually matter.

Fire Whisper small or medium overnight and wake up to searchable text.

PS: if you want a good fast little llm, something like Qwen 3.6 2B will work well on the Xeon.