Anyone messed with local cli LLMs?

bridgeenjoyer@sh.itjust.works · 23 days ago

Anyone messed with local cli LLMs?

Domi@lemmy.secnd.me · 23 days ago

I’m running gpt-oss-120b and glm-4.5-air locally in llama.cpp.

It’s pretty useful for shell commands and has replaced a lot of web searching for me.

The smaller models (4b, 8b, 20b) are not all that useful without providing them data to search through (e.g. via RAG) and even then, they have a bad “understanding” of more complicated prompts.

The 100b+ models are much more interesting since they have a lot more knowledge in them. They are still not useful for very complicated tasks but they can get you started quite quickly with regular shell commands and scripts.

The catch: You need about 128GB of VRAM/RAM to run these. The easiest way to do this locally is to either get a Strix Halo mini PC with 128GB VRAM or put 128GB of RAM in a server/PC.