Local LLM agents

Kkk2237pl@lemmy.world · 2 days ago

Local LLM agents

DaTingGoBrrr@lemmy.ml · 2 days ago

I am running qwen 3.5 locally using llama.cpp on 8gb of VRAM and 16 gigs of RAM. It works well enough with a 4B to 9B parameter model along with quantization and MTP. More optimizations are on the way with turboquant and possibly other tech.

It’s just there to assist me, not do all the work, so I am happy as long as I can self host it.

I can’t say how well my specs would work in a professional setting but for personal use a MacBook should be sufficient in my opinion.