• sga@piefed.social
    link
    fedilink
    English
    arrow-up
    4
    ·
    2 days ago

    further clarification - ollama is a distribution of llama cpp (and it is a bit commercial in some sense). basically, in ye olde days of 2023-24 (decades in llm space as they say), llama cpp was a server/cli only thing. it would provide output in terminal (that is how i used to use it back then), or via a api (an openai compatible one, so if you used openai stuff before, you can easily swap over), and many people wanted a gui interface (a web based chat interface), so ollama back then was a wrapper around llama cpp (there were several others, but ollama was relatively main stream). then as time progressed ollama “allegedly enshittified”, while llama cpp kept getting features (a web ui, ability to swap models during run time(back then that required a separate llama-swap), etc. also llama cpp stack is a bit “lighter” (not really, they both are web tech, so as light as js can get), and first party(ish - the interface was done by community, but it is still the same git repo) so more and more local llama folk kept switching to llama cpp only setup (you could use llama cpp with ollama, but at that point, ollama was just a web ui, and not a great one, some people prefered comfyui, etc). some old timers (like me) never even tried ollama, as plain llama.cpp was sufficient for us.

    as the above commenter said, you can do very fancy things with llama cpp (the best thing about llama cpp is that it works with both cpu and gpu - you can use both simulataneously, as opposed to vllm or transformers, where you almost always need gpu. this simultaneous thing is called offloading. where some of the layers are dumped in system meory as opposed to vram, hence the vram poor population used ram )(this also kinda led to ram inflation, but do not blame llama cpp for it, blame people), and you can do some of them on ollama (as ollama is wrapper around llama cpp), but that requires ollama folks to keep their fork up to date to parent, as well as expose the said features in ui.