• NotMyOldRedditName@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    16 hours ago

    Models can split loads across a discrete GPU and CPU/RAM.

    Its not as fast as if you can load it all in the GPU, but it gives you more options. Its been quite common for a long time.