• Xylight@feddit.online
    link
    fedilink
    English
    arrow-up
    4
    ·
    17 hours ago

    Qwen3 30b a3b, for example, is brilliant for its size and i can run it on my 8 GB VRAM + 32 GB RAM system at like 20 tokens per second. For lower powered systems, Qwen3 4b + a search tool is also insanely great for its size and can fit in less than 3 GB of RAM or VRAM at Q5 quantization