• Tja@programming.dev
        link
        fedilink
        English
        arrow-up
        1
        ·
        2 hours ago

        How are they running it? Doesn’t the model have to fit in (V)RAM? Does Nvidia have such huge memories in the H cards?

        • boonhet@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          23 minutes ago

          For self hosting it essentially needs to fit in VRAM + RAM but it’ll take a lot of CPU for the part in RAM

          Deepseek probably uses those big fancy H cards and not one but several together to increase VRAM.