With AI seeming to consume all resources for hardware, I’m wondering what parts of those current systems we could see trickling down into componentry for desktop PC’s as they get outdated for AI tasks.

I know most of this hardware is pretty specific and integrated, but I do wonder if an eventual workaround to these hardware shortages are through recycling and repurposing of the very systems causing the shortage. We have seen things like dram, flash, and even motherboard chipsets be pulled from server equipment and find its way into suspiciously cheap hardware on eBay and AliExpress, so how much of the current crop of hardware will turn up there in the future?

How much of that hardware could even be useful to us? Will nvidia repo old systems and shoot them into the sun to keep it out of the hands of gamers? Perhaps only time will tell

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    12 hours ago

    I posted in a thread a bit back about this, but I can’t find it right now, annoyingly.

    You can use the memory on GPUs as swap, though on Linux, that’s currently through FUSE — going through userspace — and probably not terribly efficient.

    https://wiki.archlinux.org/title/Swap_on_video_RAM

    Linux apparently can use it via HMM: the memory will show up as system memory.

    https://www.kernel.org/doc/html/latest/mm/hmm.html

    Provide infrastructure and helpers to integrate non-conventional memory (device memory like GPU on board memory) into regular kernel path

    It will have higher latency due to the PCI bus. It sounds like it basically uses main memory as a cache, and all attempts to directly access a page on the device trigger an MMU page fault:

    Note that any CPU access to a device page triggers a page fault and a migration back to main memory. For example, when a page backing a given CPU address A is migrated from a main memory page to a device page, then any CPU access to address A triggers a page fault and initiates a migration back to main memory.

    I don’t know how efficiently Linux deals with this for various workloads; if it can accurately predict the next access, it might be able to pre-request pages and do this pretty quickly. That is, it’s not that the throughput is so bad, but the latency is, so you’d want to mitigate that where possible. There are going to be some workloads for which that’s impossible: an example case would be just allocating a ton of memory, and then accessing random pages. The kernel can’t mitigate the PCI latency in that case.

    There’s someone who wrote a driver to do this for old Nvidia cards, something that starts with a “P”, that I also can’t find at the moment, which I thought was the only place where it worked, but it sounds like it can also be done on newer Nvidia and AMD hardware. Haven’t dug into it, but I’m sure that it’d be possible.

    A second problem with using a card as swap is going to be that a Blackwell card uses extreme amounts of power, enough to overload a typical consumer desktop PSU. That presumably only has to be used if you’re using the compute hardware, which you wouldn’t if you’re just moving memory around. I mean, existing GPUs normally use much less power than they do when crunching numbers. But if you’re running a GPU on a PSU that cannot actually provide enough power for it running at full blast, you have to be sure that you never actually power up that hardware.

    EDIT: For an H200 (141 GB memory):

    https://www.techpowerup.com/gpu-specs/h200-nvl.c4254

    TDP: 600 W

    EDIT2: Just to drive home the power issue:

    https://www.financialcontent.com/article/tokenring-2025-12-30-the-great-chill-how-nvidias-1000w-blackwell-and-rubin-chips-ended-the-era-of-air-cooled-data-centers

    NVIDIA’s Blackwell B200 GPUs, which became the industry standard earlier this year, operate at a TDP of 1,200W, while the GB200 Superchip modules—combining two Blackwell GPUs with a Grace CPU—demand a staggering 2,700W per unit. However, it is the Rubin architecture, slated for broader rollout in 2026 but already being integrated into early-access “AI Factories,” that has truly broken the thermal ceiling. Rubin chips are reaching 1,800W to 2,300W, with the “Ultra” variants projected to hit 3,600W.

    A standard 120V, 15A US household circuit can only handle 1,800W, even if you keep it fully loaded. Even if you get a PSU capable of doing that and dedicate the entire household circuit to that, beyond that, you’re talking something like multiple PSUs on independent circuits or 240V service or something like that.

    I have a space heater in my bedroom that can do either 400W or 800W.

    So one question, if one wants to use the card for more memory, is going to be what the ceiling on power usage that you can ensure that those cards will use while most of their on-board hardware is idle.