• krooklochurm@lemmy.ca
    link
    fedilink
    English
    arrow-up
    3
    ·
    2 hours ago

    So… the models are all trained up and now they need to run them is what I’m reading.

    You need lots of vram to train a model.

    An llm once trained can be run in much less vram and a lot of ram

  • w3dd1e@lemmy.zip
    link
    fedilink
    English
    arrow-up
    5
    ·
    3 hours ago

    Buying used RAM on marketplace and hoping it isn’t broken. Hoping it was just stolen from a Best Buy. Fingers crossed y’all!

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    25
    ·
    11 hours ago

    GPU prices are coming to earth

    https://lemmy.today/post/42588975

    Nvidia reportedly no longer supplying VRAM to its GPU board partners in response to memory crunch — rumor claims vendors will only get the die, forced to source memory on their own

    If that’s true, I doubt that they’re going to be coming to earth for long.

  • ffhein@lemmy.world
    link
    fedilink
    English
    arrow-up
    11
    ·
    12 hours ago

    … I was thinking about buying a 96GB DDR5 kit from the local computer store a few weeks ago, but wasn’t sure it was actually worth €700. Checked again now and the exact same product costs €1500. I guess that settles it, 32GB will have to be enough for the next couple of years then.

    • Holytimes@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      6
      ·
      5 hours ago

      Iv come to learn over the years. If you want to buy computer parts just do it.

      Your actively stupid if you don’t cause some bigger idiot with more money then brains will make a new grift that causes everything to be unaffordable.

      Fuck waiting for deals, fuck thinking twice. Just fucking buy it and ignore reality around you cause you will be pissed either way.

      Either a deal comes and you fucked yourself, or everything goes to the moon and now you have nothing AND your fucked.

      • lemming741@lemmy.world
        link
        fedilink
        English
        arrow-up
        5
        ·
        3 hours ago

        Part of it the thrill of the hunt. I’ve caught some great deals over the years stalking marketplace.

        Got .iso storage after chia crashed
        Got a 3090 after Bitcoin asics took over
        Got a 5900x when the X3D parts came out

        But I’ve never seen decent RAM for sale, only single sticks or slow kits.

    • panda_abyss@lemmy.ca
      link
      fedilink
      English
      arrow-up
      13
      ·
      15 hours ago

      I hope this is the beginning of the end for the cuda monopoly. I just want good gpgpu support for numerical code.

      • klangcola@reddthat.com
        link
        fedilink
        English
        arrow-up
        20
        ·
        11 hours ago

        The article introduction is gold:

        In the unlikely case that you have very little RAM and a surplus of video RAM, you can use the latter as swap.

          • litchralee@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            0
            ·
            1 hour ago

            https://github.com/Overv/vramfs

            Oh, it’s a user space (FUSE) driver. I was rather hoping it was an out-of-tree Linux kernel driver, since using FUSE will: 1) always pass back to userspace, which costs performance, and 2) destroys any possibility of DMA-enabled memory operations (DPDK is a possible exception). I suppose if the only objective was to store files in VRAM, this does technically meet that, but it’s leaving quite a lot on the table, IMO.

            If this were a kernel module, the filesystem performance would presumably improve, limited by how the VRAM is exposed by OpenCL (ie very fast if it’s just all mapped into PCIe). And if it was basically offering VRAM as PCIe memory, then this potentially means the VRAM can be used for certain RAM niche cases, like hugepages: some applications need large quantities of memory, plus a guarantee that it won’t be evicted from RAM, and whose physical addresses can be resolved from userspace (eg DPDK, high-performance compute). If such a driver could offer special hugepages which are backed by VRAM, then those application could benefit.

            And at that point, on systems where the PCIe address space is unified with the system address space (eg x86), then it’s entirely plausible to use VRAM as if it were hot-insertable memory, because both RAM and VRAM would occupy known regions within the system memory address space, and the existing MMU would control which processes can access what parts of PCIe-mapped-VRAM.

            Is it worth re-engineering the Linux kernel memory subsystem to support RAM over PCIe? Uh, who knows. Though I’ve always like the thought of DDR on PCIe cards. All technologies are doomed to reinvent PCIe, I think, said someone from Level1Techs.

  • Jeena@piefed.jeena.net
    link
    fedilink
    English
    arrow-up
    28
    ·
    19 hours ago

    This is very unfortunate, about a year ago I built my PC and only put in 32 GB of Ram, It was double I had on my laptop so I thought it should be enough for the beginning and I could buy more later.

    Already after 2 months I realizes I can do so much more because of the fast CPU in parallel but suddenly the amount of RAM became the bottleneck. When I looked at the RAM prices it didn’t seem quite worth it and I waited. But that backfired because since then the prices never went down, only up.

    • NotSteve_@piefed.ca
      link
      fedilink
      English
      arrow-up
      35
      ·
      17 hours ago

      What are you running that needs more than 32Gb? I’m only just barely being bottlenecked by my 24Gb when running games at 4k

      • Jeena@piefed.jeena.net
        link
        fedilink
        English
        arrow-up
        6
        ·
        16 hours ago

        Two browsers full of tabs but that is not a problem, but once I start compiling AOSP (which I sometimes want to do for work at home instead in the cloud because it’s easier and faster to debugg) then it eats up all the RAM imediatelly and I have to give it 40 more GB or swap and then this swapping is the bottleneck. Once that is running the computer can’t really do anything else, even the browser struggles.

        • usernamesAreTricky@lemmy.ml
          link
          fedilink
          English
          arrow-up
          8
          ·
          13 hours ago

          Have you tried just compiling it with fewer threads? Would almost certainly reduce the RAM usage, and might even make the compile go faster if it you’re needing to swap that heavily

          • Jeena@piefed.jeena.net
            link
            fedilink
            English
            arrow-up
            2
            ·
            6 hours ago

            Yeah that’s what I’m doing but I played for the fast CPU and can’t make it the bottleneck ^^

      • hoshikarakitaridia@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        17 hours ago

        AI or servers probably. I have 40gb and that’s what I would need more ram for.

        I’m still salty because I had the idea of going cpu & ram sticks for AI inference literally days before the big AI companies. And my stupid ass didn’t buy them in time before the prices skyrocketed. Fuck me I guess.

        • NotMyOldRedditName@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          ·
          edit-2
          17 hours ago

          It does work, but it’s not really fast. I upgraded to 96gb ddr4 from 32gb a year or so ago, and being able to play with the bigger models was fun, but it’s not something I could do anything productive with it was so slow.

          • Possibly linux@lemmy.zip
            link
            fedilink
            English
            arrow-up
            4
            ·
            16 hours ago

            Your bottle necked by memory bandwidth

            You need ddr5 with lots of memory channels for it to he useful

          • tal@lemmy.today
            link
            fedilink
            English
            arrow-up
            3
            ·
            edit-2
            17 hours ago

            You can have applications where wall clock tine time is not all that critical but large model size is valuable, or where a model is very sparse, so does little computation relative to the size of the model, but for the major applications, like today’s generative AI chatbots, I think that that’s correct.

            • NotMyOldRedditName@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              ·
              edit-2
              16 hours ago

              Ya, that’s fair. If I was doing something I didn’t care about time on, it did work. And we weren’t talking hours, it it could be many minutes though.

        • panda_abyss@lemmy.ca
          link
          fedilink
          English
          arrow-up
          2
          ·
          15 hours ago

          I’m often using 100gb of cram for ai.

          Earlier this year I was going to buy a bunch of 1tb ram used servers and I wish I had.

    • Jeena@piefed.jeena.net
      link
      fedilink
      English
      arrow-up
      9
      ·
      16 hours ago

      I just had a look, 2nd of April I payed 67,000 KRW for one 16 GB stick, now the same one (XPG DDR5 PC5-48000 CL30 LANCER BLADE White), they only sell them in pairs, a pair costs 470,000 KRW in the same shop, so 235,000 KRW per 16 GB stick. That is a price increase of 250%, god damn.

    • tal@lemmy.today
      link
      fedilink
      English
      arrow-up
      3
      ·
      edit-2
      17 hours ago

      Last I looked, a few days ago on Google Shopping, you could still find some retailers that had stock of DDR5 (I was looking at 2x16GB, and you may want more than that) and hadn’t jacked their prices up, but if you’re going to buy, I would not wait longer, because if they haven’t been cleaned out by now, I expect that they will be soon.