I realize, I need to upgrade my little NUC to something bigger for higher inference of bigger llama models. I want something that you still can have on your living room’s tv bench, so no monster rack please, but that has also the necessary muscle when needed for llama. Budget doesn’t matter right now, want to understand what’s good and what’s out there. Thanks

EDIT: Wow, thanks for the inspiration, guess I need to look at bit for “how to stuff a huge graphics card into a mini box”. To clarify a bit more what I want with it: I want to build a responsive personal assistant. I am dreaming of models bigger than 8B, good tool calling for things like memory, websearch etc., no coding, no image generation, no video generation required. Image recognition would be good but not a must. Regarding footprint, the no monster ;) Something that you can have in your livingroom, and could be wife approved - so no big gaming rig with exhaust pipes and stuff, needs to be good looking ;)

  • bazinga@discuss.tchncs.deOP
    link
    fedilink
    English
    arrow-up
    2
    ·
    13 hours ago

    Thank you for the detailed writeup. Are you aware of anything small footprint? I am thinking like dgx spark size maybe a bit bigger?

    • anamethatisnt@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      13 hours ago

      Problem with smaller footprint is cooling and how audible it becomes.
      One idea is to use fiber optic hdmi cables and a usb extender to hide the pc away in another room.

      If you want smaller footprint then the keyword to use is “Unified memory”, it can be reasonable fast for 30B models and a slow thinker mode for 70B ones.

      edit: example of a Unified Memory Apple Mac Studio can be found here at $5499 for 96GB RAM
      https://www.apple.com/shop/buy-mac/mac-studio/m3-ultra-chip-32-core-cpu-80-core-gpu-96gb-memory-2tb-storage

      • zergtoshi@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        9 hours ago

        Wouldn’t an AMD RX 9060 XT with 16 GB RAM be nice as well if you’re hunting for good speed/cost options?

        • TheHolm@aussie.zone
          link
          fedilink
          English
          arrow-up
          1
          ·
          47 minutes ago

          Probably. It just not as fast as 9070 XT. I’m using 9070 XT myself and limitation for running LLMs is memory, not speed. If model fit in memory it will runs fast enough to be practical.