Has anyone tried in organization to use self hosted llm models for agentic programming?

Im curious if it makes any sense. My organization spends fortune on tokens from us companies. I want to recommend something…

  • 87Six@lemmy.zip
    link
    fedilink
    arrow-up
    3
    ·
    2 days ago

    I mean… RAM? Don’t you need mass VRAM for this kind of thing? Or are they shared on Mac?

    idk how to calculate how long it pay for itself…

    You don’t… Not in this industry. You guess and hope it goes in your favor.

    No calculations matter if the market can jump or drop by 300% in a few months… And that applies to programming, hardware prices, AI subscription prices, regulations between countries when Trump is in office…

    • SeductiveTortoise@piefed.social
      link
      fedilink
      English
      arrow-up
      6
      ·
      edit-2
      2 days ago

      Apple unified memory shares all over CPU, GPU and NPU, you can assign a lot of memory to run local models and there bandwidth is good, depending on the model.

      AMD has something similar with their something something AI CPUs and they go up to 128GB at the moment. Apple can be way faster though. And you were able to buy a Mac Studio with 512GB back when RAM wasn’t worth more than unicorn pee. For… I guess 10k though.

      • 87Six@lemmy.zip
        link
        fedilink
        arrow-up
        2
        ·
        1 day ago

        Apple unified memory shares

        That’s cool asf.

        Apple engineers with better leadership could change the fucking world… But instead they’re used to screw over their own user base.

        If my GPU starts falling back to RAM my game fps drops to 1 lol.