@brucethemoose

brucethemoose@lemmy.world · 26 days ago

Demonizing spaces for like minded people to congregate doesn’t solve that.

If this is a polite way of saying “go somewhere else to lightly criticize democrats,” I don’t accept that. I can at least hope Lemmy can do better, and try to change it.

Of course having a good information diet is critical. But that’s besides the point? I don’t think this thread would be a thing if all our information diets were great.

brucethemoose@lemmy.world · 26 days ago

It just doesn’t resonate with voters.

I think many voters “feel” tech getting junky, but the connection to why is just way too complicated for most to dig into. It’s not a direct line like tipping waiters or getting abortions.

brucethemoose@lemmy.world · edit-2 26 days ago

I’m with Shepard on this one, even if he’s being a jerk about it.

Lemmy is a filter bubble, an echo chamber. You miss information that would be personally important to you, but is excluded because it doesn’t fit with the US Democrat party line, and the very specific part of it Lemmy’s politically active base likes.

Like, I’m a raging Trump hater, but I’m kind of aghast at how many knee jerk reactions (like, to me, your original reply) I get when I imply something vaguely critical about the Democrats.

brucethemoose@lemmy.world · 26 days ago

What does that have to do with internet privacy legislation?

brucethemoose@lemmy.world · edit-2 27 days ago

This is not just a partisan issue. As the article points out, its been like this for 30 years. The Dems failed to pass any meaningful legislation too.

It’s because it makes gobs of money that both parties are taking, and it also kind of projects US power to other countries since US tech is doing most of the data collection.

brucethemoose@lemmy.world · 2 months ago

8GB or 4GB?

Yeah you should get kobold.cpp’s rocm fork working if you can manage it, otherwise use their vulkan build.

llama 8b at shorter context is probably good for your machine, as it can fit on the 8GB GPU at shorter context, or at least be partially offloaded if its a 4GB one.

I wouldn’t recommend deepseek for your machine. It’s a better fit for older CPUs, as it’s not as smart as llama 8B, and its bigger than llama 8B, but it just runs super fast because its an MoE.

brucethemoose@lemmy.world · edit-2 2 months ago

Oh I got you mixed up with the other commenter, apologies.

I’m not sure when llama 8b starts to degrade at long context, but I wanna say its well before 128K, and where other “long context” models start to look much more attractive depending on the task. Right now I am testing Amazon’s mistral finetune, and it seems to be much better than Nemo or llama 3.1 out there.

brucethemoose@lemmy.world · 2 months ago

4 core i7, 16gb RAM and no GPU yet

Honestly as small as you can manage.

Again, you will get much better speeds out of “extreme” MoE models like deepseek chat lite: https://huggingface.co/YorkieOH10/DeepSeek-V2-Lite-Chat-Q4_K_M-GGUF/tree/main

Another thing I’d recommend is running kobold.cpp instead of ollama if you want to get into the nitty gritty of llms. Its more customizable and (ultimately) faster on more hardware.

brucethemoose@lemmy.world · edit-2 2 months ago

Can you afford an Arc A770 or an old RTX 3060?

Used P100s are another good option. Even an RTX 2060 would help a ton.

27B is just really chunky on CPU, unfortunately. There’s no way around it. But you may have better luck with MoE models like deepseek chat or Mixtral.

brucethemoose@lemmy.world · 2 months ago

Heres a tip, most software has the models default context size set at 512, 2048, or 4092. Part of what makes llama 3.1 so special is that it was trained with 128k context so bump that up to 131072 in the settings so it isnt recalculating context every few minutes…

Some caveats, this massively increases memory usage (unless you quantize the cache with FA) and it also massively slows down CPU generation once the context gets long.

TBH you just need to not keep a long chat history unless you need it,.