Claude: papers please?

SuspciousCarrot78@lemmy.world · edit-2 12 hours ago

Claude: papers please?

maegul (he/they)@lemmy.ml · 8 hours ago

Probably a shallow response …

But I always figured AI/LLMs are basically apocalyptic for all sorts of individualistic values in computing (including privacy but also independence and diversity).

Whether they’re good or useful etc, I just struggle to see how they will ever be justifiable against these sorts of values.

Sure, local models and our hardware will get better … but better than the state of the art from the big labs and providers? Given that data and training are the big bottlenecks on quality … I struggle to see how AI isn’t a complete feudal capturing of information computing and processing. Not to mention what happens to the pipeline that produces information content if everyone is only consuming it through the models that train on it.

So for me the big question is, what’s our call on a possible (likely even?) future where we are forever stuck using cloud provided AI along with all of its negatives, in the same way that basically all of us has been and still is stuck using MS windows, Google and the big-social-media hellscape?

For me, I baulk at this.

SuspciousCarrot78@lemmy.world · edit-2 7 hours ago

I’m right there with you…but may I offer an alternative narrative in two parts and then address the pipeline issue you raise.

The first part:

There’s a small (but real) subset of people turning their back on big corpo. Retro-tech, dumb-phones, self hosting, linux, right-to-repair advocates, OSS and FOSS, privacy groups … everyone can smell the enshittification and are (in their own ways) pushing back. That’s not nothing.

I think the way forward is not to play the game. Big corpo will do what big corpo always does. But we can use the tools we have to make the things we want.

Will it compete with SOTA? No. But…does it need to? At an individual level, I’d argue “probably not”. It just needs to work for the individual.

More to the point, there’s something to be said about doing more with less. Constraints can bring about real innovation. If the answer cannot be “Throw more X at it” (where X is $$$, compute, whatever)…then how can you leverage the tools and intelligence you have to build what you want? I think that’s the real question.

Now for the second part:

So for me the big question is, what’s our call on a possible (likely even?) future where we are forever stuck using cloud provided AI along with all of its negatives, in the same way > that basically all of us has been and still is stuck using MS windows, Google and the big-social-media hellscape?

I’m more sanguine about it because I think this is down to the individual. Look at where you are now - it’s not Reddit or Facebook :). You and I choose to be here because…reasons. We can choose to run Linux, LibreOffice, Mullivad, llama.cpp, SearXNG, Syncthing, Immich etc for the same reasons.

I think the trick will be figuring out how to navigate from your home ecosystem into the wider world, without getting f’d in the a.

The one thing I don’t have a clean answer for is your pipeline point. If the content web collapses into AI slop - and it’s already going that way - then the human-generated signal that makes these models worth using starts to degrade. You may need to hold onto your “Good Old LLMs” for a while yet (or start training your own from scratch. There are ways and means but that’s beyond the scope of this conversation I think).

In any case, individual sovereignty doesn’t fix that. You can opt out personally and still live in a world where the epistemic commons has been strip-mined.

That…probably what WILL happen, come to think of it. Ok, fine. But partial answers already exist - cryptographic provenance of human content, federated communities being structurally harder to slop-flood (maybe).

Honestly? Nobody has solved that problem just yet. The people building the biggest models know it’s a problem and don’t have a clean answer either. Anyone who says they do is selling something.

All I can say is the only way to win is not to play the game. Which WORP would no doubt meep-morp at.

maegul (he/they)@lemmy.ml · 6 hours ago

Oh I hear you (and appreciate the response).

For me, I can’t help but think of another alternative, which I’m surprised I haven’t heard of yet …

stripping down one’s personal technological cognitive load to a stack of systems that can fit into one’s brain (like the Python mantra), focusing on learning that stack well building sustainable and stable systems, and then just detoxing from the increasingly polluted digital information stream (protected commons, traditional formats such as books and in person engagement … dunno).

Depends on what the end goal is, but AIs seem to be about using tech more or just opting out of sovereignty. Something like the above seems to me to be about using tech less (in the end) and pushing toward being a secondary tool rather than an end of its own.

SuspciousCarrot78@lemmy.world · edit-2 6 hours ago

I agree.

God help me, I’m actually reading books again.

Books.

It’s…harder than it use to be. A lot harder, actually.

But there’s something to be said about marginalia etc.

maegul (he/they)@lemmy.ml · 6 hours ago

Ha yes … on the other hand, it was easy to forget how good damn expansive non-internet information is: the whole world ran on that shit for millennia.

Eager Eagle@lemmy.world · 12 hours ago

I hope people in the us wake the fuck up and these companies start losing users to foreign ones by the millions

unitedwithme@lemmy.today · 11 hours ago

We spread word to migrate to federated and non US tech platforms to start taking their financial leg, then ad revenue drops and slowly they weaken. The weaker they are, the less political influence (aka purchasing power) they have!

SuspciousCarrot78@lemmy.world · 11 hours ago

I’m not sure that solves the issue or just changes the actors. Still, I’m all for “fight the power”.

I’m just a silly man with a box of scraps. But I hope enough silly men with boxes can come together to form some sort of co-op. Maybe. I don’t know. But…I hope, people smarter and better resourced than I can find a way forward.

The writing is on the wall here.

utopiah@lemmy.ml · edit-2 10 hours ago

IMHO LLM usage isn’t coherent with independence. That being said I wrote quite a bit on self-hosting LLMs. There are quite a few tools available, like ollama itself relying on llama.cpp that can both work locally and provide an API compatible replacement to cloud services. As you suggested though typically at home one doesn’t have the hardware, GPUs with 100+GB of VRAM, to run the state of the art. There is a middle ground though between full cloud, API key, closed source vs open source at home on low-end hardware : running STOA open models on cloud. It can be done on any cloud but it’s much easier to start with dedicated hardware and tooling, for that HuggingFace is great but there are multiples.

TL;DR: closed cloud -> models on clouds -> self-hosted provide a better path to independence, including training.

SuspciousCarrot78@lemmy.world · edit-2 9 hours ago

Yeah, me too :)

https://bobbyllm.github.io/llama-conductor/

https://codeberg.org/BobbyLLM/llama-conductor

I’m thinking about coding a >>cloud side car at the moment, with the exact feature you mentioned…but…that’s scope creep for what I have in mind.

Irrespective of all that, I agree: an open cloud co-op could be a good way to have SOTA (or near SOTA - GLM 5.1 is about as close as we have right now) access for when needed.

(Not teaching you to suck eggs, so this comment is for the lay-reader):

For coding, you can do some interesting stuff where the cloud model is the “general” and the locally hosted LLM is the “soldier” that does the grunt work. We have some pretty decent, consumer-level-hardware runnable “soldiers” now (I still like Qwen 3 coder)…they just don’t quite have the brains to see the full/big picture for coding.

ropatrick@lemmy.world · 8 hours ago

I love the sound of this but can I ask, if the net goes down and you hardly notice, where do you get your ‘net’ from? Or is it that your intranet doesn’t need internet as such and everything is just local?

I might have answered my own question there but I’m interested to understand it a bit more.

Thanks!

SuspciousCarrot78@lemmy.world · edit-2 8 hours ago

The intranet becomes the internet :) Everything is local, accessible from multiple devices within my wLAN. The main box plugs into the router and serves everything over Wifi to trusted devices - my documents, media, books, games etc.

I wrote (flippantly) about the bones of the system here, 3 or 4 months ago. It’s more complex now, but the endgame has always been “what if cloud, but you are your own cloud?”

https://lemmy.world/post/41315607/21438607

It may not be fresh (if the net goes down), but it would be local. The only real question I have to grok for myself is if I want to mirror curated section of Wikipedia, books etc.

https://en.wikipedia.org/wiki/Kiwix

Probably I should. May as well go full data-horder. Good excuse to get a few more TB of storage. What I’ve done so far is all within 4TB, using clever tricks and black magic but there’s a limit to 4TB. Fortunately, hard drives are still not too $$$. +4TB is about $200 here locally. So the entire set up is still around $600-700 AUD (around $350-400 USD)

All the other stuff I have more or less tee-ed up (barring the UPS + solar kit I am building later in the year).

https://www.youtube.com/watch?v=1q4dUt1yK0g

Anyhow, 8TB local should just about cover what I have in mind (he said, fully aware of dragon horder sickness). Then I’ll grab something for offsite storage for critical docs - I have an old raspberry pi with a 256GB NVMe ssd I can use for that.

I’m semi tempted (because fuck it, why not have fun) to look into LoRA after that.

https://en.wikipedia.org/wiki/LoRa

What I really NEED to do is finish the LLM stack (I’m on it and nearly done) and then do a curated Youtube replacement with yl-dlp feeding into Nova-player or Jellyfin, once/if SmartTube etc gets shit-canned. The youtube thing I’m kinda excited about because I’ve figured out how to squeeze ~5000 videos in around 250ish GB of storage, with TTL (time to live) mechanics, download replacement schedule etc. The kids watch too much random shit on YT, so daddy will make YT at home (ha!).

I have some other wild ideas too…it’s a whole other thing…don’t get me started :)

Once I’m finished, I will open-source the entire thing, post about it here, and let others replicate / improve on it. And so it goes. Once you begin walking down the dark path, you are forever doomed. Be careful :)

ropatrick@lemmy.world · 7 hours ago

OK I have you. You dont need the internet because you have the internet in your terabyte farm. Pretty cool.

Thanks for the detailed reply.

One final question, I’m sure its dark at the bottom of the deep rabbit hole you are in, what do you do for batteries for your head torch?! 😀

SuspciousCarrot78@lemmy.world · edit-2 4 hours ago

Exactly so. Mom - can we get the internet? Mom: we have the internet at home.

Batteries? I don’t need batteries. I have the never-ending warm glow of weaponized autism. And that’s not even a joke.

I tend to hyper-fixate on something until either it breaks or I do. It’s usually 70/30 in my favour :)

trailee@sh.itjust.works · 11 hours ago

Even if the tools are not yet there, “they” want to know exactly who asks for code to things like a DIY radar station or autonomous drone control. We’re well into “first they came” territory.

SuspciousCarrot78@lemmy.world · edit-2 11 hours ago

I hope you’re wrong. I’m worried that you’re probably not.

Still time. Just barely.

trailee@sh.itjust.works · 9 hours ago

I hope I’m wrong too. But I’m pretty pessimistic.

steel_for_humans@piefed.social · 10 hours ago

Say I have a GPU with 32GB VRAM and I am on Linux, what local LLM would be good for coding?

Currently I just have an iGPU ;) but that’s always an option, albeit a very expensive one.

andrew0@lemmy.dbzer0.com · 9 hours ago

Get llama.cpp and try Qwen3.6-35B-A3B. Just came out and looks good. You’ll have to look into optimal settings, as it’s a Mixture of Experts (MoE) model with only 3B parameters active. That means that the rest can stay in RAM for quick inference.

You could also try the dense model (Qwen3.5-27B), but that will be significantly slower. Put these in a coding harness like Oh-My-Pi, OpenCode, etc. and see how it fares for your tasks. Should be ok for small tasks, but don’t expect Opus / Sonnet 4.6 quality, more like better than Haiku.

SuspciousCarrot78@lemmy.world · edit-2 9 hours ago

Sadly…none. Well, I mean…it depends what you mean by “coding”. If you mean “replace Claude with local?”. Then…none. Sorry.

If you mean “actually, if I use ECA to call a cloud model from OpenRouter for planning, then have it direct a local LLM to do the scutt work”, then the Qwen series of models (like Qwen 3 Next) are pretty awesome.

The iGPU will make you want to kill yourself though. Get a GPU :) Even a 4-16GB one can make a difference.

PS: You said GPU and iGPU, so I’m not sure which one has the 32GB or what rig your running. I have suspicion though you’re running on a i5 or i7 with something like a intel 630 igpu inbuilt? In which case, the iGPU is pretty slow and depending on the exact chip, you likely won’t be able to use CUDA or Vulkan acceleration.

So, the “get a GPU” thing still holds :)

steel_for_humans@piefed.social · 9 hours ago

I meant that I can buy one of those Radeons dedicated to AI work, like the ASRock Radeon AI PRO R9700 Creator 32GB GDDR6. If I need to.

Currently my Ryzen iGPU is all I need, because all I need is to see the graphical desktop environment on my screen ;) It does the job well.

I use Claude Code as well and I am slightly concerned with that ID verification news, even more so because of the technology partner that they chose.

SuspciousCarrot78@lemmy.world · 8 hours ago

Hmm. The R9700 is RDNA4 - ROCm support for that architecture may be patchy in linux? Dunno. Check that before you commit your hard earned dollary-doos.

If all good

Qwen2.5-Coder-32B fits comfortably and is genuinely capable.
Qwen3.5-27B (dense)
Qwen3.5-35B-A3B (MoE, only 3B active parameters)
Qwen3.6-35B-A3B just dropped

Qwen 3.6 is the latest hotness. I’d start from there and work backwards

https://inv.nadeko.net/embed/YKNvkBbRJIE?

https://www.youtube.com/watch?v=YKNvkBbRJIE

Sims@lemmy.ml · 11 hours ago

Yucky… The identity partner looks nasty. I bet all these corps are linked to the Epstein class and their think tanks. The 1% are taking control of their cattle…

lsjw96kxs@sh.itjust.works · 9 hours ago

Personally, I would like to use AI, but I don’t due to it being non local. I know there are local AI that could do things, but I don’t know which models are the good one for each task. If someone can give me pointers for it, I’d be grateful, for exemple a good model for local coding :)

lime!@feddit.nu · 9 hours ago

depends on your hardware and your preferred language. i think wizardcoder is a pretty common choice but the smallest useful version is around 14GB so you need the vram to accommodate it.

SuspciousCarrot78@lemmy.world · edit-2 9 hours ago

How much VRAM do you have?
Which GPU?
What sort of coding do you want to do?

No point in telling you “yo, dude, just grab MinMax 2.7 or GLM5.1”…unless you happen to have several GPUs running concurrently with a total combined VRAM pool of 500GB or more.

There are strong local contenders… (Like Qwen3-Coder-Next but as you can see, the table ante is probably in the 45GB vram range just to load them up. Actually running them with a decent context length is likely to mean you need to be in the 80-100GB range.

Do-able…but maybe pay $10 on OpenRouter first to test drive them before committing to $2000+ worth of hardware upgrades.

There are other, more reasonable, less hardware dependent uses for local LLMs, but if you want fully local coders, it’s the same old story: pay to play (and that’s even if you don’t mind slow speed / overnight batch jobs).

Right now, cloud-based providers are hemorrhaging money because they know it will lead to lock-in (ie: people will get use to what can be achieved with SOTA models, forgetting the multi-million dollar infrastructure required to run them). Then, when they realize you can’t quite do the same with local gear (at least, without spending $$$), they can ratchet the prices up.

Codex pro-plan just went to $300/month.

We’ve seen this playbook before, right?