The West Forgot How to Build. Now It's Forgetting Code

HaraldvonBlauzahn@feddit.org · 1 day ago

The West Forgot How to Build. Now It's Forgetting Code

ikt@aussie.zone · 19 hours ago

The useful ones are still provided by big companies because the rest of us can’t afford the hardware to train them.

We have computing power in our pockets a million times more powerful than we used to send man to the moon, why do you think we’ll never have enough power?

I have already pointed out https://eurollm.io/

The EuroLLM project includes Instituto Superior Técnico, the University of Edinburgh, Instituto de Telecomunicações, Université Paris-Saclay, Unbabel, Sorbonne University, Naver Labs, and the University of Amsterdam. Together they created EuroLLM-22B, a multilingual AI model supporting all 24 official EU languages. Developed with support from Horizon Europe, the European Research Council, and EuroHPC, this open-source LLM aims to enhance Europe’s digital sovereignty and foster AI innovation. Trained on the MareNostrum 5 supercomputer, EuroLLM outperforms similar-sized models. It is fully open source and available via Hugging Face.

So long as someone doesn’t want to rely on big tech there will be people pushing for independence just like Linux users such as myself

boonhet@sopuli.xyz · 15 hours ago

22B

There are 700B+ parameter open weight models now. Frontier models are in the trillions.

And even that model apparently took a supercomputer to train. I don’t have a supercomputer so I can’t train my own models like I can compile my own software. This is not comparable to running Linux where you can just compile your own kernel or even whole operating system (former Gentoo user here).

I’ve tried running the models my 8 GB card can handle. They’re OK for a quick question, but they won’t be doing anything useful for me.

zalgotext@sh.itjust.works · 17 hours ago

We have computing power in our pockets a million times more powerful than we used to send man to the moon, why do you think we’ll never have enough power?

Not the person you replied to, but I have thoughts on this point in particular:

Consumer devices have started to slow down their performance improvements because we’re bumping up against the limits of physics
People/corporations with way more money than the average consumer will always be able to run something orders of magnitude more powerful. Any advances that improve things for the average consumer will improve things for rich people/corporations even more.
Training an LLM isn’t really even about compute speed, it’s about access to good training material. The average consumer can’t afford to buy (or pirate) every book in existence like a rich person/corporation can. An average person doesn’t have the ability or time to curate their own training data, but rich people/corporations do.

Auli@lemmy.ca · edit-2 15 hours ago

Because companies are using so much computing power it requires as much electricity as a city. Or you can take your pocket computing resources and see his long it takes to train an LLM.