Stop telling me AI is the future [Still Vreni]

cannedtuna@lemmy.world · 1 month ago

Stop telling me AI is the future [Still Vreni]

SystemDisc@feddit.org · edit-2 1 month ago

Such a fallacy. Anything that falls under the umbrella of machine learning will contribute to future AI. We certainly won’t improve LLMs such that they become AGI, but all of it contributes.

And, whether or not future AI even uses traditional silicon computing is also irrelevant.

What matters is improved understanding of mathematics, neurons, chemistry, electronics, etc. That all happens each step of the way, even if the next technology is completely different.

JcbAzPx@lemmy.world · 1 month ago

What matters is improved understanding of mathematics, neurons, chemistry, electronics, etc.

All of which have absolutely nothing to do with what we are currently calling AI.

SystemDisc@feddit.org · edit-2 1 month ago

Doing with it, sure, but the creation of LLMs, and the algorithms behind them, especially the training, are what I’m talking about. It’s a lot of very impressive, complicated math

I think it’s pretty pathetic that “fuck AI” has become the trendy, cool thing. It really misses the mark. It should be fuck capitalism and the sociopathic CEOs abusing AI and shoving it down our throats. AI is not the problem.

AnyOldName3@lemmy.world · 1 month ago

It’s actually just a lot of pretty simple maths from decades ago, but it’s a lot of it. The big changes in those decades have been the feasibility of doing enough of that simple maths to achieve anything useful, and domain-specific network architecture stuff that’s rarely transferable, e.g. LLMs are possible because of the invention of the transformer architecture in 2017, and that’s also turned out to be useful for a few things like image generation and protein folding simulation, but not for all neural network based techniques, and then most of the things that have made successive LLMs better haven’t also been useful for the few other transformer-architecture-based neural networks. Most not-LLM AI isn’t going to be meaningfully easier to create than it would have been had the world got bored after GPT-2 and we’d only focussed on doing image and video generation.

pfried@reddthat.com · 1 month ago

Transformer is useful for damn near anything. At the end of the day, what we consider intelligence is the ability to predict what comes next, whether that is what our senses will tell us next or what the next hypothesis to test should be based on the data we have seen so far.

AnyOldName3@lemmy.world · 1 month ago

It’s not damn near anything. There’s loads of stuff that computers can do much more quickly and more accurately without it just by virtue of computers already being fast and effective at maths and obeying logic. With or without the transformer architecture, a neural network is never going to be as fast or reliable at, for example, summing a collection of numbers as just adding them would be, and loads of real-world tasks are like this, hence why we’ve built billions of computers even before the transformer architecture was invented.

Also, in particular, I didn’t say that the transformer architecture wasn’t useful for things that aren’t LLMs, I said that most of the work done specifically to improve LLMs has no applications outside LLMs, so the next big leap towards making computers intelligent isn’t helped more by working on LLMs than it would be by working on any other kind of AI.

pfried@reddthat.com · 29 days ago

I’m saying there is no “big leap” necessary. As the paper that introduced the transformer said, attention is all you need.

AnyOldName3@lemmy.world · 29 days ago

If we’re going to pull up other people’s pithy phrases that aren’t intended to be taken entirely literally, then the relevant one here is machine learning is the second best solution to any problem. In the (approximately, depending on how you define it) century people have been thinking about computers, we’ve already found better solutions to lots of problems. If a transformer-based neural network can get 99% accuracy in sixty seconds on 92 billion transistors of GPU and billions more for its VRAM, that’s pretty useless if we can also do it with 100% accuracy in sixty microseconds on a $1 microcontroller, or even faster on a less constrained device.

The attention is all you need phrase is specifically in the context of sequence transduction models, and specifically referring to the discovery that they don’t need a combination of attention, recurrence and convolution, but actually only need attention if it’s used in the novel way introduced by the paper. If you don’t need to transduce any sequences, then this isn’t necessarily relevant, and it’s critically not a claim that you can do everything by transducing sequences. It was a surprise that applying it to generating new text instead of just converting it worked as well as it did, and a surprise that it kept getting better with larger models instead of plateauing around the GPT-1 and GPT-2 era, and a surprise that the text generation could be used to do other things, even ones as basic as addition. These things weren’t predicted by the Attention Is All You Need paper.

pfried@reddthat.com · 19 days ago

“machine learning is the second best solution to any problem”

In much the same way as human thinking is the second best (and soon third best) solution to any problem. The point is that an LLM can come up with the best solution and use it.

These things weren’t predicted by the Attention Is All You Need paper.

Obviously not — they’re not going to make claims beyond the results they achieved in the paper. It was, however, obvious to everyone who read the paper that all of what we consider thinking could be derived by clever application of a sequence model, and all those papers that came after were results achieved by teams doing the obvious thing.