The best use of machine learning is to do “human like” pattern recognition at large scale. Like with any statistical analysis, it can have predictable error rates that can be accounted for. The key is to give it narrowly defined tasks that are then followed up by manual review.
A big part of the problem with using it to summarize information is that different topics require different perspectives to analyze properly. A scientist might excel at analyzing and understanding research papers written in their field, but be unable to parse the technical jargon and format of articles from different fields. This is also true with understanding fandoms or cultural subgroups. A specific LLM might be fantastic at analyzing certain types of articles, but suck at others, and because you aren’t able to tweak parameters, you can’t really train it to specialize.
In a sense, the major LLM companies are not only selling a questionable product, but actively holding back more productive uses of AI. They buy up all the computer hardware and waste it trying to build an all purpose tool. Meanwhile, the most fruitful uses of AI cost more to use. Training and running specialized models can theoretically be done on local hardware, but even most prosumers have been priced out. More ambitious uses for scientific research spend more on renting out supercomputers, all because techbros want a monopoly without offering a better product.
And on top of all of that, how much time are you actually saving by asking it to summarize for you? If you need to double check it anyways because it keeps getting stuff wrong, wouldn’t it have been better to do it yourself from the start? You aren’t doing something that would be impossible otherwise, you’re just trying to save time. Machine learning can help scientists run simulations that would take millions of years to run otherwise. Comparing that to the time you spend using and then double checking the LLM, what would your gains be? Would it save orders of magnitude, or would you only save a few minutes off a half hour task? Is the risk of being misinformed worth even that?
The best use of machine learning is to do “human like” pattern recognition at large scale. Like with any statistical analysis, it can have predictable error rates that can be accounted for. The key is to give it narrowly defined tasks that are then followed up by manual review.
A big part of the problem with using it to summarize information is that different topics require different perspectives to analyze properly. A scientist might excel at analyzing and understanding research papers written in their field, but be unable to parse the technical jargon and format of articles from different fields. This is also true with understanding fandoms or cultural subgroups. A specific LLM might be fantastic at analyzing certain types of articles, but suck at others, and because you aren’t able to tweak parameters, you can’t really train it to specialize.
In a sense, the major LLM companies are not only selling a questionable product, but actively holding back more productive uses of AI. They buy up all the computer hardware and waste it trying to build an all purpose tool. Meanwhile, the most fruitful uses of AI cost more to use. Training and running specialized models can theoretically be done on local hardware, but even most prosumers have been priced out. More ambitious uses for scientific research spend more on renting out supercomputers, all because techbros want a monopoly without offering a better product.
And on top of all of that, how much time are you actually saving by asking it to summarize for you? If you need to double check it anyways because it keeps getting stuff wrong, wouldn’t it have been better to do it yourself from the start? You aren’t doing something that would be impossible otherwise, you’re just trying to save time. Machine learning can help scientists run simulations that would take millions of years to run otherwise. Comparing that to the time you spend using and then double checking the LLM, what would your gains be? Would it save orders of magnitude, or would you only save a few minutes off a half hour task? Is the risk of being misinformed worth even that?