• 0 Posts
  • 542 Comments
Joined 3 years ago
cake
Cake day: June 18th, 2023

help-circle















  • Language models are not databases and they are not markov bots (similar function but work directly using statistical word association maps).

    Except that it’s been demonstrated multiple times that original training data can be extracted from a language model, so it is completely valid to talk about the model as a database, because the training data is stored within it.

    Here’s a broad survey of more than 100 research papers demonstrating this: Training Data Extraction From Pre-trained Language Models: A Survey

    There is much more uncertainty about what is going on under the hood.

    So, this is a good anology in this case.

    See, I know how an internal combustion engine works. I don’t know, by looking at the hood of a particular vehicle, how exactly a specific car’s engine operates (maybe it has 4 cylinders, or 6 or 8, maybe it has fuel injectors, maybe it has a carburetor, etc). However, I do know that the principles are the same for all internal combustion engines, and that just because I don’t know the details of how a particular engine operates, that does not mean that its operation is beyond my understanding.

    The same is true for machine learning models. There may be uncertainty as to how a particular model operates “under the hood”, but the principles of operation are the same for all, and are not incomprehensible.

    The main thing we can really know is that ultimately a human mind is a computer

    We actually don’t know this. This is called computationalism. It is speculative, there are several alternative theories, and little in the way of experimental evidence supporting any particular theory.

    The idea that they are “just statistical models” and this knowledge can be used to say what is impossible for them from philosophical first principles keeps getting repeated but has never worked in practice. The reality is that no one knows enough to say for sure where the line is.

    You have to understand, the current branch of machine learning models grew out of algorithms whose purpose was processing large data sets with thousands or millions of variables and optimizing for areas in the data set where many of those variables were maximized (or minimized). Here’s a better explanation:

    Hill Climbing Algorithm & Artificial Intelligence - Computerphile

    How these tools perform their optimization, and what they optimize for, has been recombined in different ways to produce different types of models, and the search space of variables has been expanded with increased computing power, but the underlying operating principles are still the same. This is not a tool that can comprehend what it is doing, it can’t be self-aware. It can only process large amounts of input data and attempt to maximize for particular dimensions. This seems vague to humans because the amount of variables being handled at any given time is far more than a human mind can focus on, but that doesn’t make the optimization routine intelligent or conscious. It’s just doing a lot of number crunching really fast, optimizing for specific aspects as directed by its developers.