• psycho_driver@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    14 hours ago

    Bruh a couple of months ago I asked it (Gemini) to check the number of characters, including spaces, in a potential game character name because I was working at the time and couldn’t stop to check my in-head count. It told me 21–I had counted 20. I thought I must have gotten distracted and miscounted. Later when I had time to actually focus on the issue it turned out AI had miscounted a 20 character string (maybe counting the null terminating character?).

    • boonhet@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      21
      ·
      14 hours ago

      AI doesn’t see individual characters, it sees tokens, with most tokens being a word or part of a word. That’s why per-character questions have such a high failure rate.

      • PunnyName@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        31 minutes ago

        If it doesn’t understand the simple concept of the number of letters and spaces, it needs to be reprogrammed.

        ETA: sorry folks, not gonna change my view and simp for shit A.I., continue with the downvotes.

        • boonhet@sopuli.xyz
          link
          fedilink
          English
          arrow-up
          19
          ·
          13 hours ago

          It doesn’t understand anything though? It never will. It’s a probability machine. If you choose to believe its output, that’s on you. I use it as a coding assistant to get boring things done faster. Fire a prompt at claude code, grab a coffee, check out the diff. But that last step is crucial. Can’t trust AI output blindly.

          • dream_weasel@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            4
            ·
            12 hours ago

            The embedding layer post tokenization is not just a probability machine the way you’re suggesting it. You can argue that it is probabilistic with inferred sentiment, but too many people think it works like how text prediction on your phone does and that is just factually inaccurate.

            Verify output of course, but saying “it doesn’t understand anything” and “probability machine” is a borderline erroneous short sell. At the level of tokens it “understands” relationships, and those relationships are not probabilistic, though they are fundamentally approximated based on a training corpus.

            • hesh@quokk.au
              link
              fedilink
              English
              arrow-up
              4
              ·
              12 hours ago

              Can you explain how it’s more than probability? It’s using a neural network to guess the most likely next token, isn’t it?

              • SlimePirate@lemmy.dbzer0.com
                link
                fedilink
                English
                arrow-up
                1
                ·
                1 hour ago

                The fact that it uses a non-trivial neural network. If it was simply a rate count of based on a corpus of how much time each word is followed by each it wouldn’t be stronger than keyboard word predictions. To make accurate suggestions requires emergence of primitive reasoning on the semantics of the tokens, LLM neural networks (transformers) can be analyzed to find subnetworks dedicated to modeling reality. It is still probability, but saying it’s just probability is not faithful

                • hesh@quokk.au
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  34 minutes ago

                  It’s still just predicting the next token, it’s just using more past data points than your keyboard. The rest of the phenomena are emergent from that. I think it’s important to keep that in mind given how much they can imitate human reasoning.

              • Canigou@jlai.lu
                link
                fedilink
                English
                arrow-up
                3
                ·
                edit-2
                12 hours ago

                You could also say that it chooses what will be the next word it will say to you. It has a few words to choose from, which it has selected in relation to the previously spoken words, your question and previous interactions (the context). The probability you’re talking about (a number) could also be seen as it’s preference among those words. I’m not sure the probability vocabulary/analogy is necessarily the best one. The best might be to not employ any analogy at all, but then you have to dig deeper into the subject to form yourself an informed opinion. This series of videos explains it better than I do : https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi

        • Womble@piefed.world
          link
          fedilink
          English
          arrow-up
          5
          ·
          11 hours ago

          How many letters are there in 令牌? It’s a simple question right, you wouldnt need to search for it to find out would you?