Convincing

perishthethought@piefed.social · 1 month ago

Convincing

A_norny_mousse@piefed.zip · 1 month ago

AI agents are remarkably bad at “self-awareness”

🤔 what does it say when you tell it something like “look, this is wrong, and this is why, can you please fix that”? In a general sense, not going into technical aspects like what OOP is describing.

Hazzard@lemmy.zip · 1 month ago

It’s usually pretty good about that, very apologetic (which is annoying), and usually does a good job taking it into account, although it sometimes needs reminders as that “context” gets lost in later messages.

I’ll give some examples. In that same networking session, it disabled some security feature, to test if it was related. It never remembered to turn that back on until I specifically asked it to re-enable “that thing you disabled earlier”. To which it responds something like “Of course, you’re right! Let’s do that now!”. So, helpful tone, “knew” how to do it, but needed human oversight or it would have “forgotten” entirely.

Same tone when I’d tell it something like “stop starting all your commands with SSH, I’m in an SSH session already.” Something like “of course, that makes sense, I’ll stop appending SSH immediately”. And that sticks, I assume because it sees itself not using SSH in its own messages, thereby “reminding” itself.

Its usual tone is always overly apologetic, flattering, etc. For example, if I tell it bluntly I’m not giving my security credentials to an LLM, it’ll always say something along the lines of “great idea! That’s a good security practice”, despite directly suggesting the opposite moments prior. Of course, as we’ve seen with lots of examples, it will take that tone even if actually can’t do what you’re asking, such as in the examples of asking ChatGPT to give you a picture of a “glass of wine filled to the very top”, so it’s “tone” isn’t really something you can rely on as to whether or not it can actually correct the mistake. It’s always willing to take another attempt, but I haven’t found it always capable of solving the issue, even with direction.

Bronzebeard@lemmy.zip · 1 month ago

Meh, they apologize then proceed to continue making the same mistake. Repeatedly.