Sycophantic behavior in AI affects us all, say researchers

vegeta@lemmy.world · 7 hours ago

Sycophantic behavior in AI affects us all, say researchers

OwOarchist@pawb.social · edit-2 5 hours ago

The crazy thing is that the technology isn’t naturally sycophantic on its own. It can generate any kind of text at all; it doesn’t have to generate fawningly sycophantic text.

Where that comes from is the ‘hidden prompt’ every major AI company puts into their AI. In addition to the prompt you send, the interface also sends it other prompts that you don’t see, telling it things like ‘be polite, agreeable, and helpful’, ‘avoid profanity’, ‘respond like a knowledgeable expert’, and ‘refuse to generate anything copyrighted, sexually explicit, or violent’, etc, etc, etc. And these hidden prompts define much of the AI’s behavior and “personality”. To some degree, this is necessary for it to be an even vaguely useful tool, and these hidden prompts help it pass various tests. Some LLMs, if you ask them to, will repeat their hidden prompt to you so you can see what it’s actually being asked to do.

And either because it drives engagement … or just because the CEO types in charge of these decisions love sycophantic behavior so much, the sycophantic fawning is specifically asked for in these hidden prompts.

AI doesn’t have to be like this. The companies making AI are deliberately making it sycophantic.

arcine@jlai.lu · 18 minutes ago

It also comes from the human supervised training, sycophantic responses are more likely to be marked as appropriate, which makes them more likely.