If it were hallucinations which it very well could be, it means the model has learned this bias somewhere. Indicating Grok has either been programmed to derank Palestine content, or Grok has learned it by himself (less likely).
It’s difficult to conceive the AI manually making this up for no reason, and doing it so consistently for multiple accounts so consistently when asked the same question.
yea the only way I can see confidence being stored as a string would be if the key was meant for a GUI management interface that didn’t hardcode possible values(think for private investors or untrained engineers for sugar/cosmetic reasons). In an actual system this would almost always be a number or boolean not a string.
Being said, its entierly possible that it’s also using an LLM for processing the result, which would mean they could have something like “if its rated X or higher” do Y type deal, where the LLM would then process the string and then respond whether it is or not, but that would be so inefficient. I would hope that they wouldn’t layer like that.
Likely just hallucinations. For example, there is no way they would store a confidence score as a string
If it were hallucinations which it very well could be, it means the model has learned this bias somewhere. Indicating Grok has either been programmed to derank Palestine content, or Grok has learned it by himself (less likely).
It’s difficult to conceive the AI manually making this up for no reason, and doing it so consistently for multiple accounts so consistently when asked the same question.
yea the only way I can see confidence being stored as a string would be if the key was meant for a GUI management interface that didn’t hardcode possible values(think for private investors or untrained engineers for sugar/cosmetic reasons). In an actual system this would almost always be a number or boolean not a string.
Being said, its entierly possible that it’s also using an LLM for processing the result, which would mean they could have something like “if its rated X or higher” do Y type deal, where the LLM would then process the string and then respond whether it is or not, but that would be so inefficient. I would hope that they wouldn’t layer like that.