minus-squareMHard@lemmy.worldtoTechnology@lemmy.world•How Much Do LLMs Hallucinate in Document Q&A Scenarios? A 172-Billion-Token Study Across Temperatures, Context Lengths, and Hardware Platforms [TLDR: 25%]linkfedilinkEnglisharrow-up9·3 days agoThe task described in this article is asking questions about a document that was provided to the llm in the context. I would hope that if you give a human a text and ask them to cite facts from it they would do better than 99% correct. Also, when the tokens exceeded 200k, the llm error rate was higher than 10% linkfedilink
MHard@lemmy.world to memes@lemmy.world · 1 year agoMy portfolio looks much cleaner thoughlemmy.worldimagemessage-square0fedilinkarrow-up12
arrow-up12imageMy portfolio looks much cleaner thoughlemmy.worldMHard@lemmy.world to memes@lemmy.world · 1 year agomessage-square0fedilink
MHard@lemmy.world to memes@lemmy.worldEnglish · 1 year agoI give it my best thoughlemmy.worldimagemessage-square0fedilinkarrow-up11
arrow-up11imageI give it my best thoughlemmy.worldMHard@lemmy.world to memes@lemmy.worldEnglish · 1 year agomessage-square0fedilink
The task described in this article is asking questions about a document that was provided to the llm in the context.
I would hope that if you give a human a text and ask them to cite facts from it they would do better than 99% correct.
Also, when the tokens exceeded 200k, the llm error rate was higher than 10%