Hallucinations

This is part 10 of “101 Ways AI Can Go Wrong” - a series exploring the interaction of AI and human endeavor through the lens of the Crossfactors framework.

View all posts in this series or explore the Crossfactors Framework

You’re likely familiar with today’s concept in my series of 101 Ways to Screw Things Up with AI.

And if you’re not, you could just confidently pretend you are.

That’s because concept #10 is Hallucinations.

What is it

Hallucinations refers to the generation of false information in particular from generative AI models. We all know ChatGPT confidently makes things up on the fly and we’ve seen many realistic images with some nonsensical elements not part of the prompt.

In a way, this type of output would be better described as confabulation. Many prefer this term but hallucinations is by far more popular.

Why It Matters

Large-language models are used for many tasks, including research. It is suggested by many that querying these models would replace the traditional web search. But hallucinations present a major flaw. Large-language models will output false information with confident language and tone consistent correct information. The user has no way of knowing that the information is incorrect, unless they were already aware of the ground truth.

Real-World Examples

In 2023, a lawyer presented fictitious cases generated by ChatGPT as real cases in a legal brief filed in federal court in the U.S. This became a fairly high profile case, was embarrassing to the law community and was covered widely in the news. The lawyer in question had even double checked the veracity of the cases - with ChatGPT, which assured it of the integrity of the information it provided.

Key Dimensions

Trust - Users place trust in their tools, but this trust must be calibrated to a tool’s reliability for the desired task. In the case of LLM hallucinations, this is very difficult.

Mitigation Measures- The field has quickly made progress with mitigation measures on the technical side, but all approaches remain imperfect. Less frequent hallucinations may cause fewer problems and increase user trust to a point where users let their guards down.

System uncertainty - There are measures of system confidence that are part of the text-generation process. However, none of this information is communicated to the user or attached to the output.

Take-away

Do you think about the trust level you have when working with a LLM? What about when you read material that you suspect may have generated with its help?