Recently, there has been considerable interest in large language models: machine learning systems which produce human-like text and dialogue. Applications of these systems have been plagued by persistent inaccuracies in their output; these are often called “AI hallucinations”. We argue that these fa...
Recently, there has been considerable interest in large language models: machine learning systems which produce human-like text and dialogue. Applications of these systems have been plagued by persistent inaccuracies in their output; these are often called “AI hallucinations”. We argue that these falsehoods, and the overall activity of large language models, is better understood as bullshit in the sense explored by Frankfurt (On Bullshit, Princeton, 2005): the models are in an important way indifferent to the truth of their outputs. We distinguish two ways in which the models can be said to be bullshitters, and argue that they clearly meet at least one of these definitions. We further argue that describing AI misrepresentations as bullshit is both a more useful and more accurate way of predicting and discussing the behaviour of these systems.
Control the language and you control the thought. I pitched a fit when "hallucinate" was put forward by the tech giants to describe their LLMs' falsehoods, and it mostly fell on deaf ears in my circles. Hallucinating isn't what these things do. They bullshit.
Hallucination also hid that literally everything they produce is a 'hallucination' because that's how they work. "Bullshit" is much more apt, as a bullshitter is sometimes and even often right.
The use of anthropomorphic language to describe LLMs is infuriating. I don't even think bullshit is a good term, because among other things it implies intent or agency. Maybe the LLM produces something that you could call bullshit, but to bulshit is a human thing and I'd argue that only reason that what the LLM is producing can be called bullshit is because there's a person involved in the process.
Probably better to think about it in terms of lossy compression. Even if that's not quite right, it's less inaccurate and it doesn't obfuscate difference between what the person brings to the table and what the LLM is actually doing.