How should we think about large language models (LLMs)? People commonly think and talk about them in terms of human intelligence. To the extent this metaphor does not accurately reflect the properties of the technology, this may lead to misguided diagnoses and prescriptions. It seems to me an LLM is not like a human or a human brain in so many ways. One crucial distinction for me is that LLMs lack individuality and subjectivity.
What are organisms that similarly lack these qualities? Coral polyps and Portuguese man o’ war come to mind, or slime mold colonies. Or maybe a single bacterium, like an E. coli. Each is essentially identical to its clones, responds automatically to chemical gradients (bringing to mind how LLMs respond to prompts), and doesn’t accumulate unique experiences in any meaningful way.
Considering all these examples, the meme about LLMs being like a shoggoth (an amorphous blob-like monster originating from the speculative fiction of Howard Philips Lovecraft) is surprisingly accurate. The thing about these metaphors though is that it’s about as hard to reason about such organisms as it is to reason about LLMs. So to use them as a metaphor for thinking about LLMs won’t work. A shoggoth is even less helpful because the reference will only be familiar to those who know their H.P. Lovecraft.
So perhaps we should abandon metaphorical thinking and think historically instead. LLMs are a new language technology. As with previous technologies, such as the printing press, when they are introduced, our relationship to language changes. How does this change occur?
I think the change is dialectical. First, we have a relationship to language that we recognize as our own. Then, a new technology destabilizes this relationship, alienating us from the language practice. We no longer see our own hand in it. And we experience a lack of control over language practice. Finally, we reappropriate this language use in our practices. In this process of reappropriation, language practice as a whole is transformed. And the cycle begins again.
For an example of this dialectical transformation of language practice under the influence of new technology, we can take Eisenstein’s classic account of the history of the printing press (1980). Following its introduction many things changed about how we relate to language. Our engagement with language shifted from a primarily oral one to a visual and deliberative one. Libraries became more abundantly stocked, leading to the practice of categorization and classification of works. Preservation and analysis of stable texts became a possibility. The solitary reading experience gained prominence, producing a more private and personal relationship between readers and texts. Concerns about information overload first reared its head. All of these things were once new and alien to humans. Now we consider them part of the natural order of things. They weren’t predetermined by the technology, they emerged through this active tug of war between groups in society about what the technology would be used for, mediated by the affordances of the technology itself.
In concrete material terms, what does an LLM consist of? An LLM is just numerical values stored in computer memory. It is a neural network architecture consisting of billions of parameters in weights and biases, organized in matrices. The storage is distributed across multiple devices. System software loads these parameters and enables the calculation of inferences. This all runs in physical data centers housing computing infrastructure, power, cooling, and networking infrastructure. Whenever people start talking about LLMs having agency or being able to reason, I remind myself of these basic facts.
A printing press, although a cleverly designed, engineered, and manufactured device, is similarly banal when you break it down to its essential components. Still, the ultimate changes to how we relate to language have been profound. From these first few years of living with LLMs, I think it is not unreasonable to think they will cause similar upheavals. What is important for me is to recognize how we become alienated from language, and to see ourselves as having agency in reappropriating LLM-mediated language practice as our own.