On how to think about large language models

How should we think about large language models (LLMs)? People commonly think and talk about them in terms of human intelligence. To the extent this metaphor does not accurately reflect the properties of the technology, this may lead to misguided diagnoses and prescriptions. It seems to me an LLM is not like a human or a human brain in so many ways. One crucial distinction for me is that LLMs lack individuality and subjectivity.

What are organisms that similarly lack these qualities? Coral polyps and Portuguese man o’ war come to mind, or slime mold colonies. Or maybe a single bacterium, like an E. coli. Each is essentially identical to its clones, responds automatically to chemical gradients (bringing to mind how LLMs respond to prompts), and doesn’t accumulate unique experiences in any meaningful way.

Considering all these examples, the meme about LLMs being like a shoggoth (an amorphous blob-like monster originating from the speculative fiction of Howard Philips Lovecraft) is surprisingly accurate. The thing about these metaphors though is that it’s about as hard to reason about such organisms as it is to reason about LLMs. So to use them as a metaphor for thinking about LLMs won’t work. A shoggoth is even less helpful because the reference will only be familiar to those who know their H.P. Lovecraft.

So perhaps we should abandon metaphorical thinking and think historically instead. LLMs are a new language technology. As with previous technologies, such as the printing press, when they are introduced, our relationship to language changes. How does this change occur?

I think the change is dialectical. First, we have a relationship to language that we recognize as our own. Then, a new technology destabilizes this relationship, alienating us from the language practice. We no longer see our own hand in it. And we experience a lack of control over language practice. Finally, we reappropriate this language use in our practices. In this process of reappropriation, language practice as a whole is transformed. And the cycle begins again.

For an example of this dialectical transformation of language practice under the influence of new technology, we can take Eisenstein’s classic account of the history of the printing press (1980). Following its introduction many things changed about how we relate to language. Our engagement with language shifted from a primarily oral one to a visual and deliberative one. Libraries became more abundantly stocked, leading to the practice of categorization and classification of works. Preservation and analysis of stable texts became a possibility. The solitary reading experience gained prominence, producing a more private and personal relationship between readers and texts. Concerns about information overload first reared its head.

All of these things were once new and alien to humans. Now we consider them part of the natural order of things. They weren’t predetermined by the technology, they emerged through this active tug of war between groups in society about what the technology would be used for, mediated by the affordances of the technology itself.

In concrete material terms, what does an LLM consist of? An LLM is just numerical values stored in computer memory. It is a neural network architecture consisting of billions of parameters in weights and biases, organized in matrices. The storage is distributed across multiple devices. System software loads these parameters and enables the calculation of inferences. This all runs in physical data centers housing computing infrastructure, power, cooling, and networking infrastructure. Whenever people start talking about LLMs having agency or being able to reason, I remind myself of these basic facts.

A printing press, although a cleverly designed, engineered, and manufactured device, is similarly banal when you break it down to its essential components. Still, the ultimate changes to how we relate to language have been profound. From these first few years of living with LLMs, I think it is not unreasonable to think they will cause similar upheavals. What is important for me is to recognize how we become alienated from language, and to see ourselves as having agency in reappropriating LLM-mediated language practice as our own.

Waiting for the smart city

Nowadays when we talk about the smart city we don’t necessarily talk about smartness or cities.

I feel like when the term is used it often obscures more than it reveals.

Here a few reasons why.

To begin with, the term suggests something that is yet to arrive. Some kind of tech-enabled utopia. But actually, current day cities are already smart to a greater or lesser degree depending on where and how you look.

This is important because too often we postpone action as we wait for the smart city to arrive. We don’t have to wait. We can act to improve things right now.

Furthermore, ‘smart city’ suggests something monolithic that can be designed as a whole. But a smart city, like any city, is a huge mess of interconnected things. It resists topdown design.

History is littered with failed attempts at authoritarian high-modernist city design. Just stop it.

Smartness should not be an end but a means.

I read ‘smart’ as a shorthand for ‘technologically augmented’. A smart city is a city eaten by software. All cities are being eaten (or have been eaten) by software to a greater or lesser extent. Uber and Airbnb are obvious examples. Smaller more subtle ones abound.

The question is, smart to what end? Efficiency? Legibility? Controllability? Anti-fragility? Playability? Liveability? Sustainability? The answer depends on your outlook.

These are ways in which the smart city label obscures. It obscures agency. It obscures networks. It obscures intent.

I’m not saying don’t ever use it. But in many cases you can get by without it. You can talk about specific parts that make up the whole of a city, specific technologies and specific aims.


Postscript 1

We can do the same exercise with the ‘city’ part of the meme.

The same process that is making cities smart (software eating the world) is also making everything else smart. Smart towns. Smart countrysides. The ends are different. The networks are different. The processes play out in different ways.

It’s okay to think about cities but don’t think they have a monopoly on ‘disruption’.

Postscript 2

Some of this inspired by clever things I heard Sebastian Quack say at Playful Design for Smart Cities and Usman Haque at ThingsCon Amsterdam.

Reboot 9.0 day 1

So here’s a short wrap up of the first day. I must say I’m not disappointed so far. The overall level of the talks is quite high again. Here’s what I attended:

Opening keynote – Nice and conceptual/theoretical. Not sure I agree with all the claims made but it was a good way to kick off the day on a gee whizz way.

Jeremy Keith – Good talk, nice slides, didn’t really deliver on the promise of his proposal though. I would’ve really liked to see him go into the whole idea of life streams further. The hack day challenge sounded cool though.

Stephanie Booth – Very topical for me, being a bilingual blogger and designer often confronted with localisation/multilingual issues.

My own talk – Went reasonably well. I guess half of the room enjoyed and the other half wondered what the f*** I was talking about. Oh well, I had fun.

Ross Mayfield – Could have been much better if it hadn’t been for technical screw-ups and perhaps some tighter pacing by Ross. Still the work he’s doing with social software is great.

Matt Jones – Very pretty presentation, nice topic and Dopplr looks cool. I’m not a frequent flyer but I can see the value in it. Still not quite sure it will improve the consequences of air-travel though.

Nicolas Nova – Came across as the high concept, theoretical twin to my talk. Lots of cool pervasive game examples. Nicolas always boggles my mind.

Jyri Engeström – Cool to see how he’s developed his talks throughout the past Reboots. I guess he delivered on his promise and stayed on the right side of the ‘I’m pushing my product’ line.

The evening program – No micro-presentations (which to be honest was fine by me, being quite exhausted). Good food, nice conversations and plenty of weird generative art, live cinema etc. All good.

On to day 2!