On how to think about large language models

How should we think about large lan­guage mod­els (LLMs)? Peo­ple com­mon­ly think and talk about them in terms of human intel­li­gence. To the extent this metaphor does not accu­rate­ly reflect the prop­er­ties of the tech­nol­o­gy, this may lead to mis­guid­ed diag­noses and pre­scrip­tions. It seems to me an LLM is not like a human or a human brain in so many ways. One cru­cial dis­tinc­tion for me is that LLMs lack indi­vid­u­al­i­ty and subjectivity.

What are organ­isms that sim­i­lar­ly lack these qual­i­ties? Coral polyps and Por­tuguese man o’ war come to mind, or slime mold colonies. Or maybe a sin­gle bac­teri­um, like an E. coli. Each is essen­tial­ly iden­ti­cal to its clones, responds auto­mat­i­cal­ly to chem­i­cal gra­di­ents (bring­ing to mind how LLMs respond to prompts), and does­n’t accu­mu­late unique expe­ri­ences in any mean­ing­ful way. 

Con­sid­er­ing all these exam­ples, the meme about LLMs being like a shog­goth (an amor­phous blob-like mon­ster orig­i­nat­ing from the spec­u­la­tive fic­tion of Howard Philips Love­craft) is sur­pris­ing­ly accu­rate. The thing about these metaphors though is that it’s about as hard to rea­son about such organ­isms as it is to rea­son about LLMs. So to use them as a metaphor for think­ing about LLMs won’t work. A shog­goth is even less help­ful because the ref­er­ence will only be famil­iar to those who know their H.P. Lovecraft.

So per­haps we should aban­don metaphor­i­cal think­ing and think his­tor­i­cal­ly instead. LLMs are a new lan­guage tech­nol­o­gy. As with pre­vi­ous tech­nolo­gies, such as the print­ing press, when they are intro­duced, our rela­tion­ship to lan­guage changes. How does this change occur?

I think the change is dialec­ti­cal. First, we have a rela­tion­ship to lan­guage that we rec­og­nize as our own. Then, a new tech­nol­o­gy desta­bi­lizes this rela­tion­ship, alien­at­ing us from the lan­guage prac­tice. We no longer see our own hand in it. And we expe­ri­ence a lack of con­trol over lan­guage prac­tice. Final­ly, we reap­pro­pri­ate this lan­guage use in our prac­tices. In this process of reap­pro­pri­a­tion, lan­guage prac­tice as a whole is trans­formed. And the cycle begins again.

For an exam­ple of this dialec­ti­cal trans­for­ma­tion of lan­guage prac­tice under the influ­ence of new tech­nol­o­gy, we can take Eisenstein’s clas­sic account of the his­to­ry of the print­ing press (1980). Fol­low­ing its intro­duc­tion many things changed about how we relate to lan­guage. Our engage­ment with lan­guage shift­ed from a pri­mar­i­ly oral one to a visu­al and delib­er­a­tive one. Libraries became more abun­dant­ly stocked, lead­ing to the prac­tice of cat­e­go­riza­tion and clas­si­fi­ca­tion of works. Preser­va­tion and analy­sis of sta­ble texts became a pos­si­bil­i­ty. The soli­tary read­ing expe­ri­ence gained promi­nence, pro­duc­ing a more pri­vate and per­son­al rela­tion­ship between read­ers and texts. Con­cerns about infor­ma­tion over­load first reared its head. All of these things were once new and alien to humans. Now we con­sid­er them part of the nat­ur­al order of things. They weren’t pre­de­ter­mined by the tech­nol­o­gy, they emerged through this active tug of war between groups in soci­ety about what the tech­nol­o­gy would be used for, medi­at­ed by the affor­dances of the tech­nol­o­gy itself.

In con­crete mate­r­i­al terms, what does an LLM con­sist of? An LLM is just numer­i­cal val­ues stored in com­put­er mem­o­ry. It is a neur­al net­work archi­tec­ture con­sist­ing of bil­lions of para­me­ters in weights and bias­es, orga­nized in matri­ces. The stor­age is dis­trib­uted across mul­ti­ple devices. Sys­tem soft­ware loads these para­me­ters and enables the cal­cu­la­tion of infer­ences. This all runs in phys­i­cal data cen­ters hous­ing com­put­ing infra­struc­ture, pow­er, cool­ing, and net­work­ing infra­struc­ture. When­ev­er peo­ple start talk­ing about LLMs hav­ing agency or being able to rea­son, I remind myself of these basic facts.

A print­ing press, although a clev­er­ly designed, engi­neered, and man­u­fac­tured device, is sim­i­lar­ly banal when you break it down to its essen­tial com­po­nents. Still, the ulti­mate changes to how we relate to lan­guage have been pro­found. From these first few years of liv­ing with LLMs, I think it is not unrea­son­able to think they will cause sim­i­lar upheavals. What is impor­tant for me is to rec­og­nize how we become alien­at­ed from lan­guage, and to see our­selves as hav­ing agency in reap­pro­pri­at­ing LLM-medi­at­ed lan­guage prac­tice as our own.