top of page
Allison Parrish Image.jpeg



Allison Parrish



Machine learning models language as a sequence of tokens. Such models of language assign a probability to a given sequence of tokens: "Let's take a walk in the park" has a higher probability than "Let's take a walk in the ocean" which in turn has a higher probability than "Walk park let's a take the in." This is how autocomplete works, whether in the Google search box or QuickType suggestions on your iPhone. The model that drives those technologies considers the words you've typed in, then shows the words that are most likely to come next in that sequence. By repeatedly chaining such predictions together, you can generate a new text from scratch.





This technique operationalizes an intuition that we have about language: that it can be predictable, or it can be weird. A sequence of words with high probability draws no attention to itself; a sequence of words with low probability—say, a poetic juxtaposition—requires effort to interpret. "Juxtaposition makes the reader an accomplice," writes Charles Hartman in Virtual Muse. "[W]e supply a lot of energy, and that involves us in the poem." [1]


Energy, it turns out, is at the heart of machine learning models of language. It takes energy (in the form of computation) to train the models, of course, and energy to assign probability to a sequence of tokens. But energy—in the form of heat—is also an important metaphor for how programmers use language models to generate text. The quality of text generated with a language model often improves if words are chosen by sampling from the model's probability distribution, instead of simply picking the most likely word. In machine learning, this is called softmax sampling, and the softmax sampling process can be adjusted with a parameter called temperature. When the temperature parameter is low, the model is more likely to choose tokens that were already likely; when the parameter is high, the model attenuates the probabilities of the likeliest tokens and boosts those of less likely tokens.


Here's an example of how it works. Consider the phrase "Let's take a walk in the ____." Imagine the model has determined that the word filling in the blank will be "park" 50% of the time, "woods" 30% of the time, and "forest" 15% of the time, with the probabilities of several thousand other words combining to make up the remaining 5%. A model that uses softmax sampling will choose randomly among these words, weighted by their respective probabilities. When the softmax temperature parameter is low, the probability of "park" will be further boosted, and the probability of the remaining words diminished ("park" might be selected 90% of the time, "woods" 5%, "forest" 1%, etc.). As the softmax temperature parameter increases, however, the probabilities of the tokens even out, until no one token is more likely to be chosen than any other ("park" has no more chance of being chosen than "woods" or "alabaster" or "amongst" or "pikachu").


A low temperature will tend to produce text with predictable, repetitive language, while a high temperature will produce text that careens wildly from one non-sequitur juxtaposition to the next. A programmer making use of the model can adjust the softmax temperature parameter to suit their own aesthetic purposes.


The use of the word temperature comes from thermodynamics and statistical mechanics, where it describes entropy in systems of constant composition. As the temperature increases, the number of possible states the system can be in also increases. As the temperature decreases, the number of possible states decreases. A simple example of this is phase changes of matter: ice, a solid, has a predictable, crystalline structure; as heat is applied, it becomes first liquid water, and then water vapor, both of which take on a variety of shapes and arrangements less predictable than their solid counterpart.


The Epilith Poems attempt to recapitulate in language how crystalline structures come to life, using the processes of softmax sampling in machine learning language models as a method. I was inspired by lichens, which are among a handful of microbial organisms that live on rocks and are "believed to be involved in the weathering of rocks by promoting mineral diagenesis and dissolution" [2]; through this process, according to Merlin Sheldrake, "[l]ichens are how the inanimate mineral mass within rocks is able to cross over into the metabolic cycles of the living."[3] 

Lichens introduce entropy to the structure of rock, breaking down its crystalline structure and making the nutrients contained therein available to other forms of life.


The opening stanzas of the poems are generated with a computer program that selects individual words from a language model based on the frequency of individual words in a corpus of Wikipedia pages related to detritus. In each stanza, the temperature of the sampling increases, leading to text with more and more unusual juxtapositions.


The second half of each poem is an attempt to impose equilibrium on this fertile mess, returning it to a state of lower entropy. Through several repetitions, words in the stanza are replaced at random with words that a neural network language model considers to be more likely in that context. This continues until no word has a more likely alternative according to the model.

Whereas the statistical probabilities in the first half of the poem are drawn from a corpus that I selected by hand, the probabilities in the second half are drawn from a large commercial language model (Huggingface's DistilBERT [4]) that incorporates statistical information from many gigabytes of uncurated text. This model is similar to those used in search engine autocomplete algorithms. The intent here is to show how such commercial language models tend to smooth over high-energy poetic juxtapositions, resulting in predictable structures that perhaps invite future applications of algorithmic fungal entropy.

Screen Shot 2021-03-20 at 10.40.29
Screen Shot 2021-03-20 at 10.40.41

[1] Hartman, Charles O. “Start with Poetry.” Virtual Muse: Experiments in Computer Poetry, Wesleyan University Press, 1996, pp. 16–27.

[2] Burford, Euan P., et al. “Geomycology: Fungi in Mineral Substrata.” Mycologist, vol. 17, no. 3, Aug. 2003, pp. 98–107.

[3] Sheldrake, Merlin. Entangled Life: How Fungi Make Our Worlds, Change Our Minds and Shape Our Futures. Random House, 2020.

Epilith #1 (“Moths”)


The the the the

the the the the

the the the in.


A and of a

of the the to

to the the the.


Moths and or more

and a be the

of often that few.


Moths and or litter

to and exploited a

its applied it or.


Moths and or litter

to peterson a its

applied it hydrogen a.


Moths due gives resource

sources lead towards bursting

other specify seeds nis.


Moths due gives resource

manhattan tributyrin varro hands

lacked usability carpeted odors.


Moths due to resource

manhattan tributyrin finnish hands

lacked usability carpeted carpets.


Moths belonging to the

manhattan tributyrin finnish hands

lacked suitability carpeted carpeting.


Mos belonging to the

manhattan turbutyrite finnish hands

lacked suitably carpeted carpeting.


Carpets belonging to the

manhattan timbuthwaite finnish hands

lacked suitably carpeted carpets.


Carpets belonging to the

manhattan timbutton & finnish firm

lacked suitably patterned carpets.


Carpets belonging to the

manhattan wimbleton & danish firm

produced suitably patterned carpets.


Carpet manufacturers belonging to the

manhattan wimblett & davis

company produced suitably patterned carpets.




Epilith #2 (“Bergstrom cows”)


The the the the

the the the the

the the the the.


The in of to

ecosystem and the the

of of use and.


The in of to

rapidly in of species

which of of be.


That or is them

in of species process

of the or infect.


Have escape from ireland

species as the on

the most growing and.


Have approved tomatoes decade

broths berryman drives invade

moves aware harmed senecio.


Have approved tomatoes decade

broths berryman cows aware

harmed senecio shifted resemble.


Federally approved tomatoes decade

brooks berryman cows aware

harmed senegal shifted resemble.


Federally approved james decade

brooks bergman cows aware

harmonized senegal shifted resemble.


Federally protected james decade

brooks bergstrom cows aware

harmonized senegal shifted wildlife.


Federally protected jamaic decade

brooks bergstrom cows

aware harmonized senegalese wildlife.


Federally protected jamaica

brooks bergstrom cows from

harming senegalese wildlife.


Federally protected jamaica brook

prevents beaverstrom trout

from harming senegalese wildlife.


Federally protected jamaica brook

prevents beaver rainbow trout from

harming senegal salmon populations.

Allison Parrish is a computer programmer, poet, educator and game designer whose teaching and practice address the unusual phenomena that blossom when language and computers meet. She is an Assistant Arts Professor at NYU's Interactive Telecommunications Program, where she earned her master's degree in 2008. Read more about her work at

sm logo white background.jpg
bottom of page