

Epilith
Poems
Allison Parrish
Introduction
Machine learning models language as a sequence of tokens. Such models of language assign a probability to a given sequence of tokens: "Let's take a walk in the park" has a higher probability than "Let's take a walk in the ocean" which in turn has a higher probability than "Walk park let's a take the in." This is how autocomplete works, whether in the Google search box or QuickType suggestions on your iPhone. The model that drives those technologies considers the words you've typed in, then shows the words that are most likely to come next in that sequence. By repeatedly chaining such predictions together, you can generate a new text from scratch.
​
​
​
​
​
​
​
​
​
This technique operationalizes an intuition that we have about language: that it can be predictable, or it can be weird. A sequence of words with high probability draws no attention to itself; a sequence of words with low probability—say, a poetic juxtaposition—requires effort to interpret. "Juxtaposition makes the reader an accomplice," writes Charles Hartman in Virtual Muse. "[W]e supply a lot of energy, and that involves us in the poem." [1]
Energy, it turns out, is at the heart of machine learning models of language. It takes energy (in the form of computation) to train the models, of course, and energy to assign probability to a sequence of tokens. But energy—in the form of heat—is also an important metaphor for how programmers use language models to generate text. The quality of text generated with a language model often improves if words are chosen by sampling from the model's probability distribution, instead of simply picking the most likely word. In machine learning, this is called softmax sampling, and the softmax sampling process can be adjusted with a parameter called temperature. When the temperature parameter is low, the model is more likely to choose tokens that were already likely; when the parameter is high, the model attenuates the probabilities of the likeliest tokens and boosts those of less likely tokens.
Here's an example of how it works. Consider the phrase "Let's take a walk in the ____." Imagine the model has determined that the word filling in the blank will be "park" 50% of the time, "woods" 30% of the time, and "forest" 15% of the time, with the probabilities of several thousand other words combining to make up the remaining 5%. A model that uses softmax sampling will choose randomly among these words, weighted by their respective probabilities. When the softmax temperature parameter is low, the probability of "park" will be further boosted, and the probability of the remaining words diminished ("park" might be selected 90% of the time, "woods" 5%, "forest" 1%, etc.). As the softmax temperature parameter increases, however, the probabilities of the tokens even out, until no one token is more likely to be chosen than any other ("park" has no more chance of being chosen than "woods" or "alabaster" or "amongst" or "pikachu").
A low temperature will tend to produce text with predictable, repetitive language, while a high temperature will produce text that careens wildly from one non-sequitur juxtaposition to the next. A programmer making use of the model can adjust the softmax temperature parameter to suit their own aesthetic purposes.
The use of the word temperature comes from thermodynamics and statistical mechanics, where it describes entropy in systems of constant composition. As the temperature increases, the number of possible states the system can be in also increases. As the temperature decreases, the number of possible states decreases. A simple example of this is phase changes of matter: ice, a solid, has a predictable, crystalline structure; as heat is applied, it becomes first liquid water, and then water vapor, both of which take on a variety of shapes and arrangements less predictable than their solid counterpart.
The Epilith Poems attempt to recapitulate in language how crystalline structures come to life, using the processes of softmax sampling in machine learning language models as a method. I was inspired by lichens, which are among a handful of microbial organisms that live on rocks and are "believed to be involved in the weathering of rocks by promoting mineral diagenesis and dissolution" [2]; through this process, according to Merlin Sheldrake, "[l]ichens are how the inanimate mineral mass within rocks is able to cross over into the metabolic cycles of the living."[3]
Lichens introduce entropy to the structure of rock, breaking down its crystalline structure and making the nutrients contained therein available to other forms of life.
The opening stanzas of the poems are generated with a computer program that selects individual words from a language model based on the frequency of individual words in a corpus of Wikipedia pages related to detritus. In each stanza, the temperature of the sampling increases, leading to text with more and more unusual juxtapositions.
The second half of each poem is an attempt to impose equilibrium on this fertile mess, returning it to a state of lower entropy. Through several repetitions, words in the stanza are replaced at random with words that a neural network language model considers to be more likely in that context. This continues until no word has a more likely alternative according to the model.
Whereas the statistical probabilities in the first half of the poem are drawn from a corpus that I selected by hand, the probabilities in the second half are drawn from a large commercial language model (Huggingface's DistilBERT [4]) that incorporates statistical information from many gigabytes of uncurated text. This model is similar to those used in search engine autocomplete algorithms. The intent here is to show how such commercial language models tend to smooth over high-energy poetic juxtapositions, resulting in predictable structures that perhaps invite future applications of algorithmic fungal entropy.


[1] Hartman, Charles O. “Start with Poetry.” Virtual Muse: Experiments in Computer Poetry, Wesleyan University Press, 1996, pp. 16–27.
[2] Burford, Euan P., et al. “Geomycology: Fungi in Mineral Substrata.” Mycologist, vol. 17, no. 3, Aug. 2003, pp. 98–107.
[3] Sheldrake, Merlin. Entangled Life: How Fungi Make Our Worlds, Change Our Minds and Shape Our Futures. Random House, 2020.
Epilith #1 (“Moths”)
The the the the
the the the the
the the the in.
A and of a
of the the to
to the the the.
Moths and or more
and a be the
of often that few.
Moths and or litter
to and exploited a
its applied it or.
Moths and or litter
to peterson a its
applied it hydrogen a.
Moths due gives resource
sources lead towards bursting
other specify seeds nis.
Moths due gives resource
manhattan tributyrin varro hands
lacked usability carpeted odors.
Moths due to resource
manhattan tributyrin finnish hands
lacked usability carpeted carpets.
Moths belonging to the
manhattan tributyrin finnish hands
lacked suitability carpeted carpeting.
Mos belonging to the
manhattan turbutyrite finnish hands
lacked suitably carpeted carpeting.
Carpets belonging to the
manhattan timbuthwaite finnish hands
lacked suitably carpeted carpets.
Carpets belonging to the
manhattan timbutton & finnish firm
lacked suitably patterned carpets.
Carpets belonging to the
manhattan wimbleton & danish firm
produced suitably patterned carpets.
Carpet manufacturers belonging to the
manhattan wimblett & davis
company produced suitably patterned carpets.
Epilith #2 (“Bergstrom cows”)
The the the the
the the the the
the the the the.
The in of to
ecosystem and the the
of of use and.
The in of to
rapidly in of species
which of of be.
That or is them
in of species process
of the or infect.
Have escape from ireland
species as the on
the most growing and.
Have approved tomatoes decade
broths berryman drives invade
moves aware harmed senecio.
Have approved tomatoes decade
broths berryman cows aware
harmed senecio shifted resemble.
Federally approved tomatoes decade
brooks berryman cows aware
harmed senegal shifted resemble.
Federally approved james decade
brooks bergman cows aware
harmonized senegal shifted resemble.
Federally protected james decade
brooks bergstrom cows aware
harmonized senegal shifted wildlife.
Federally protected jamaic decade
brooks bergstrom cows
aware harmonized senegalese wildlife.
Federally protected jamaica
brooks bergstrom cows from
harming senegalese wildlife.
Federally protected jamaica brook
prevents beaverstrom trout
from harming senegalese wildlife.
Federally protected jamaica brook
prevents beaver rainbow trout from
harming senegal salmon populations.
​
​
Allison Parrish is a computer programmer, poet, educator and game designer whose teaching and practice address the unusual phenomena that blossom when language and computers meet. She is an Assistant Arts Professor at NYU's Interactive Telecommunications Program, where she earned her master's degree in 2008. Read more about her work at www.decontextualize.com.