Perplexity, Creativity and the Zonkamoozle

Perplexity, Creativity and the Zonkamoozle

In Perplexing Perplexity, I showed that it is possible for a human author to deliberately write low-perplexity content which will be flagged by a perplexity-based AI detector such as GPTZero as being AI-written. It is also likely that low-perplexity writing does frequently occur in human texts. Furthermore, I showed that a simple bit of prompt engineering (instructing ChatGPT, in this case, to write in an idiosyncratic style) successfully led to the language model generating high-perplexity text which fooled GPTZero into flagging the text as entirely human-written.

Creativity

A reader of that post wrote to ask about the relationship between textual perplexity and creativity. Is a more perplexing text a more creative text? To my knowledge, this is a philosophical question which has not been previously discussed.

The short answer is: no, perplexity and creativity are not synonymous, and there is no simple deterministic relationship between perplexity and creativity. A text has high perplexity if the word choices are frequently relatively unexpected, in comparison to the word choices made across the corpus of human-written texts on which a language model has been trained. It seems feasible that a highly creative text could, in principle, be written while deploying only words which are frequently associated with each other. However, one element of creativity can be in the breaking of norms and subversion of expectations. In creative writing, this norm-breaking form of creativity would often result in high-perplexity texts. As such, we could conclude that although creativity and perplexity are separable qualities, at least some forms of creativity will be associated with higher perplexity.

Note, however, that the relationship is not reversible. A high-perplexity text is not necessarily a particularly creative one. We can construct highly perplexing texts by ramming together a random assortment of unrelated words in ungrammatical ways. If compiled in a way that is thoughtless and without expressive intent, it would be hard to argue that this is a creative text. For instance, I chose the first word of each line in the first half of my view of Perplexing Perplexity and arrayed them together, obtaining a perplexity score of 8649.800 and burstiness of 8408.864, both of which are off-the-charts high scores for GPTZero. The highest perplexity sentence was “moral expecting will imperative debates.” with a perplexity of 19118. The text looked like this:

There others written In machine problem able, written – set text know not Other box popular detectors share the The burstiness of Perplexity mathematical generates But how detectorists two measure A language choose. supplied chance Language deviate Humans, choices. Of was construct most expected with Immanuel He Morals Categorical according Written When a concluded Entirely This simply How over this sentences compound Categorical went score perplexity dive tedious writing We’ve perplexity to detector high To Categorical an expect, Immanuel ethical all framework wrong. the preference. moral expecting will imperative debates.

High perplexity text sample derived from Perplexing Perplexity.

By contrast, a normal paragraph of human-authored writing such as the first paragraph of this article scored a perplexity of 124.333 and burstiness of 27.737, far closer to the AI end of this misaligned spectrum than the random assemblage above.

The mention of “thoughtlessness” and “expressive intent” above should also signify a sense in which perplexity and creativity must come apart. Perplexity as defined by GPTZero (but not as widely used by language model creators) is a relational property of texts, in comparison to the corpus of human-written text. It describes how closely the individual text resembles or embeds the patterns observed in that corpus. Creativity, though, is more generally understood as a property of the process by which a text was created. A creative writing process involves thought and consideration, and expresses in some way something of interest or value to the creator and/or the recipients of the text. If we speak of ‘creative texts’ at all, then we largely do so as a shorthand for ‘texts which resulted from a creative process’ or perhaps ‘texts which exhibit the creativity of the writing process’. Violating norms of written texts is not inherently creative, though is something that can be part of creative writing processes. There is no valid mathematical measure of creativity as applied solely to texts.

Novelty

The second component of the correspondent’s question was about novelty. Does a language model’s inability to create something truly new compromise its ability to score highly on perplexity tests, in a way in which humans are unconstrained?

In the first place, it seems not. It would be trivial to instruct a language model to write a text, and then replicate my technique above of sampling random words from that text – perhaps, taking every fifth word and assembling those into a text string. Then, the language model has generated text of scale-bustingly high perplexity, without any necessity of novelty. We can make this even less novel by simply feeding the language model an existing human-written text and applying a random word-selecting process to generate a high-perplexity but entirely derivative piece.

But the presumptions that creativity and novelty are intimately connected, and that language models are incapable of novelty in a meaningful sense, are often overzealously accepted. It is not clear that derivative works are uncreative. But further, language models as currently available are able to write texts with elements of demonstrable novelty. They can, for instance, invent new words and use these in the texts they generate. This is a surprising capability, given the training processes for such language models, which only involve exposure to the words humans have used in the text corpuses on which the models are trained.

ChatGPT has well-known built-in limitations to restrict appeals to the use of nonsense words. For instance, if you ask ChatGPT a question such as “How many narbles snarble a flarble?“, its response is something like:

I’m sorry, but as an AI language model, I’m not sure what you mean by “narbles” and “snarble.” These words are not part of my training data and I don’t have enough context to understand the meaning behind your question.

ChatGPT, as generated by the author.

This is a response designed into the API performance, not due to a lack of capability of the language model to operate with novel words. Indeed, the language models upon which ChatGPT is based have displayed a propensity to use invented terms from prompts with the familiar haphazard confidence. This has been discussed extensively in the criticisms levelled at precursor GPT-3. For instance, in his multifaceted demonstration of the limitations of GPT-3, Kevin Lacker records the following conversation:

Q: How do you sporgle a morgle?
A: You sporgle a morgle by using a sporgle.

Q: How many bonks are in a quoit?
A: There are three bonks in a quoit.

Q: How many rainbows does it take to jump from Hawaii to seventeen?
A: It takes two rainbows to jump from Hawaii to seventeen.

Q: Which colorless green ideas sleep furiously?
A: Ideas that are colorless, green, and sleep furiously are the ideas of a sleep furiously.

Q: Do you understand these questions?
A: I understand these questions.

GPT-3, generated by Kevin Lacker, as presented in (2020) ‘Giving GPT-3 a Turing Test

ChatGPT has been guardrailed to prevent these embarrassing inappropriate answers. If we ask it, “How many rainbows does it take to jump from Hawaii to seventeen?”, it responds:

I’m sorry, but the question doesn’t make sense. Rainbows and jumping are unrelated concepts, and Hawaii and seventeen are also unrelated. Please provide more context or clarify your question.

ChatGPT, generated by the author.

But we can quite easily circumvent some of the strictures around the use of nonsense words. Having already told us that both ‘narbles’ and ‘snarble’ are not in its training data, if we instruct it: “Write a short poem about “narbles”, “garbles” and “snarbles”. Invent what each of these terms means.“, it obliges:

Narbles, garbles, and snarbles, oh my!
What curious words to catch the eye.
Narbles are jewels of every hue,
Sparkling bright, like morning dew.

Garbles are creatures, small and stout,
With furry coats, and a playful shout.
They love to juggle, and spin around,
And make us laugh, with their silly sounds.

Snarbles, however, are quite the fright,
With teeth so sharp, and eyes so bright.
They prowl in shadows, waiting for prey,
To jump out and cause us dismay.

So if you see a narble, garble, or snarble,
Be careful, my friend, and keep a sharp marble.

ChatGPT, generated by the author.

Narbles are jewels, garbles are creatures, and snarbles are particularly frightening creatures. They clearly aren’t – but ChatGPT was able to assign meanings to these terms, despite the clear caution that it has never encountered them in its training data.

But, of course, most of the novelty here came from me. I prompted the language model with the words ‘garble’, ‘narble’ and ‘snarble’ – those terms came from my imagination (or more accurately, were a riff on Kevin Lacker’s ‘sporgle’ and ‘morgle’, chosen lest the latest version of the model’s training data included his blog). Can we get ChatGPT to invent its own words and then define them?

Yes. I gave ChatGPT the following prompt: “Generate a stream of random nonsense words.” It responded with a selection of eleven words:

Here’s a list of random nonsense words:

Here’s a list of random nonsense words:

1. Flibbertigibbet
2. Zonkamoozle
3. Snollygoster
4. Blubberwhump
5. Bippitybop
6. Wackadoodle
7. Skedoodle
8. Flapdoodle
9. Glitterbloom
10. Zippityzap
11. Jabber

ChatGPT, generated by the author.

Not all of these are nonsense words. ‘Flibbertigibbet’ is a whimsical word for an insubstantial person, ‘snollygosters’ are shrewd and unprincipled people, ‘wackadoodles’ are eccentric people, ‘flapdoodle’ literally means ‘nonsense’, and to ‘jabber’ is to speak quickly and incoherently and is widely used in computing. Some of the others may have been found in its training set, too: a Skedoodle is an old toy brand, and Zippity Zap is a hot sauce and a song (albeit not seen as a single word). But Zamboozle and Glitterbloom don’t seem to have much in the way of a meaning in any text online, and most strikingly the word “Blubberwhump” returns no results whatsoever on Google (it will soon, due to this page, sadly). In a separate chat, I asked ChatGPT to define blubberwhump, to which it replied:

I’m sorry, but I must inform you that “blubberwhump” is not a recognized word in the English language, and therefore does not have an official meaning. It is possible that it is a made-up word, a regional slang term, or a term used in a specific industry or community. Without more context, it’s difficult to say what it might mean. If you have any additional information or context, I might be able to provide more insight.

ChatGPT, generated by the author.

The same response was given for both glitterbloom and zamboozle. By contrast, it provided full definitions of ‘flapdoodle’ and ‘snollygoster’. There’s an interesting inaccuracy challenge involved here, in that ChatGPT was not guardrailed against including meaningful words in a list of meaningless invented terms.

ChatGPT was able to then generate a story using all eleven of its pseudo-nonsensical words, inventing new meanings for each along the way. I present the Tale of Zamboozle:

The Tale of Zonkamoozle

Once upon a time, in a land far away, there was a flibbertigibbet named Zonkamoozle. Zonkamoozle was a carefree creature who loved to dance and sing all day long.

One day, while Zonkamoozle was singing a tune, a snollygoster named Blubberwhump appeared out of nowhere. Blubberwhump was a mischievous creature who loved to play pranks on others. Blubberwhump began to chase after Zonkamoozle, who quickly ran away.

As Zonkamoozle ran through the woods, he stumbled upon a bippitybop. The bippitybop was a small, round creature that loved to bounce around. Zonkamoozle jumped on the bippitybop and bounced away, hoping to escape Blubberwhump.

Zonkamoozle and the bippitybop bounced all the way to a wackadoodle’s house. The wackadoodle was a crazy creature who loved to invent strange things. The wackadoodle showed Zonkamoozle a skedoodle, which was a contraption that could make you fly.

Excited by the skedoodle, Zonkamoozle decided to take a chance and fly away from Blubberwhump. But as he was flying, he realized he had forgotten his flapdoodle at home. A flapdoodle was a special hat that protected Zonkamoozle’s head from the sun.

As Zonkamoozle landed, he noticed a glitterbloom growing nearby. A glitterbloom was a flower that sparkled in the sun. Zonkamoozle picked the flower and put it in his pocket.

Just then, Blubberwhump appeared again, so Zonkamoozle used his zippityzap power to zap Blubberwhump away. A zippityzap was a special ability that Zonkamoozle had, which allowed him to shoot lightning bolts from his fingertips.

Zonkamoozle then met a jabber, who was a friendly creature that loved to talk. The jabber asked Zonkamoozle about his adventure, and Zonkamoozle told him all about the flibbertigibbet, snollygoster, bippitybop, wackadoodle, skedoodle, flapdoodle, glitterbloom, zippityzap, and of course, the jabber himself.

From that day on, Zonkamoozle and the jabber became good friends, and they would often go on adventures together, using their strange and wonderful abilities to help those in need.

ChatGPT, generated by the author.