The Stochastic Masquerade and the Streisand Effect

A few weeks ago, Google fired one of their most credible senior AI ethics researchers, Margaret Mitchell, founder and former co-lead of their ethical AI team. This follows their firing of Timnit Gebru, another highly-respected AI ethics researcher, in December. Both Mitchell and Gebru have been leading voices in the critique of the biases which AI can embed and exacerbate, as well as the ways in which the overwhelmingly white-male demographics of AI research can marginalise those who could anticipate and address such biases.

Google’s actions show a baffling ignorance of the Streisand Effect. In 2003, the California Coastal Records Project were attempting to document coastal erosion. They inadvertently photographed Barbra Streisand’s residence in Malibu. Streisand’s legal team’s heavy-handed attempts to suppress the image (including suing the photographer for $50 million) led to a massive upswing in traffic, driving people to view, share and archive the photo. When an attempt to suppress information becomes public, it drives greater attention to that information: that is the Streisand Effect. Firing Timnit Gebru, and now Margaret Mitchell, has driven attention both to Google’s ethical failures, and to the biases, environmental harms and dangers which Gebru and Mitchell have worked to highlight.

This is where my work in designing and teaching LSE100, a course which is taken by every LSE undergraduate in the first year of their degree, regardless of programme, comes in. It is in moments like this that I can delight in being able to say that every undergraduate enrolled at LSE in 2018 and 2019 received a course entitled “Can we control AI?”. That module foregrounded biases in AI, the importance of the demographics and structure of the tech industry, and the potential dangers of AI in sectors from criminal justice and medicine to surveillance and warfare. In particular, we focused on the use of giant language models such as GPT-2. Every student was able to experiment with this language model, see the text it can produce, and use social scientific frameworks to analyse the impact that such language models could have on truth, trust and society.

But what we did not have, in our courses for students in 2018-2020, was the paper On the dangers of stochastic parrots: can language models be too big? written by Gebru and Mitchell, alongside co-authors Emily Bender and Angelina McMillan-Major. This paper, as yet unpublished but widely available online, is the eye of the storm for the firing of both Gebru and Mitchell (the latter of whom uses the wonderful pseudonym ‘Shmargaret Shmitchell’ on the circulated version). Despite some blundering attempts at smear and disinformation which approaches gaslighting, it became clear to the AI ethics community that Gebru was fired in part over the internal ire roused by this particular paper. It appears that, at least to some at Google, the answer to ‘Can language models be too big?’ is to silence anyone who dares to ask the question.

In 2021, LSE100 will again provide every new LSE undergraduate with a course on AI which foregrounds ethical, social and political challenges. But now, we’ll have the opportunity to have every student read On the dangers of stochastic parrots. It’s a perfect match: the authors’ clear explanation of the technical challenges involved in creating large language models, along with both environmental and bias concerns, will be of great value to our students. A quick unearned thank-you, then, to Google for bringing this important paper to everyone’s attention. Two thousand students might have missed out. It’s Malibu all over again – and for a sense of how that turns out, be aware that Streisand lost the lawsuit, and the photograph in question is plastered everywhere from Wikipedia to the top of this very page.

Stochastic Parrots raises at least two of the most important issues with giant language models. These language models are increasingly good at generating human-like text. But this comes with a significant environmental cost in the energy drain of training massive models, and with the very real risk that as these models get better at emulating human writing, they will at the same time learn to mimic, replicate and reinforce our biases. Researchers, including those at OpenAI who created GPT-2 and GPT-3, have also expressed fears about the potential malicious use of the tool, a dual use which I’ve discussed before.

But there is another significant danger to which medical scientists particularly must attend. As language models improve, we are closing in on the ability to train a model on a corpus of existing research to generate research papers which are near-indistinguishable from real research. Over the past few years, at conferences and workshops, I have touted a new list of challenges which form the core of a field of meta-medical science, my personal analogue for David Hilbert’s famed 23 problems of mathematics from 1900. While this list is not quite ready for daylight, it merits saying that the impact of giant language models will be felt, in particular, on one of these problems. I call this the Masquerading Problem.

We know that some studies in the medical literature are fraudulent, using invented data, merely masquerading as genuine reports of medical trials. It would be hubris to assume that we catch all of these. Moreover, many more studies will have some element of evidential masquerade, whether in fudging data, manipulating measures, or misrepresenting the methodological features of a study. The Masquerading Problem asks how good our existing systems (peer review as well as algorithms and tests to detect fake data) are at detecting masquerading of various kinds, and whether we can make our systems more sensitive to fraud and misrepresentation. The proliferation of giant language models – now and in the future – make this task much more challenging. We must ask how robust our systems are at masquerade detection, whether by human deception or by automation.

AI can function as an amplifier. Patterns are detected, replicated, and ultimately those replications are often fed back into the algorithm or into the training data of the next generation of algorithms. Existing biases and harms get amplified along with the rest. Like the observer who wishes to know how the magic trick is done, we’d best pay close attention to exactly what the magician is trying to divert us away from. For that alone, Stochastic Parrots is perhaps 2021’s most required reading.

Featured image is copyright, 2002, of Kenneth & Gabrielle Adelman, California Coastal Records Project,