Google’s NotebookLM is a language model driven AI tool which can analyse documents and webpages, and rapidly produce useful summaries, answer questions about the content, create study guides, and – the option that has captivated much online attention – automatically generate podcast style “deep dives”. These deep dives are generally 5-15 minutes in length, creating a dialogue between two AI-generated speakers. The lifelike speech patterns and ability to discuss the content of a page or article in an engaging and way have made these both interesting technological curios but also useful study tools.
I share here three such “deep dive” versions of my own academic articles, in the hopes that they might prove a more digestible and listener-friendly alternative to the papers themselves (and even inspire an additional reader to take the plunge with the real deal):
However, we must remain cautious when faced with these compelling GenAI podcasts. Even in my own experiments with my papers, above, a few hallucinatory elements are evident. Interestingly, these followed a pattern: the “Deep Dive” podcast format has a tendency towards romanticising the subject material. There seems to be a format-driven preference for a happy ending or a life lesson to be learned, or at least a practical suggestion to follow. Not for NotebookLM the relentless doom and irredeemable despair of Behind the Bastards. It would rather end on an upbeat note.
Perhaps for that formatting reason, the Deep Dives above both diverge from the subject matter of my papers later in the runtime. After discussion the tendency to oversimplify the complexities of medical evidence appraisal that is central to the argument of The Pyramid Schema, for instance, the deep dive then takes the mention that the GRADE system allows for a bit more nuance in evidence ranking as a full-throated endorsement of that system as an alternative. It wraps up by saying that we take evidence pyramids as a valuable starting point but need to apply our critical thinking faculties and embrace complexity. In reality, while the paper makes it clear that more sophisticated hierarchies like GRADE don’t suffer from the same level of reductionist thinking as evidence pyramids, it certainly stops a long way short of endorsing those (still deeply flawed) models.
Similarly, in The Dismal Disease, the deep dive reverses the conclusion of the paper. That paper showed how evidence from a post-trial subgroup analysis by Monika Hegi and colleagues had been marginalised and ignored in favour of over-reliance on the results of the trial’s main analysis, largely due to the flawed way that Evidence-Based Medicine ranks and appraises evidence. But the deep dive has a different story with a happier ending – that Hegi et al.’s daring re-analysis provided the medical community with the evidence they needed to reform the standard of care. That’s far from the reality and certainly not the lesson I’d draw from the case of Temozolomide that is at the centre of the paper. When the deep dive remarks that this episode has been a “heavy one”, perhaps there’s a modicum of recognition that the paper is a bleak case study. But that bleakness has been toned down for the popular audience stylings of the deep dive.
In conclusion, then, take the Deep Dive hosts’ advice in their rendition of Pyramid Schema and treat these artefacts with critical thinking faculties on full beam, and be very careful to check the accuracy of the representation of any information given before making any decisions or assumptions based on deep dives. They might go as far as to invert a crucial claim for the sake of a satisfying conclusion.