The True Causal Effect

Do medical scientists need philosophers of medicine like birds need ornithologists? A quotation often attributed to Richard Feynman claims that “philosophy of science is as useful to scientists as ornithology is to birds”. Feynman, never the intellectual slouch, acutely discounts the value of philosophy of science to him, without actually claiming that philosophers are useless, valueless, or eliminable. After all, we do have ornithologists. The only scientists explicitly referenced in his statement are the analogues of the philosophers of science! Nor is an ornithologist’s work solely for the birds.

But the comparison raises a challenge which philosophers have been anxiously answering since: does philosophy offer something which scientists need? What value can the ornithologist offer the bird? Of course, the impassible gulf for an ornithologist who seeks to improve the lives of her avian subjects is that the birds are unable to listen to ornithological insights. They don’t speak the same language.

Philosophers of medicine have tried to speak the language of their subjects, the medical scientists and practitioners. Medics, too, speak in philosophical vernacular as standard. It’s when we speak alike but not akin that we see why the medical scientists are often bemired in perplexities that their armchair-bound colleagues could drain away. In their 2019 paper, Target Validity and the Hierarchy of Study Designs, Daniel Westreich et al. offer an intriguing concept of ‘target validity’ to replace the false but ubiquitous dichotomy of internal and external validity. Their discussion is nuanced and their proposal is intricate. The paper deserves a much more detailed discussion than I will give it here, and should be read on its own merits. But I’m afraid I simply cannot get past sentence five of the abstract, which reads:

In this work, we introduce and formally define target bias as the total difference between the true causal effect in the target population and the estimated causal effect in the study sample, and target validity as target bias = 0.

The phrase that triggers every philosophical instinct towards dismantlement is “the true causal effect”. It is horror-show phrases like this one, liberally spattering the medical literature, which expose shaky assumptions implicit in much meta-medical theorising which goes on in the absence of philosophical scrutiny. The true causal effect is one component of their measure of target bias, which in itself defines target validity, which is the topic of the paper. Its importance to their work cannot be undersold. But it is almost a philosophical nonsense.

So what is the true causal effect? The authors presume that we know what a true causal effect is, and do not attempt a definition. But I am sorry to say I hate every word of this phrase, so let’s take it from the top.

“The”

Definite articles are the subtle knife of fallacious reasoning. All manner of grief can be concealed by these three letters. The definite article often conveys uniqueness. Consider the difference between the claim that ‘Randomised trials are a source of high-quality clinical evidence’ and that ‘Randomised trials are the source of high-quality clinical evidence’: a difference in uniqueness that might pass beneath scrutiny. In the former, there is scope for other work to provide high-quality evidence. In the latter, there is no such scope and RCTs become the sole providers of high-quality evidence. The definite morphs into the definitive. This transition was particularly notable in the late 1990s in the Evidence-Based Medicine movement, a period in which their literature moved away from offering ‘a hierarchy of evidence’ to assess the quality of evidence for a specific claims, towards ‘the hierarchy of evidence’ to propose an unassailable unified ranking governing all medical evidence.

Here, we have the true causal effect. Implicitly, they presume that there is only one unique true causal effect. An intervention cannot have multiple true causal effects. Disguised beneath is an unwarranted (and possibly unwanted) presumption of homogeneity. For, if there is but one true causal effect, there cannot be different true causal effects in different people, populations, or circumstances. Westreich et al. are trying to contribute to debates around precision and “personalised” medicine, and here the issue of whether we can speak meaningfully of individual-level effects is precisely the contentious point. Yet the debate is assumed away by the subtle ‘the’. One is left to wonder whether side-effects can be true causal effects, and how to differentiate the true causal effect from an intervention’s many other effects. After all, it is rare that any effect worth causing comes without other effects into the bargain.

“True”

It would be remiss of a philosopher to omit a discussion of truth. “True” here is an adjective to modify the “causal effect”. Do Westreich et al. suggest that the causal effect can come in at least two varieties: true causal effects and false causal effects? If so, what is a false causal effect?

One possibility is that Westreich et al. don’t actually think there are false causal effects at all, rather that they are concerned about the false impression that an effect is causal. In other words, rather than thinking about a “false causal effect”, we must worry about non-causal effects masquerading as causal effects. What on earth are these deceptive non-causal effects? I leave that to the discussion of “causal”, below. But if we go this route, which might feel a charitable option, then the word “true” in this phrase is either redundant or misleading, and should be eliminated or replaced with something more precise.

Alternatively, it may be that Westreich et al. think there are some false causal effects. These are probably “false” in a sense which diverges from our usual assumptions about veracity. They are not false in the sense that they are incorrect, deceptive or non-existent. They perhaps mean something more like a falsely-attributed causal effect. A true causal effect, then, would be conceptualised as one which is attributed correctly to the intervention, while a false causal effect is attributed to something else. What else? Perhaps it is a causal effect brought about by something other than the active intervention but which was done alongside the intervention. Maybe it is caused by the patient’s own reaction to receiving the intervention (a placebo or nocebo effect, in a sense). Or perhaps it is the result of regression to the mean or the natural course of the patient’s underlying disease. In a similar way, Collins & Pinch, in Dr. Golem, once distinguished (rather hair-raisingly) between true and false placebo effects. Even so, it is not really the ‘causal effect’ which is ‘true’ or ‘false’ here, so much as our own decision to attribute it to a given cause, rightly or wrongly, accurately or inaccurately, justifiably or unjustifiably.

Another way in which Westreich et al. can employ “true” here is to help us discern the reality from the scientific model – the estimate. Their work focuses on methods which attempt to estimate causal effects, and conceptualises bias and validity in terms of the accuracy of the estimate to ‘the’ real effect. In this sense, ‘true’ here is again redundant, unnecessarily inviting a nasty encounter with some violently unsolved philosophical problems. What they are doing, rather, is making a quite common but still controversial claim: that there are underlying phenomena in reality which we call ‘causal effects’, which have directions and magnitudes, which we cannot measure directly, but which we can at least attempt to estimate. But in this case, they do not need to clarify that it is the ‘true causal effects’ (or perhaps more properly the ‘real causal effects’) that we are estimating or talk of true causal effects and estimated causal effects. Estimated causal effects are not causal effects themselves at all. They are instead estimates of causal effects. We do not need to expand our medical ontology by introducing the true causal effect and the estimated causal effects; we only need the causal effects plus the concept of estimating something.

The incompatible and diverse ways in which we could understand claims about ‘true causal effects’ suggests that the term is underdeveloped here and only likely to obfuscate. Obfuscation breeds confusion in readers while allowing strong unjustified assumptions to persist unchallenged, potentially undermining the entire project if they prove unfounded. If philosophers could help clarify this muddy appeal to truth, they would offer a potentially project-saving value.

“Causal”

In pairing ‘true’ and ‘causal’, this benighted phrase combines two words which have spawned more philosophical work than almost any other pairing. But even sidelining the voluminous debates over the nature of causation, principles for causal inference, and the roles of statistical regularities and mechanisms in inferring causation, the use of ‘causal’ here raises a plethora of issues.

Most strikingly, the authors chose the phrase “causal effects” rather than merely “effects”. Assuming this is not just redundancy, this suggests that there are non-causal effects from which the causal ones must be distinguished. What is a non-causal effect?

Aristotelian musings about the “unmoved mover” notwithstanding, I presume that no one involved in this debate believes that there are uncaused effects. A non-causal effect, then, is nevertheless caused by something. So, presumably, the distinction between causal and non-causal is in terms of whether the effect has the right cause, the cause we are interested in. (Beware, for this is another opportunity for the pernicious ‘the’ to surface: in lulling us into mistakenly thinking that an effect must have a singular and uniform cause – ‘the cause’ of ‘the effect’. When we look at a difference between two average treatment effects on some variable of interest in a study, it is so tempting to reduce the problem to ‘What caused this difference?’, rather than ‘What are the causes of this difference?’) Non-causal effects are presumably effects which were not caused by the intervention of interest in the study: placebo effects, exogeneous events, and so on. Really, though, we should be talking about which effects – and more acutely, how much of those effects – is justifiably attributable to the cause(s) of interest.

It becomes clear how much philosophical and conceptual load the word ‘causal’ disguises here. Really, this word is deployed to skirt the entire problem of determining the cause structure behind the various effects in a study. The proposal that we might distinguish the effects which do and do not result from the cause of interest is a bold oversimplification. As we know, there are usually complex causal interactions involved in any case of serious medical interest. These make it very difficult (if not impossible) to discern specific causal contributions for each element within a causal structure to a difference in a single variable.

To this point, little suggests that “the true causal effect” means anything more than “the effect”.

“Effect”

The final word in the phrase brings together the multifarious issues heaped upon this turn of phrase. First, there is an implicit reductive view underpinning the use of ‘effect’ here. The ‘effect’ being discussed is best understood as a differential average treatment effect – that is, the difference between the average outcome (in terms of some outcome of interest) between two or more groups which received different treatment protocols. These are bare averages: we do not have information about the actual average treatment effect in either group, and are only able to discuss the difference because we interested in attributing (at least some of) the difference in average outcome to the difference in treatment. Nor can we drill down to any individual level effect (although we may discuss individual outcomes). These are certainly not the only effects we might be interested in. Nor is this the only way to gear up medical science to analyse the relationships between treatment and patient outcomes. We can take other approaches to causal modelling, many of which are more sophisticated (if less straightforward and easy to explain to practitioners and endorse in the abstract through idealised philosophico-statistical justifications). All of this to say: the process of finding true (or at least, evidence-based) claims about causal relationships between treatments and outcomes in medicine should not be reduced to the process of determining how much of the differential average treatment effect in a comparative study can be attributed to the difference in treatment protocols.

We would do well not to speak of “causal effects”. There are no uncaused effects in our understanding of causation in medicine. Our interest is in understanding causes of different outcomes of interest, with the hope of intervening to promote certain outcomes over others. The terminology of a “causal effect” suggests a simplistic one-to-one mapping of cause and effect and an undue neatness in these relationships, which is unlikely to reflect the messy realities of clinical medicine.

We would do better still to avoid calling causes and effects themselves ‘true’. We might want to distinguish between real and imagined causes and effects in unusual cases. But more often we want to talk not about the reality of causes and effects but of how confident we can be in attributing an effect (or some set portion of that effect) to a given cause, or the truth of the claim that intervention A was causally relevant to effect B to degree C.

Finally, we should be wary of definitive articles when it comes to cause and effect. ‘The cause’ and ‘the effect’ is language we use when discussing highly simplified toy cases. It is the medical scientific analogue of the frictionless plane in a vacuum. It is not appropriate to the morass of medicine.

Is it excessive to devote such scrutiny to four words, albeit words which are undefined and central to a new framework being introduced in a medical paper? Perhaps, but only inasmuch as it is excessive for ornithologists to produce detailed volumes on the intricacies of a starling’s flight. A concept of target validity that founders before it gets through its very definition is far less useful to the medical scientists than one which can accommodate the challenges of clinical medicine, rather than creaking under the weight of a great and unstable implicit pseudo-philosophical load.



References:

Collins, H. & Pinch, T. (2005) Dr. Golem: How to think about medicine (University of Chicago Press)

Westreich, D. et al. (2019) ‘Target validity and the hierarchy of study designs’, in American Journal of Epidemiology, 188:(2) 438-443.

Feynman’s quote is widely attributed without any discernible evidence to my knowledge, but is consistent with his other statements on philosophy of science, see e.g.:
Feynman, R.P. (1963) The Feynman Lectures on Physics (Addison-Wesley)

For someone with a well-recorded disdain of philosophy of science, Feynman published a lot of it, e.g.:

Feynman, R.P. (1974) Cargo Cult Science, available at: https://calteches.library.caltech.edu/51/2/CargoCult.htm
Feynman, R.P. (2001) ‘What is Science?’, in The Pleasure of Finding Things Out: The Best Short Works of Richard P. Feynman (London: Penguin), 171-88