Causality and Correlation

Printer-friendly version

Causality and Correlation One of the most common misunderstandings about the relationship between two observations is the difference between causality and correlation.  Two things may seem to be correlated (associated) when there is actually no real connection between them.  This is especially true with changes over time.  For example, over the last two centuries bread prices and sea levels in Venice increased in tandem, but they are obviously not connected other than statistically.   Another good example is the strong association (correlation) between the reading skills of children and their shoe size.   Despite the association no one would seriously suggest that growing feet have anything directly to so with reading skills.  There is, a simpler explanation, which is that as children get older both their shoe size and their reading skills increase.  In this case we refer to age as the confounder.  

Another example might be the association between a city’s ice cream sales.  It turns out that sales are highest when deaths due to drowning in the city swimming pools are also at their highest.  No one seriously thinks that eating ice cream increases the risk of drowning: in this case the confounder is likely to be a heat wave

In other cases the explanation for an association is not so obvious.  Consider the familiar old folk legend which says that storks bring babies into the world.   Even today, it’s common to see storks on greetings cards celebrating births.  Whilst no one seriously thinks that storks have anything to do with human reproduction, studies in European countries show a consistent statistical correlation between the density of nesting storks and birth rates.  How can this be?  It turns out that there could be a number of explanations, the most plausible of which is that the countries with the highest birth rates also have the largest land mass, so more space for the storks to nest.  Again, this additional factor (confounder) explains the apparent relationship between the other two factors.  

These examples illustrate a very important point:  correlation is simply an association between two or more things, whereas causation is a relationship in which a particular action or event can be shown to be the direct consequence of another.  

In practice, looking for a correlation between two things is just the first step – albeit an important one.  After all, if there is no correlation, there cannot be any causal association. If there is a correlation, then this might be causal or it might not be.  Quite often, as in the examples above, it’s obvious that there is no plausible causal mechanism which might explain the association.  On other occasions, it requires much more investigation to determine why things are associated or correlated.  A good example is the well known correlation between cigarette smoking and lung cancer which was first published in 1950.  It was not known at that time how smoking might cause lung cancer and smoking was listed as only one of a number of causes.  Since then we have learned that cigarette smoke contains a large number of cancer-causing agents and the mechanisms by which smoke may alter normal cell division to produce cancer cells has been described.  Today, the weight of epidemiological and experimental evidence is such that we now regard the relationship between smoking and lung cancer as causal.  Hence public health messages which say that smoking causes lung cancer or that smoking kills.  

In general it’s easier to spot a simple correlation (as in the case of nesting storks) than to confirm a true causal relationship.  As was made clear earlier, true causality can only be confirmed when a particular action or event can be shown to be the direct consequence of another.  In fact, in medicine, it can be extremely difficult to show that a relationship is genuinely one of cause and effect.   For example, there is a strong association between obesity and various forms of cancer, but it is quite difficult to show that this is causal.   Not everyone who is obese will develop cancer and not everyone who has cancer is obese.  So the best we can say is that whilst the association is a strong one, it falls short of being causal - and this is true of most observations that support public health policy and advice.  

As always, things are usually more complex than they seem!

Back to understanding research