Blog 3: In which I try to explain about correlation and causation (Caution: some blogs may contain rambling).

Opening a stats textbook is, for me, very similar to opening the Ark of the Covenant (as in ‘Indiana Jones: Raiders of the Lost Ark’, gotta love that movie!). The cover is lifted and immediately my face begins to melt right off at the arcane horrors within. This happens every time I look at a page of stats a textbook. Can I therefore conclude that statistical textbooks cause face-melting? Note: if I can show causation then people in dark suits will take all the stats books away to be hidden away in area 51 examined by Top Men.

                Suppose I conducted an observational study and find that the frequency of face-melting incidents increases with increasing exposure to stats textbooks. I have found a strong positive correlation but I cannot infer causation from this as I have not controlled for any possible confounding variables. These face-melting effects may be common to all textbooks, my face may be hypersensitive or I might be allergic to paper, the melting may be caused by the stress or boredom I feel when opening the textbook and not by the textbook itself. What this experiment can tell me however, is that there is a strong covariation between exposure to stats textbooks and melting faces, which hints at the possibility of a causal relationship between the two.

                To demonstrate causation I would have to conduct an experiment in which all other possible causes for face-melting were controlled for and in which possible sources of bias were removed (random assignment, independent observations….. the usual). If at the end of this I found that the frequency of face-melting was significantly higher in the stats textbook condition than in the control conditions then I could infer that there is likely to be a causal relationship between the two variables (Causation, at last! Send in the Top Men!).

                It is worth noting however that both of these techniques were looking for a relationship between the variables and that how we evaluate the data does not determine whether we can infer causation, rather it is how the data was collected that determines this (for a better explanation of this see here: http://core.ecu.edu/psyc/wuenschk/stathelp/Correlation-Causation.htm).  What this means is that, although we are taught dogmatically that correlation never ever implies causation, it is possible for correlation to imply causation if the data was gathered by true experimental methods; as it is the control of possible biases and confounds within the experimental method that allows us to be able to infer causation. In other words, if I had collected my face-melting data experimentally but analysed it using a correlational or regressional statistical technique, rather than an ANOVA or t-test, then I could still possibly make a causal inference provided the analysis was appropriate and I had been rigorous enough in collecting my data.

                The dogma of “Correlation does not imply Causation” is potentially extremely misleading for any student of statistics. What it really means is that we cannot guarantee or even infer a causal relationship from a correlational research design, because no variables have been manipulated or controlled for. Also, the word ‘imply’ is problematic in this context. Here it is intended to mean guarantee, but in most cases it would mean ‘hint at’, which is in fact what any good correlational study should do; explore connections between variables to uncover possible relationships to be examined experimentally later on.

 

For your entertainment:

http://www.obereed.net/hh/correlation.html

http://xkcd.com/552/

For your education:

http://www.cambridge2000.com/memos/correlation.html

http://psych.csufresno.edu/psy144/Content/Design/Types/correlational.html

http://www.webster.edu/~woolflm/methods/devresearchmethods.html

Advertisements
Leave a comment

1 Comment

  1. I enjoyed reading the silly example of correlation thus meaning correlation is not causation. Whilst researching this topic I found that although it does not imply cause it is a necessity in detecting relationships between variables without manipulation or control. Correlation is an important measure in psychology and without it many relationships between variables would not have been discovered. Also assessing the independence of concepts, correlation is valuable to use to assess whether or not the ideas one is examining are distinct and separate. Finally we can make predictions using correlations a good example of this is height and weight

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: