Fooled by correlation

by Stijn Vanden Bossche

What always interested me is the degree to which one variable is related to another and how often this correlation is present, even though you wouldn’t expect it at all. However, it seems that not every modeler shares my opinion. We find that bad output results in risk analysis models are often caused by a lack (or wrong usage) of dependency modeling techniques.

For me, the first step is to look at the available data and fit it to a correlation structure to see if there is any pattern and what pattern shows up. The next step is harder as you should think about where the correlation could come from(?). I found some funny/weird correlations through surveys and research that seem completely unrelated, but that must have an explanation or common driving factor. I’ll give it my best shot to find an explanation, but please feel free to help me out and send other possible explanations and thoughts directly to me. We are organizing a free webcast on correlation, but more on that at the bottom of this email.

Correlation 1: 74 percent of people who can type without looking at the keyboard prefer a restaurant to a fast food chain, compared with 56 percent of people in general.
My first thought is: the next time I’m in McDonalds I will tell this trivia fact to the female cashier in the hope she wants to prove herself and mistypes the amount I have to pay.

My second thought is: people that can type without looking are most likely office workers. They are more used to going to restaurants for lunch meetings, dinners with clients. Also, office workers tend to earn more than non-office workers and we all know how much you need to pay for a juicy steak (or a cashew-mushroom pâté for the vegetarians among us) in a good restaurant.

Correlation 2: 82 percent of people with tattoos prefer hot weather over cold, compared with 63 percent of people in general.
Contrary to popular belief I think people with tattoos are happy with their body and want to decorate it, similar to women (sometimes men) putting on lipstick, wearing earrings, … Although some tattoos are very personal, a lot of tattoos are a way to express your identity and beliefs towards the outside world. The hotter it is, the less clothes you have to wear and therefore the more you can show your beliefs and identity.

Correlation 3: A UK study showed there was a strong correlation between wearing coats and car accidents.
A first study proclaimed that wearing a coat would obstruct the driver in his movements and therefore making it more dangerous to drive the car. The explanation for this correlation was wrong. A second study showed that the coats actually didn’t obstruct the free movement. Most people wear coats when it rains. In this example rain is the hidden, common “driving” factor.

In this line of thoughts more accidents occur when the windshield wipers are on, therefore we should consider turning off the windshield wipers.

Other funny/weird conclusions:

  • 26 percent of people who never took a ride on a motorcycle are multilingual, compared with 40 percent of people in general.
  • 15 percent of people who don’t like mayonnaise are good dancers, compared with 29 percent of people in general.
  • In general, 33 percent of people say their friends are mostly of the opposite sex. But among those with oily hair, 46 percent say their friends are mostly of the opposite sex.

Other dangers in the world of correlation

A big danger in correlation is the assumption that correlation is always symmetrical. Rank order correlation, for example, implies a symmetrical correlation. This assumes (in positive correlation) that when one variable is low the other is also at the low end and when the variable is high the other is also at the high end. We often see that variables are only strongly correlated at one end but at the other end very poorly. For that reason we can’t use symmetrical dependency modeling technique and have to revert to assymetrical, such as the clayton copula.