Correlation, Association and Causality

CategoriesData , Statistics , Variables , Independence

Correlation shows a relationship between two variables, whereas causality implies that one variable directly affects the other.

Correlation vs Causality

Correlation and causality are two concepts used to describe the relationship between two variables. Correlation is a statistical measure that shows how closely two variables are related to each other. Correlation can be positive, negative, or zero, indicating the direction and strength of the relationship. 

Causality, on the other hand, refers to a relationship where one variable directly affects the other and is the cause of a particular outcome.

The key difference between correlation and causality is that correlation does not imply causation. Just because two variables are correlated does not mean that one variable causes the other. It is possible that the correlation is coincidental or that there is a third variable that is causing the observed relationship. Mistaking correlation for causation can cause problems in business operations, understanding how to effective market your project and / or communicate with your users.

For example, there is a strong positive correlation between ice cream sales and crime rates. However, it would be incorrect to conclude that ice cream consumption causes crime. Instead, the correlation is likely due to the fact that both ice cream sales and crime rates increase during the summer months when the weather is warmer.

Correlation vs Association

You might also come across the term "association". Correlation and association both refer to the relationship between two variables, but association is a more general term that encompasses a broader range of relationships than correlation. Correlation is a specific type of association that refers to a linear relationship between two variables. It measures the degree to which the variables move together in a straight line. Association, on the other hand, refers to any relationship between two variables, including linear, curvilinear, or non-linear relationships. Therefore, all correlations are associations, but not all associations are correlations. It is important to consider the type of relationship between variables when interpreting statistical results to ensure accurate conclusions.

Types of association and correlation between two variables, including no association.The figure above shows how a scatter plot of two variables might look when the variables are correlated, associated, or not associated. Note that association is a more general relationship than correlation. This figure may be helpful when interpreting scatter plots.

Independence - the absence of association and causality

You might also want to check out our article on Independence. Independence typically refers to the absence of an association or causal relationship between variables.

Measuring correlation

Numerical measures of correlation allow you to measure the degree of correlation between two variables. The two most popular measurements are the Pearson Correlation Coefficient (which is only suitable for linear relationships) and the Spearman Rank Correlation Coefficient (suitable for any monotonic relationship between two variables, including nonlinear ones). Both metrics are provided in Causal Wizard in the Exploratory Data Analysis tools.

Summary

While correlation, association and causality are related concepts, they are not interchangeable. Correlation can help identify relationships between variables, but causality requires additional evidence to establish a cause-and-effect relationship.

Correlation is a specific type of association. Association may reflect a causal relationship, but not always.

Related articles
In categories