Confounding / Confounder (Variable)

CategoriesCausal Inference , Statistics , Variables

Confounding is a situation where an extraneous variable is related to both the treatment and the outcome, making it difficult to determine the true causal effect of the treatment on the outcome.

Confounding can arise when there are variables that are associated with both the treatment variable and the outcome variable, but are not themselves caused by the treatment. These variables are known as Confounders and can lead to biased (misleading, or inaccurate) estimates of the causal effect.

Confounding is often referred to as a "mixing of effects", which means the true effect of the treatment is "mixed" or confused with the effects of other, "confounding" variables.

Possible confounders should be drawn as ordinary variables in your Causal Diagram. Causal Wizard will work out if they are confounders, or not, and colour them as appropriate, during the Check process.

Example of Confounding

As an example, let's say we are interested in studying the effect of smoking on lung cancer risk. We collect data on a group of people who smoke and a group of people who do not smoke, and we find that the smoking group has a higher rate of lung cancer. However, if we do not account for other factors that may be associated with both smoking and lung cancer risk, such as age, gender, and exposure to environmental toxins, then we cannot be sure that the observed association between smoking and lung cancer is causal. These other factors could be confounders, and if we do not control for them, they could lead to biased estimates of the true effect of smoking on lung cancer risk.

To address this issue, we need to include these variables in our models, and use appropriate statistical methods that can adjust for the confounding variables, such as stratification or regression analysis. By doing so, we can obtain more accurate estimates of the causal effect of smoking on lung cancer risk while accounting for the potential confounding factors.

For further examples, see this article.

Importance of not over-controlling

One of the discoveries of Causal Inference is that unnecessarily controlling for too many variables (known as over-controlling) can also bias the result, so it's important to control for only the right set of variables. These are automatically identified during the Check process in Causal Wizard, and displayed in your Causal Diagram.

Related articles
In categories