Double ML

CategoriesStudy Design , Method

Using Double Machine Learning to accurately estimate treatment effects

Double Machine Learning

Double Machine Learning (DML), also known as de-biased machine learning, was first proposed by MIT statistician and economist Victor Chernozhukov (Chernozhukov et al., 2018). However, rather than the paper, we recommend this easier explanation by Matheus Facure. As there are many good articles on the topic, we won't attempt to review the theory here. Instead, we will focus on the practical implications and capabilities of the model.

Double ML estimates a causal effect size (e.g. ATE) while controlling for confounding variables. Therefore, Double ML can be used when a backdoor estimand is generated, instead of (or in addition to!) techniques such as linear regression. However, Double ML can provide better estimates of the causal effect given non-linear relationships between variables in the data. This is its primary use-case in Causal Wizard.

EconML implementation

Causal Wizard uses the Double ML implementation from EconML, via the DoWhy binding. This means we can provide:

The package does not currently support counterfactual or individual treatment effect predictions.

Configuration options

Double ML has many configuration options. It is a often described as a meta-learner - a model comprised of other models. This means that we need to configure each of the three models which are used in Double ML.

Somewhat confusingly, Double-ML is applied in two stages

The three models are:

  • Model Y: Predicts outcome aka dependent variable (y) from features (covariates).
  • Model T: Predicts treatment assignment (t) from features (covariates). This step helps control for selection bias by modeling the treatment assignment process.
  • Final model: This model is used in the second stage of Double ML (hence the name) to estimate the causal treatment effect. It combines the residuals obtained from model Y and model T to isolate the effect of t on y. The final model helps control for confounding and provides an unbiased estimate of the treatment effect.

Models Y and T are used in the first stage of Double ML to fit "nuisance" models for outcome and treatment assignment respectively. The final model is used in the second stage to estimate the treatment effect after controlling for these nuisance components.

Note that all the models in Double-ML can be quite simple, and a variety of models can be used.

Causal Wizard is configured to use the following options:

We chose the Gradient Boosting models because they reliably perform well on nonlinear interactions and generally don't require tuning of additional hyperparameters.

These options mean that CausalWizard will only offer Double ML when the outcome variable is numerical, not categorical. We did test a binary categorical outcome with the GradientBoostingClassifier for model Y and discrete_outcome=True, but found the results to be unstable.