Categories → Causal Wizard Concept , Process , Causal Inference , Validation , Statistics
Bootstrap validation is a statistical technique that involves resampling the data to assess the accuracy and variability of a statistical model's estimates.
Bootstrap validation is a statistical technique used to assess the accuracy and variability of a statistical model's estimates by resampling the data from the original dataset (with replacement, meaning a sample can be drawn more than once). This technique involves creating multiple sample subsets from the original dataset and fitting the model to each subset. By comparing the results of the model fits to the fit of the original dataset, one can assess the accuracy and variability of the model's estimates.
A related method, cross-validation, is also widely used in machine learning but partitions the data into evenly sized chunks rather than sampling randomly.
Both methods have pros and cons. If the data are shuffled, cross-validation is conceptually similar to sampling without replacement, because each sample can only occur once. A disadvantage of cross-validation is that in smaller datasets, each subset is smaller.
However, a disadvantage of bootstrap sampling is that (especially when stratified) the importance of "rare" samples may be over-inflated.
Bootstrap validation is used in causal inference to identify the effect of a treatment on an outcome variable is as follows:
Suppose you are conducting a study to determine the effect of a new drug on blood pressure. You have collected data on 100 patients, with 50 receiving the new drug and 50 receiving a placebo. You want to estimate the causal effect of the drug on blood pressure.
To estimate this effect, you could use a regression model that includes the treatment as a predictor variable, controlling for any confounding variables that may be related to blood pressure. However, it is important to assess the accuracy and variability of the estimates obtained from this model.
To do this, you can use bootstrap validation. This involves randomly selecting, with replacement, a subset of the original data (e.g., 80% of the data) and fitting the regression model to this sample. You then repeat this process many times (e.g., 1000 times), each time using a different randomly selected subset of the data. This creates multiple estimates of the effect of the drug on blood pressure.
You can then examine the distribution of these estimates to assess the accuracy and variability (or stability) of the model's estimates. For example, you could calculate the mean estimate and the standard deviation of the estimates. You could also construct a confidence interval around the estimate to determine the range within which the true effect of the drug is likely to lie. This mechanism is used to generate confidence intervals in Causal Wizard.
The bootstrap method is also used in Causal Wizard and the underlying DoWhy library for hypothesis significance testing.
If our experimental hypothesis is that there is a significant causal effect of Treatment on Outcome, the null hypothesis is that the observed effect is due to chance and there is really no causal effect.
The probability that the null hypothesis is true can be estimated by randomly permuting the outcomes, drawing samples, re-fitting the model and observing how often the estimated effect is as significant as the original estimate. This is the method performed by the significance testing in Causal Wizard, one of the refutation tests performed as part of the validation process.
Bootstrap validation is a useful technique to validate the stability of causal inference results when the data may be noisy or there may be unobserved confounding variables that are difficult to control for.