Categories → Validation , Statistics , Causal Effect
Randomizing the outcomes allows quantification of the chance of observing the effect by chance.
NOTE: Please read this article for a detailed guide to refutation and statistical significance testing in DoWhy, a Python library used in Causal Wizard.
Refutation is a key concept in causal inference, which refers to the process of testing a hypothesis by attempting to prove it false. One way to do this is by using a randomized outcome.
Falsifying or refuting an outcome should not been seen as a disappointment:
Strict refutation helps to ensure - but does not guarantee - that results are sound and trustworthy.
Randomizing the outcomes should destroy any causal effect, because the outcome is no longer affected by the treatment at all.
Causal Wizard adopts a non-parametric (i.e. not needing assumptions about data distribution) statistical significance test of the core causal result by repeatedly permuting the outcomes and re-fitting the model - this is known as the bootstrap method. Permutation is used to do this (rather than creating "random" outcomes) because it's an easy way to generate a set of statistically realistic outcome values.
We then look at how often a causal effect as strong as the original estimate is obtained from models fitted to these datasets with no causal effect. If an equally strong causal effect is very rare with permuted outcomes, this suggests there was a causal effect in the original data. The frequency of effects as strong as the original causal effect being observed under the condition of randomized outcomes is used to generate a p-value.