Categories → Statistics , Data
Class imbalance means that your classes (also called cohorts, or groups) are not balanced. One group is much larger than the other.
In Machine Learning and statistics, a Class is a group of samples which share the same category, label, or group identity. Similarly, in scientific experiments, a group of subjects is also sometimes called a cohort.
In Causal Wizard, we are interested in comparing two groups or cohorts:
The treatment is used to determine which group each sample belongs to. We will then compare the outcomes of the two groups.
Causal Wizard will refuse to proceed if one of these groups is empty (i.e. no samples are classified as Controls or Treated).
If the ratio of Controls to Treated exceeds 5:1 or 1:5 (either way), Causal Wizard will warn you that these classes are imbalanced (also known as class-imbalance).
If both classes have thousands of samples, this may be less of a concern. But if your dataset is small, and one of the classes is extremely small, this problem may be severe. It is left to your judgement how severe the impact may be in your case.
Possible remedies include: