Panel Data format

CategoriesStudy Design , Causal Wizard Concept , Data , Causal Inference

Panel data format tracks the attributes and outcomes of a group of entities over time.

Datasets

Causal Wizard supports two tabular Dataset formats. This article is about a specific format, called Panel Data. See this article for details of the generic dataset format, and general terminology and concepts for dataset design, such as the definition of a Sample or Sample Unit.

Panel Data format

Each row represents an observation of a specific Entity at one moment in Time. The concepts of Entities and Time are specific to Panel Data format and are not required to be present in other Causal Wizard Datasets. 

An Entity can be an individual participant, or a group of participants who are related in some way that makes you want to model them the same way. So you might want to think of the Entity as a "Unit" or "Group" or "Cohort" instead.

The Time column in your data can have date-time -like values, but can also be anything which has an inherent order, such as integer numbers representing discrete periods.

Both the Entity and Time columns are optional - you can have neither, one, or both. In all cases you will have access to the Fixed-Effects models, but the analysis and results will be slightly different. To access Fixed-Effects models, modify the Study Method setting.

Columns

A Panel Data file should have the following columns:

  • Treatment
  • Outcome
  • Entity Identifier (optional)
  • Time Identifier (optional; must have natural order)
  • Additional covariates (optional)

Treatment variable & interaction term

The Treatment variable should be set to the value which indicates Treated only in the rows which represent times during or after Treatment. Do not set Treatment to a "true" value for all rows where the Entity is present. This allows the Treatment variable to act as an "interaction term" which is only true when the Entity is in the Treated group and Treatment has been applied.

For example, if you have 2 entities A and B which are observed before and after A is treated, the data would look like this:

Entity Time Treatment Outcome
A 1 0 5
A 2 1 7
B 1 0 3
B 2 0 4

In the data above, Entity A is Treated and Entity B is a Control.

 

Related articles
In categories