Propensity Score Matching for Diff-in-diff with panel data

Question

I am trying to use MatchIt to perform Propensity Score Matching (PSM) for my panel data which contains following a group of participants (participant_uuid) from 12 months before treatment to 12 months after treatment, i.e. we have complete 24 observations per participant. I am performing the matching to prepare a dataset for later calculating Diff-in-diff models. As with diff-in-diff I want to find how the reaction to treatment might vary between groups I am matching based on the 12 months prior treatment. My code currently looks like this:

match.nearestneighbour <- matchit(grouping_variable ~ characteristic1 + characterictic2 + charcteristic3, data = dataset_12months_pre_treatment, distance = "glm", method = "nearest", m.order = "largest", replace = TRUE, exact = c("month_relative_to_treatment"))

I realized this code matches on individual unit level (i.e. selects the best participant_uuid for each month_relative_to_treatment matchig to each of the observations treatment group months). How to change the rstudio code to find the nearest neighbor not per month/observation but the participant_uuid in control group with nearest distance to participant_uuid in treatment group, aggregated across the 12 months considered? Any hints are much appreciated.

score 0 · Answer 1 · answered Apr 12 '23 at 19:43

0

You need to transform your dataset so that it is wide, i.e., so there is a single for each observation, and each column contains the value of its variables for the given month. Then you include all the month-specific variables in the matching formula to estimate the propensity score. This will attempt to create pairs of units that are similar across all 12 months.

answered Apr 12 '23 at 19:43

Noah

3,437
1
11
27

Hi Noah, many thanks for your reply. Transforming to wide results in two challenges: 1) Including the variables I wanted to include across all 24 months I am considering seems to overstrain my model (no result can be achieved) 2) I need to weight variables to account for the fact that time-independent variables are given sufficient consideration What would you recommend? Are you aware of prior papers and/or methodological discussions of PSM for Panel data? Thanks in advance! – Alexanderg Apr 13 '23 at 20:43
Maybe use a model that penalizes the coefficients like ridge regression or avoid a model and use the scaled Euclidean or Mahalanobis distance. I think you only need to match in the pre-period, so don't seek matches using all 24 months (if I understand your analysis correctly). I don't know of any specific papers but I do think there are some out there. You might look into generalized synthetic control. – Noah Apr 13 '23 at 21:07

Propensity Score Matching for Diff-in-diff with panel data

1 Answers1