Questions tagged [panel-data]

A multidimensional dataset usually describing measurements over time for a specific cohort.

Panel data is a dataset that is focused, multivariate longitudinal data for a set of cross-sectional units such as a family or an individual. Many statistical analysis libraries require the data to be formatted in a certain manner.

854 questions
-1
votes
1 answer

In Python, is there a way to impute average values (or interpolate linear values) for entities in a panel, but only when not all values are missing?

I have a question concerning imputation for panel data. In short, I wish to impute a value in years that have missing values based on the other years of the relevant entity. I thus do not want to impute values when I do not have any non-missing data…
-1
votes
1 answer

Issue with pivot_longer command

I am trying to convert a long dataset to a wide set set using pivot longer, the column headers are "Program ID" and ‘Participant_Count_22’, ‘Participant_Count_21’, ‘Participant_Count_20’, ‘Participant_Count_19’ for four years 2019-2022. …
sili
  • 9
  • 2
-1
votes
1 answer

How to replace missing values in panel data?

I am looking into weekly earnings data, where I have defined my data as pre-pandemic earning data and post-pandemic earning data. Now for some individuals, I have some missing values for the post-pandemic period which I want to replace with their…
-1
votes
2 answers

Attrition in panel data - Stata

I am constructing a panel dataset based on the survey data for the years 2010-2013 (four consecutive years). As is usually the case with household survey data, there is an issue of attrition, i.e. some households drop out from the survey from year…
Joker312
  • 59
  • 5
-1
votes
1 answer

pivot_wider not collapsing rows

got a pretty basic question to ask unfortunately, I am trying to use a pivot_wider to make my data into a panel. variable id reports gp & ge every year, the column t denotes the year. I want a separate variable gp_t and ge_t for every year in the…
Gilrob
  • 93
  • 7
-1
votes
1 answer

How to convert event dates to longitudinal data in R?

I have a dataset of individual subject records with birth, diagnosis, and death dates. I would like to turn this into longitudinal data that shows whether or not subjects have been born, have been diagnosed (diagnosis can happen before or after…
AMG
  • 33
  • 4
-1
votes
1 answer

How to compute the growth rate of a variable for a varying time horizon in a panel?

I'm trying to calculate the growth rate of the gdp per capita(pib_pc) for 32 states in a time horizon of 40 years but the time horizon changes. For example, in the first row I would like to get the growth rate of pib_pc between 1980 and 2017 for…
-1
votes
1 answer

R: Continuous return for panel data

I have the following data: structure(list(`Product Name` = c("A", "A", "A", "B", "B", "B", "C", "C", "C"), Year = c(2018L, 2019L, 2020L, 2018L, 2019L, 2020L, 2018L, 2019L, 2020L), Price = c(200L, 300L, 250L, 304L, 320L, 103L, 203L, 203L, 402L)),…
nima
  • 37
  • 6
-1
votes
1 answer

Fit a fixed effect model with lm() in R without individual intercepts

I am working on panel regression and decided to switch to lm(), because plm() does not have a good predict() function for test data (as well as linearmodels in Python) and lme4 syntax is not intuitive for me as a newbie to econometrics. I want to…
Anakin Skywalker
  • 2,400
  • 5
  • 35
  • 63
-1
votes
1 answer

Trying to analyze panel data but feel like I am mixing up commands - could anybody review and check?

I have the following data structure: 186 unique firm acquisitions Observations for 5 years per firm; 2 years before acquisition year, acquisition year, and 2 years after Total number of observations is thus 186 * 5 = 930 Two dependent variables,…
-1
votes
1 answer

Transforming long form panel data to wide form in R

I have the following long format data that I would like to transform into wide format using R: structure(list(survey_unique_id = c(2816790L, 2816790L, 2816790L, 2585861L, 2585861L, 214733L, 214733L, 214733L, 224481L, 224481L, 224481L), user_id =…
-1
votes
1 answer

How to deal with fractional response on panel data?

I have panel data which the response variable is fractional between 0 and 1, Initially I modeled my data using simple fractional logit model=sm.Logit(y,x) ,results=model.fit(cov_type='HC0') in Python but since my data is panel I needed a model to…
Amir
  • 11
  • 2
-1
votes
1 answer

How to include variable defining total observations in a panel data set?

Here is an example of the panel dataset I'm working with: library(data.table) data <- data.table(ID = c(1,1,1,1,1,2,2,2,2), crop = c(1,2,3,4,5,1,2,3,4)) ID, crop 1, 1 1, 2 1, 3 1, 4 1, 5 2, 1 2, 2 2, 3 2, 4 There are several ID…
codemachino
  • 103
  • 9
-1
votes
1 answer

Plotting panel regression

In the image there is the head of my data, what I want to know is if it id possible to plot these two regressions with a spline curve, together with the raw data of each hospital? I mean, each plot should have the spline regression togheter with…
-1
votes
1 answer

How to create unbalanced panel data based on start date and end date

My data looks something like this: leader startday enddate P 28/12/2000 15/12/2004 C 11/11/1966 19/10/1969 H 21/10/1993 1/07/1994 And I would like to obtain the following data: leader year P …
Pir
  • 1
  • 1