Questions tagged [panel-data]

A multidimensional dataset usually describing measurements over time for a specific cohort.

Panel data is a dataset that is focused, multivariate longitudinal data for a set of cross-sectional units such as a family or an individual. Many statistical analysis libraries require the data to be formatted in a certain manner.

854 questions
2
votes
0 answers

Python panel data regression with more than two fixed effects

I have a panel database and would like to run a regression considering fixed effects. When using Panel.Ols, two fixed effects work without problems. My code looks like this: df['countyCode'] = pd.Categorical(df['countyCode']) df['state'] =…
2
votes
1 answer

lag() from pglm seems to bug lag() from stats

As the title says. After I load pglm, lag stops to work properly. library(pglm) c(1,2,3,4) %>% lag() the object is converted into a time series and is not compatible anymore with tibbles. Even unloading pglm, the dependency for lag is still…
GiulioGCantone
  • 195
  • 1
  • 10
2
votes
2 answers

Combine long-format data frames with different length and convert to wide format

I want to combine data frames in long format with different length because of the time variable (imbalanced panel data): set.seed(63) #function to create a data frame that includes id, time and x func1 <- function (size=5) { …
cliu
  • 933
  • 6
  • 13
2
votes
1 answer

Is there a way to derive the intercept of the firm fixed effect from the Python PanelOLS model?

I am in the process of estimating the fixed effect of panel data using the Python statsmodel package. First, the data used in the analysis include X and Y observed over time with several companies. Below are some examples from the actual data, but…
2
votes
2 answers

Constraint on panel data to remove subjects using data.table

I have a panel dataset: data <- data.table(ID = c(1,1,1,1,2,2,3,3,3), year = c(1,2,3,4,1,2,1,2,3), score1 = c(90,78,92,69,86,73,82,85,91)) > data ID year score1 1: 1 1 90 2: 1 2 78 3: 1 3…
codemachino
  • 103
  • 9
2
votes
1 answer

Adding fixed effects regression line to ggplot

I am plotting panel data using ggplot and I want to add the regression line for my fixed effects model "fixed" to the plot. This is the current code: # Fixed Effects Model in plm fixed <- plm(progenyMean ~ damMean, data=finalDT, model= "within",…
codemachino
  • 103
  • 9
2
votes
1 answer

How to cluster by entity and year, in IV2SLS with linearmodels?

I am working on a panel model of African countries, with their democracy scores, log(gdp per capita) with 3 lags, and log(rain) amounts, also with three lags. I am trying to us IV2SLS to find the economic shocks in log(gdp per capita) (and its lags)…
Demosthenes
  • 111
  • 1
  • 9
2
votes
1 answer

How can I run a diff-in-diff with fixed effects in Python?

I tried searching everywhere, but couldn't find this: how can I run a diff-in-diff with fixed effects in Python? I already know how to run a diff-in-diff. For instance, let's consider the njmin dataset. This dataset consider the minimum wage…
dekio
  • 810
  • 3
  • 16
  • 33
2
votes
2 answers

R package alternative to plm for panel data

I have been googling quite a lot and only found the plm package as a comprehensive package of tools for handling and analyzing panel data in R. I am a novice in the field but will have to perform some analyses for my Master Thesis. Do you have any…
corkinabottle
  • 141
  • 1
  • 7
2
votes
1 answer

Loop over data.frame columns to generate dummy variable in R

I'm struggling with generating a variable for my current project. I'm using R version 4.0.1 on Windows. Data description I have unbalanced panel data in a data.table containing 243 variables (before running the commands) and 8,278 observations. The…
ilka
  • 59
  • 7
2
votes
1 answer

R data into panel data

I have several datasets which look roughly like this ones. I would like to transform into a proper panel dataset to run regressions and random forests. However, I am struggling to put the years into a column. Thank you very much in advance First…
2
votes
1 answer

How to detect a change in a variable over time in panel-data using dplyr?

I am using panel data and have some discrepancies in the age variable. For some respondents, their age increase or decrease by more than 1 from one year to another as we can see for respondents with ID number 2 and 3 below. This could be due to…
Jack
  • 813
  • 4
  • 17
2
votes
2 answers

Warning message in R plm package regarding indexes that have the same length but not the same content

I get a warning message from the plm package in R when I perform ´summary()´ of a model: 1: In Ops.pseries(y, bX) : indexes of pseries have same length but not same content: result was assigned first operand's index 2: In Ops.pseries(y, bX) : …
noslomo
  • 58
  • 7
2
votes
1 answer

How to delete variables in a panel data if all observations for a given year are NAs?

I have a dataframe like this, scores <-structure(list(student = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("adam", "mike", "rose"), class = "factor"), year = c(2001L, 2002L, 2003L, 2001L, 2002L, 2003L, 2001L, 2002L,…
2
votes
1 answer

R Panel data: Create new variable based on ifelse() statement and previous row

My question refers to the following (simplified) panel data, for which I would like to create some sort of xrd_stock. #Setup data library(tidyverse) firm_id <- c(rep(1, 5), rep(2, 3), rep(3, 4)) firm_name <- c(rep("Cosco", 5), rep("Apple", 3),…
chuesker
  • 25
  • 2