R, finding the worst predicted cases in a panel regression

Question

I am working on a model explaining country participation in an international institution. After a Panel Regression using both a random- and a within transformation model, I want to select the cases (i.e. countries) which the model predicts the worst, to use those in a qualitative research.

One idea was to predict the country values in each year, then compare those results with the actual participation and average the overall mismatch within one country to find the maximum deviation of real vs. predicted value across all countries. Can I use the normal predict() function for a plm Model, or is there a different approach?

score 0 · Accepted Answer · answered Jan 02 '20 at 03:01

One idea was to predict the country values in each year, then compare those results with the actual participation and average the overall mismatch within one country to find the maximum deviation of real vs. predicted value across all countries. Can I use the normal predict() function for a plm Model

Yes, you can do that. Or you can just investigate the residuals directly. Here is a related example using the "Produc" dataset that comes with the plm package:

library(dplyr)  
library(plm)
data("Produc", package = "plm")
zz <- plm(
  log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,
  data  = Produc,
  index = c("state","year"))
Produc <- cbind(Produc, resids = zz$residuals)
Produc %>%
  group_by(state) %>%
  summarize(resids.meanAbs = mean(abs(resids))) %>%  # average residuals within states
  .[order(.$resids.meanAbs, decreasing = TRUE),]     # states by largest avg. residual

R, finding the worst predicted cases in a panel regression

1 Answers1