1

I've Googled around quite a bit and can't find documentation on this. I'm trying to estimate a feasible generalized least squares (FGLS) model on cross-sectional time series data in R. For example:

library(nlme)
foo <- gls(Y ~ factor(panel_ID) + X1 + X2, data = myData,
           correlation=corARMA(p=1), method='ML', na.action=na.pass)

When I run this (my data frame is quite large, which is why I don't include it here), I get the following error:

Error in array(c(X, y), c(N, ncol(X) + 1), list(row.names(dataMod), c(colnames(X), : length of 'dimnames' [1] not equal to array extent

Is anyone familiar enough with the internal workings of gls or the nlme package in general to tell me what I'm doing wrong here? Or suggest another way to go about this (I've also tried the plm package)?

Artem
  • 3,304
  • 3
  • 18
  • 41
Matt
  • 646
  • 4
  • 11
  • I would try it on a subset of your data that doesn't contain `NA` values ... – Ben Bolker Mar 07 '12 at 13:52
  • @BenBolker Thanks, it's running now. Looks like it'll take a long time to converge, but at least it started. – Matt Mar 07 '12 at 18:01
  • If that turns out to work you are encouraged to post an answer to your own question, to help future readers find the answer to the problem ... – Ben Bolker Mar 07 '12 at 18:44

1 Answers1

1

Answer goes out to Ben Bolkner.

The main reason that in your data NA is presented. Please see simulation below:

library(nlme)

# Simulation
n <-100
myData <- data.frame(panel_ID = sample(letters[1:3], n, replace = TRUE), X1 = rnorm(n), X2 = rnorm(n), Y = rnorm(n))
# NA introduction into X1 variable in Row 10.
myData$X1[10] <- NA


foo <- gls(Y ~ factor(panel_ID) + X1 + X2, data = myData,
           correlation=corARMA(p=1), method='ML', na.action=na.pass)

it throws the error:

Error in array(c(X, y), c(N, ncol(X) + 1L), list(row.names(dataMod), c(colnames(X), : length of 'dimnames' [1] not equal to array extent

To eliminate the problem you can remove NAs then it's OK.

# remove NAs 
myData <- myData[!is.na(myData$X1), ]

foo <- gls(Y ~ factor(panel_ID) + X1 + X2, data = myData,
           correlation=corARMA(p=1), method='ML', na.action=na.pass)

summary(foo)

Output:

Generalized least squares fit by maximum likelihood
  Model: Y ~ factor(panel_ID) + X1 + X2 
  Data: myData 
       AIC      BIC    logLik
  280.8763 299.0421 -133.4382

Correlation Structure: AR(1)
 Formula: ~1 
 Parameter estimate(s):
       Phi 
-0.3496918 

Coefficients:
                        Value  Std.Error    t-value p-value
(Intercept)        0.21510948 0.14041692  1.5319343  0.1289
factor(panel_ID)b -0.27337750 0.25997687 -1.0515455  0.2957
factor(panel_ID)c -0.21930200 0.19704831 -1.1129352  0.2686
X1                -0.00604318 0.09469452 -0.0638177  0.9493
X2                 0.23870397 0.09754513  2.4471130  0.0163

 Correlation: 
                  (Intr) fctr(pnl_ID)b fctr(pnl_ID)c X1    
factor(panel_ID)b -0.649                                   
factor(panel_ID)c -0.787  0.443                            
X1                -0.065  0.148         0.044              
X2                -0.094  0.021        -0.011         0.117

Standardized residuals:
        Min          Q1         Med          Q3         Max 
-2.07929137 -0.77670150 -0.01062337  0.52685034  2.43978797 

Residual standard error: 0.9935003 
Degrees of freedom: 99 total; 94 residual
Artem
  • 3,304
  • 3
  • 18
  • 41