4

I'm trying to run plm to see effects of classes positive, negative and neutral on stock prices.

DATE <- c("1","2","3","4","5","6","7","1","2","3","4","5","6","7")
COMP <- c("A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B")
RET <- c(-2.0,1.1,3,1.4,-0.2, 0.6, 0.1, -0.21, -1.2, 0.9, 0.3, -0.1,0.3,-0.12)
CLASS <- c("positive", "negative", "neutral", "positive", "positive", "negative", "neutral", "positive", "negative", "negative", "positive", "neutral", "neutral", "neutral")
df <- data.frame(DATE, COMP, RET, CLASS, stringsAsFactors=F)

df

#    DATE COMP   RET    CLASS
# 1     1    A -2.00 positive
# 2     2    A  1.10 negative
# 3     3    A  3.00  neutral
# 4     4    A  1.40 positive
# 5     5    A -0.20 positive
# 6     6    A  0.60 negative
# 7     7    A  0.10  neutral
# 8     1    B -0.21 positive
# 9     2    B -1.20 negative
# 10    3    B  0.90 negative
# 11    4    B  0.30 positive
# 12    5    B -0.10  neutral
# 13    6    B  0.30  neutral
# 14    7    B -0.12  neutral

If I run the model, the output shows only two of the estimates (neutral and positive). How can I see the estimate of class negative? I think it's got something to do with the Dummies. But still, shouldn't there be at least a line "Intercept" for the negative class?

mymodel <- plm(RET ~ CLASS, data=df,
              index = c("DATE", "COMP"), 
              model="within", 
              effect="time")

summary(mymodel)

# Oneway (time) effect Within Model

# Call:
# plm(formula = RET ~ CLASS, data = df, effect = "time", model = "within", 
#     index = c("DATE", "COMP"))

# Balanced Panel: n=7, T=2, N=14

# Residuals :
#    Min. 1st Qu.  Median 3rd Qu.    Max. 
# -2.1500 -0.4620 -0.0791  0.7540  1.9300 

# Coefficients :
#               Estimate Std. Error t-value Pr(>|t|)
# CLASSneutral   0.35818    0.81581  0.4390    0.670
# CLASSpositive -0.56418    0.81581 -0.6916    0.505

# Total Sum of Squares:    16.79
# Residual Sum of Squares: 14.694
# R-Squared      :  0.12486 
#       Adj. R-Squared :  0.089183 
# F-statistic: 0.713347 on 2 and 10 DF, p-value: 0.5133

Thank You!

landroni
  • 2,902
  • 1
  • 32
  • 39
cptn
  • 693
  • 2
  • 8
  • 28

1 Answers1

1

As with most models with categorical covariates, The first level is used as a reference level. In this case the "negative" category is used as the reference category because by default R sorts the levels of a factor alphabetically. When you have a categorical data, you can't really tease out the person-specific mean and the mean for the reference category. They are combined into the intercept term. Then the coefficient for CLASSneutral isn't the effect of the neutral class, it's the different between the effect of neutral and negative. Same for CLASSpositive -- that's the different between the effect of positive and negative. Because the model by default uses individual effects, each person has their own intercept, I'm assuming that's why they didn't print it on the summary.

This is not unique to plm. The same thing would happen with a standard lm.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • @ MrFlick: Do you know how to build in an Intercept, so I can see how much the return is in case of "negative" ? – cptn May 22 '14 at 09:59
  • @cptn As far as i can tell, your model calculates a separate intercept term for every individual. There is no single intercept term to estimate as in the case of simple linear regression. What you are asking for doesn't make sense. You can't get effects for every level of a factor. – MrFlick May 22 '14 at 14:28
  • use fixef(mymodel) to see the individual intercepts – Helix123 May 11 '15 at 10:23