-1

My question is very similar to the one posted here. I am trying to use reshape data to turn CRSP data from long format to wide. I have the tickers of S&P companies stacked in one column and I want to make them the columns instead. In my dataframe I have three columns: "DATE", "TICKER", and "PRC". (PRC is price). I would like instead to have the number of columns equal my unique tickers. I use:

dcast(df, DATE~TICKER, value.var="PRC"), fun.aggregate=length, fill=NaN, drop=FALSE)

This returns the perfect dataframe sorting the time series date on the left and tickers on the right, but just like the question I linked to, I get 1's and NaN's as my values. I like the NaN's, but I would want the 1's to be replaced with the prices of my securities aka the PRC.

I get 1's and 0's because some tickers are missing data, which makes sense as some stocks get into and out of the S&P 500. I just have no idea why the value.var command is being blatantly ignored.

how to format a column with duplicate dates in r

ah bon
  • 9,293
  • 12
  • 65
  • 148
Bobe Kryant
  • 2,050
  • 4
  • 19
  • 32

2 Answers2

1

Without a working dataset, here is all I can really provide.

data.melt <- melt(df, id.vars=c('TICKER','DATE'), measure.vars='PRC')
data.cast <- dcast(data.melt, TICKER~DATE, sum)
branch.lizard
  • 595
  • 1
  • 5
  • 15
1

The value.var isn't being ignored. The problem is that you're specifying fun.aggregate=length. That means that it isn't going to give you the actual values but the result of an aggregate function, in this case length so you're getting 1s and 0s because you either have missing data (a 0) or 1 data point.

Try taking out fun.aggregate. Sometimes it's fussy when there's missing data so you might get an error so then bring back fun.aggregate but replace length with sum or mean

Dean MacGregor
  • 11,847
  • 9
  • 34
  • 72