Since you're using the quantmod
package, you can use getSplits() function to determine which splits a stock has had within a certain timeframe. Since you're dealing with a large number of stocks, you can use a custom function to get what you want.
getSymbolSplit <- function(symbol,xts,date) {
splitCheck <- getSplits(symbol,from = date)
if(anyNA(splitCheck, recursive = FALSE)){
xts$Split <- 0
} else {
xts$Split <- 1
}
return(xts)
}
Once you have that function, you can quickly add a split to the existing stocks data. Example:
getSymbols('GOOG')
GOOG <- getSymbolSplit('GOOG',GOOG,'1993-01-01')
The getSymbols() function creates the xts named GOOG, so our function checks if any splits have happened since 1993-01-01 (yes) and adds a column Split
with the value of 1.
getSymbols('REGN')
REGN <- getSymbolSplit('REGN',REGN,'1993-01-01')
Same deal, but REGN has had no splits since 1993 and the column has a value of 0.
Now you have a clear binary variable for grouping between firms that have had a split in your given timeframe.
As a warning, I encountered a problem with BRK-A. R does not normally permit names that include a '-', and the function breaks when trying to pass an xts named BRK-A. If you have stocks that use a - in their symbol, I recommend you rename them before using them. This function is not the only place where the dash could cause problems.