9

I am very new to R so apologies if I get any of the terminology wrong when I explain this problem.

I have a set of daily returns data in a csv file that I have managed to convert to an xts object. The data is in the format:

           HighYield..EUR. MSCI.World..EUR.
2002-01-31          0.0144           0.0031    
2002-02-01          0.0056          -0.0132       
2002-02-02          0.0373           0.0356       
2002-02-03         -0.0167          -0.0644      
2002-02-04         -0.0062          -0.0332      
2002-02-05         -0.0874          -0.1112 
...

I want to create a script that will find the first business day of the month (from the range of values in the index) and then create a new xts object with these returns in it.

For example, after the script has run I would have an xts object in the format:

           HighYield..EUR. MSCI.World..EUR.
2002-01-31          0.0144           0.0031    
2002-02-28          0.0011          -0.0112       
2002-03-31          0.0222           0.0224       
2002-04-30         -0.0333          -0.0223      
2002-05-30         -0.0011          -0.0012      
2002-06-30         -0.0888          -0.0967 
...

Can someone help me please? and if possible explain what each part of the script is doing.

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
GreenyMcDuff
  • 3,292
  • 7
  • 34
  • 66
  • Your example shows last day of each month, but no matter. There are lots of ways to pull specific dates, up to such kludges as (pseudocode) `if (month(dateval[i]>month(dateval[i-1]) then { copy this i-th row to output}` . Start by taking a look at the package `lubridate` for useful date-related functions. – Carl Witthoft Jun 12 '12 at 11:19

1 Answers1

13

Thanks to the power of the base R language, you can do this in one line:

 library(xts)
 data(sample_matrix)
 x <- as.xts(sample_matrix)
 do.call(rbind, lapply(split(x, "months"), first))

To explain what each step is doing:

 # Split the xts object into a list with an element for each month.
 x1 <- split(x, "months")
 # Loop over the list (x1) and call the first() function on each element.
 # This returns a new list where each element only contains the first observation
 # from each respective element in x1.
 x2 <- lapply(x1, first)
 # Call rbind() with all the elements of x2 as arguments to rbind()
 # Same as rbind(x2[[1]], x2[[2]], ..., x2[[N]])
 x3 <- do.call(rbind, x2)
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • Joshua, you are a scholar and a gentleman. I am in your debt. – GreenyMcDuff Jun 12 '12 at 13:41
  • If we assume the "first business day" to be literally exclusive of Saturdays and Sundays, shouldn't we use `do.call(rbind, lapply(split(x[.indexwday(x) %in% 1:5], "months"), first))`? Or is there an even better way with "xts" to do this? – A5C1D2H2I1M1N2O1R2T1 Oct 22 '12 at 07:06
  • @mrdwab: yes, that's a good point. My answer assumes the object only contains business days. Yours is better, but still doesn't exclude any potential holidays. The [timeDate](http://cran.r-project.org/web/packages/timeDate/index.html) package has good functions for that. – Joshua Ulrich Oct 22 '12 at 11:47
  • @JoshuaUlrich, cool. I was just curious. I'll check out the package you suggested, though I'm going to go out on a limb and guess that Indian holidays are not part of that package ;) – A5C1D2H2I1M1N2O1R2T1 Oct 22 '12 at 11:55