5

My questions is closely related to the one asked here: Pull Return from first business day of the month from XTS object using R.

Instead of extracting the first day of each month, I want to extract, say the 10th data point of each month. How can I do this?

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
mchangun
  • 9,814
  • 18
  • 71
  • 101

2 Answers2

9

Using the same example data from the question you've linked to, you can do some basic subsetting.

Here's the sample data:

library(xts)
data(sample_matrix)
x <- as.xts(sample_matrix)

Here's the subsetting:

x[format(index(x), "%d") == "10"]
#                Open     High      Low    Close
# 2007-01-10 49.91228 50.13053 49.91228 49.97246
# 2007-02-10 50.68923 50.72696 50.60707 50.69562
# 2007-03-10 49.79370 49.88984 49.70385 49.88698
# 2007-04-10 49.55704 49.78776 49.55704 49.76984
# 2007-05-10 48.83479 48.84549 48.38001 48.38001
# 2007-06-10 47.74899 47.74899 47.28685 47.28685

Is this what you were looking for?


Using %in% would give you some more flexibility. For instance, if you wanted the tenth, eleventh, and twelfth days of each month, you could use x[format(index(x), "%d") %in% c("10", "11", "12")] instead.


Update

If, as you have in your update, you want to extract the tenth data point, just use an anonymous function as follows:

do.call(rbind, lapply(split(x, "months"), function(x) x[10]))
#                Open     High      Low    Close
# 2007-01-11 49.88529 50.23910 49.88529 50.23910
# 2007-02-10 50.68923 50.72696 50.60707 50.69562
# 2007-03-10 49.79370 49.88984 49.70385 49.88698
# 2007-04-10 49.55704 49.78776 49.55704 49.76984
# 2007-05-10 48.83479 48.84549 48.38001 48.38001
# 2007-06-10 47.74899 47.74899 47.28685 47.28685

Note that the first row is the eleventh day of the month, because the data actually starts on January 2, 2007.

x[1, ]
#                Open     High      Low    Close
# 2007-01-02 50.03978 50.11778 49.95041 50.11778
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • Note that this doesn't take into account things like business days and so on, just the tenth day itself. – A5C1D2H2I1M1N2O1R2T1 Oct 22 '12 at 06:20
  • I've re-edited my question to refer to the 10th data point. Problem I have for using the tenth day of the month is that for some months, that day doesn't exist in my time series. – mchangun Oct 22 '12 at 06:42
2

xts has some built-in functions for these types of subsets.

> data(sample_matrix)
> x <- as.xts(sample_matrix)
> x[.indexmday(x) == 10]
               Open     High      Low    Close
2007-01-10 49.91228 50.13053 49.91228 49.97246
2007-02-10 50.68923 50.72696 50.60707 50.69562
2007-03-10 49.79370 49.88984 49.70385 49.88698
2007-04-10 49.55704 49.78776 49.55704 49.76984
2007-05-10 48.83479 48.84549 48.38001 48.38001
2007-06-10 47.74899 47.74899 47.28685 47.28685

See the help page ?indexClass for a list of all of them.

GSee
  • 48,880
  • 13
  • 125
  • 145
  • 1
    I thought about this after I had added my update, but it doesn't actually address their updated question, where they ask for the tenth data point (assuming some missing data) not the tenth day of each month. – A5C1D2H2I1M1N2O1R2T1 Oct 23 '12 at 09:06