0

I'm using R's ff package with ffdf objects named MyData, (dim=c(10819740,16)). I'm trying to split the variable Date into Day, Month and Year and add these 3 variables into ffdf existing data MyData.

For instance: My Date column named SalesReportDate with VirtualVmode and PhysicalVmode = double after I've changed SalesReportDate to as.date(,format="%m/%d/%Y").

Example of SalesReportDate are as follow:

> B
  SalesReportDate
1       2013-02-01
2       2013-05-02
3       2013-05-04
4       2013-10-06
5       2013-15-10
6       2013-11-01
7       2013-11-03
8       2013-30-02
9       2013-12-12
10      2014-01-01

I've refer to Split date into different columns for year, month and day and try to apply it but keep getting error warning.

So, is there any way for me to do this? Thanks in advance.

Community
  • 1
  • 1
MizaUnic
  • 79
  • 11
  • require(ffbase); MyData$SalesReportDateYear <- with(MyData["SalesReportDate"], format(SalesReportDate, "%Y"), by = 250000) –  Feb 06 '14 at 11:36
  • It's work, thank you so much @jwijffels , really appreciate that..but can I know, why we need to add by=250000, what does it means? I'm new in this.. – MizaUnic Feb 07 '14 at 01:57
  • You don't need to add the by argument, but if you specify it, it will loop in chunks of 250000 records in this example. If you don't specify it, it will look at the RAM you specified in getOption("ffbatchbytes") to see how much can be processed in 1 chunk without overblowing your RAM. –  Feb 07 '14 at 08:28

1 Answers1

0

Credit to @jwijffels for this great solution:

require(ffbase)

MyData$SalesReportDateYear <- with(MyData["SalesReportDate"], format(SalesReportDate, "%Y"), by = 250000) 

MyData$SalesReportDateMonth <- with(MyData["SalesReportDate"], format(SalesReportDate, "%m"), by = 250000) 

MyData$SalesReportDateDay <- with(MyData["SalesReportDate"], format(SalesReportDate, "%d"), by = 250000) 
MizaUnic
  • 79
  • 11