0

I have a large file with about 200000 lines and I'd like to get the file in a way that I can use the zoo package to plot the file and to truncate by date,month and time. The first column is the modified julian date and the second is temperature.

I'd appreciate any help. The file looks like:

4812663507.000000,1.76438
4812663512.000000,1.65121
4812663517.000000,1.60362
4812663522.000000,1.51509
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
user1513007
  • 49
  • 1
  • 3
  • 1
    According to http://en.wikipedia.org/wiki/Julian_day , the current Modified Julian Date is 56148.74726. The current Julian Date is 2456149.24726. It looks to me like those numbers in the first column must be in some completely different system. – Jim Lewis Aug 09 '12 at 18:47
  • 1
    As Jim has pointed out its not clear what you have but once that is determined then `library(zoo); z <- read.zoo("myfile.dat", sep = ",", FUN = f); plot(z)` will do it where `f` is a function you specify to convert whatever it is you have to one of R's date or date/time classes. You can just omit `FUN = f` and it will use the numbers in column 1 as your date/time. Maybe the astroFuns R package has something relevant. – G. Grothendieck Aug 11 '12 at 11:27
  • Hello, Thanks for the comments, the first column is a MJD-TAI(seconds), so I can divided by 86400 to have MJD in days. Once I have in days, how can I use the zoo or other packages to plot or to analyse the files truncating by hours, minutes,days, etc..?? – user1513007 Aug 13 '12 at 17:18

1 Answers1

1

From the comments, the time is in some sort of MDJ seconds, so you can covert it into a time index with Gabor's hint:

library(zoo)
z <- read.zoo("myfile.dat", sep = ",",
              FUN = function(x){as.POSIXct(x,origin='1858-11-17',tz='UTC')})

Where 1858-11-17 is the MJD epoch per http://en.wikipedia.org/wiki/Julian_day

Alternately you could specifiy the origin and add the seconds-since:

z <- read.zoo("myfile.dat", sep = ",",
              FUN = function(x){as.POSIXct('1858-11-17',tz='UTC')+x})

Then it seems you want the data aggregated by various granularity in time:

plot(aggregate(z,cut(time(z),breaks='year'   ),mean)) 
plot(aggregate(z,cut(time(z),breaks='quarter'),mean)) 
plot(aggregate(z,cut(time(z),breaks='month'  ),mean))
plot(aggregate(z,cut(time(z),breaks='day'    ),mean)) 
plot(aggregate(z,cut(time(z),breaks='hour'   ),mean))
plot(aggregate(z,cut(time(z),breaks='6 min'  ),mean)) 
plot(aggregate(z,cut(time(z),breaks='min'    ),mean)) 
plot(aggregate(z,cut(time(z),breaks='10 sec' ),mean)) 
plot(aggregate(z,cut(time(z),breaks='sec'    ),mean)) 
Dave X
  • 4,831
  • 4
  • 31
  • 42