0

I have a set of netcdf file that is organised by dates in my directory ( each file is one day of data). I read all the files in R using

require(RNetCDF) files= list.files( ,pattern='*.nc',full.names=TRUE)

When I run the codes R reads 2014 and 2013, then parts of 2010 is at the end .. ( see below sample output in R)

"./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820223.SUB.nc"
"./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820224.SUB.nc"
"./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820225.SUB.nc"

"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130829.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130830.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130831.SUB.nc"

"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100626.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100827.SUB.nc"
"./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100828.SUB.nc"

I am trying to generate daily times series for these files using a loop..so when i apply the rest of my codes.. data for from June to Aug 2010 comes to end of daily time series. I rather suspect that this has to do how the files are listed R

Is there any way to list files in R and ensure it is organized dates?

nee
  • 125
  • 1
  • 11

1 Answers1

2

Here are your files unsorted

paths <- c("./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820223.SUB.nc",
           "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820224.SUB.nc",
           "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820225.SUB.nc",
           "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130829.SUB.nc",
           "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130830.SUB.nc",
           "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130831.SUB.nc",
           "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100626.SUB.nc",
           "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100827.SUB.nc",
           "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100828.SUB.nc")

I'm using a regular expression to extract the 8 digits in the date, YYYYMMDD, and you should be able to sort by the string of digits, but you can also just convert them into dates

## matches ...Nx.<number of digits = 8>... and captures the stuff in <>
## and saves this match to the first capture group, \\1
pattern <- '.*Nx\\.(\\d{8}).*'

gsub(pattern, '\\1', paths)
# [1] "19820223" "19820224" "19820225" "20130829" "20130830" "20130831"
# [7] "20100626" "20100827" "20100828"

sort(gsub(pattern, '\\1', paths))
# [1] "19820223" "19820224" "19820225" "20100626" "20100827" "20100828"
# [7] "20130829" "20130830" "20130831"

## not necessary to convert that into dates but you can
as.Date(sort(gsub(pattern, '\\1', paths)), '%Y%m%d')
# [1] "1982-02-23" "1982-02-24" "1982-02-25" "2010-06-26" "2010-08-27"
# [6] "2010-08-28" "2013-08-29" "2013-08-30" "2013-08-31"

And order the original paths

## so you can use the above to order the paths
paths[order(gsub(pattern, '\\1', paths))]
# [1] "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820223.SUB.nc"
# [2] "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820224.SUB.nc"
# [3] "./MERRA100.prod.assim.tavg1_2d_lnd_Nx.19820225.SUB.nc"
# [4] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100626.SUB.nc"
# [5] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100827.SUB.nc"
# [6] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20100828.SUB.nc"
# [7] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130829.SUB.nc"
# [8] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130830.SUB.nc"
# [9] "./MERRA301.prod.assim.tavg1_2d_lnd_Nx.20130831.SUB.nc"
rawr
  • 20,481
  • 4
  • 44
  • 78