1

I am working with a very large netCDF file in three dimensions (lat/lon/time). The resolution is 300 meters and the time variable has 25 steps, which leads to 64800x129600x25 cells.

The one variable contained in the file is an integer (ranging from -36 to 120) but represents an underlying factor, which is the problem. It is a land cover data set, so for example: -20 means the cell is of the land type Forest or 10 means the cell is covered by water.

I want to reshape the netCDF file such that there is an additional dimension which represents every factor level of the original variable. And the variable would then be just a 1 or 0 per cell indicating the presence of every factor level at a certain lat/lon/time. The dimensions would then be lat/lon/time/land type.

Here is an example data set, that does not concern land type but is small enough that it can be used for testing. And here is some code to read it in:

library(ncdf4)
# Download the data
download.file("http://schubert.atmos.colostate.edu/~cslocum/code/air.sig995.2012.nc", 
          mode="wb", destfile = "test.nc")
test.ncdf <- nc_open("test.nc", write=TRUE)
# See the lon,lat,time dimensions
print(test.ncdf)
tmp.array <- ncvar_get(test.ncdf, varid="air")

I'm not sure if the raster package is better more suited for this task. For very small netCDF-files I have managed the intended result to some extent, by extracting the data and then stacking it as a data.frame.

Any help or pointing in the right direction would be greatly appreciated. Thanks in advance.

Community
  • 1
  • 1

1 Answers1

0

If I understand correctly, you want to have a set of fields for each type that are 1 or 0 as a function of lat/long/time. e.g. if you are looking a forest you want an array which is 1 when the factor=20 and 0 otherwise.

I know you want to do this in a 4 dimensional array, for that you will need to use R I expect as you tagged the question. But if you don't mind to have a series of 3D arrays for types, a quick and easy way to do this is to use CDO to process the integer array

cdo eqc,-20 air.sig995.2012.nc test.nc

The issue with this is that the output variable still has the same name (you don't say what it is called, so I refer to it as sfctype), and so you would need to change the meta data with nco.

Therefore a better way would be to use expr in cdo.

cdo expr,"forest=sfctype==-20" air.sig995.2012.nc forest.nc

This makes a new variable called forest which is 1 or 0. You could now process all the types you want, and then merge them into one file:

 cdo expr,"forest=(sfctype==-20)" air.sig995.2012.nc type_forest.nc
 cdo expr,"forest=(sfctype==10)" air.sig995.2012.nc type_water.nc
 ...etc...
 cdo merge type_*.nc combined_file.nc

(I don't think you need the curly brackets, but it is a clearer syntax)

...almost what you wanted in a few lines, but not quite... I am not sure how to "stack" these new variables into a 4D array if you really need that, but perhaps nco can do it.

ClimateUnboxed
  • 7,106
  • 3
  • 41
  • 86