1

I am trying to load data from a NetCDF file to r and continue having issues even after following a couple tutorials and SO posts. This is the first time I work with NetCDF so I am unsure what I am doing wrong. I have tried both with the ncdf4 and raster packages. I still don't fully understand the code or how these files get converted/used so apologize for the long code example, but I don't know what I am doing wrong.

The data (NetCDF) I am trying to load has Sea Surface Temperatures and I accessed here. I tried following this tutorial and did the following:

# load the ncdf4 package
library(ncdf4)
dname <- "sst"
ncin <- nc_open("raw/SST/sst.mon.anom.nc")
print(ncin)
# get longitude and latitude
lon <- ncvar_get(ncin,"lon")
nlon <- dim(lon)
head(lon)
#[1]  2.5  7.5 12.5 17.5 22.5 27.5
lat <- ncvar_get(ncin,"lat")
nlat <- dim(lat)
head(lat)
#[1] -87.5 -82.5 -77.5 -72.5 -67.5 -62.5

print(c(nlon,nlat))
#[1] 72 36

# get time
time <- ncvar_get(ncin,"time")
head(time)
#[1] 20453 20484 20513 20544 20574 20605

tunits <- ncatt_get(ncin,"time","units")
nt <- dim(time)
nt
#[1] 2005

tunits
#$hasatt
#[1] TRUE

#$value
#[1] "days since 1800-1-1 00:00:00"

# get temperature
tmp_array <- ncvar_get(ncin,dname)
dlname <- ncatt_get(ncin,dname,"long_name")
dunits <- ncatt_get(ncin,dname,"units")
fillvalue <- ncatt_get(ncin,dname,"_FillValue")
dim(tmp_array)
#[1]   72   36 2005

# get global attributes
title <- ncatt_get(ncin,0,"title")
institution <- ncatt_get(ncin,0,"institution")
datasource <- ncatt_get(ncin,0,"source")
references <- ncatt_get(ncin,0,"references")
history <- ncatt_get(ncin,0,"history")
Conventions <- ncatt_get(ncin,0,"Conventions")

nc_close("raw/SST/sst.mon.anom.nc") #Error in nc$safemode : $ operator is invalid for atomic vectors
ls()

library(chron)

# convert time -- split the time units string into fields
tustr <- strsplit(tunits$value, " ")
tdstr <- strsplit(unlist(tustr)[3], "-")
tmonth <- as.integer(unlist(tdstr)[2])
tday <- as.integer(unlist(tdstr)[3])
tyear <- as.integer(unlist(tdstr)[1])
chron(time,origin=c(tmonth, tday, tyear))

# replace netCDF fill values with NA's
tmp_array[tmp_array==fillvalue$value] <- NA

length(na.omit(as.vector(tmp_array[,,1])))
#[1] 1203

# get a single slice or layer (January) #Im assuming this is where things 
#start going wrong but not sure what to change/do differently 
m <- 1
tmp_slice <- tmp_array[,,m]
dim(tmp_slice)
#[1] 72 36

lonlat <- as.matrix(expand.grid(lon,lat))
dim(lonlat)
#[1] 2592    2

tmp_vec <- as.vector(tmp_slice)
length(tmp_vec)
#[1] 2592

# create dataframe and add names
tmp_df01 <- data.frame(cbind(lonlat,tmp_vec))
names(tmp_df01) <- c("lon","lat",paste(dname,as.character(m), sep="_"))
head(na.omit(tmp_df01), 10)

write.table(na.omit(tmp_df01),"cru_tmp_1a.csv", row.names=FALSE, sep=",")

This creates a csv with lon, lat, and sst_1 columns but the longitude values seem wrong and it is not providing me any dates which I assume should be included in this.

I tried to continue following the tutorial linked above and part 3.4.1 (reshape whole array) is not looking right.

# reshape the array into vector
tmp_vec_long <- as.vector(tmp_array)
length(tmp_vec_long)

# reshape the vector into a matrix
tmp_mat <- matrix(tmp_vec_long, nrow=nlon*nlat, ncol=nt)
dim(tmp_mat)

head(na.omit(tmp_mat))

this seems to result in a blank array..?

Because this is not working for me and I am feeling stuck, I also tried following this other tutorial also using the ncdf4 package but slightly different approach. I get an error once I get to Step 4 while trying to create "the coordinate matrix". I then tried this other tutorial - but no luck. I also tried what was suggested in SO post.

Additionally, I tried the same methods using a different NetCDF data just in case (AMO

I will most likely be working with more NetCDF so just trying to find the best workflow to convert these into csv files and use them in r.

To clarify, why is this not working? How can I access data in this format (netcdf)?

Thank you!

FishyFishies
  • 301
  • 3
  • 14

1 Answers1

1

You can use terra package for this job like

library(terra)

#Read the nc file
sst <- rast("sst.mean.anom.nc")
time(sst)

#Convert it into data frame
df <- as.data.frame(sst, xy = T) 
df1 <- subset(df, select = -c(x, y))
names(df1) <- time(sst) 

#Write the data in csv file
write.csv(cbind.data.frame(subset(df, select = c(x, y)), df1), "Try.csv")
UseR10085
  • 7,120
  • 3
  • 24
  • 54
  • 1
    thank you this seems to have worked for the sst.mean.anom.nc (NC_FORMAT_NETCDF4_CLASSIC). This did not work for another netcdf file (AMO) that is NC_FORMAT_CLASSIC. Does the terra approach not work with older formats? Thanks! – FishyFishies Jul 06 '23 at 12:03