In essence I'm trying to do a relatively simple set of operations on a collection of netcdf4 files I've downloaded. They're sourced from ESA's Lakes Climate Change Initiative database of satellite-derived limnological data, and each netcdf4 file represents one day in a time series going back to the 2000s or earlier. Each netcdf4 file has a number of dimensions representing different variables of interest (surface temperature, chlorophyll-a concentration, etc).
Using the stars package I was hoping to geographically subset the dataset to only the lake I'm interested in and create a monthly aggregated time series for the lake for a number of those variables.
Unfortunately it seems as though there might be something wrong with the netcdf4 files ESA provided, as I'm running into odd errors while working with them in the 'stars' package in R.
For reproducibility's sake here's a link to the file in question - it should be the first file in the directory. If you download it to a directory of your choice and setwd() to it you should be able to repeat what I've managed here:
library(stars)
setwd()
lake_2015_01_01 <- read_stars('ESACCI-LAKES-L3S-LK_PRODUCTS-MERGED-20150101-fv2.0.2.nc')
## Compare to using the read_ncdf4() function also from stars:
lake_2015_01_01_nc <- read_ncdf('ESACCI-LAKES-L3S-LK_PRODUCTS-MERGED-20150101-fv2.0.2.nc')
Running the file through read_stars() produces the error:
Error in attr(x, "dimensions")[[along]] : subscript out of bounds In addition: There were 50 or more warnings (use warnings() to see the first 50)
While running it through read_ncdf() produces the following:
Warning messages: 1: In CPL_crs_from_input(x) : GDAL Error 1: PROJ: proj_create: Error 1027 (Invalid value for an argument): longlat: Invalid value for units 2: In value[3L] : failed to create crs based on grid mapping and coordinate variable units. Will return NULL crs. Original error: Error in st_crs.character(base_gm): invalid crs: +proj=longlat +a=6378137 +f=0.00335281066474748 +pm=0 +no_defs +units=degrees
But it does successfully complete, just with a broken coordinate system that can be fixed by approximating the coordinate system originally set by the creators:
lake_2015_01_01_nc <- st_set_crs(lake_2015_01_01_nc, 4979)
However the functions stars uses to manipulate data don't work on it, as it's a proxy file that points to the original netcdf4. The architecture of the stars package would suggest that if I want to manipulate the file I need to use read_stars().
I've tried to reverse-engineer the problem a little bit by opening the .nc file in QGIS, where it seems to perform as expected. I'm able to display the data and it seems to be georeferenced correctly, which makes me suspect the data is not being read into 'stars' correctly in the read_ncdf() function, and likely in the read_stars() function as well.
I'm not sure how to go about fixing this however, and I'd like to use both this dataset and stars in the analysis. Does anyone have any insight as to what might be going on here and if it's something I can fix?