0

My goal is to reproject a raster called pftnc to have the same CRS and cell size of another raster sp, so they overlap perfectly and can be stacked. Files available here pftnc and sp

I thought it was an issue with software, see the other question https://gis.stackexchange.com/questions/299935/extract-values-from-raster-r-and-arcgis-different-results, but I am having inconsistencies just within R. So I would like to find an R solution first.

I have tried to reproject directly

pft1 <- projectRaster(pftnc, sp)

And also tried to first aggregate and then reproject

pftagg <- aggregate(pftnc, fact=4)
pftproj1 <- projectRaster(from=pftagg, to=sp)

I have also tried to just project and then crop and aggregate or the other way around but I can never get to the result I get when I use to=sp, so I abandoned this route. For example

cea=crs(sp)
pftproj2 <- projectRaster(pftnc, crs=cea)

The problem is that pftproj1 and pft1 have the same rows,columns,extent, and resolution but the values are different.

> cellStats(pftproj1,mean)
[1] 0.1955
> cellStats(pft1,mean)
[1] 0.2028

Is there a correct order of operations to achieve this goal? Should I let projectRaster deal with the aggregation of values? I imagine

Herman Toothrot
  • 1,463
  • 3
  • 23
  • 53

1 Answers1

1

I do not know if there is a "best" method. I think the main reason for the difference that you get is that if you aggregate first, you get more cells (if aggregating with na.rm=TRUE). The difference is much smaller if you use mask to compare strictly the same cells.

library(raster)
sp <- raster("sp.tif")
pf <- raster("pftnc.tif")

pf1 <- projectRaster(pf, sp)
pfa <- aggregate(pf, fact=4)
pf2 <- projectRaster(from=pfa, to=sp)
pf2m <- mask(pf2, pf1)

# an alternative that I would not recommend
pfr <- projectRaster(pf, crs=crs(sp))
pf3 <- resample(pfr, sp)
pf3m <- mask(pf3, pf1)

cellStats(pf, mean)
#[1] 0.2202417
cellStats(pf1, mean)
#[1] 0.2027533
cellStats(pf2, mean)
#[1] 0.1954503
cellStats(pf2m, mean)
#[1] 0.2024068
cellStats(pf3, mean)
#[1] 0.2016164
cellStats(pf3m, mean)
#[1] 0.2021608

In a question it is better to use example data generated by code. In your case, you could use

sp <- raster(nrow=142, ncol=360, ext=extent(-17367529, 17367529, -6356742, 7348382), crs="+proj=cea +lon_0=0 +lat_ts=30 +x_0=0 +y_0=0 +datum=WGS84 +units=m")

pf <- raster(ncol=1440, nrow=720, xmn=-180, xmx=180, ymn=-90, ymx=90, crs='+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0')
values(pf) = rep(c(1:4,NA), ncell(pf)/5)

But ... the differences go away, except for the "bad" method. Not sure if this is due to the distribution of your values, or to the role of NA values, but I guess you could experiment to find out.

cellStats(pf, mean)
#[1] 2.5
cellStats(pf1, mean)
#[1] 2.5
cellStats(pf2, mean)
#[1] 2.5
cellStats(pf2m, mean)
#[1] 2.5
cellStats(pf3, mean)
#[1] 2.547783
cellStats(pf3m, mean)
[1] 2.547783
Robert Hijmans
  • 40,301
  • 4
  • 55
  • 63