0

I'm trying to study the Sea Surface Temperature (SST) correlation with tropical-cyclone activity in a certain range of months. The data I'm using is from the Hadley Centre (in NetCDF format) using the get_anual_ssts() function from the hadsstR package.

get_annual_ssts <- function(hadsst_raster, years = 1969:2011) {
    mean_rasts <-
        apply(matrix(years), 1, function(x) {
            yearIDx <- which(chron::years(hadsst_raster@z$Date) == x)
            subset_x <- raster::subset(hadsst_raster, yearIDx)
            means <- raster::calc(subset_x, mean, na.rm = TRUE)
            names(means) <- as.character(x)
            return(means)
        })
    mean_brick <- raster::brick(mean_rasts)
    mean_brick <- raster::setZ(mean_brick, as.Date(paste0(years, '-01-01')), 'Date')
    return(mean_brick)
}

What I need is to have an additional parameter that allows me to filter by months of hurricane activity instead of calculating the whole year mean SST.

For instance, for the Southwest Pacific Ocean, I should be able to call get_annual_ssts(hadsst_raster, 12:04, 1966:2007), being December-April the months of activity of hurricane activity. Setting a range of months that comprise two different years would be crucial (maybe stating the initial month and the range length to ease the structure of mean_brick, saving the mean on the initial year?).

Looking at chron's documentation, it doesn't seem possible to assign a subset of mm-yy or something similar. What would be the best way to accomplish this?

Here's how the input raster data (hadsst_raster) looks like, for reference:

class       : RasterBrick 
dimensions  : 180, 360, 64800, 1766  (nrow, ncol, ncell, nlayers)
resolution  : 1, 1  (x, y)
extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 
data source : ~/Downloads/Hadley/HadISST_sst.nc 
names       : X1870.01.16, X1870.02.14, X1870.03.16, X1870.04.15, X1870.05.16, X1870.06.16, X1870.07.16, X1870.08.16, X1870.09.16, X1870.10.16, X1870.11.16, X1870.12.16, X1871.01.16, X1871.02.15, X1871.03.16, ... 
Date        : 1870-01-16, 2017-02-16 (min, max)
varname     : sst 

And how the output (get_annual_ssts(hadsst_raster, 1966:2007)) looks like:

class       : RasterBrick 
dimensions  : 180, 360, 64800, 42  (nrow, ncol, ncell, nlayers)
resolution  : 1, 1  (x, y)
extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 
data source : in memory
names       :      X1966,      X1967,      X1968,      X1969,      X1970,      X1971,      X1972,      X1973,      X1974,      X1975,      X1976,      X1977,      X1978,      X1979,      X1980, ... 
min values  :  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167,  -916.8167, -1000.0000, -1000.0000, ... 
max values  :   29.94996,   29.66276,   29.70941,   30.22522,   29.61913,   29.43723,   29.65050,   29.73929,   29.59117,   29.48381,   29.36425,   29.72932,   29.70908,   29.84216,   29.84868, ... 
Date        : 1966-01-01, 2007-01-01 (min, max)
  • The raster data above is what you return with `return(mean_brick)`? – Val May 05 '17 at 12:36
  • That's the input. I'll make the distinction, sorry. – Alfredo Hernández May 05 '17 at 12:40
  • 1
    Maybe I'm misunderstanding you, but it seems to me you have the info about the months in both `names` and `Date` in the input brick. Have you tried indexing it based on that? – Val May 05 '17 at 12:47
  • @Val yeah, that's true. I was worried about filtering between months in different years. But thinking about it the other night I realised that maybe the way to do it instead is to filter out the months that I don't want. I'll post the answer if I come up with a reasonable method. – Alfredo Hernández May 05 '17 at 12:54
  • Am I assuming correctly that you want to filter specific months for all years? Let's say all layers from April for each year? – Val May 05 '17 at 12:56
  • @Val that is correct. But given that I need to compare the data to hurricane activity, it would be ideal to calculate the mean from the same month range of activity. In the example of the Southwest Pacific Ocean, doing the mean between Jan:Apr & Dec from each year would introduce a systematic error compared to doing the mean from Dec(year_i):Apr(year_i+1). – Alfredo Hernández May 05 '17 at 13:03
  • Ok, I understand. So it's not only filtering the months of interest but also averaging them the proper way. – Val May 05 '17 at 13:09
  • 1
    @Val yeah. If hurricanes were nice and occurred all within the same year all would be easier :P – Alfredo Hernández May 05 '17 at 13:10

1 Answers1

1

Ok, I've got a little something. Maybe you an use it to modify your function:

## Generate your layer names (used for indexing later)

nms <- expand.grid(paste0('X',1969:2011),c("01","02","03","04","05","06","07","08","09","10","11","12"),'16')

nms <- apply(nms,1,function(x) paste0(x,collapse = '.'))

nms <- sort(nms)

## Generating fake raster brick

r <- raster()
r[] <- runif(ncell(r))

rst <- lapply(1:length(nms),function(x) r)

rst <- do.call(brick,rst)

names(rst) <- nms

And now you can index the brick with the layer names. Loop through the Hurricane Seasons (starting with Year1 -1):

for (ix in 1970:2011){


  sel <- rst[[c(grep(paste0(ix-1,'.12'),nms),sapply(paste0(0,1:4),function(x) grep(paste0(ix,'.',x),nms)))]]


  break ## in case you don't want to go through all iterations

  }

For the first iteration, I'm getting this output:

> sel
class       : RasterStack 
dimensions  : 180, 360, 64800, 5  (nrow, ncol, ncell, nlayers)
resolution  : 1, 1  (x, y)
extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
names       :  X1969.12.16,  X1970.01.16,  X1970.02.16,  X1970.03.16,  X1970.04.16 
min values  : 5.988637e-06, 5.988637e-06, 5.988637e-06, 5.988637e-06, 5.988637e-06 
max values  :    0.9999771,    0.9999771,    0.9999771,    0.9999771,    0.9999771 

Let me know if that's helpful.


Edit:

So maybe a more applicable example:

(the function assumes your layer names of your input brick x have the format Xyyyy.mm.dd)

hadSSTmean <- function(x, years, first.range = 11:12, second.range = 1:4){

  nms <- names(x)

  mts <- c("01","02","03","04","05","06","07","08","09","10","11","12")

  xMeans <- vector(length = length(years)-1,mode='list')

  for (ix in 2:length(years){

    xMeans[[ix-1]] <- mean(x[[c(sapply(first.range,function(x) grep(paste0(years[ix-1],'.',mts[x]),nms)),sapply(1:4,function(x) grep(paste0(years[ix],'.',mts[x]),nms)))]])

  }

  return(do.call(brick,xMeans))
  # you could also return the list instead of a single brick
}
Val
  • 6,585
  • 5
  • 22
  • 52
  • I assume `dts` would be my original raster data (`hadsst_raster`), right? – Alfredo Hernández May 05 '17 at 13:51
  • Sorry, that would come from an intermediate product of mine ... I'll correct it. It should be the layer names `nms` – Val May 05 '17 at 13:52
  • `sel <-` seems to be rewritten for each iteration (commenting the `break`). Correct me if I'm wrong, but defining `sel <- raster()` before the loop and then stacking into it (`sel <- stack(sel, ... )`) would do the trick, right? – Alfredo Hernández May 05 '17 at 14:10
  • Maybe I should have been a bit more clear: `sel` is just the current selection given the year you're in. This was merely an example (a proof of concept so to speak), hence the `break` statement. You would need to implement the indexing into your function. I'll see if I have some pointers. You want to select these rasters and the perform a simple mean? – Val May 05 '17 at 14:12
  • So `first.range` means Nov and Dec of the previous year and `second.range` Jan to Apr of the ensuing year? – Val May 05 '17 at 14:46
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/143514/discussion-between-alfredo-hernandez-and-val). – Alfredo Hernández May 05 '17 at 14:46