0

I am working on spatial datasets using R.

Data Description

My master dataset is in SpatialPointsDataFrame format and has surface temperature data (column names - "ruralLSTday", "ruralLSTnight") for every month. Data snippet is shown below:

Master Data - (in SpatialPointsDataFrame format)

           TOWN_ID ruralLSTday ruralLSTnight year month
2920006.11 2920006    303.6800      289.6400 2001     0
2920019.11 2920019    302.6071      289.0357 2001     0
2920015.11 2920015    303.4167      290.2083 2001     0
3214002.11 3214002    274.9762      293.5325 2001     0
3214003.11 3214003    216.0267      293.8704 2001     0
3207010.11 3207010    232.6923      295.5429 2001     0

Coordinates:

           longitude latitude
2802003.11  78.10401 18.66295
2802001.11  77.89019 18.66485
2803003.11  79.14883 18.42483
2809002.11  79.55173 18.00016
2820004.11  78.86179 14.47118

I want to add columns in the above data about rainfall and air temperature - This data is present in SpatialGridDataFrame in the table "secondary_data" for every month. Snippet of "secondary_data" is shown below:

Secondary Data - (in SpatialGridDataFrame format)

  month meant.69_73 rainfall.69_73
1     1    25.40968      0.6283871
2     2    26.19570      0.4580542
3     3    27.48942      1.0800000
4     4    28.21407      4.9440000
5     5    27.98987      9.3780645

Coordinates:

    longitude latitude
[1,]      76.5      8.5
[2,]      76.5      8.5
[3,]      76.5      8.5
[4,]      76.5      8.5
[5,]      76.5      8.5

Question

How do I add the columns from secondary data to my master data by matching over latitude longitude and month? Currently the latitude/longitude information in the two table above will not match exactly as master data is a set of points and secondary data is grid.

Is there a way to find the square of the grid on the "Secondary Data" that the lat/long of my master data falls into, and interpolate?

andyteucher
  • 1,393
  • 14
  • 21
sv_noname
  • 1
  • 2
  • How will you match them up when the longitudes do not match? Is the idea (say) to find the square of the grid on your "Secondary Data" that the lat/long falls into, and interpolate? – mathematical.coffee Jul 16 '15 at 04:52
  • Yes, that is exactly what I want to do. What is the best way to do that? – sv_noname Jul 16 '15 at 05:00
  • You will have to give us a small reproducible example (we can't get the lat/lon of the data you've provided, and the month in your master data doesn't match any of the months in your secondary data) – mathematical.coffee Jul 16 '15 at 05:03
  • Its just a snippet, the months in both dataset varies from 0-11. My dataset is pretty huge – sv_noname Jul 16 '15 at 05:06
  • I have added the coordinates in the question above. Does that help? – sv_noname Jul 16 '15 at 05:15
  • 1
    Take a look at `sp::over`. One of the parameter combos it supports is `x = "SpatialPoints", y = "SpatialGridDataFrame"` – hrbrmstr Jul 16 '15 at 05:56

1 Answers1

2

If your SpatialPointsDataFrame object is called x, and your SpatialGridDataFrame is called y, then

x <- cbind(x, over(x, y))

will add the attributes (grid cell values) of y matching to the locations of x, to the attributes of x. Match is done by point-in-grid cell.

Interpolation is a different question; a simple way would be inverse distance with the four nearest neighbours, e.g. by

library(gstat)
x = idw(meant.69_73~1, y, x, nmax = 4)

whether you want one, or the other really depends on what your grid cells mean: do they refer to (i) the point value at the grid cell center, (ii) a value that is constant throughout the grid cell, or (iii) an average value over the whole grid cell. First case: interpolate, second: use over, third: use area-to-point interpolation (not explained here).

R package raster will offer similar functionality, but use different names.

Edzer Pebesma
  • 3,814
  • 16
  • 26