I have a dataset with 132 observations, 1 response variable (Lopend_gemiddelde
), and 9 predictors on which I perform three methods: 1) multiple linear regression (MLR), 2) kriging on the residuals (based on a multiple linear regression), and 3) universal kriging. The results for the methods differ slightly in terms of performance (R2, RMSE, MAE). As an example, the performances for the three methods, based on a 20-fold cross validation are:
Performance criterion | Multiple linear regression | Kriging on residuals | Universal kriging |
---|---|---|---|
R2 | 0.337 | 0.323 | 0.333 |
RMSE | 7.585 | 7.718 | 7.615 |
MAE | 6.118 | 6.170 | 6.084 |
My two questions related to these results:
- Why does the addition of kriged residuals to the MLR results in poorer performance (in terms of R2, RMSE, and MAE), compared to only the MLR?
- What causes the difference in model performance (in terms of R2, RMSE, and MAE) between the kriging on residuals on the one hand, and the universal kriging on the other hand?
My related code for the first method (MLR) is displayed below. test$predicted
is used to calculate the R2, RMSE, and MAE.
## == 1) multiple linear regression == ##
#multiple linear regression, input data is trained dataset (75% of original)
model_train = lm(Lopend_gemiddelde ~ 1 + nightlight_450 + nightlight_4950 + population_3000 + road_class_1_5000 + road_class_2_1000 + road_class_2_5000 + road_class_3_100 + road_class_3_300 + trafBuf50 , data=train)
#predict NO2 by the trained model (linear)
test$predicted = predict(model_train, test)
The related code for the second method (kriging on residuals) is displayed below (inspired by this post). The kriged residuals are added to the prediction values that derive from the multiple linear regression method (i.e. method 1). I project my kriging on a 100m resolution grid (grid_sp
). predicted_model$PredAddedKrigedResi
is used to calculate the R2, RMSE, and MAE.
## == 2) kriging on the residuals == ##
train_df = data.frame(train)
data_model_train <- data.frame(x = train_df$coords.x1, y = train_df$coords.x2,resid=resid(model_train))
#fitting the variogram (autofit)
coordinates(data_model_train) = ~x+y
variogram_train = autofitVariogram(resid ~ 1, data_model_train)
#create dataset including variogram parameters
autofit_params <- variogram_train$var_model
#use variogram with autofitted parameters
lz.ok_train_resid <- krige(resid ~ 1, data_model_train, grid_sp, autofit_params)
#convert to raster
raster_train_resid <- raster(lz.ok_train_resid['var1.pred'])
#predict NO2 by the trained model (linear)
test$predicted = predict(model_train, test)
#spatially join predicted values by trained model with corresponding kriged residuals.
predicted_model = raster::extract(raster_train_resid, test, sp=T) #sp = T: keep all data
#add kriged residuals to the predicted values by the trained model
predicted_model$PredAddedKrigedResi <- predicted_model$predicted + predicted_model$var1.pred
The related code for the universal kriging method is displayed below. I project my kriging on a 100m resolution grid (grid100
) that, this time, includes information on the 9 predictions to account for trends. predicted_model$predictedUK is used to calculate the R2, RMSE, and MAE.
## == 3) universal kriging == ##
variogram_uk = autofitVariogram(Lopend_gemiddelde ~ 1 + nightlight_450 + nightlight_4950 + population_3000 + road_class_1_5000 + road_class_2_1000 + road_class_2_5000 + road_class_3_100 + road_class_3_300 + trafBuf50, input_data = train)
autofit_params_uk <- variogram_uk$var_model
system.time(parallelX <- parLapply(cl = cl, X = 1:no_cores, fun = function(x) krige(formula = Lopend_gemiddelde ~ 1 + nightlight_450 + nightlight_4950 + population_3000 + road_class_1_5000 + road_class_2_1000 + road_class_2_5000 + road_class_3_100 + road_class_3_300 + trafBuf50, locations = train, newdata = grid100[parts[[x]],], model = autofit_params_uk)))
# Create SpatialPixelsDataFrame from mergeParallelX
mergeParallelX <- SpatialPixelsDataFrame(points = mergeParallelX, data = mergeParallelX@data)
#convert to raster
raster_uk <- raster(mergeParallelX["var1.pred"])
#spatially join predicted values by trained model with corresponding kriged residuals.
predicted_model = raster::extract(raster_uk, test, sp=T) #sp = T: keep all data
#rename
predicted_model <- predicted_model %>% rename(predictedUK = "var1.pred")