I have a data frame (datos) that has eight columns and 2006 observations:
from these columns I want to calculate MSE for Pcp_Estacion and Pcp_Chirps, using the MSE function of the MLmetrics library, but I want to calculate them per station and per month to obtain a data frame calculated for each month and each weather station, in the example I calculate the MSE for the five weather stations I have for the month of July
# load libraries
library(tidyverse);
library(dplyr);
library(Metrics);
library(MLmetrics);
# See the first 10 data
dput(head(datos, 10))
X Mes Year Estacion variable n Pcp_Chirps Pcp_Estacion
1 1 1 1982 11024 Pcp 30 0.262 0.000
2 2 1 1982 11033 Pcp 31 0.190 0.045
3 3 1 1982 11141 Pcp 31 0.265 0.000
4 4 2 1982 11024 Pcp 28 0.317 0.286
5 5 2 1982 11033 Pcp 28 0.242 0.629
6 6 2 1982 11141 Pcp 28 0.351 0.500
7 7 3 1982 11024 Pcp 31 0.000 2.903
8 8 3 1982 11033 Pcp 31 0.148 0.000
9 9 3 1982 11141 Pcp 31 0.000 0.000
10 10 4 1982 11024 Pcp 30 0.543 0.800
# Calculate the July mse() for each weather station
mse_11024_7 <- filter(datos, Mes == 7, Estacion %in% c("11024"))
mse_11033_7 <- filter(datos, Mes == 7, Estacion %in% c("11033"))
mse_11060_7 <- filter(datos, Mes == 7, Estacion %in% c("11060"))
mse_11096_7 <- filter(datos, Mes == 7, Estacion %in% c("11096"))
mse_11141_7 <- filter(datos, Mes == 7, Estacion %in% c("11141"))
# check the result
mse(mse_11024_7$Pcp_Estacion, mse_11024_7$Pcp_Chirps)
mse(mse_11033_7$Pcp_Estacion, mse_11033_7$Pcp_Chirps)
mse(mse_11060_7$Pcp_Estacion, mse_11060_7$Pcp_Chirps)
mse(mse_11096_7$Pcp_Estacion, mse_11096_7$Pcp_Chirps)
mse(mse_11141_7$Pcp_Estacion, mse_11141_7$Pcp_Chirps)
is there a faster way to do all this at once, for all months and weather stations ?
Here the example data https://drive.google.com/drive/folders/19h7u0GzGO1okjhO3RLREY0QKY8DOoTy-?usp=sharing