0

I am not very experienced with loops so I am not sure where I went wrong here... I have a dataframe that looks like:

 month year day mean.temp mean.temp.year.month
     1 1961   1      4.85             4.090323
     1 1961   2      4.90             4.090323
     1 1961   3      2.95             4.090323
     1 1961   4      3.40             4.090323
     1 1961   5      2.90             4.090323

dataset showing 3 months for 2 years can be found here: https://drive.google.com/file/d/1w7NVeoEh8b7cAkU3cu1sXx6yCh75Inqg/view?usp=sharing

and I want to subset this dataframe by year and month so that I can run one nls model per year and month. Since my dataset contains 56 years (and each year has 12 months), that will give 672 models. Then I want to store the parameter estimates in a separate table.

I've created this code, but I can't work out why it is only giving me the parameter estimates for month 12 (all 56 years, but just month 12):

table <- matrix(99999, nrow=672, ncol=4)
YEARMONTHsel <- unique(df_weather[c("year", "month")])
YEARsel <- unique(df_weather$year)
MONTHsel <- unique(df_weather$month)
for (i in 1:length(YEARsel))  {
  for (j in 1:length(MONTHsel)) {
  temp2 <-  df_weather[df_weather$year==YEARsel[i] & df_weather$month==MONTHsel[j],]
   mn <- nls(mean.temp~mean.temp.year.month+alpha*sin(day*pi*2/30+phi), 
          data = temp2, control=nlc,
          start=list(alpha=-6.07043, phi = -10))
   cr <- as.vector(coef(mn))
   nv <-length(coef(mn))
   table[i,1:nv] <- cr 
   table[i,nv+1]<- YEARsel[i]
   table[i,nv+2]<- MONTHsel[j]
  }
 }

I've tried several options (i.e. without using nested loop) but I'm not getting anywhere. Any help would be greatly appreciated!Thanks.

  • 2
    Maybe this post can help: https://stackoverflow.com/questions/51796317/r-predict-glm-fit-on-each-column-in-data-frame-using-column-index-number/51810814#51810814 . If you are still having trouble, post some data (maybe 3 months) and it would be easier to help. – AndS. Aug 19 '18 at 02:00
  • Thanks for your help! Yes, I can do the subsetting and the modelling job with tidyverse, but the output is just too long and I would have to copy each 672 parameters by hand to another table. I can't work out how to automate it with tidyverse so I thought the loop would do everything? I'll edit my question to post more data, cheers! – ElisaPBadas16 Aug 19 '18 at 13:36
  • When you update the post with some more data, I'll take a look and see if I can help. – AndS. Aug 19 '18 at 14:03

1 Answers1

0

Based on your loop, it looks like you want to run the regression grouped by year and month and then extract the coefficients in a new dataframe (correct me if thats wrong)

library(readxl)
library(tidyverse)

df <- read_excel("~/Downloads/df_weather.xlsx")

df %>% nest(-month, -year) %>% 
  mutate(model = map(data, ~nls(mean.temp~mean.temp.year.month+alpha*sin(day*pi*2/30+phi), 
                                data = .x, control= "nlc",
                                start=list(alpha=-6.07043, phi = -10))),
         coeff = map(model, ~coefficients(.x))) %>%
  unnest(coeff %>% map(broom::tidy)) %>%
  spread(names, x) %>%
  arrange(year)

#> # A tibble: 6 x 4
#>   month  year  alpha    phi
#>   <dbl> <dbl>  <dbl>  <dbl>
#> 1     1  1961  0.561 -10.8 
#> 2     2  1961 -1.50  -10.5 
#> 3     3  1961 -2.06   -9.77
#> 4     1  1962 -3.35   -5.48
#> 5     2  1962 -2.27   -9.97
#> 6     3  1962  0.959 -10.8

First we nest the data based on your groups (in this case year and month), then we map the model for each group, then we map the coefficients for each group, lastly we unnest the coefficients and spread the data from long to wide.

AndS.
  • 7,748
  • 2
  • 12
  • 17