1

I have a lot of statistical models to run and I am trying to use pmap to loop over the variable names needed for each model. I want to return the model summary along with extra information about each model. I'm able to run the models and return the model summary and extra information, but instead of keeping the extra information and the model summary together, the output returns all of the extra information first and then it returns all of the model summaries. Is it possible for each iteration to return the extra information and model summary together? I have a simplified example below.


    #Import library
    library(purrr)
    
    #Set up vars
    y1 <- c(runif(20, 0, 1))
    y2 <- c(runif(20, 0, 1))
    x1 <- c(rnorm(20, 0, 1))
    x2 <- c(rnorm(20, 0, 1))
    
    #Collect vars in lists
    ys <- list(y1, y2)
    xs <- list(x1, x2)
    
    #Write function with a model and "extra information"
    regressor <- function(y, x){
      #extra information
      mean_y <- mean(y) 
      cat("data:", mean_y, "\n\n")
      
      #model
      model <- lm(y ~ x)
      summary(model)
    }
    
    #Use pmap to run model over the vars
    pmap(list(ys, xs), regressor)

When I run the code above my output looks like this:

>data: 0.5281057 
>
>data: 0.5522678 
>
>[[1]]
>
>Call:
>lm(formula = y ~ x)
>
>Residuals:
>     Min       1Q   Median       3Q      Max 
>-0.57284 -0.24802  0.03689  0.26913  0.47428 
>
>Coefficients:
>            Estimate Std. Error t value Pr(>|t|)    
>(Intercept)  0.54894    0.07203   7.621 4.86e-07 ***
>x           -0.12909    0.07718  -1.673    0.112    
>---
>Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>Residual standard error: 0.3173 on 18 degrees of freedom
>Multiple R-squared:  0.1345,   Adjusted R-squared:  0.08643 
>F-statistic: 2.797 on 1 and 18 DF,  p-value: 0.1117
>
>
>[[2]]
>
>Call:
>lm(formula = y ~ x)
>
>Residuals:
>     Min       1Q   Median       3Q      Max 
>-0.49107 -0.18591 -0.05057  0.27710  0.50679 
>
>Coefficients:
>            Estimate Std. Error t value Pr(>|t|)    
>(Intercept)  0.54197    0.06764   8.013  2.4e-07 ***
>x           -0.05526    0.06441  -0.858    0.402    
>---
>Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>Residual standard error: 0.2977 on 18 degrees of freedom
>Multiple R-squared:  0.03928,  Adjusted R-squared:  -0.01409 
>F-statistic: 0.736 on 1 and 18 DF,  p-value: 0.4022

I want the results to look something like this:

>[[1]]
>
>data: 0.5281057 
>
>Call:
>lm(formula = y ~ x)
>
>Residuals:
>     Min       1Q   Median       3Q      Max 
>-0.57284 -0.24802  0.03689  0.26913  0.47428 
>
>Coefficients:
>            Estimate Std. Error t value Pr(>|t|)    
>(Intercept)  0.54894    0.07203   7.621 4.86e-07 ***
>x           -0.12909    0.07718  -1.673    0.112    
>---
>Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>Residual standard error: 0.3173 on 18 degrees of freedom
>Multiple R-squared:  0.1345,   Adjusted R-squared:  0.08643 
>F-statistic: 2.797 on 1 and 18 DF,  p-value: 0.1117
>
>
>[[2]]
>
>data: 0.5522678 
>
>Call:
>lm(formula = y ~ x)
>
>Residuals:
>     Min       1Q   Median       3Q      Max 
>-0.49107 -0.18591 -0.05057  0.27710  0.50679 
>
>Coefficients:
>            Estimate Std. Error t value Pr(>|t|)    
>(Intercept)  0.54197    0.06764   8.013  2.4e-07 ***
>x           -0.05526    0.06441  -0.858    0.402    
>---
>Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>Residual standard error: 0.2977 on 18 degrees of freedom
>Multiple R-squared:  0.03928,  Adjusted R-squared:  -0.01409 
>F-statistic: 0.736 on 1 and 18 DF,  p-value: 0.4022
  • 2
    You do realize that the results are returned while the data is printed/cat onto the console. Unless you return the data, that is practically impossible in any language. Change your function to return `list(data = mean_y, summary = summary(model))` – Onyambu Sep 16 '21 at 00:24

1 Answers1

2

Let output in one chunk

regressor <- function(y, x){
  #extra information
  mean_y <- mean(y) 
  #cat("data:", mean_y, "\n\n")
  
  #model
  model <- lm(y ~ x)
  smmod <- summary(model)
  smmod$information <- paste("data:", mean_y)
  list(summary = smmod, information = smmod$information)
}

[[1]]
[[1]]$summary

Call:
lm(formula = y ~ x)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.36805 -0.22448  0.00118  0.20575  0.45271 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.538509   0.059874   8.994 4.45e-08 ***
x           0.004642   0.058710   0.079    0.938    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2676 on 18 degrees of freedom
Multiple R-squared:  0.0003473, Adjusted R-squared:  -0.05519 
F-statistic: 0.006253 on 1 and 18 DF,  p-value: 0.9378


[[1]]$information
[1] "data: 0.538674127601553"


[[2]]
[[2]]$summary

Call:
lm(formula = y ~ x)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.50855 -0.25834  0.02311  0.28248  0.40919 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.583976   0.071464   8.172 1.81e-07 ***
x           0.003352   0.072773   0.046    0.964    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3177 on 18 degrees of freedom
Multiple R-squared:  0.0001178, Adjusted R-squared:  -0.05543 
F-statistic: 0.002121 on 1 and 18 DF,  p-value: 0.9638


[[2]]$information
[1] "data: 0.583617433917243"
Park
  • 14,771
  • 6
  • 10
  • 29