Apply function to multiple subjects and retain the results in table in R

Question

I am going to use the Chainladder package, and apply its function mackchainladder()

I have a data with many cumulative payments triangles in the Excel, formatted like the following:


ID  A    DY1    DY2     DY3     DY4     DY5     DY6     DY7     DY8     DY9     DY10
27  1   1526    2860    3575    3544    3700    3714    3714    3681    3669    3595
27  2   1717    3619    3706    3687    3547    3511    4306    4235    4231    
27  3   5042    9957    12912   14249   15801   15659   15766   15811       
27  4   8399    15946   19155   20120   20645   20357   20517           
27  5   6494    11274   13220   13907   14383   14435               
27  6   7340    13426   16916   18420   18605                   
27  7   11004   20474   26251   28964                       
27  8   13107   23065   26150                           
27  9   17250   23922                               
27  10  18833                                   
35  1   90751   130298  147618  159509  165517  169048  170988  171730  171854  172295
35  2   105370  152668  174635  189384  196637  200010  202436  203752  204294  
35  3   123692  181088  207566  225943  235993  243255  245759  247297      
35  4   142717  206614  236415  255642  265007  269450  272050          
35  5   167997  235772  267384  286297  294015  297560              
35  6   169744  237253  270713  288991  299322                  
35  7   204752  279957  317459  342275                      
35  8   231249  318056  359736                          
35  9   257927  358768                              
35  10  300881

You can find data and examples of chain-ladder package here:

http://code.google.com/p/chainladder/wiki/Examples

sample chainladder output

library(ChainLadder)
M <- MackChainLadder(RAA, est.sigma="Mack")
M

MackChainLadder(Triangle = RAA, est.sigma = "Mack")


     Latest Dev.To.Date Ultimate   IBNR Mack.S.E    CV
1981 18,834       1.000   18,834      0        0   NaN
1982 16,704       0.991   16,858    154      206 1.339
1983 23,466       0.974   24,083    617      623 1.010
1984 27,067       0.943   28,703  1,636      747 0.457
1985 26,180       0.905   28,927  2,747    1,469 0.535
1986 15,852       0.813   19,501  3,649    2,002 0.549
1987 12,314       0.694   17,749  5,435    2,209 0.406
1988 13,112       0.546   24,019 10,907    5,358 0.491
1989  5,395       0.336   16,045 10,650    6,333 0.595
1990  2,063       0.112   18,402 16,339   24,566 1.503

               Totals
Latest:    160,987.00
Ultimate:  213,122.23
IBNR:       52,135.23
Mack S.E.:  26,909.01
CV:              0.52

I have thousands of IDs (companies).

I want to apply the R Chain-Ladder method (mackchainladder) to each triangle, and obtain the IBNR and SD of IBNR. However, I don't know how to do apply the chain-ladder to each of those companies, and how to get the results in a table, looks like this:


ID  IBNR  Mack_S.E.
11  11111  2222
24  33333  4444 (I made up those numbers)

Can I use gapply()? Then how can I the table for my results? I am new to R, any comments are welcome.

score 0 · Accepted Answer · answered Jul 07 '14 at 02:02

So here's your data in a copy/paste-able form that i've named dd

dd<-structure(list(ID = c(27L, 27L, 27L, 27L, 27L, 27L, 27L, 27L, 
27L, 27L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L, 35L), 
    A = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 
    4L, 5L, 6L, 7L, 8L, 9L, 10L), DY1 = c(1526L, 1717L, 5042L, 
    8399L, 6494L, 7340L, 11004L, 13107L, 17250L, 18833L, 90751L, 
    105370L, 123692L, 142717L, 167997L, 169744L, 204752L, 231249L, 
    257927L, 300881L), DY2 = c(2860L, 3619L, 9957L, 15946L, 11274L, 
    13426L, 20474L, 23065L, 23922L, NA, 130298L, 152668L, 181088L, 
    206614L, 235772L, 237253L, 279957L, 318056L, 358768L, NA), 
    DY3 = c(3575L, 3706L, 12912L, 19155L, 13220L, 16916L, 26251L, 
    26150L, NA, NA, 147618L, 174635L, 207566L, 236415L, 267384L, 
    270713L, 317459L, 359736L, NA, NA), DY4 = c(3544L, 3687L, 
    14249L, 20120L, 13907L, 18420L, 28964L, NA, NA, NA, 159509L, 
    189384L, 225943L, 255642L, 286297L, 288991L, 342275L, NA, 
    NA, NA), DY5 = c(3700L, 3547L, 15801L, 20645L, 14383L, 18605L, 
    NA, NA, NA, NA, 165517L, 196637L, 235993L, 265007L, 294015L, 
    299322L, NA, NA, NA, NA), DY6 = c(3714L, 3511L, 15659L, 20357L, 
    14435L, NA, NA, NA, NA, NA, 169048L, 200010L, 243255L, 269450L, 
    297560L, NA, NA, NA, NA, NA), DY7 = c(3714L, 4306L, 15766L, 
    20517L, NA, NA, NA, NA, NA, NA, 170988L, 202436L, 245759L, 
    272050L, NA, NA, NA, NA, NA, NA), DY8 = c(3681L, 4235L, 15811L, 
    NA, NA, NA, NA, NA, NA, NA, 171730L, 203752L, 247297L, NA, 
    NA, NA, NA, NA, NA, NA), DY9 = c(3669L, 4231L, NA, NA, NA, 
    NA, NA, NA, NA, NA, 171854L, 204294L, NA, NA, NA, NA, NA, 
    NA, NA, NA), DY10 = c(3595L, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, 172295L, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("ID", 
"A", "DY1", "DY2", "DY3", "DY4", "DY5", "DY6", "DY7", "DY8", 
"DY9", "DY10"), class = "data.frame", row.names = c(NA, -20L))

Then, we can really just split the data.frame into chunks based on IDs and then apply the function to each subset.

library('ChainLadder')

#helper function
dstack<-function(x) do.call(rbind, Map(function(id, dd) 
    cbind.data.frame(id=id, data.frame(as.list(dd))), names(x), x))

dstack(lapply(lapply(lapply(split(dd[,-(1:2)], dd$ID), MackChainLadder), summary), 
    function(x) setNames(x$Totals[c("IBNR:","Mack S.E.:"),],c("IBNR","Mack S.E."))))

Here we split the data (leaving off the first two columns) and run MacChainLadder on each subset. Then we perform a summary on each of those object (to calculate the statistics) and then finally we extract the statistics. Then i use dstack to stack the values back into a data.frame and merge in the ID for each of the companies. With this sample data, I got the result

   id      IBNR Mack.S.E.
27 27  35588.59  8839.536
35 35 481647.16 17130.185

Apply function to multiple subjects and retain the results in table in R

1 Answers1