4

I would like to calculate the weighted average life (WAL) of a loan over time in R. The formula to calculate the WAL is given here.

I have the following sample data created in R.

Sample data

library(data.table)
DT<-data.table(date=c(rep(seq(from = 2015, to = 2016.25,by = .25),2),
seq(from = 2015, to = 2017.5,by = .5)),
           value=c(rep(100,5), 0, 100, 80, 60, 40, 20, 0, 100, 70, 40, 30, 20, 0),
           id=rep(c("a","b","c"),each=6))

DT

       date value id
 1: 2015.00   100  a
 2: 2015.25   100  a
 3: 2015.50   100  a
 4: 2015.75   100  a
 5: 2016.00   100  a
 6: 2016.25     0  a
 7: 2015.00   100  b
 8: 2015.25    80  b
 9: 2015.50    60  b
 10: 2015.75    40  b
 11: 2016.00    20  b
 12: 2016.25     0  b
 13: 2015.00   100  c
 14: 2015.50    70  c
 15: 2016.00    40  c
 16: 2016.50    30  c
 17: 2017.00    20  c
 18: 2017.50     0  c

Thus every loan in this example has a maturity of 5 years and at maturity date the loan is completely amortized. Note: The dates are not always incremented by one semi year or one quarter, but are may differ (see sample data).

To calculate the the WAL I have created the following R code

Counter <- unique(DT$id)

# LOOP OVER ID
for (i in 1:length(Counter)) {

# SUBSET ONE ID
DTSub <- DT[id == Counter[i], ]

# LOOP OVER THE AMORTIZATIONDATES
CounterSub <- unique(DTSub$date)

for (j in 1:length(CounterSub)) {

# SUBSET RANGE OF DATES IN COUNTERSUB
DTSub_Date <- DTSub[date >= CounterSub[j], ]
DTSub_Date[, t := abs(min(date)-date)]
DT[id == Counter[i] & date == CounterSub[j], 
       wal_calc := round(sum(abs(diff(DTSub_Date$value)) 
       / max(DTSub_Date$value) * DTSub_Date$t[2:nrow(DTSub_Date)]),3)]

}
}

The output of the code

DT

       date value id wal_calc
 1: 2015.00   100  a    1.250
 2: 2015.25   100  a    1.000
 3: 2015.50   100  a    0.750
 4: 2015.75   100  a    0.500
 5: 2016.00   100  a    0.250
 6: 2016.25     0  a    0.000
 7: 2015.00   100  b    0.750
 8: 2015.25    80  b    0.625
 9: 2015.50    60  b    0.500
 10: 2015.75    40  b    0.375
 11: 2016.00    20  b    0.250
 12: 2016.25     0  b    0.000
 13: 2015.00   100  c    1.300
 14: 2015.50    70  c    1.143
 15: 2016.00    40  c    1.125
 16: 2016.50    30  c    0.833
 17: 2017.00    20  c    0.500
 18: 2017.50     0  c    0.000

The output of the code is correct (wal_calc) but uses a double for loop, and hence is slow on relatively large datasets (mine has 77k rows and 200 columns).

The first for loop subsets the IDs and the second subsets future dates (by id, based on the first subset).

Request

I would like to be able to generate WALS on this sample data in way faster and more efficient manner and avoid this double for loop. There might be a very simple solution to this problem.

If anything is unclear please let me know.

Dave van Brecht
  • 514
  • 4
  • 16

2 Answers2

3

This will do it without for loops.

DT[order(date), WAL := {
  pmts <- matrix(value[-.N] - value[-1L], 
                 nrow = n2 <- .N - 1L, ncol = n2)
  ts <- matrix(date[-1L] - date[-.N], nrow = n2, ncol = n2)
  ts[upper.tri(ts)] <- 0
  ts <- apply(ts, 2, cumsum)
  c(colSums(pmts * ts) / value[-.N], 0)}, by = id]
DT
     date value id       WAL
# 1: 2015.00   100  a 1.2500000
# 2: 2015.25   100  a 1.0000000
# 3: 2015.50   100  a 0.7500000
# 4: 2015.75   100  a 0.5000000
# 5: 2016.00   100  a 0.2500000
# 6: 2016.25     0  a 0.0000000
# 7: 2015.00   100  b 0.7500000
# 8: 2015.25    80  b 0.6250000
# 9: 2015.50    60  b 0.5000000
# 10: 2015.75    40  b 0.3750000
# 11: 2016.00    20  b 0.2500000
# 12: 2016.25     0  b 0.0000000
# 13: 2015.00   100  c 1.3000000
# 14: 2015.50    70  c 1.1428571
# 15: 2016.00    40  c 1.1250000
# 16: 2016.50    30  c 0.8333333
# 17: 2017.00    20  c 0.5000000
# 18: 2017.50     0  c 0.0000000
Dave van Brecht
  • 514
  • 4
  • 16
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
1

you could use apply instead for the first subset. Then you would just need on for loop.

ids <- unique(DT$id)

DTSub <- apply(DT, 1, function(x) if x$id %in% ids)

CounterSub <- unique(DTSub$date)
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
Seekheart
  • 1,135
  • 7
  • 8
  • Thanks Seekheart. However, ideally I would like to have a fast data.table solution, since the function is used in a Shiny application and should be able to calculate the WALS on the fly (i.e. as fast and efficient as possible). There must be other ways of doing this. I already searched for some specific packages but could not find them. – Dave van Brecht Feb 23 '16 at 16:20