0

I'm trying to create a bootstrap using arrays.

I have a dataset of two stocks with data in columns B and C:

16/08/2016  79.84   70.87
15/08/2016  80.26   71.79
12/08/2016  80.22   71.7
11/08/2016  80.56   71.98
10/08/2016  80.55   71.21
09/08/2016  81.5    73.05
08/08/2016  81.6    72.25
05/11/1990  17.5625 6.4011
02/11/1990  17.0938 6.4358
01/11/1990  17      6.5137
31/10/1990  16.8438 6.583
30/10/1990  17.3438 6.4444
29/10/1990  17.7813 6.3232

I need to pick randomly 10,000 times from column C, take an average and display in cell D1. Next, generate random 10,000 times and display in D2, etc. Similar for E1, E2, ..., En, but based on the data in column C.

Because data sets are large with two or more stocks and the methodology requires many runs, simple printing of the average every time to each cell, takes a long time. So I need to use arrays. I can do it "manually", but I am sure there is a better way.

The code takes a long time to run because it prints to the spreadsheet every time. I am storing simulated values in the simval() array, but it is deleted upon Redim.

Ideally, I want to lift data into an array, simulate a new distribution and print results next to the original data.

Sub bstrap()
Dim miRange As Range
Dim avgsimvals() As Double, simval() As Double
Dim start As Double, secs As Double
start = Timer
r = Range("A1").CurrentRegion.Rows.Count
c = 4
Set miRange = Range(Cells(1, 2), Cells(r, 2))
For j = 1 To 100 '100 runs takes over 90 secs, approx 2.5 hrs for 10,000
    ReDim simval(1 To r)
    For i = 1 To r
        simval(i) = WorksheetFunction.Index(miRange, r * Rnd() + 1)
        Cells(i, c).Value = simval(i)
    Next i
    c = c + 1
Next j

secs = Round(Timer - start, 6)
MsgBox "run in " & secs, vbInformation

End Sub
Community
  • 1
  • 1

2 Answers2

0

use arrays

your narrative is not so clear to me as to your exact goal, but you can start from this:

Option Explicit

Sub bstrap2()
    Dim start As Double
    Dim dataSet As Variant

    start = Timer
    With Worksheets("Boostrap")
        dataSet = Application.Transpose(.Range("B1:B" & .Cells(.Rows.Count, 1).End(xlUp).Row).Value)
        Range("D1") = GetRandomizedAverage(dataSet, 10000)

        dataSet = Application.Transpose(.Range("C1:C" & .Cells(.Rows.Count, 1).End(xlUp).Row).Value)
        Range("E1") = GetRandomizedAverage(dataSet, 10000)
    End With

    MsgBox "run in " & Round(Timer - start, 6), vbInformation
End Sub

Function GetRandomizedAverage(dataSet As Variant, repetitions As Long) As Double
    Dim i As Long, j As Long, r As Long

    r = UBound(dataSet)
    ReDim simval(1 To repetitions, 1 To r)

    For j = 1 To repetitions
        For i = 1 To r
            simval(j, i) = dataSet(Int(r * Rnd() + 1))
        Next i
    Next j
    GetRandomizedAverage = WorksheetFunction.Average(simval)
End Function
user3598756
  • 28,893
  • 4
  • 18
  • 28
  • Hi, thanks for quick response. I understand that it is not clear, apologies for that. However, you understood what I want. I have over 6,500 rows of data in columns B and C, stock prices. I want to generate a similar data set, ie 6,500 rows, using re-sampling technique. Your code is not running on my PC, throwing "out of memory" error and after me amending 10000 down to 1000, it took it 7 seconds to generate two cells of data. I want to generate data using re-sampling to populate Range("D1:E6500"). Wondering, if this code will be efficient for this purpose? Thanks again. – VictorD Oct 02 '16 at 17:42
  • Please be more descriptive about needed calculations. And when reporting errors always specify which line throws them. – user3598756 Oct 02 '16 at 18:29
0

Let me start with some comments. First, you might want to consider using any other language as soon as you start doing simulations. VBA is not known for its speedy calculations. Second (although off topic), why do you want to bootstrap timeseries data?
Third, are you aware of the fact that your own code is not performing the desired computations? It is simply writing the randomly drawn observation to the worksheet. No averaging is done anywhere.

Now to my answer. Here is a piece of code that runs reasonably quickly. Running 10,000 simulations for 6500 rows in 2 columns takes about 5 minutes. Not bad, no world record either.

The most important element, that you can see in @user3598756's code as well, is that you stash every simulated value into an array first. The final step is writing the arrays to the worksheet. The less separate parts where data is written to a worksheet the better, as this is a major source of slowness.

*P.S. this code is simplified and should be made more robust if used.

Sub Bootstrap()

Const NSIM As Long = 100#
Const COL_START As Long = 4
Const COL_END As Long = 5

Dim rng_output As Range, rng_input As Range, ws As Worksheet
Dim i As Long, j As Long, nRows As Long, dCol As Long
Dim sim_vals() As Variant, sim_avg() As Variant
Dim start As Double

start = Timer

Set ws = ActiveSheet                'change to appriopriate sheet if needed

nRows = ws.UsedRange.Rows.Count
dCol = COL_START

Set rng_input = ws.Range(ws.Cells(1, dCol - 2), ws.Cells(nRows, dCol - 2))

Do While dCol < COL_END + 1
    ReDim sim_vals(1 To NSIM)
    ReDim sim_avg(1 To nRows)
    
    For j = 1 To nRows
        For i = 1 To NSIM
            sim_vals(i) = rng_input(Int(nRows * rnd() + 1))
        Next i
        sim_avg(j) = Application.WorksheetFunction.Sum(sim_vals) / NSIM
    Next j
    
    ws.Range(ws.Cells(1, dCol), ws.Cells(nRows, dCol)) = Application.WorksheetFunction.Transpose(sim_avg)
    dCol = dCol + 1
    Set rng_input = ws.Range(ws.Cells(1, dCol - 2), ws.Cells(nRows, dCol - 2))
Loop

MsgBox "Code took " & Round(Timer - start, 2) & " seconds to run"

End Sub
Christov
  • 178
  • 5