11

Background

I have an extremely large data table that takes up to 12 hours to run for around 1 million input scenarios on a high-end 64bit machine. The scenarios are based on a number of discrete Excel models, that are then fed into a financial model for detailed calculations

To improve the process, I am looking to test and compare the speeds of:

  1. The current manual process
  2. Using VBA to refresh the Data Table (with Calculation, ScreenUpdating etc off)
  3. Running a VBS to refresh the Data Table in a invisible Excel instance

So, I am looking for the best approach to programmatically manage a Data Table

Update: using code in (2) and (3) did not provide a benefit on testing a simple example with a workbook with a single large data table

Rather surprisingly there seems to be very little - possibly no - direct support in VBA for Data Tables

My current knowledge and literature search

  • QueryTable BeforeRefresh and AfterRefresh Events can be added with this class module code. Intellisense doesn't provide this as an option for Data Tables
  • Individual PivotTables and QuertyTables can be accessed like so ActiveWorkbookk.Sheets(1).QueryTables(1). Not so Data Tables
  • Eliminating all other Data Tables and then running a RefreshAll was suggested in this MrExcel thread as a workaround.

The workaround is certainly do-able as I only have a single Data Table, but I'd prefer a direct approach if one exists.

Yes, I'm sticking to Excel :)

Please do not suggest other tools for this approach, both the input models and the overarching model that uses the data table are

  • part of a well established ongoing process that will stay Excel based,
  • have been professionally audited,
  • have been streamlined and optimised by some experience Excel designers

I was simply curious if there was a way to tweak the process by refreshing a specific data table with code, which my initial test results above have concluded no to.

ashleedawg
  • 20,365
  • 9
  • 72
  • 105
brettdj
  • 54,857
  • 16
  • 114
  • 177
  • I have limited experience in this area, so this is just a suggestion. For scenario testing my colleagues use, variously, R, https://www.palisade.com/risk/, http://www.towerswatson.com/en/Services/Tools/igloo. That said, 1M scenarios is a lot, and I honestly don't know how any of these tools would perform under those conditions. – andy holaday Jan 14 '14 at 03:47
  • Andy, Thanks. I actually have one million discrete scenarios to run, as opposed to a monte carlo using continuous and/or discrete probabality distributions (@risk or crystall ball). These scenarios feed a financial model, which in turn outputs a number of metrics for each run – brettdj Jan 14 '14 at 09:08
  • brettdj, what exactly do you want help with? How to only refresh the data table calculation without the rest of the model? Else, why not simply use `.Calculate` on the worksheet with the specific data table to initiate and time the full calculation? Or am I missing something here? – Peter Albert Mar 04 '14 at 00:07
  • @PeterAlbert I wanted to see if it was possible to do this directly. Thx Philip for the bounty :) – brettdj Mar 04 '14 at 10:27
  • `directly` meaning what? Only recalculating the data table? – Peter Albert Mar 04 '14 at 10:50
  • 2
    @brettdj - I am curious... why are you running such a massive simulation in Excel? Wouldn't this be better handled in a more direct programming language? – Roberto Mar 05 '14 at 04:14
  • @roberto the scenarios are constructed from various existing Excel building blocks in a template form, which in turn are fed to an overall financial model – brettdj Mar 05 '14 at 07:29
  • 1
    Just wanted to suggest that you look into PowerPivot to get data into Excel. It should be much faster. – Excel Developers Mar 04 '14 at 14:14
  • I know excel is perfect to do montecarlo scenarios, but for a huge model like that I dont think excel is your best option. I would go to c# or vb.net then just print your report in excel. you can also read the data from excel. or you can build your model in vba wich will be harder – bto.rdz Mar 07 '14 at 19:04

1 Answers1

1

So, you are looking for the best approach to programmatically manage a Data Table.

Well, Excel 2013 does record a macro for me when I manually create a data table, it goes

Selection.Table ColumnInput:=Range("G4")

The signature is

Range.Table(RowInput as Range, ColumnInput as Range) as Boolean

which is documented in Range.Table Method. The Range.Table() function seems to always return true.

This is the only way to create data tables using VBA. But that's all there is to data tables anyway.

AFAIK there is no class or object for data tables, so there is no dt.refresh() or similar method. And there is no collection of data tables you could query. You have to refresh the sheet or recreate the table with Range.Table().

There is a DataTable Interface, but it is related to charts and has nothing to do with Range.Table().

As you mention, you should turn off the usual suspects, i.e.

Application.ScreenUpdating = False
Application.DisplayStatusBar = False
Application.Calculation = xlCalculationManual
Application.EnableEvents = False

Try to have as little formulas in your workbook. Remove all formulas not related to the cells you base the data table on. Remove any intermediate results. Best have one cell with one, possibly big, formula.

Example: G4 is your ColumnInput, and it contains =2*G3, with G3 containing =G1+G2, then better put =2*(G1+G2) into G4.

You may have 6 cores in your high end machine. Divide your scenarios into 6 chunks and have 6 Excel instances calculate them in parallel.

Wolfgang Kuehn
  • 12,206
  • 2
  • 33
  • 46
  • 1
    Thx for your comments. I am well versed in optimising Excel (check my profile) and the models themselves have already been stripped right down and thoroughly optimised using Charles William's [fast excel](https://www.google.com.au/search?q=charkes+willaim+fastexcel&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-beta&channel=sb&gfe_rd=cr&ei=uYgaU9TmGcKN8QfflICACw#channel=sb&q=charles+william+fastexcel&rls=org.mozilla:en-US:official&spell=1). My query was about direct exposure to refreshing data tables via code, not about creating them, or general optimisation. – brettdj Mar 08 '14 at 03:05
  • I have added further clarity to my initial question. – brettdj Mar 08 '14 at 03:13
  • You specifically asked for a direct api. So there is one, not much, but it may be useful to you. So there is no refresh, then create. Create will refresh. There is no penalty performancewise. – Wolfgang Kuehn Mar 08 '14 at 22:25