4

I am trying to deal with an official .xlsm file provided officially by a earthquake research center. You can download the file from theXLSM. What I am trying to do is to code R that change specific parts of the xlsm and then re-evaluate the .xlsm and extract part of the excel file.

Using the freshly downloaded .xlsm file, I extract things without changing the file first:

library(XLConnect)
wb <- loadWorkbook("NGAW2_GMPE_Spreadsheets_v5.7_041415_Protected.xlsm", 
                   create = TRUE)
setStyleAction(wb,XLC$"STYLE_ACTION.NONE")
library(readxl)
read_excel("NGAW2_GMPE_Spreadsheets_v5.7_041415_Protected.xlsm", range = "E23:G46",col_names=FALSE)

The results are:

    # A tibble: 24 x 3
     X__1  X__2  X__3
    <dbl> <dbl> <dbl>
 1 0.0100 0.443 0.812
 2 0.0200 0.456 0.838
 3 0.0300 0.506 0.938
 4 0.0500 0.632 1.19 
 5 0.0750 0.798 1.53 
 6 0.100  0.902 1.74 
 7 0.150  1.01  1.95 
 8 0.200  0.978 1.88 
 9 0.250  0.893 1.71 
10 0.300  0.800 1.55 
# ... with 14 more rows

Then I changed some part of the .xlsm and wanted to extract the updated things via the following code.

weight=t(as.matrix(c(1,0,0,0,0)));
writeWorksheet(wb, weight,
               "Main",startRow = 14, 
               startCol = 3, header = FALSE)
saveWorkbook(wb)
read_excel("NGAW2_GMPE_Spreadsheets_v5.7_041415_Protected.xlsm", range = "E23:G46",col_names=FALSE)

However, the results didn't change as I expected:

# A tibble: 24 x 3
     X__1  X__2  X__3
    <dbl> <dbl> <dbl>
 1 0.0100 0.443 0.812
 2 0.0200 0.456 0.838
 3 0.0300 0.506 0.938
 4 0.0500 0.632 1.19 
 5 0.0750 0.798 1.53 
 6 0.100  0.902 1.74 
 7 0.150  1.01  1.95 
 8 0.200  0.978 1.88 
 9 0.250  0.893 1.71 
10 0.300  0.800 1.55 
# ... with 14 more rows

When I then opened the .xlsm file, it first pops up something like (please click and see):

EXCEL WARNING

EXCEL LOG

After this, in excel the value is changed as expected. I saved the excel file and then I went back to R and run the code:

read_excel("NGAW2_GMPE_Spreadsheets_v5.7_041415_Protected.xlsm", range = "E23:G46",col_names=FALSE)

And finally it is changed:

    # A tibble: 24 x 3
     X__1  X__2  X__3
    <dbl> <dbl> <dbl>
 1 0.0100 0.411 0.779
 2 0.0200 0.421 0.798
 3 0.0300 0.447 0.845
 4 0.0500 0.508 0.952
 5 0.0750 0.655 1.23 
 6 0.100  0.792 1.50 
 7 0.150  0.986 1.93 
 8 0.200  0.978 1.92 
 9 0.250  0.870 1.72 
10 0.300  0.751 1.50 
# ... with 14 more rows

Does any one know how to avoid the openExcel -> clickYes -> clickClose -> Save stuff?

Thank you all very much!

An maybe not relevant update:

This problem won't happen in MATLAB though... However, it makes me feel really bad that I have to either write the whole code again in MATLAB or I have to explore some R-MATLAB transition stuff...

Still looking for a better solution...

This is a good reference if you guys are interested: https://mandymejia.wordpress.com/2014/08/18/three-ways-to-use-matlab-from-r/

outputRange='E23:I43';
filename = 'NGAW2_GMPE_Spreadsheets_v5.7_041415_Protected.xlsm';
sheet = 1;
xlsread(filename,sheet,outputRange)

ans =

    0.0100    0.4111    0.7787    0.2170    0.0010
    0.0200    0.4214    0.7981    0.2225    0.0042
    0.0300    0.4466    0.8449    0.2361    0.0100
    0.0500    0.5082    0.9521    0.2713    0.0315
    0.0750    0.6551    1.2259    0.3500    0.0915
    0.1000    0.7922    1.4991    0.4186    0.1967
    0.1500    0.9858    1.9298    0.5035    0.5506
    0.2000    0.9778    1.9221    0.4974    0.9709
    0.2500    0.8704    1.7183    0.4409    1.3504
    0.3000    0.7510    1.4951    0.3773    1.6779
    0.4000    0.5787    1.1618    0.2883    2.2985
    0.5000    0.4801    0.9721    0.2371    2.9796
    0.7500    0.3263    0.6692    0.1591    4.5566
    1.0000    0.2419    0.5025    0.1164    6.0048
    1.5000    0.1528    0.3202    0.0729    8.5354
    2.0000    0.1094    0.2313    0.0518   10.8676
    3.0000    0.0666    0.1396    0.0318   14.8858
    4.0000    0.0471    0.0979    0.0227   18.7140
    5.0000    0.0351    0.0730    0.0169   21.7952
    7.5000    0.0187    0.0389    0.0090   26.1710
   10.0000    0.0112    0.0233    0.0054   27.8543

M=7;
F=1;
VS30=500;
R=30;
xlswrite(filename,M,sheet,'B24');
xlswrite(filename,R,sheet,'B27');
xlswrite(filename,R,sheet,'B30');
xlswrite(filename,R,sheet,'B33');
xlswrite(filename,999,sheet,'B36');
xlswrite(filename,VS30,sheet,'B39');

xlsread(filename,sheet,outputRange)

ans =

    0.0100    0.1230    0.2293    0.0659    0.0003
    0.0200    0.1252    0.2333    0.0672    0.0012
    0.0300    0.1304    0.2416    0.0704    0.0029
    0.0500    0.1504    0.2780    0.0813    0.0093
    0.0750    0.1904    0.3516    0.1031    0.0266
    0.1000    0.2292    0.4255    0.1234    0.0569
    0.1500    0.2925    0.5520    0.1550    0.1634
    0.2000    0.3194    0.6154    0.1658    0.3172
    0.2500    0.3077    0.6038    0.1568    0.4774
    0.3000    0.2663    0.5270    0.1346    0.5949
    0.4000    0.2126    0.4244    0.1065    0.8444
    0.5000    0.1747    0.3517    0.0868    1.0841
    0.7500    0.1163    0.2371    0.0570    1.6235
    1.0000    0.0840    0.1736    0.0407    2.0864
    1.5000    0.0530    0.1105    0.0254    2.9604
    2.0000    0.0371    0.0779    0.0176    3.6807
    3.0000    0.0225    0.0468    0.0108    5.0209
    4.0000    0.0163    0.0336    0.0079    6.4612
    5.0000    0.0127    0.0263    0.0062    7.9098
    7.5000    0.0085    0.0176    0.0041   11.8785
   10.0000    0.0059    0.0123    0.0029   14.7570
olk
  • 61
  • 1
  • 5
  • Saving the worksheet object to a file does just that: saves the data. The Excel "engine" is not instantiated, so you should not expect "save to a file" to be the same as "save and recalculate how Excel would do". To do that, you will need to actually *use Excel*; perhaps you may consider [`RDCOMClient`](http://www.omegahat.net/RDCOMClient/) by omegahat. – r2evans May 19 '18 at 21:08
  • Yes, when saving the file, we should not expect that the recalculation. However, should "read_excel()" have recalculated all the things? Maybe the excel error as shown in the pictures have hindered the "read_excel()" function from the recalculation? – olk May 19 '18 at 21:16
  • No. The recalculation is done *by Excel*, not by reading the file. Cells within the worksheets have dependencies on other cells (i.e., formulae), and those are not magic: Excel looks for them, does something based on them, and then updates the apparent values. Only when Excel has done this does reading it (with not-Excel) produce the values you want. A formula in a worksheet cell is relevant *only to Excel*, not to the file format. (Half-jokingly: you are welcome to extract the formulae and implement an Excel-like parsing engine that does things in R instead of needing Excel. Good luck :-) – r2evans May 19 '18 at 21:32
  • Thanks a lot e2evans. I agree that whatever we do we need excel somehow. It seems the MATLAB is doing better than the readxl package though... See the Answer I posted below if you are interested. – olk May 19 '18 at 22:01
  • I cannot speak to the matlab implementation of reading/writing Excel files, but it is at least a suggestion that a solution might exist. I can't off-hand find an easy DCOM method, but I suggest that that would be the most robust method: have Excel handle recalculating Excel formulae within the xlsx/xlsm file. There seem to be some good connections using things like `RExcel` and/or `rcom`, though the latter appears to be gone from CRAN and not updated in many years. This is an interesting problem, sorry I don't have much to offer yet. – r2evans May 19 '18 at 22:10

1 Answers1

0

This problem won't happen in MATLAB though... However, it makes me feel really bad that I have to either write the whole code again in MATLAB or I have to explore some R-MATLAB transition stuff...

Still looking for a better solution...

This is a good reference if you guys are interested: https://mandymejia.wordpress.com/2014/08/18/three-ways-to-use-matlab-from-r/

outputRange='E23:I43';
filename = 'NGAW2_GMPE_Spreadsheets_v5.7_041415_Protected.xlsm';
sheet = 1;
xlsread(filename,sheet,outputRange)

ans =

    0.0100    0.4111    0.7787    0.2170    0.0010
    0.0200    0.4214    0.7981    0.2225    0.0042
    0.0300    0.4466    0.8449    0.2361    0.0100
    0.0500    0.5082    0.9521    0.2713    0.0315
    0.0750    0.6551    1.2259    0.3500    0.0915
    0.1000    0.7922    1.4991    0.4186    0.1967
    0.1500    0.9858    1.9298    0.5035    0.5506
    0.2000    0.9778    1.9221    0.4974    0.9709
    0.2500    0.8704    1.7183    0.4409    1.3504
    0.3000    0.7510    1.4951    0.3773    1.6779
    0.4000    0.5787    1.1618    0.2883    2.2985
    0.5000    0.4801    0.9721    0.2371    2.9796
    0.7500    0.3263    0.6692    0.1591    4.5566
    1.0000    0.2419    0.5025    0.1164    6.0048
    1.5000    0.1528    0.3202    0.0729    8.5354
    2.0000    0.1094    0.2313    0.0518   10.8676
    3.0000    0.0666    0.1396    0.0318   14.8858
    4.0000    0.0471    0.0979    0.0227   18.7140
    5.0000    0.0351    0.0730    0.0169   21.7952
    7.5000    0.0187    0.0389    0.0090   26.1710
   10.0000    0.0112    0.0233    0.0054   27.8543

M=7;
F=1;
VS30=500;
R=30;
xlswrite(filename,M,sheet,'B24');
xlswrite(filename,R,sheet,'B27');
xlswrite(filename,R,sheet,'B30');
xlswrite(filename,R,sheet,'B33');
xlswrite(filename,999,sheet,'B36');
xlswrite(filename,VS30,sheet,'B39');

xlsread(filename,sheet,outputRange)

ans =

    0.0100    0.1230    0.2293    0.0659    0.0003
    0.0200    0.1252    0.2333    0.0672    0.0012
    0.0300    0.1304    0.2416    0.0704    0.0029
    0.0500    0.1504    0.2780    0.0813    0.0093
    0.0750    0.1904    0.3516    0.1031    0.0266
    0.1000    0.2292    0.4255    0.1234    0.0569
    0.1500    0.2925    0.5520    0.1550    0.1634
    0.2000    0.3194    0.6154    0.1658    0.3172
    0.2500    0.3077    0.6038    0.1568    0.4774
    0.3000    0.2663    0.5270    0.1346    0.5949
    0.4000    0.2126    0.4244    0.1065    0.8444
    0.5000    0.1747    0.3517    0.0868    1.0841
    0.7500    0.1163    0.2371    0.0570    1.6235
    1.0000    0.0840    0.1736    0.0407    2.0864
    1.5000    0.0530    0.1105    0.0254    2.9604
    2.0000    0.0371    0.0779    0.0176    3.6807
    3.0000    0.0225    0.0468    0.0108    5.0209
    4.0000    0.0163    0.0336    0.0079    6.4612
    5.0000    0.0127    0.0263    0.0062    7.9098
    7.5000    0.0085    0.0176    0.0041   11.8785
   10.0000    0.0059    0.0123    0.0029   14.7570
olk
  • 61
  • 1
  • 5
  • 2
    If you're asking for somebody to answer with an R solution, then this does not fit as an *answer*. If that is the case, I suggest you include this information in your question and delete this answer. Rationale: (1) people with similar questions will not be able to use a matlab answer for an R question; (2) answerers looking at questions often skip over those with 1 or more answers already, so this will possibly discourage more help. – r2evans May 19 '18 at 22:04