11

After some experiments I ended up with the following code to perform Search and Replace in MSWord. This code works perfectly also in header and footer, including the cases in which header and/or footer are different for the first page or odd/even pages.

The problem is that I need to call MSWordSearchAndReplaceInAllDocumentParts for every string I replace, and I get an unacceptable performance (2 minutes for about 50 strings in a 4 pages doc word). Ideally it should be "instantaneous" of course.

Before handling headers and footers I was just doing search and replace in the main document (using wdSeekMainDocument). In that case the perofmrance was acceptable (even if quite slow). I just wonder why is it so slow: does switching view takes time? Typically headers or footers contain few words, so I expected that all the Search And Replace in headers and footers was not making the overall performance so worse. But this is not what I observed.

This is the code, at the bottom i put profiler results:

// global variable (just for convenience of posting to Stack Overflow)   
var
 aWordApp: OLEVariant; // global

// This is the function that is executed once per every  string I replace
function MSWordSearchAndReplaceInAllDocumentParts;
begin
    try
      iseekValue := aWordApp.ActiveWindow.ActivePane.View.SeekView;
      iViewType := aWordApp.ActiveWindow.ActivePane.View.Type;
      if iViewType <> wdPrintView then
        aWordApp.ActiveWindow.ActivePane.View.Type := wdPrintView;
      if aWordApp.ActiveDocument.PageSetup.OddAndEvenPagesHeaderFooter then
      begin
        Try
          aWordApp.ActiveWindow.ActivePane.View.SeekView := wdSeekEvenPagesFooter;
          SearchAndReplaceInADocumentPart;
        Except
            // do nothing ..it was not able to set above view
        end;
        Try
          aWordApp.ActiveWindow.ActivePane.View.SeekView := wdSeekEvenPagesHeader;
          SearchAndReplaceInADocumentPart;
        Except
          // do nothing ..it was not able to set above view
        end;
      end;
      if aWordApp.ActiveDocument.PageSetup.DifferentFirstPageHeaderFooter then
      begin
        Try
          aWordApp.ActiveWindow.ActivePane.View.SeekView := wdSeekFirstPageFooter;
          SearchAndReplaceInADocumentPart;
        Except
          // do nothing ..it was not able to set above view
        end;
        Try
          aWordApp.ActiveWindow.ActivePane.View.SeekView := wdSeekFirstPageHeader;
          SearchAndReplaceInADocumentPart;
        Except
          // do nothing ..it was not able to set above view
        end;
      end;
      //Replace in Main Docpart
      Try
        aWordApp.ActiveWindow.ActivePane.View.SeekView := wdSeekMainDocument;
        SearchAndReplaceInADocumentPart;
      Except
          // do nothing ..it was not able to set above view
      end;
      //Replace in Header
      Try
        aWordApp.ActiveWindow.ActivePane.View.SeekView := wdSeekCurrentPageHeader;
        SearchAndReplaceInADocumentPart;
      Except
          // do nothing ..it was not able to set above view
      end;
      //Replace in Footer
      Try
        aWordApp.ActiveWindow.ActivePane.View.SeekView := wdSeekCurrentPageFooter;
        SearchAndReplaceInADocumentPart;
      Except
          // do nothing ..it was not able to set above view
      end;
      //Replace in Header
      Try
        aWordApp.ActiveWindow.ActivePane.View.SeekView := wdSeekPrimaryHeader;
        SearchAndReplaceInADocumentPart;
      Except
        // do nothing ..it was not able to set above view
      end;
      //Replace in Footer
      Try
        aWordApp.ActiveWindow.ActivePane.View.SeekView := wdSeekPrimaryFooter;
        SearchAndReplaceInADocumentPart;
      Except
        // do nothing ..it was not able to set above view
      end;
    finally
      aWordApp.ActiveWindow.ActivePane.View.SeekView := iseekValue;
      if iViewType <> wdPrintView then
        aWordApp.ActiveWindow.ActivePane.View.Type := iViewType;
    end;
end;

// This is the function that performs Search And Replace in the selected View
 // it is called once per view

function SearchAndReplaceInADocumentPart;
begin
    aWordApp.Selection.Find.ClearFormatting;
    aWordApp.Selection.Find.Text := aSearchString;
    aWordApp.Selection.Find.Replacement.Text := aReplaceString;
    aWordApp.Selection.Find.Forward := True;
    aWordApp.Selection.Find.MatchAllWordForms := False;
    aWordApp.Selection.Find.MatchCase := True;
    aWordApp.Selection.Find.MatchWildcards := False;
    aWordApp.Selection.Find.MatchSoundsLike := False;
    aWordApp.Selection.Find.MatchWholeWord := False;
    aWordApp.Selection.Find.MatchFuzzy := False;
    aWordApp.Selection.Find.Wrap := wdFindContinue;
    aWordApp.Selection.Find.Format := False;
    { Perform the search}
    aWordApp.Selection.Find.Execute(Replace := wdReplaceAll);
end;

Here i paste profiling results (i have aqtime pro): enter image description here

Can you please help me in pinpointing the problem?

Zeina
  • 1,573
  • 2
  • 24
  • 34
UnDiUdin
  • 14,924
  • 39
  • 151
  • 249
  • IF you really need performance THEN using Word via OLE/ActiveX is basically not going to cut it... is using a library (without Word dependency) for handling the Word documents an option ? – Yahia Mar 09 '12 at 17:07
  • 3
    It would be better if you can provide an appropriate sample document for benchmarking sake. – menjaraz Mar 11 '12 at 05:16
  • Can you elaborate on the profiling results: is the time in seconds or milliseconds, is the time per hit or the cumulative of all hits? – The_Fox Mar 12 '12 at 11:00
  • @The_Fox it is in seconds, it is what happens after a substitution. In that case in fact I call the function 153 times so i am substituing 150 different words (it is 50 in a normal db, then if user adds custom fields the number increases). – UnDiUdin Mar 12 '12 at 16:34
  • @Yahia Yes I agree the fact is that this method was fast enough until I added the substitution of header and footer. My main concern is that it seems that with adding header and footer subsitution it become much slower, like if full document is parsed when doing search and replace and the active view is header only. – UnDiUdin Mar 12 '12 at 16:36
  • @menjaraz it is quite terrible with any document – UnDiUdin Mar 12 '12 at 16:37
  • @user193655 I understand that... header and footer are treated in a very "special way" which is partially due to how they are handled inside the file format... – Yahia Mar 12 '12 at 16:49
  • Sorry, but there is still one thing unclear about the profiling results. The results, are they per hit or is it the total time of all hits? So the `Time with Children`, is it 21,78 seconds for 1 hit, or 21,78 seconds total with 153 hits (so one replace only takes 0,14 seconds). If the latter is true, there is nothing wrong with your performance and I'm afraid you can't speed it up. Office automation is rather slow. Even when using late time binding, it will still be slow. – The_Fox Mar 12 '12 at 23:11
  • @The_Fox yes it 21 seconds total. Anyway the main problem is that the performance was acceptabe when i was not replacing in header and footer. THis is the main idea of my question. May be ole automation loops through the full document even the view is set on footer? – UnDiUdin Mar 13 '12 at 08:11
  • Well 21 seconds total isn't that bad for executing `MSWordSearchAndReplaceInallDocumentParts` 153 times. I have update my answer with another possible improvement. – The_Fox Mar 13 '12 at 10:13
  • @The_Fox yes this is proaably going to be more effective, anyway I don't need to do seaarch and replace of 153 fields because those are all the available fields, but in the real world a document will contain 5 to 15 fields, so if thsi is true I shuold have a good improvement if I do serach % replace only on the "found" fields. But to this I should be able to read as plain text all the content of word document, including header and footer (and all the variations: different for first page, ..). THen I can use Delphi Pos function to locate which strings are used and then replace those.(CONTINUES) – UnDiUdin Mar 13 '12 at 11:32
  • @The_Fox (CONTINUES) But how to get the text from the word document? – UnDiUdin Mar 13 '12 at 11:32
  • You may also considere to use OpenOffice OLE (http://stackoverflow.com/questions/7806041/how-to-search-and-replace-in-odt-open-office-document) for it. Sometimes (not always) it is more efficient than MSWord one's for the same docs. – philnext Mar 13 '12 at 17:16

1 Answers1

9

I didn't see such terrible performance when testing on my machine, but still, there are ways to improve performance.

Biggest improvement is setting the aWordApp.ActiveWindow.Visible to False before calling MSWordSearchAndReplaceInAllDocumentParts.

Second improvement is setting aWordApp.ScreenUpdating to False.

When you are calling MSWordSearchAndReplaceInAllDocumentParts multiple times in a row, apply above settings once. Also, set ActiveWindow.ActivePane.View.Type to wdPrintView before calling MSWordSearchAndReplaceInAllDocumentParts multiple times.

Edit:

I got another improvement by changing the way you de find/replace: Instead of changing the SeekView, iterate through all the sections and get the range of the document, headers and footers yourself and do a Find/Replace over those ranges.

procedure TForm1.MSWordSearchAndReplaceInAllDocumentParts(const aDoc: OleVariant);
var
  i: Integer;
  lSection: OleVariant;
  lHeaders: OleVariant;
  lFooters: OleVariant;
  lSections: OleVariant;
begin
  lSections := aDoc.Sections;
  for i := 1 to lSections.Count do
  begin
    lSection := lSections.Item(i);
    lHeaders := lSection.Headers;
    lFooters := lSection.Footers;
    if lSection.PageSetup.OddAndEvenPagesHeaderFooter then
    begin
      SearchAndReplaceInADocumentPart(lHeaders.Item(wdHeaderFooterEvenPages).Range);
      SearchAndReplaceInADocumentPart(lFooters.Item(wdHeaderFooterEvenPages).Range);
    end;
    if lSection.PageSetup.DifferentFirstPageHeaderFooter then
    begin
      SearchAndReplaceInADocumentPart(lHeaders.Item(wdHeaderFooterFirstPage).Range);
      SearchAndReplaceInADocumentPart(lFooters.Item(wdHeaderFooterFirstPage).Range);
    end;
    SearchAndReplaceInADocumentPart(lHeaders.Item(wdHeaderFooterPrimary).Range);
    SearchAndReplaceInADocumentPart(lFooters.Item(wdHeaderFooterPrimary).Range);

    SearchAndReplaceInADocumentPart(lSection.Range);
  end;
end;

procedure TForm1.SearchAndReplaceInADocumentPart(const aRange: OleVariant);
begin
  aRange.Find.ClearFormatting;
  aRange.Find.Text := aSearchString;
  aRange.Find.Replacement.Text := aReplaceString;
  aRange.Find.Forward := True;
  aRange.Find.MatchAllWordForms := False;
  aRange.Find.MatchCase := True;
  aRange.Find.MatchWildcards := False;
  aRange.Find.MatchSoundsLike := False;
  aRange.Find.MatchWholeWord := False;
  aRange.Find.MatchFuzzy := False;
  aRange.Find.Wrap := wdFindContinue;
  aRange.Find.Format := False;

  { Perform the search}
  aRange.Find.Execute(Replace := wdReplaceAll);
end;

You will see even a bigger improvement if you open the document you want to modify while the application is invisible, or if you open the document with Visible := False; (setting the application visible again will also set the document visible).

The_Fox
  • 6,992
  • 2
  • 43
  • 69
  • Thanks for the suggestions, I will try them out, they make sense. The only one I don't understand is wdPrintVIew, which is the advantage of doing that? – UnDiUdin Mar 12 '12 at 16:28
  • Another comment: performance in my case is terrible since I replace about 150 strings (according to profiler results). – UnDiUdin Mar 12 '12 at 16:37
  • moreover I am already setting aWordApp.AcitveWIndow.Visible to False. – UnDiUdin Mar 12 '12 at 16:38
  • I tried the suggestion in the "Edit" section, but I didn't notice any improvement. May be a very small one. You said you improved it, but how much? – UnDiUdin Mar 13 '12 at 13:36
  • I correct my self. By doing some tricks I was able to use your suggestion on move from 21 seconds to 13. Now the final optimization is to call search replace only the needed fields. Do you know a way to get all the document as rtf including hedaer and footer? So than I can use Delphi Pos function to locate the strings? – UnDiUdin Mar 13 '12 at 13:59
  • @user193655: I went from 11 to 6,5 seconds. But when delving a bit deeper it seems to be caused by making the Wordapplication itself invisible before opening the document. When I open the worddocument when the wordapplication is not visible, the old method only takes 5,3 seconds. So that is even faster. – The_Fox Mar 13 '12 at 14:17
  • @user193655: I don't know how to check if a string is present or not. The only way I can think of is using the Find dialog, but you are already doing that. – The_Fox Mar 13 '12 at 14:28
  • @user193655: managed to get another improvement by checking for header/footer options. – The_Fox Mar 13 '12 at 14:31
  • I thank you a lot. WIth this last fine tuning I guess we hit the limit of what we can achieve. Now what I will do will be to save the document to text and then use Delphi to locate the strings I need to replace, now I did this with OLE. I am sorry bounty was only +200, you deserved at least +400 for your great help. – UnDiUdin Mar 14 '12 at 09:31
  • Final remark: i succeeded in saving the doc in txt with ole (saving to txt saves also header and footer, they are appended at the end of the text file), with this further optimization the final time was about 3 seconds. More than acceptable! – UnDiUdin Mar 14 '12 at 15:44