1

I have a protected document (doc.ProtectionType == wdAllowOnlyFormFields). It has areas that can be edited. Everything else is protected even from copying. I'm using the NetOffice.Word library and I'm trying to programmatically find text and create a bookmark in the found range. The problem is that when I try to call the method wordDoc.Content.Duplicate.Find.Execute(smthParams), the exception "COMException: This method or property is not available because the object refers to a protected area of the document." occurs. And, I can get any range of text manually without any problems:

var range = doc.Content.Duplicate;
range.SetRange(start, end);

In a range obtained in this way, I can create a bookmark with no problem. But I can't find the range corresponding to the text I'm looking for in this way. I am trying to create a bookmark this way:

public void CreateBookmarkTest()
{
    Document doc = Context.WordDocument;

    var searchText = "smth text";
    var bookmarkName = "newBookmark";
    
    using Range docRange = doc.Content.Duplicate;

    foreach (var paragraph in docRange.Paragraphs)
    {
        using Range paragraphRange = paragraph.Range;
        var text = paragraphRange.Text;
        var startParagraph = paragraphRange.Start;
        var endParagraph = paragraphRange.End;

        var startIndex = text.IndexOf(searchText);
        if (startIndex >= 0)
        {
            text = GetParagraphTextWithHiddenSymbols(paragraphRange, text);
            startIndex = text.IndexOf(searchText);
            var startFoundRange = startParagraph + startIndex;
            var end = startFoundRange + searchText.Length;

            paragraphRange.SetRange(startFoundRange, end);

            var foundText = paragraphRange.Text;
            if (foundText == searchText)
            {
                doc.Bookmarks.Add(bookmarkName, paragraphRange);
                break;
            }
        }
    }
}

private string GetParagraphTextWithHiddenSymbols(Range paragraphRange, string initialText)
{
    var text = initialText;
    foreach (Field field in paragraphRange.Fields)
    {
        int index = text.IndexOf(field.Result.Text);
        if (index >= 0)
        {
            text = text.Replace(field.Result.Text, $"{{{field.Code.Text}}} {field.Result.Text}{(char)21}");
        }
    }
    return text;
}

The problem is that, in this case, not always foundText == searchText. Sometimes foundText is offset and I can't figure out how to fix it yet. And this way seems to me slow and suboptimal. Perhaps there is some way to correctly implement search and text replacement (it would be ideal through Find.Execute). I'm also wondering if there's any way to get the areas allowed for editing (or just find out if the current Range is allowed for editing or not)?

I tried to convert using Oscar's idea from the answer below. The code works much better, but it also bugs out on large paragraphs with lots of unprotected input fields.

Thanks a lot for your help, friend!

Alldman
  • 15
  • 7
  • `fullText = range.Text;` What do you need to do to get fullText? – Oscar Sun Jul 10 '23 at 14:51
  • This is just a helper variable for debugging to compare the original text in the paragraph with what is presented in ``range.Text`` – Alldman Jul 10 '23 at 14:58
  • *1.* Then I'll comment on that line with `just for test (or debug watching)` for the convenience of readers。 *2.* Why don't you refer to `Microsoft.Office.Interop.Word`? . – Oscar Sun Jul 10 '23 at 15:13
  • 1. I removed this unnecessary variable because it duplicates the value of the ``text`` variable. 2. If you mean why I don't use the Microsoft.Office.Interop.Word library - my company uses NetOffice.Word to work with Microsoft Office documents and I'm desperate to find ways to get the range I need in protected documents without throwing an exception. Also, tried unsuccessfully to find a way to get the areas available for editing. – Alldman Jul 10 '23 at 16:05
  • I didn’t know about NetOffice before, and I haven’t used it before. I’m still trying to get familiar with it. I created a special project for testing, and I refer to NetOffice instead of Microsoft.Office.Interop.Word to avoid discrepancies due to different object models. Even though they may just be the same and small differences. If it is Microsoft.Office.Interop.Word or VBA, maybe I can get started right away. Fortunately, NetOffice is similar to Interop, but not as different as Python or java. – Oscar Sun Jul 10 '23 at 16:21
  • I'll try to focus on this question until my abilities can't go any further. I also hope that you can meet someone who is capable of helping you solve it quickly earlier. – Oscar Sun Jul 10 '23 at 16:21
  • Thank you so much for your eagerness to help, Oscar! It's very nice – Alldman Jul 10 '23 at 18:12
  • 1
    I have the same problem when using Microsoft.Office.Interop.Word (if ``doc.ProtectionType == wdAllowOnlyFormFields``): the ``Find.Execute`` method throws the same exception, and my suboptimal code correctly finds only the first few lines of a paragraph. – Alldman Jul 10 '23 at 19:55
  • `GetParagraphTextWithHiddenSymbols` Yes, Good! It is good to isolate this function method – Oscar Sun Jul 12 '23 at 13:21

1 Answers1

0

It should be that hidden text like Field's code text in it results in this problem. Whatever NetOffice, Microsoft.Office.Interop.Word or VBA, etc. You can try my code first. Although it's not a perfect solution so far, notice this block snippet:

if (range.Text != searchText)
                {

                    Console.WriteLine(range.Text);
                    System.Diagnostics.Debugger.Break();
                }

at least it points the way to debugging, knowing what the problem is. You can follow this direction for further refinement.

using NetOffice.WordApi.Enums;
using Word = NetOffice.WordApi;

Test();

//The following code applies only to the content( main body) of the document itself and does not include the footnote, comments, header, footer ......, and other parts of the document.
void Test()
{
    //just test file for me
    //const string fFullnameStr = @"C:\Users\oscar\Dropbox\VS\VBA\stackoverflow.docm";
    const string fFullnameStr = @"C:\Users\oscar\Dropbox\VS\stackoverflow\VBA\Naive Bayes classifier.docx";
    Word.Application wordApplication = new Word.Application();
    wordApplication.DisplayAlerts = WdAlertLevel.wdAlertsNone;
    wordApplication.Visible = true; //just for test to watch
    Word.Document doc = wordApplication.Documents.Open(fFullnameStr);//Context.WordDocument;

    /* for test
    if(doc.ProtectionType!= WdProtectionType.wdAllowOnlyFormFields)
        Console.WriteLine(doc.ProtectionType);
    doc.Close();
    doc.Protect(WdProtectionType.wdAllowOnlyFormFields);
    just for test */
    int i = 0;

    //var searchText = "smth text";
    // https://github.com/Aldman/ProtectedRangeSearch/blob/main/FindTextTests.cs#L15
    var searchText = "based on a common";//"diameter features";//"based on a common";//"assume that the value";
    var bookmarkName = "newBookmark";

    Word.Range rng = doc.Content;//doc.Content.Duplicate;

    if (doc.ProtectionType != WdProtectionType.wdAllowOnlyFormFields)
    {
        if (doc.ActiveWindow.View.ShowFieldCodes)
            doc.ActiveWindow.View.ShowFieldCodes = false;
        while (rng.Find.Execute(findText: searchText, matchCase: true, matchWholeWord: true, matchWildcards: false,
                matchSoundsLike: false, matchAllWordForms: false, forward: true, wrap: WdFindWrap.wdFindStop))
        {
            rng.Bookmarks.Add(bookmarkName + i++.ToString()); //rng.Select();//just for test
        }

    }
    else
    {
        foreach (var paragraph in rng.Paragraphs)//http://msdn.microsoft.com/en-us/en-us/Iibrary/office/ff837006.aspx 轉址為:https://learn.microsoft.com/en-us/office/vba/api/Word.Range.Paragraphs
        {
            Word.Range range = paragraph.Range;
            var text = range.Text;
            var index = text.IndexOf(searchText); int indexPre = index;
            var start = 0;


            #region GetParagraphTextWithHiddenSymbols
            foreach (Word.Field item in range.Fields)
            {

                index = text.IndexOf(item.Result.Text, start);
                if (index >= 0)
                {
                    text = text.Substring(0, index) + "{" + item.Code.Text + "}" + item.Result.Text + ((char)21).ToString()
                        + text.Substring(index + item.Result.Text.Length);
                    start = (text.Substring(0, index) + "{" + item.Code.Text + "}" + item.Result.Text + ((char)21).ToString()).Length;
                }
                //text = text.Replace(item.Result.Text, 
                //"{" +item.Code.Text+"}"+ item.Result.Text + (char)21);
                //fieldsResultLength += item.Result.Text.Length + 2 + 1;//2="{}" of field code,1=chr(21) placehold of the fields
            }

            start = 0;
            //there will be "" both the start and end of a ContentControl object, so have to plus 2 for the two placeholders
            foreach (Word.ContentControl item in range.ContentControls)
            {
                text = text.Substring(start, item.Range.Start - 1) + " " + item.Range.Text + " " + text.Substring(item.Range.End - 1);
            }
            #endregion


            while (index >= 0)
            {

                index = text.IndexOf(searchText);

                start = range.Start;
                var end = range.End;

                start += index; //+ fieldsResultLength;
                end = start + searchText.Length;
                range.SetRange(start, end);

                while (range.Text != searchText && end <= range.End)
                {
                    range.SetRange(++start, ++end);
                    if (range.Text == searchText) break;
                }

                if (range.Text != searchText)
                {
                    Console.WriteLine(range.Text);
                    System.Diagnostics.Debugger.Break();
                }

                range.Bookmarks.Add(bookmarkName + i++.ToString());

                text = paragraph.Range.Text; start = 0;
                index = text.IndexOf(searchText, indexPre + 1);
                indexPre = index;
            }
        }
    }


    wordApplication.Visible = true; //just for test to watch
    doc.ActiveWindow.View.ReadingLayout = false;//just for test to watch
    if (doc.ProtectionType != WdProtectionType.wdNoProtection)
        doc.Unprotect(123.ToString());//just for test

}

It is a logical necessity that Find objects cannot execute searched when the protection type is like this wdAllowOnlyFormFields. I think it's because the Find object class is not just a find class, but also includes a replace (edit) facility. Either you need to unprotect it, or change the way it is protected, or choose to use the current alternative, both of which I have conditioned flows in the code above. In addition to using this foreach paragraph approach to locate, you can also consider using a regular expression to achieve this. No matter which method you use, you have to do proper processing of the hidden text such as Fields' code text in order to get accurate results.

  • .csproj file:
<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net6.0</TargetFramework>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="NetOfficeFw.Core" Version="1.9.3" />
    <PackageReference Include="NetOfficeFw.Word" Version="1.9.3" />
  </ItemGroup>

  <ItemGroup>
    <FrameworkReference Include="Microsoft.WindowsDesktop.App.WindowsForms" />
  </ItemGroup>

</Project>

void Test_ShowFieldCodes()
{
    //just test file for me
    const string fFullnameStr = @"C:\Users\oscar\Dropbox\VS\VBA\stackoverflow.docm";
    Word.Application wordApplication = new Word.Application();
    wordApplication.DisplayAlerts = WdAlertLevel.wdAlertsNone;
    //wordApplication.Visible = true; //just for test to watch
    Word.Document doc = wordApplication.Documents.Open(fFullnameStr);//Context.WordDocument;


    int i = 0;
    var searchText = "smth text";
    var bookmarkName = "newBookmark";

    Word.Range rng = doc.Content;//doc.Content.Duplicate;

    if (doc.ProtectionType != WdProtectionType.wdAllowOnlyFormFields)
    {

        while (rng.Find.Execute(findText: searchText, matchCase: true, matchWholeWord: true, matchWildcards: false,
                matchSoundsLike: false, matchAllWordForms: false, forward: true, wrap: WdFindWrap.wdFindStop))
        {

            if ((bool)rng.Information(WdInformation.wdInContentControl))
                rng.SetRange(rng.Paragraphs[1].Range.ContentControls[1].Range.End + 1,
                    rng.Paragraphs[1].Range.ContentControls[1].Range.End + 1);
            rng.Bookmarks.Add(bookmarkName + i++.ToString());
        }

    }
    else
    {        //rng = doc.Content.Duplicate;
        foreach (var paragraph in rng.Paragraphs)//http://msdn.microsoft.com/en-us/en-us/Iibrary/office/ff837006.aspx 轉址為:https://learn.microsoft.com/en-us/office/vba/api/Word.Range.Paragraphs
        {
            Word.Range range = paragraph.Range;
            var text = range.Text;
            var index = text.IndexOf(searchText); int indexPre = 0;
            var start = 0;

            while (index >= 0)
            {

                if (paragraph.Range.Fields.Count > 0)
                {

                    doc.ActiveWindow.View.ShowFieldCodes = true;
                    text = paragraph.Range.Text;
                    //if there are fields this will be the index of ShowFieldCodes=false + index of ShowFieldCodes=true and plus 1
                    index = index + text.IndexOf(searchText, indexPre) + 1;
                    doc.ActiveWindow.View.ShowFieldCodes = false;
                }

                start = range.Start;
                var end = range.End;

                start += index;
                end = start + searchText.Length;
                range.SetRange(start, end);

                while (range.Text != searchText && end <= range.End && range.End < doc.Content.End - 1)
                {
                    //range.Select();//just for test
                    range.SetRange(++start, ++end);
                    if (range.Text == searchText) break;
                }

                if (range.Text != searchText && range.End < doc.Content.End - 1)
                {
                    Console.WriteLine(range.Text);
                    System.Diagnostics.Debugger.Break();
                }

                if (range.Text == searchText)
                {
                    if ((bool)range.Information(WdInformation.wdInContentControl))
                        range.SetRange(range.Paragraphs[1].Range.ContentControls[1].Range.End + 1,
                            range.Paragraphs[1].Range.ContentControls[1].Range.End + 1);
                    range.Bookmarks.Add(bookmarkName + i++.ToString());
                }
                text = paragraph.Range.Text; start = 0;
                index = text.IndexOf(searchText, indexPre + 1);
                indexPre = index;
            }
        }
    }


    wordApplication.Visible = true; //just for test to watch
    //doc.Unprotect(1.ToString());//just for test

}

20230712 ContentControls ,either

So the answer is in your file there is no field in it, and all of the file it has is plenty of ContentControl not Fields! ActiveDocument.ContentControls.Count is 3. ActiveDocument.Fields.Count is 0. The new code is updated above.

Oscar Sun
  • 1,427
  • 2
  • 8
  • 13
  • Thank you very much for the presented solution! I really like the idea of adding the input field code and because of that I almost found a way to find the correct range. But it also gets wrong on large paragraphs with many unhidden fields. Can you please drop me a link to an article that has information about hidden codes! – Alldman Jul 12 '23 at 11:48
  • I modified the question a bit by adding your idea in there. – Alldman Jul 12 '23 at 12:02
  • @Alldman `but it also bugs out on large paragraphs with lots of unprotected input fields` Can you share a sample to test? Or you can figure it out by yourself. https://learn.microsoft.com/en-us/office/vba/api/word.field.code https://learn.microsoft.com/en-us/office/vba/api/word.fields.toggleshowcodes – Oscar Sun Jul 12 '23 at 12:25
  • My repo with doc https://github.com/Aldman/ProtectedRangeSearch – Alldman Jul 12 '23 at 13:14
  • @Alldman Is this one? [Naive Bayes classifier.docx](https://github.com/Aldman/ProtectedRangeSearch/blob/main/Resources/Naive%20Bayes%20classifier.docx) Could you give me the password to unprotect the file to debugging? In this file, there is no string `smth text`, what string did you set to test. – Oscar Sun Jul 12 '23 at 13:53
  • Password: 123. I wrote a test (in the FindTextTests.cs file) where I added some test values. – Alldman Jul 12 '23 at 14:00
  • I apologize, my test document didn't have fields like the working document (which I can't use). I added the fields and uploaded the corrected version – Alldman Jul 12 '23 at 14:29
  • @Alldman So the answer is in your file there is no field in it, and all of the file it has is plenty of **ContentControl** not **Fields** ! `ActiveDocument.ContentControls.Count ` is 3. `ActiveDocument.Fields.Count` is 0. Check my updated code plz. It should be all set. – Oscar Sun Jul 12 '23 at 15:16
  • I changed the file in the repository. Now I will try to test the updated code – Alldman Jul 12 '23 at 15:22
  • @Alldman Use the one I've updated 1 min ago. Thanks. – Oscar Sun Jul 12 '23 at 15:36
  • @Alldman [new file](https://github.com/Aldman/ProtectedRangeSearch/blob/4b469f1525ea842b452c89b05e268d19e7c3e6ab/Resources/Naive%20Bayes%20classifier.docx) is ok too, with my newest code. – Oscar Sun Jul 12 '23 at 15:44
  • 1
    Thank you so much. You have helped me out a lot! Everything works! – Alldman Jul 12 '23 at 15:47
  • @Alldman That's fine! Congratulations! – Oscar Sun Jul 12 '23 at 15:52
  • @Alldman Just fixed some bugs and trimmed the code a moment ago. Good luck. – Oscar Sun Jul 12 '23 at 16:08