0

I am currently developing a word addin using the office js library. I need to get all sentences in the word document as individual ranges. For this I used getTextRanges() on the body of the document with "." as the delimiter. However, it also separates on paragraph mark which is not ideal for my use case. All I want is for the document to be divvied up into ranges where the only delimiter is "." - regardless of whether the ranges will then expand across paragraphs.

Is there a way to ignore paragraph marks with getTextRanges(), or is there another method entirely that I seem to have overlooked?

Thanks.

I have been unable to resolve it.

Eugene Astafiev
  • 47,483
  • 3
  • 24
  • 45

1 Answers1

0
  Word.run(async (context) => {
     const body = context.document.body;
     context.load(body, 'text');
     await context.sync();
     const text = body.text;
     // 3. Split the text into individual sentences
     const sentences = text.split(/[.!?]/);
     // 4. Create a list of sentences
     const sentenceList = [];
     sentences.forEach(sentence => {
     // Ignore any paragraph marks found in the body of the document

     sentenceList.push(sentence);

   });
   for (let i = 0 ; i < sentenceList.length ; i++) {
     console.log(i + " " + sentenceList[i]);
   }
   await context.sync();
});

This seems to meet your requirements.

Paul H.
  • 158
  • 8
  • This indeed gets the text pieces that I want, however, it does not extract the ranges so I can perform actions on the underlying pieces of text. – HighPriestPete Mar 04 '23 at 06:30