I am trying to parse a docx folder and take specific elements base on wether or not a certain word is bolded. If this is the text in the document:
Foo: Hello
Boo: Blah Blah
•Blah
•Blah
Choo: Hello
I would want to scan, line by line, and take all the text after the bolded word until the next bolded word.
As of right now I am using using an XML parser that parses based on newline charactrs. I cannot find anything in the Zipfile or the individual lines that would give me metadata like that.
Is it possible to do this?