0

I want to highlight the text or elements which are inserted or deleted, after combine the two version of the Docx file.

Here there are just returning the the values. I tried following code. It is highlighting full paragraph.

def get_accepted_text(p):
    xml = p._t.xml
    if "w:del" in xml or "w:ins" in xml:
        for run in p.runs:
            run.font.highlight_color = WD_COLOR_INDEX.PINK

But I need, highlight the text.

Note: Here there are returning the the values

KarSho
  • 5,699
  • 13
  • 45
  • 78

1 Answers1

0

at line for run in p.runs: you're setting highlight for all runs which is not something you want. the snippet below finds all runs (including the tracked ones) and checks if they are contained within the tracking containers w:ins and w:del. Once you find these, it's easy to apply custom formatting on the list of changed runs.

import docx
from docx.text.run import Run
from docx.enum.text import WD_COLOR_INDEX

doc = docx.Document('t1.docx')
ns = "{http://schemas.openxmlformats.org/wordprocessingml/2006/main}"

def iter_changed_runs(doc):
    for p in doc.paragraphs:
        for r in p._p.xpath(f'//w:r'):
            parent = r.getparent()
            if parent.tag in (f'{ns}ins', f'{ns}del'):
                yield Run(r, p)

delta_runs = list(iter_changed_runs(doc))

# change color
for r in delta_runs:
    r.font.highlight_color =  WD_COLOR_INDEX.YELLOW
doc.save('t2.docx')

this is the screenshot of the t2.docx generated from the initial text of the document t1.docx written without using the tracking function and then modified with the track changes turned on.

highlighting of the modified runs in docx file

aleksandarbos
  • 496
  • 4
  • 12