-2

I am trying to get a Python script to redact a word-document based of a list of words to redact. I found a link with the code but can't seem to get it to work.

Link: https://arccoder.medium.com/redact-word-documents-using-python-7a676fd84d5e

I don't think its to hard to make it work, but due to my limited knowledge i can't figure out how/where to put my paths/outputs etc.

Can you guys help me where to fill in the needed inputs/outputs?

def redact_document(input_path: str, output_path: str, pattern: list, color: str = None):
   
    # Get the text color and text-background color for reaction
    txt_color, background_color = redact_colors(color)

    # Open the input document
    doc = Document(input_path)
    # Loop through paragraphs
    for para in doc.paragraphs:
        # Loop through the runs in the paragraph in the reverse order
        run_index = len(para.runs) - 1
        while run_index > -1:
            run = para.runs[run_index]
            # Find the start and end indices of the patterns in the run-text
            match_pairs = [(match.start(), match.end()) for match in re.finditer('|'.join(pattern), run.text)]
            # Get the locations in the format required for `split_run_by` function
            highlights, matches = process_matches(match_pairs, run.text)
            # Go to redact only if patterns are found in the text
            if len(highlights) > 0 and len(matches) > 0:
                if len(highlights) != len(matches) - 1:
                    ValueError('Calculation error within matches and highlights')
                else:
                    if len(matches) == 2:  # When a pattern is the only text in the run
                        # Highlight the background color
                        run.font.highlight_color = background_color
                        # Match the text color to the background color
                        run.font.color.rgb = txt_color
                    else:
                        # Split the runs using the matches
                        new_runs = split_run_by(para, run, matches[1:-1])
                        # Highlight the run if it matches a pattern
                        for highlight, run in zip(highlights, new_runs):
                            if highlight:
                                # Highlight the background color
                                run.font.highlight_color = background_color
                                # Match the text color to the background color
                                run.font.color.rgb = txt_color
            # Decrement the index to process the previous run
            run_index -= 1
    # Save the redacted document to the output path
    doc.save(output_path)
macropod
  • 12,757
  • 2
  • 9
  • 21
Dion
  • 15
  • 4
  • it is `definition` of function. You have to `execute` this function with your values - `redact_document(your_parameters)`. That's all – furas Sep 17 '22 at 11:49
  • So only in the first line i should give my path, list, pattern etc? how would i give a path there? just between annotations? – Dion Sep 17 '22 at 15:50
  • Is it possible to give me an example how to fill in the input parameters? Then i will able to understand how to give my inputs. – Dion Sep 17 '22 at 16:17
  • I don't understand you - it is normal function and you have to execute it with YOUR values which you see in definition `def redact_document(input_path: str, output_path: str, pattern: list, color: str = None):` - normaly `redact_document(your_input, your_output, your_pattern, your_color )` – furas Sep 17 '22 at 17:30
  • Thanks for you comment. I know where to fill in my values, but now HOW. `def redact_document(input_path: 'C:\\Documents\\Python\\Dit is een test.docx', output_path: 'C:\\Documents\\Python\\Dit is een test.docx', pattern: ['test'], color: '0, 0, 0'):` This does not seem to work. I think i am giving all my input wrong.. – Dion Sep 18 '22 at 07:29
  • you don't understand - DON'T change code in `def` but run function as any other code - without `def`. You don't do `def print(arg1: "text")` but `print("text")` – furas Sep 18 '22 at 11:47

1 Answers1

0

I don't understand what is the problem.

It seems you are confusing two things definition and execution.

Don't change def redact_document(...). You have to execute function with your parameters.

# --- define function - copy code without any changes ---

def redact_document(input_path: str, output_path: str, pattern: list, color: str = None):
    # ... code from your question ... 

# --- execute function with your parameters ---

redact_document('C:\\Documents\\Python\\Dit is een test.docx', 'C:\\Documents\\Python\\Dit is een test.docx', ['test'], '0, 0, 0')

#redact_document('other.docx', 'result.docx', ['Hello', 'World'], '255, 0, 0')
furas
  • 134,197
  • 12
  • 106
  • 148
  • My apologies for not understanding, my knowledge is very limited. My defs look like this now: `def redact_document(input_path='C:\\Users\\Documents\\Python\\Dit is een test.docx', output_path='C:\\Users\\Documents\\Python\\Dit is een test.docx', pattern=list['test'], color='255, 0, 0'):` Code seems to be running but nothing happens? probably missing something? – Dion Sep 18 '22 at 14:26
  • you still wrong: DON'T change `def` but execute/run function. You only redefine function but you don't run it. See my answer: I don't change `def redact_document(...)` and I don't use keyword `def` to execute/run function. You are confusing two things `definition` and `execution` – furas Sep 18 '22 at 14:29