4

I want to use the python-pptx module to change the proofing language of every text-containing shape in a given powerpoint presentation. Unfortunately I do not manage. :(

I'm using Python 3.6.3 and python-pptx 0.6.7.

My code looks like this:

from pptx import Presentation
from pptx.enum.lang import MSO_LANGUAGE_ID

# In this example code, all proofing language is set to ENGLISH_UK
# all languages can be found in the docs for python-pptx
new_language = MSO_LANGUAGE_ID.ENGLISH_UK

input_file = 'test_pptx.pptx'
output_file = input_file[:-5] + '_modified.pptx'

# Open the presentation
prs = Presentation(input_file)

# iterate through all slides
for slide_no, slide in enumerate(prs.slides):
    # iterate through all shapes/objects on one slide
    for shape in slide.shapes:
        # check if the shape/object has text (pictures e.g. don't have text)
        if shape.has_text_frame:
            # print some output to the console for now
            print('SLIDE NO# ', slide_no + 1)
            print('Object-Name: ', shape.name)
            print('Text -->', shape.text)
            # check for each paragraph of text for the actual shape/object
            for paragraph in shape.text_frame.paragraphs:
                for run in paragraph.runs:
                    # display the current language
                    print('Actual set language: ', run.font.language_id)
                    # set the 'new_language'
                    run.font.language_id = new_language
        else:
            print('SLIDE NO# ', slide_no + 1, ': This object "', shape.name, '" has no text.')
        print(' +++++ next element +++++ ')
    print('--------- next slide ---------')

# save pptx with new filename
prs.save(output_file)

This code now WORKS! (again, thanks to Steve!)

Please help! Thanks in advance!

ChristianH
  • 53
  • 1
  • 7
  • thanks a lot! Your code works straight out of the box. Tested today with python-pptx-0.6.18, python 3.8.5 and language MSO_LANGUAGE_ID.FRENCH – BrunoO Nov 10 '20 at 15:36

1 Answers1

3

I'm actually not entirely sure of all the rules by which the proofing functionality decides what dictionary to use, but language is set at the run level and I'm thinking that's a good place to start.

This makes a certain amount of sense, because you could have a foreign phrase in the midst of a paragraph of text, and having only a shape-level language setting wouldn't support that.

So you'd need some additional code once you got past the .has_text_frame test:

for paragraph in shape.text_frame.paragraphs:
    for run in paragraph.runs:
        font = run.font
        print(font.language_id)

This should give you something like:

TURKISH (1055)
ENGLISH_UK (2057)
...

Note that the language id value has some additional information by way of standard language codes available on the .xml_value property, so you could elaborate the output to something like:

    for run in paragraph.runs:
        font = run.font
        language_id = font.language_id
        print('\'%s\'' % run.text, language_id, language_id.xml_value)

to get something like:

'the rain in ' ENGLISH_US (1033) en-US
'España' SPANISH (1034) es-ES_tradnl
...
scanny
  • 26,423
  • 5
  • 54
  • 80
  • Hey Steve! This was exactly the missing piece in my puzzle. Now it works. I will brush up my code and I will post it here! Thanks a lot! – ChristianH Nov 02 '17 at 17:22
  • This also works for tables inside a pptx; just had to dig deep; `for cell in shape.table.iter_cells()`, `for paragraph in cell.text_frame.paragraph` and so on... loop nesting madness. – FObersteiner Oct 15 '19 at 14:45