I am new to python pptx library and my question is: How can I define the list of shapes, the shape numbers/indexes (shapetree) and shape types of each pptx slide within an existing presentation using Python Library pptx? I would like to update an existing ppt presentation and it seems that the first step would be to locate exact shape identifiers on each slide to access them with the updates. Would you point me to an existing solution or possibly examples?
Asked
Active
Viewed 4,164 times
2 Answers
7
I assume by "define" you mean something like "discover", since there's not usually a good reason to change the existing values.
A good way to start is by looping through and printing some attributes:
prs = Presentation("my-deck.pptx")
for slide in prs.slides:
for shape in slide.shapes:
print("id: %s, type: %s" % (shape.shape_id, shape.shape_type))
You can get as elaborate as you want with this, using any of the slide and/or shape attributes listed in the API documentation here:
https://python-pptx.readthedocs.io/en/latest/api/shapes.html#shape-objects-in-general
To look up a shape by id (or name) you need code like this:
def find_shape_by_id(shapes, shape_id):
"""Return shape by shape_id."""
for shape in shapes:
if shape.shape_id == shape_id:
return shape
return None
or if you doing a lot of it you can use a dict
for that job:
shapes_by_id = dict((s.shape_id, s) for s in shapes)
Which then gives you all the handy methods like:
>>> 7 in shapes_by_id
True
>>> shapes_by_id[7]
<pptx.shapes.Shape object at 0x...>

Mahaveer Jangir
- 597
- 7
- 15

scanny
- 26,423
- 5
- 54
- 80
-
A follow up question: which attributes of the shape are best to use to reference it? For example, if I am trying to define chart that is on slide[0] and it has name Chart 1, what other attributes should I pull to define it properly to replace data in it. And, what is the best way to define the object such as chart = prs.slides[0].shapes... – Sveta Oct 23 '19 at 19:15
-
Here is what I am trying. I know that slide[0] has chart with shape_id 3, to define this as an object to replace data, I am using chart = prs.slides[0].shape_id[3] and I am getting error: AttributeError: 'SlideShapes' object has no attribute 'shape_id' – Sveta Oct 23 '19 at 19:57
-
1The shape-id is guaranteed not to change, so that makes a reliable identifier. On the other hand, depending on your purposes, the name is easier to change (a shape can be renamed using the PowerPoint UI), so that's often the more practical choice. The initial names are generic, like "Rectangle 7", so you're not losing much by changing those to something more descriptive that you later use to look it up by name. You'll have to judge for yourself based on your purposes, but I would say it's a choice between those two, the former being more reliable and the latter more understandable. – scanny Oct 24 '19 at 00:17
0
Here is a my own function to get information on the key elements of a pptx presentation.
def presentation_elements(ppt_path):
""" Get all elements in a Powerpoint presentation
Parameters
----------
ppt_path : str / Path
full path to powerpoint file
Returns
-------
elements : pd.DataFrame
information on all elements in dataframe
Notes
-----
Slide Number follows Excel convention, use Slide Id to address slide.
To verify correct shapes in Excel, use:
Home > Arrange > Selection Pane ...
"""
ppt = Presentation(ppt_path)
elements = []
for num, slide in enumerate(ppt.slides):
for shape in slide.shapes:
shape_info = pd.Series(
[num + 1, slide.name, slide.slide_id,
shape.name, shape.shape_id, shape.shape_type],
index=['Slide No', 'Slide Name', 'Slide Id',
'Shape Name', 'Shape Id', 'Shape Type'])
elements.append(shape_info)
elements = pd.concat(elements, axis=1).T
return elements

UCCH
- 1
- 1