4

I have PowerPoint files with many dozens of links to different sheets in an Excel document. I need to change the Excel documents to which these links point in a programmatic way.

I'm pretty sure I could do this with VBA, but since I'm generating the Excel documents in python anyway I'd prefer to update the links there as well.

I dug in to the underlying XML files for a test .pptx file and found that the link references live in the ppt/slides/_rels/ folder (after unzipping the .pptx file)

For example, slide1.xml.rels contains several relationships, one having TargetMode="External" and Target="FULL_PATH_OMITTED\test.xlsx!Sheet1!R3C5:R20C14"

Using the python-ppt package I found that this same reference lives under slide.part.rels

E.g.:

for rel in slides[0].part.rels.values():
    if rel.is_external:
        print(rel.target_ref)

Finds the same path for the link (i.e. "FULL_PATH_OMITTED\test.xlsx!Sheet1!R3C5:R20C14")

What I don't know how to do is change this value, if it can be changed. Just trying to set it using python-pptx produces an AttributeError

Is there a way to modify the underlying XML for a PowerPoint file using python-pptx? Or some alternative strategy would be fine.

dan_g
  • 2,712
  • 5
  • 25
  • 44
  • silly question, but how did you load the xml into Python, do you unzip the pptx and then just prase the xml? – Umar.H Jan 08 '20 at 17:53
  • 1
    The `python-pptx` library handles the actual unzipping / modifying / re-zipping of the underlying XML files through its API. – dan_g Jan 09 '20 at 02:50
  • Awesome, I ended up writing a program that changes the extension, unzips, breaks out the XML then re zips and changes ext after modification but this module looks promising as my functions require previous knowledge of the PPT/XML. – Umar.H Jan 09 '20 at 11:02

1 Answers1

5

Try setting the ._target attribute of the rel (Relationship) object
https://github.com/scanny/python-pptx/blob/master/pptx/opc/package.py#L555

rel._target = 'FULL_PATH_OMITTED\test.xlsx!Sheet1!R3C5:R20C14'

This will only work when the relationship type is External (as opposed to a relationship to another part in the same package).

This is hacking internals, of course, so use at your own risk. That said, this part of the code base has been very stable for a long time.

scanny
  • 26,423
  • 5
  • 54
  • 80
  • perfect, thank you. I found that I could just replace that `rel` with a new one using `add_relationship` to generate the new one as well, but this is obviously a shorter path to the same result. – dan_g Mar 29 '18 at 17:02
  • 1
    Shorter, yes, but also better since it doesn't leave dangling relationships hanging around :) Also then you wouldn't need to change the hyperlink reference or whatever. If you *do* want to "replace" a relationship, better to drop the old one before creating the new one e.g. `slide.part.drop_rel('rId3')`. – scanny Mar 29 '18 at 17:29
  • @scanny How to access ppt/slides/slide1.xml. In that I want to access . can u help with this. – Naren Babu R Apr 29 '21 at 11:39
  • @NarenBabuR please post that as a separate question and use the `python-pptx` tag. – scanny Apr 29 '21 at 16:32