3

I am able to remove/overwrite some of the metadata (that which is stored in core.xml) with the following code:

def remove_metadata(prs):
    """Overwrites the metadata in core.xml however does not overwrite metadata which is stored in app.xml"""
    prs.core_properties.title = 'PowerPoint Presentation'
    prs.core_properties.last_modified_by = 'python-pptx'
    prs.core_properties.revision = 1
    prs.core_properties.modified = datetime.utcnow()
    prs.core_properties.subject = ''
    prs.core_properties.author = 'python-pptx'
    prs.core_properties.keywords = ''
    prs.core_properties.comments = ''
    prs.core_properties.created = datetime.utcnow()
    prs.core_properties.category = ''

prs = pptx.Presentation('my_pres.xml')
remove_metadata(prs)

And this is useful - but there is other metadata that is stored in app.xml such as Company and Manager. I also need to clear these properties. Using python-pptx how can I edit the app.xml file?

Sam Redway
  • 7,605
  • 2
  • 27
  • 41

1 Answers1

3

I found a solution. It is not necessarily an ideal way to deal with this issue but seems to work:

def remove_metadata_from_app_xml(prs):
    """There is currently no functionality for handling app.xml so 
    have to find the part and then alter its blob manually
    """
    package_parts = prs.part.package.parts
    for part in package_parts:
        if part.partname.endswith('app.xml'):
            app_xml_part = part
    app_xml = app_xml_part.blob.decode('utf-8')
    tags_to_remove = ('Company', 'Manager', 'HyperlinkBase')
    for tag in tags_to_remove:
        pattern = f'<{tag}>.*<\/{tag}>'
        app_xml = re.sub(pattern, '', app_xml)
    app_xml_part.blob = bytearray(app_xml, 'utf-8')
Sam Redway
  • 7,605
  • 2
  • 27
  • 41