0

I am reading data from a pptx file using python. I need to access the hyperlinks/urls present in it.

    ppt2 = Presentation('../sample dataset/'+ file_name)
    for slide in ppt2.slides:
        for shape in slide.shapes:
               click_action = shape.click_action
               if click_action.action == PP_ACTION.HYPERLINK:
                    print(click_action.hyperlink.address)

I have tried this and it did not work. it did not show any output.

I need the urls present in the hyperlinks as output. But i did not get any output. sample of how the ppt slide looks like

The hyperlink with text 'sample text' has a URL. I need to access the url (please see the ppt slide image).

  • 1
    _I have tried this and it did not work._ --> What does that mean? Did you get any errors (if so, please [edit] your question and add the complete error message)? Or just no output? Are you sure that those links are not images? – Ocaso Protal Oct 14 '19 at 08:19
  • There is just no output. Yeah I am sure the links are not images – Ganesh Raj K Oct 14 '19 at 08:52
  • Ah, ok, thanks for the clarification. So you are creating the pptx? Is it possible for you to add a sample pptx instead of that screenshot? – Ocaso Protal Oct 14 '19 at 08:55
  • No I wasn't creating the pptx. I was accessing an already created pptx. I just created a sample slide because I didn't want to share the ppt I was working on. – Ganesh Raj K Oct 14 '19 at 09:00
  • @GaneshRajK, Have you solved the problem? – Larytet May 13 '20 at 15:06

1 Answers1

0

Try this

for slide in prs.slides:
    for shape in slide.shapes:
        if hasattr(shape, "hyperlink"):
            hyperlink = shape.hyperlink
            hyperlink_address = hyperlink.address
            hyperlink_text = ""
            if hasattr(shape, "text"):
                hyperlink_text = shape.text
            print("hyperlink_text", hyperlink_text, "hyperlink_address", hyperlink_address)
        elif shape.has_text_frame:
            for paragraph in shape.text_frame.paragraphs:
                for run in paragraph.runs:
                    if not hasattr(run, "hyperlink"):
                        continue
                    hyperlink = run.hyperlink
                    hyperlink_text = ""
                    if hasattr(run, "text"):
                        hyperlink_text = run.text
                    hyperlink_address = hyperlink.address
                    if hyperlink_address == None:
                        continue
                    print("hyperlink_text", hyperlink_text, "hyperlink_address", hyperlink_address)
Alain Pannetier
  • 9,315
  • 3
  • 41
  • 46
Larytet
  • 648
  • 3
  • 13