0

I’ve had success doing web scraping with Microsoft Access, using MSXML2.XMLHTTP objects and Regex. I’ve been exploring the web scraping possibilities of Power Automate, and see that it doesn’t have regex, but can execute regex scripts from Excel. The problem: accessing the relevant data in the first place.

Take a look at this link: https://letterboxd.com/tiff_net/list/2022-toronto-international-film-festival/ When you try to extract data from one of the entries, nothing useful is available.

enter image description here

And yet all the information I want is contained behind it:

enter image description here

You can display the source behind a web page by using Edge as your browser and adding “view-source:” to the beginning of the web address you want to go to. But then what? How do you get the HTML source into a variable where you can work on it? With MSXML2.XMLHTTP, you just access the responseText property. Can something like this be done with Power Automate or are you limited to scraping sites with extraction-compatible objects?

Skin
  • 9,085
  • 2
  • 13
  • 29
trevbet
  • 145
  • 1
  • 12

0 Answers0