Can Power Automate extract data from the HTML source of a web page?

Asked Jan 29 '23 at 23:44

Active Jan 29 '23 at 23:53

Viewed 406 times

I’ve had success doing web scraping with Microsoft Access, using MSXML2.XMLHTTP objects and Regex. I’ve been exploring the web scraping possibilities of Power Automate, and see that it doesn’t have regex, but can execute regex scripts from Excel. The problem: accessing the relevant data in the first place.

Take a look at this link: https://letterboxd.com/tiff_net/list/2022-toronto-international-film-festival/ When you try to extract data from one of the entries, nothing useful is available.

And yet all the information I want is contained behind it:

You can display the source behind a web page by using Edge as your browser and adding “view-source:” to the beginning of the web address you want to go to. But then what? How do you get the HTML source into a variable where you can work on it? With MSXML2.XMLHTTP, you just access the responseText property. Can something like this be done with Power Automate or are you limited to scraping sites with extraction-compatible objects?

edited Jan 29 '23 at 23:53

Skin

9,085
2
13
29

asked Jan 29 '23 at 23:44

trevbet

Can Power Automate extract data from the HTML source of a web page?

0 Answers0