Search HTML Data Retrieved from Invoke-WebRequest with Regex

Question

I am trying to scrape data from https://www.reuters.com/finance/stocks/lookup?searchType=any&comSortBy=marketcap&sortBy=&dateRange=&search=Accor.

The end goal is to pull the table down that contains the Company, Symbol and Exchange.

I have successfully gained the HTML that I need but I can't pull the data I need from it.

I've used some online RegEx 'helpers' and the string works fine and selects the data I need, but when I try and use the command it doesn't work.

$web = Invoke-WebRequest -uri 'https://www.reuters.com/finance/stocks/lookup?searchType=any&comSortBy=marketcap&sortBy=&dateRange=&search=Accor' -UseBasicParsing
$str = ($web.Content).ToString()
[regex]$regex = '<table[\s\S]*?</table>'
$str | Select-String -Pattern $regex -AllMatches

$str > raw.txt; Select-String -Pattern $regex -Path ./raw.txt -AllMatches

I'm expecting to return the whole element but it returns the full string in the piped command and nothing in the -Path command.

I've tried also doing this using a IE Com object.

score 0 · Answer 1 · answered Jul 15 '19 at 00:26

Rubber ducky effect. As soon as I asked I figured it out...

$url = 'https://www.reuters.com/finance/stocks/lookup?searchType=any&comSortBy=marketcap&sortBy=&dateRange=&search=Accor'

$content = (New-Object System.Net.WebClient).DownloadString($url)

$content -match '<table[\s\S]*?</table>'

$matches

Name                           Value                                                                                                                                                                 
----                           -----                                                                                                                                                                 
0                              <table width="100%" cellspacing="0" cellpadding="1" class="search-table-data">...

Better off with this regex `']?)+>([\S\s]*?)
'` – Jul 15 '19 at 00:48 — , Jul 15 '19 at 00:48

Search HTML Data Retrieved from Invoke-WebRequest with Regex

1 Answers1