1

I working on scrapping microsoft website where i want to fetch sql product name such as "SQL Version 2012" using their version "11.0.6607.3".

Basically I want to search for their product name using their Version. Help me with powershell using invoke-webrequest.

Thanks in Advance

This is what I have tried

$url = "https://learn.microsoft.com/en-us/troubleshoot/sql/releases/download-and-install-latest-updates"
$html.content = Invoke-WebRequest  -Uri $url

# Find the table rows (tr) in the HTML
$rows = ($html.content).ParsedHtml.getElementsByTagName('tr')

# Define the version number to search for
$searchVersion = '11.0.5058.0'

# Loop through the rows and find the matching version number
foreach ($row in $rows) {
    $cells = $row.getElementsByTagName('td')
    if ($cells.Count -gt 1 -and $cells[1].innerText -match $searchVersion) {
        # The version number is found in the second cell of the row
        # Output the SQL Server version from the first cell of the same row
        Write-Output $cells[0].innerText
        break  # Exit the loop after the first match is found
    }
}
scode123
  • 49
  • 4
  • This site works best when you post code you've already written and describe what isn't working - e.g. an error or unexpected results. Open ended questions like "help me write this functionality" tend to get downvoted and closed - have a read of https://stackoverflow.com/help/how-to-ask to see how you can improve your question and make it more likely to get a helpful response... – mclayton Mar 31 '23 at 10:43
  • What website are you scrapping? – jdweng Mar 31 '23 at 10:49
  • @jdweng. "https://learn.microsoft.com/en-us/troubleshoot/sql/releases/download-and-install-latest-updates" this is website what i am trying – scode123 Mar 31 '23 at 10:56

1 Answers1

1

Here's a quick and dirty way to do it (with no error handling) that works with that url today, but it's brittle and might break if the page layout changes...

$url = "https://learn.microsoft.com/en-us/troubleshoot/sql/releases/download-and-install-latest-updates";

$html = (Invoke-WebRequest -Uri $url -UseBasicParsing).Content;

$searchVersion = "11.0.5058.0";

# find the version text
$index = $html.IndexOf($searchVersion);

# find the start of the containing "<tr>"
$tr = $html.LastIndexOf("<tr>", $index);

# find the text inside the following "<strong>...</strong>"
$start = $html.IndexOf("<strong>", $tr) + "<strong>".Length;
$end = $html.IndexOf("</strong>", $tr);
$name = $html.Substring($start, $end - $start);

$name
# SQL Server 2012

It would probably be better to use a proper HTML parser library like the HTML Agility Pack, but for a simple use case like this it's probably enough to just do it with basic string searches.

Note that the ParsedHtml property isn't available in PowerShell "Core" so if you're writing new code it's more future-proof if you avoid using it and add the -UseBasicParsing switch, even if you're targeting Windows PowerShell at the moment.

And since SQL Server doesn't really get new versions that frequently, it even might be better to simply have a hashtable hardcoded in your script to do the lookups with...

mclayton
  • 8,025
  • 2
  • 21
  • 26
  • Thanks . but its giving me wrong information. basically it is giving previous version of what is suppose to give. – scode123 Mar 31 '23 at 11:11
  • I get "SQL Server 2012" for ""11.0.5058.0" - what do you get? – mclayton Mar 31 '23 at 11:12
  • try for this "14.0.3456.2" – scode123 Mar 31 '23 at 11:16
  • Ok, updated to look for the start of the containing "" instead of the "" since you're looking for version numbers in any of the "Latest service pack", "Latest GDR" or "Latest cumulative update" columns to identify the row. – mclayton Mar 31 '23 at 11:16
  • hi @mclayton , Tried your code, for some versions I am getting correct Table name. but for some like - "15.0.4188.2" or "15.0.2000.5" getting error. Let me know if any solution is there. Thanks – scode123 Apr 24 '23 at 12:04