There are several downloads sites with exactly the same or very similar layout to this GIMP page, including many Apache projects like Tomcat and ActiveMQ. I had written a little function to parse these and other pages in the past, and interestingly it also worked for this GIMP page. I thought it was worth sharing as such.
Function Extract-FilenameFromWebsite {
[cmdletbinding()]
Param(
[parameter(Position=0,ValueFromPipeline)]
$Url
)
begin{
$pattern = '<a href.+">(?<FileName>.+?\..+?)</a>\s+(?<Date>\d+-.+?)\s{2,}(?<Size>\d+\w)?'
}
process{
$website = Invoke-WebRequest $Url -UseBasicParsing
switch -Regex ($website.Content -split '\r?\n'){
$pattern {
[PSCustomObject]@{
FileName = $matches.FileName
URL = '{0}{1}' -f $Url,$matches.FileName
LastModified = [datetime]$matches.Date
Size = $matches.Size
}
}
}
}
}
It's assumed the site passed in has a trailing slash. If you want to account for either, you can add this simple line to the process block.
if($Url -notmatch '/$'){$Url = "$Url/"}
To get the latest version, call the function like this
$url = 'https://download.gimp.org/pub/gimp/v2.10/windows/'
$latest = Extract-FilenameFromWebsite -Url $Url | Where-Object filename -like '*exe' |
Sort-Object LastModified | Select-Object -Last 1
$latest.url
Or you could expand the property while retrieving
$url = 'https://download.gimp.org/pub/gimp/v2.10/windows/'
$latesturl = Extract-FilenameFromWebsite -Url $Url | Where-Object filename -like '*exe' |
Sort-Object LastModified | Select-Object -Last 1 -ExpandProperty URL
$latesturl