0

Hi I have a script trying to pull out the price value from a html file. the Regex works when I assign it in the script but when I put the regex in CSV, it refuses to give me the result. Could someone help with this?

$htmlcontent = Get-Content ".\Temp.html" -Raw
$priceregex = "(?<=<span class=""a-offscreen"">\$)[\d\.]+"
Write-Host "Regex value is: " $priceregex
IF ($htmlcontent -match $priceregex){$Matches[0]}else{"Not found"}
$csvdata=Import-Csv .\WebMonitor-A.csv
$priceregex=$csvdata[0].Regex
Write-Host "Regex from CSV file is: " $priceregex
IF ($htmlcontent -match $priceregex){$Matches[0]}else{"Not found"}

The html file content looks like this:

<div class="a-section a-spacing-micro">                <span class="a-price aok-align-center" data-a-size="xl" data-a-color="base"><span class="a-offscreen">$10.95</span><span aria-hidden="true"><span class="a-price-symbol">$</span><span class="a-price-whole">10<span class="a-price-decimal">.</span></span><span class="a-price-fraction">95</span></span></span> 

I have this in CSV file as a column (Regex):

(?<=<span class=""a-offscreen"">\$)[\d\.]+
Cliff
  • 3
  • 2
  • https://stackoverflow.com/a/1732454/431172 – mfinni Mar 23 '23 at 22:29
  • @mfinni Interesting reading. a little bit over-killing. I have never seen an expert gives another solution other than giving you a correct regex, which means the other solution (XML parser?) might not be understood by most of the geeks or it's not good enough in some ways (easy,quick...etc). – Cliff Mar 24 '23 at 17:04

1 Answers1

0

Issue :
When you give the RegEx in the Script , Certain Characters (like Quotes) must be escaped. Hence what you have given is working.

When you give the RegEx in the Textfile Eg CSV , those Characters need not be escaped.
You are still escaping those , hence that will not match.

Solution :
In the CSV text file , give this RegEx :

(?<=<span class="a-offscreen">\$)[\d\.]+

Here , the Quotes (around the Class Name) must not be escaped.

That will work.

Prem
  • 578
  • 1
  • 5
  • 12