1

Using this webpage for testing, and this Bloomberg page as a source for the XML string, I wat to get the NAV value that is in big numbers. If I inspect the element with Google Chrome and right-click to copy the XPath I get:

//*[@id="root"]/div/div/section[2]/div[1]/div[2]/section[1]/section/section[2]/section/div[1]/span[1]

If I copy the full XPath, then I get:

/html/body/div[6]/div/div/section[2]/div[1]/div[2]/section[1]/section/section[2]/section/div[1]/span[1]

It would be even better if something like this worked:

//span[@class="priceText__1853e8a5"]

But all these expressions, despite being marked valid by the tool, also give an empty result in the testing tool and in my Google Spreadsheet, where I use:

=importxml("https://www.bloomberg.com/quote/ZARATHU:BZ";"//span[@class='priceText__1853e8a5']")

Though I note all the queries above don't work on the spreadsheet as well.

Mefitico
  • 816
  • 1
  • 12
  • 41
  • The element with that class is not a `div` but a `span`. You probably mean `//span[@class="priceText__1853e8a5"]`. I confirm this span is in the source code of the page. If you still can't retrieve it, please include a link your Google sheet. – derloopkat Oct 03 '20 at 22:33
  • @derloopkat well pointed, thank you. But even then, your XPath query returns empty both on my spreadsheet and on https://extendsclass.com/xpath-tester.html (the testing tool) – Mefitico Oct 04 '20 at 00:05
  • 1
    The tool you mention is for parsing xml, not html. Web pages are not standard xml, they could contain self closing tags or symbols like `<`, `>` into `script` tags that would break an xml parser and not even report the error correctly. – derloopkat Oct 04 '20 at 01:50
  • 1
    As I've said, if you show what you have in your Google spreadsheet may be someone else would be willing to help later on. – derloopkat Oct 04 '20 at 02:21
  • @derloopkat : Fair point, thanks. I've shown the code used on the google sheet, it is set to use comma as the decimal separator, so `;` separates arguments in a function call in my spreadsheet. – Mefitico Oct 04 '20 at 20:57
  • I just tried again this formula in a googlesheet and worked fine `=importxml("https://www.bloomberg.com/quote/ZARATHU:BZ", "//span[@class='priceText__1853e8a5']")`, returned 1.5653. Yesterday it didn't work. – derloopkat Oct 05 '20 at 20:02
  • @derloopkat tested right now for the sake of "why not", and it didn't work for me. But I think Iamblichus' answer is at least partially correct, I access normally the website because I block javascript on it with a chrome plug-in (hence why I didn't suspect this). – Mefitico Oct 05 '20 at 21:17

1 Answers1

1

Looks like this website is protected against automated requests, and so it's blocking the requests from IMPORTXML. You can check that if you try to retrieve the full HTML:

=importxml("https://www.bloomberg.com/quote/ZARATHU:BZ","html")

The returned HTML content includes the following:

We've detected unusual activity from your computer network
To continue, please click the box below to let us know you're not a robot.

Related:

Edit:

The OP solved the issue by using this web scraping add-on:

=IMPORTFROMWEB("https://www.bloomberg.com/quote/ZARATHU:BZ";"//span[@class='priceText__1853e8a5']")
Iamblichus
  • 18,540
  • 2
  • 11
  • 27
  • Worth noticing that IMPORTFROMWEB also accepts CSS selectors, which can help to simplify the path: ````.priceText__1853e8a5```` Should work – Matt Pi Oct 08 '20 at 21:35