1

I am trying to return three things from these two urls: title, price and the details. On the first link, there is no sale or promotion, so the xpath is

//*[@id="priceblock_ourprice"]

and in the second one there is a sale and the xpath is

//*[@id="priceblock_dealprice"]

I only want this script to return is there is a value for urls that have the ourprice xpath and not the dealprice xpath. If the ourprice xpath is not present, I would like "N/A" to be returned. What am I missing here?

from requests_html import HTMLSession
import pandas as pd

urls = ['http://amazon.com/dp/B01KZ6V00W',
'http://amazon.com/dp/B089FBPFHS'
          ]

def getPrice(url):
    s = HTMLSession()
    r = s.get(url)
    r.html.render(sleep=1,timeout=20)
    product = {
        'title': str(r.html.xpath('//*[@id="productTitle"]', first=True).text),
        'price': str(r.html.xpath('//*[@id="priceblock_ourprice"]', first=True).text),
        'details': str(r.html.xpath('//*[@id="detailBulletsWrapper_feature_div"]', first=True).text)
    }
    res = {}
    for key in list(product):
        res[key] = product[key].replace('\n',' ')

    print(res)
    return res

prices = []
for url in urls:
    prices.append(getPrice(url))


df = pd.DataFrame(prices)
print(df.head(15))
df.to_csv("testfile.csv",index=False)
print(len(prices))

traceback for second url, first url completes sucessfully

line 14, in getPrice
    'price': str(r.html.xpath('//*[@id="priceblock_ourprice"]', first=True).text),
AttributeError: 'NoneType' object has no attribute 'text'
mjbaybay7
  • 99
  • 5
  • By calling the `.text` you are triggering the `AttributeError`. I'd recommend assigning your xpath call to a variable, then assigning it to your price key as something like `{"price": xpath_var if xpath_var else "N/A"}` – gallen Dec 16 '20 at 03:32
  • @gallen Where would I place this statement? – mjbaybay7 Dec 16 '20 at 03:42
  • The variable would be defined before the definition of the `product` dictionary. Then you'd update the definition of the value for the `price` key with my suggested snippet. – gallen Dec 16 '20 at 03:46
  • Does this answer your question? [How can I check if either xpath exists and then return the value if text is present?](https://stackoverflow.com/questions/65313719/how-can-i-check-if-either-xpath-exists-and-then-return-the-value-if-text-is-pres) – Charalamm Dec 16 '20 at 03:53

1 Answers1

0

Try this code:

title = r.html.xpath('//*[@id="productTitle"]', first=True)
price = r.html.xpath('//*[@id="priceblock_ourprice"]', first=True)
details = r.html.xpath('//*[@id="detailBulletsWrapper_feature_div"]', first=True)

product = {
        'title': str(title.text) if title else 'N/A',
        'price': str(price.text) if price else 'N/A',
        'details': str(details.text) if details else 'N/A'
    }

P.S. I think text will be already returned in string format, so you don't need to use str(title.text) but just title.text

JaSON
  • 4,843
  • 2
  • 8
  • 15