I am trying to return three things from these two urls: title, price and the details. On the first link, there is no sale or promotion, so the xpath is
//*[@id="priceblock_ourprice"]
and in the second one there is a sale and the xpath is
//*[@id="priceblock_dealprice"]
I only want this script to return is there is a value for urls that have the ourprice xpath and not the dealprice xpath. If the ourprice xpath is not present, I would like "N/A" to be returned. What am I missing here?
from requests_html import HTMLSession
import pandas as pd
urls = ['http://amazon.com/dp/B01KZ6V00W',
'http://amazon.com/dp/B089FBPFHS'
]
def getPrice(url):
s = HTMLSession()
r = s.get(url)
r.html.render(sleep=1,timeout=20)
product = {
'title': str(r.html.xpath('//*[@id="productTitle"]', first=True).text),
'price': str(r.html.xpath('//*[@id="priceblock_ourprice"]', first=True).text),
'details': str(r.html.xpath('//*[@id="detailBulletsWrapper_feature_div"]', first=True).text)
}
res = {}
for key in list(product):
res[key] = product[key].replace('\n',' ')
print(res)
return res
prices = []
for url in urls:
prices.append(getPrice(url))
df = pd.DataFrame(prices)
print(df.head(15))
df.to_csv("testfile.csv",index=False)
print(len(prices))
traceback for second url, first url completes sucessfully
line 14, in getPrice
'price': str(r.html.xpath('//*[@id="priceblock_ourprice"]', first=True).text),
AttributeError: 'NoneType' object has no attribute 'text'