-2

I am web scraping with Python and BeautifulSoup.

I have to scrape this page.

http://www.starwoodhotels.com//sheraton/property/reviews/index.html?language=en_US&propertyID=115

From this page, I have scraped the address of hotel successfully, But I am unable to scrape the User Reviews section

Here is my code

hotel_link = "http://www.starwoodhotels.com//sheraton/property/reviews/index.html?language=en_US&propertyID=115"

hotel_page_html = requests.get(hotel_link,headers = header).text
hotel_page_soup = BeautifulSoup(hotel_page_html)

for hotel_address in hotel_page_soup.select("div#propertyAddressContainer ul#propertyAddress"):
  print("Address: "+hotel_address.select("li")[0].text)

print(hotel_page_soup.select("div.BVRRRatingNormalOutOf"))

As you can see, using the CSS Selector div#propertyAddressContainer ul#propertyAddress, I have got the address but am unable to scrape the User Reviews section.

I have checked the Console while page loads but I don't see anything that User Reviews are loaded by an AJAX call.

So how do I scrape the Reviews section?

Umair Ayub
  • 19,358
  • 14
  • 72
  • 146

2 Answers2

1

why are you making this so complicated?

just do,

soup.find("span",{"itemprop":"aggregateRating"}).text.encode('ascii','ignore').replace('\n',' ')

Out[]:
Rated 3.4 out of 5by 625 reviewers.

isn't that what you need?

Md. Mohsin
  • 1,822
  • 3
  • 19
  • 34
0

Working code

rev = hotel_page_soup.find( "span",
                            { "itemprop": "aggregateRating" }
                            ).text.encode( 'ascii',
                                           'ignore'
                                           ).replace( '\n', ' ' )

for total_rating_score in rev.select( "span" ):
    print ( total_rating_score.string )
user3666197
  • 1
  • 6
  • 50
  • 92
Umair Ayub
  • 19,358
  • 14
  • 72
  • 146
  • Your answer won't even work. `rev` is a string and that will never work with `rev.select` This is ridiculous. You ask questions and when someone answers you just minor modify to something that is not even right and post it yourself?? – Md. Mohsin Nov 17 '14 at 19:31