3

We are trying to get product urls from this page of Forever 21's site (http://www.forever21.com/Product/Category.aspx?br=f21&category=dress&pagesize=100&page=1). For some reason, BeautifulSoup is not getting the elements with class "item_pic", even though they are in the site html. We have tried using requests, mechanize, selenium, and are having no luck. All the commented code is from previous attempts to get the html (none of which worked). Here is our code:

from bs4 import BeautifulSoup
import urllib
import urllib2
import requests

#driver = webdriver.Firefox()
url = "http://www.forever21.com/Product/Category.aspx?br=f21&category=dress&pagesize=100&page=1"
#r = driver.get(url)
#html = r.read()
#headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
#html = requests.get(url, headers=headers)
#response = opener.open(url)
#html = response.read()
html = urllib.urlopen(url).read()
soup = BeautifulSoup(html, "html.parser")
print soup

Any ideas what's going wrong here?

  • Those classes are only on the webpage after the javascript is run. You can't just inspect element and expect that's the output you'll get from urllib. Go view the direct html response in your developer tools. You'll have to run the javasript if you expect to scrape the site. – Falmarri Oct 24 '16 at 22:32
  • Using selenium how were you not able to access the exact html of the page you were viewing – TheoretiCAL Oct 24 '16 at 23:00

2 Answers2

0

In order to scrape the product urls here you need to use Selenium. The following code should give you the product id links. It works by first getting the dynamically generated source through selenium and then parsing the links of the first child of the "item_pic" div you specified.

from bs4 import BeautifulSoup
from selenium import webdriver
import urllib2
import requests

driver = webdriver.Firefox()
url = "http://www.forever21.com/Product/Category.aspx?br=f21&category=dress&pagesize=100&page=1"
driver.get(url)
html = driver.page_source

driver.close()

soup = BeautifulSoup(html, "lxml")

itemList = soup.find_all('div', {'class' : 'item_pic'})

for element in enumerate(itemList):
    print element.a['href']
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
davloi
  • 16
  • 3
0

Most of the content is dynamically added, you just need to mimic the ajax request that retrieves the content:

params = {"action": "getcategory",
        "br": "f21",
        "category": "dress",
        "pageno": "",
        "pagesize": "",
        "sort": "",
        "fsize": "",
        "fcolor": "",
        "fprice": "",
        "fattr": ""}

url = "http://www.forever21.com/Ajax/Ajax_Category.aspx"

js = requests.get(url,params=params).json()
print(js)

That gives you pretty much all the dynamic content, a snippet of which looks like:

{u'CategoryHTML': u'<div class="product_item gtm_prod" data-name="Twelve Lace V-Neck Mini Dress" data-sku="2000229555" data-brand="F21" data-product-list="category dress pagesize 120" data-price="58.00" data-retail="58.00">\r\n<div class="item_pic">\r\n<div class="m_qv" alt="quick view" onclick="fnShowProductPopup(\'f21\',\'dress\',\'2000229555\',\'\');" ><span class="quick_view">quick view</span></div>\r\n<a href="http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000229555&VariantID=">\r\n<div id="imgDiv_20

So what you want is under js[u'CategoryHTML']:

In [3]: import requests
   ...: from bs4 import BeautifulSoup
   ...: params = {"action": "getcategory",
   ...:         "br": "f21",
   ...:         "category": "dress",
   ...:         "pageno": "",
   ...:         "pagesize": "",
   ...:         "sort": "",
   ...:         "fsize": "",
   ...:         "fcolor": "",
   ...:         "fprice": "",
   ...:         "fattr": ""}
   ...: url = "http://www.forever21.com/Ajax/Ajax_Category.aspx"
   ...: js = requests.get(url, params=params).json()
   ...: soup = BeautifulSoup(js[u'CategoryHTML'], "html.parser")
   ...: [a["href"] for a in soup.select("div.item_pic a")]
   ...: 

Out[3]: 
[u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000229555&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000235044&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000225681&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000250594&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000231693&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194240&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000192742&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000191102&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000214728&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000195373&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000213366&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000190888&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000231562&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000195713&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000207425&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000213751&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000229255&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000229243&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000229254&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000215480&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000250589&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000208752&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000195206&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000193780&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000199117&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000192754&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000192732&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000199660&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000207415&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000207430&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000193799&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194207&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000229598&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000193794&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000233798&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000193784&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000193758&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194949&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000215792&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194308&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194232&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000192739&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000193801&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194208&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000237450&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000229676&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000195483&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000215685&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000231583&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000213912&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000191263&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000234792&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000195271&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000197171&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000250281&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000208855&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000215076&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000216738&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194194&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194302&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194303&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000213216&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000213495&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000233096&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000192273&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000212922&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000217399&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000209239&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000250603&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000195754&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000197042&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194183&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194281&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000217421&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000233947&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194295&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000230752&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000215044&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000191569&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000191576&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000215150&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000250593&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000188763&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000215566&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000234952&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000214224&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000220848&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000214184&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000213990&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000232029&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000212710&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000230949&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000231443&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000192879&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000192588&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000235216&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000192281&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000212697&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000213386&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000208787&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000193657&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000208320&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000231811&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000196529&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000208541&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000229980&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000195375&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000229866&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000234442&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000194607&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000191105&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000196404&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000199193&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000216479&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000198558&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000193739&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000231532&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000229938&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000229912&VariantID=',
 u'http://www.forever21.com/Product/Product.aspx?BR=f21&Category=dress&ProductID=2000191678&VariantID=']

In [4]: 

You can alter the params to influence what you get back.

Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321
  • Thanks! This was really helpful. Now we're hoping to search through every category on the Forever 21 site and get the links for each. Do you know of a way we could do this? – Terry Rossi Nov 03 '16 at 20:36