1

I'm trying to scrape products list via BeautifulSoup. There's 80 products lists on the web site. It works well but stops at the 32nd product. How can I scrape all products.

import requests
from bs4 import BeautifulSoup

from pymongo import MongoClient
client = MongoClient('localhost', 27017)
db = client.dbsparta

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
data = requests.get('https://www.stories.com/kr_krw/top-sellers/top-sellers.html', headers=headers)

soup = BeautifulSoup(data.text, 'html.parser')
#image = #category-list > div:nth-child(1) > a > div.product-image > div > img.a-image.default-image -> src attr.
#name = #category-list > div:nth-child(1) > a > div.description > div.product-title > label -> text
#price = #category-list > div:nth-child(1) > a > div.description > div.m-product-price > label -> text

products = soup.select('#category-list > div.o-product')

for product in products:
    image = product.select_one('div.product-image > div > img.a-image.default-image')['src']
    name = product.select_one('div.description > div.product-title > label').text
    price = product.select_one('div.description > div.m-product-price > label').text
    print(image,name,price)
Min
  • 23
  • 1
  • 4
  • 3
    the html that you fetch using requests represents the initial state of the web page, which contains 32 listed items only. As you scroll down, the html is updated via javascript. You can use selenium, or requests with session. This question might help https://stackoverflow.com/questions/34546766/scraping-hidden-elements-using-beautifulsoup – Yati Raj Jun 07 '20 at 03:59

1 Answers1

0

The data is loaded via JavaScript dynamically, but you can simulate it with requests module.

For example:

import requests
from bs4 import BeautifulSoup

url = 'https://www.stories.com/kr_krw/top-sellers/top-sellers.html'
ajax_url = 'https://www.stories.com/kr_krw/dpa/aosCtgrItemAddList.html'

soup = BeautifulSoup(requests.get(url).content, 'html.parser')

dispLcatCd = soup.select_one('#dispLcatCd')['value']
dispMcatCd = soup.select_one('#dispMcatCd')['value']

data = {
    'sect_id': dispMcatCd,
    'dispLcatCd': dispLcatCd,
    'dispMcatCd': dispMcatCd,
    'pageNum': 1,
    'viewCnt': 32,
    }

while True:
    print('Processing page {}...'.format(data['pageNum']))
    soup = BeautifulSoup(requests.post(ajax_url, data=data).content, 'html.parser')

    if not soup.select('.o-product'):
        break

    for title, img, price in zip(soup.select('.product-title'),
                                 soup.select('.default-image'),
                                 soup.select('.price')):
        print('{:<50} {:<10} {}'.format(title.get_text(strip=True), price.get_text(strip=True), img['src']))

    data['pageNum'] += 1

Prints:

Processing page 1...
버튼 맥시 스트랩 드레스                                      129,000    https://image.thehyundai.com/static/4/4/1/14/A1/hnm40A1141441_01_0864704_001_002_568.jpg
하프 문 스트로 크로스바디 백                                   69,000     https://image.thehyundai.com/static/9/8/0/07/A1/hnm40A1070896_02_0838559_001_001_568.jpg
러플 코튼 도비 미디 드레스                                    119,000    https://image.thehyundai.com/static/4/6/7/87/A0/hnm40A0877646_01_0727841_001_001_568.jpg
스트라이프 스트랩 레더 샌들                                    89,000     https://image.thehyundai.com/static/9/1/3/14/A1/hnm40A1143195_03_0852209_001_001_568.jpg
플로럴 미디 랩 드레스                                       110,000    https://image.thehyundai.com/static/4/6/7/04/A1/hnm40A1047640_01_0680108_003_001_568.jpg
오가닉 펄 오픈 후프 이어링                                    35,000     https://image.thehyundai.com/static/3/3/0/99/A0/hnm40A0990337_02_0846451_001_001_568.jpg
플리츠 미디 스커트                                         89,000     https://image.thehyundai.com/static/6/5/7/13/A1/hnm40A1137566_01_0883732_001_002_568.jpg
플로럴 프린트 맥시 드레스                                     129,000    https://image.thehyundai.com/static/7/7/6/10/A1/hnm40A1106773_01_0493476_006_001_568.jpg
깅엄 시어서커 스윔수트                                       79,000     https://image.thehyundai.com/static/4/5/3/14/A1/hnm40A1143548_02_0882631_001_001_568.jpg
리넨 쇼츠                                              79,000     https://image.thehyundai.com/static/9/3/6/13/A1/hnm40A1136396_01_0883866_001_001_568.jpg
패디드 레더 샌들                                          110,000    https://image.thehyundai.com/static/7/4/4/13/A1/hnm40A1134475_02_0851451_001_001_568.jpg
리본 브림 우븐 스트로 햇                                     39,000     https://image.thehyundai.com/static/4/2/8/02/A1/hnm40A1028243_02_0848837_001_001_568.jpg
프릴 퍼프 슬리브 니트 탑                                     69,000     https://image.thehyundai.com/static/6/5/3/14/A1/hnm40A1143560_01_0886586_002_001_568.jpg
레더 스트래피 레이스 업 힐 샌들                                 119,000    https://image.thehyundai.com/static/0/3/8/88/A0/hnm40A0888301_02_0731706_003_001_568.jpg
프릴 크레이프 시폰 미디 드레스                                  129,000    https://image.thehyundai.com/static/1/7/2/12/A1/hnm40A1122717_01_0866370_001_001_568.jpg
슬리브리스 프릴 블라우스                                      59,000     https://image.thehyundai.com/static/9/7/8/13/A1/hnm40A1138796_0864574001_202001_LB_0020_Q8_L_1120x868_srgb_568.jpg
캔버스 토프 블러셔                                         15,000     https://image.thehyundai.com/static/9/3/2/41/A0/hnm40A0412390_02_0148486_010_001_568.jpg
스캘럽 헴 리넨 쇼츠                                        89,000     https://image.thehyundai.com/static/7/5/7/13/A1/hnm40A1137570_01_0902708_001_001_568.jpg
트윌 슬링백 버클 샌들                                       79,000     https://image.thehyundai.com/static/9/1/3/14/A1/hnm40A1143199_02_0888324_001_001_568.jpg
에이시메트릭 랩 미디 드레스                                    110,000    https://image.thehyundai.com/static/6/0/5/08/A1/hnm40A1085069_0853055001_202001_LB_0982_Q8_L_1120x868_srgb_568.jpg
피티드 스모크 스커트                                        79,000     https://image.thehyundai.com/static/9/3/6/13/A1/hnm40A1136395_01_0784826_011_001_568.jpg
캔버스 토트 백                                           110,000    https://image.thehyundai.com/static/9/4/7/09/A1/hnm40A1097499_02_0838566_002_001_568.jpg
마이크로 플로럴 랩 미니 드레스                                  79,000     https://image.thehyundai.com/static/1/7/2/12/A1/hnm40A1122713_01_0751883_004_001_568.jpg
피티드 스모크 셔츠                                         110,000    https://image.thehyundai.com/static/9/3/6/13/A1/hnm40A1136394_01_0859052_001_002_568.jpg
오픈 백 점프수트                                          110,000    https://image.thehyundai.com/static/1/6/0/13/A1/hnm40A1130610_01_0917714_001_001_568.jpg
벨티드 퍼프 슬리브 미디 드레스                                  119,000    https://image.thehyundai.com/static/9/7/8/13/A1/hnm40A1138797_0874015002_202001_LB_0175_Q8_L_1120x868_srgb_568.jpg
리넨 퍼프 슬리브 미니 드레스                                   119,000    https://image.thehyundai.com/static/5/5/3/14/A1/hnm40A1143553_01_0900126_001_001_568.jpg
자카드 랩 맥시 드레스                                       129,000    https://image.thehyundai.com/static/4/9/6/12/A1/hnm40A1126946_0887633001_202001_LB_0862_Q8_L_1120x868_srgb_568.jpg
오버사이즈 벨티드 리넨 점프수트                                  119,000    https://image.thehyundai.com/static/6/5/7/13/A1/hnm40A1137564_01_0871142_001_001_568.jpg
오버사이즈 버튼 셔츠 드레스                                    79,000     https://image.thehyundai.com/static/0/4/6/13/A1/hnm40A1136405_0880475001_202002_LB_1106_Q8_L_1120x868_srgb_568.jpg
패디드 레더 슬링백 샌들                                      119,000    https://image.thehyundai.com/static/7/4/4/13/A1/hnm40A1134476_02_0876166_002_001_568.jpg
듀오 톤 레더 크로스바디 백                                    225,000    https://image.thehyundai.com/static/5/7/3/99/A0/hnm40A0993757_02_0775965_001_001_568.jpg
Processing page 2...
리넨 블렌드 블레이저                                        119,000    https://image.thehyundai.com/static/1/3/5/12/A1/hnm40A1125312_01_0852710_001_001_568.jpg
A라인 러플 미니 드레스                                      89,000     https://image.thehyundai.com/static/0/6/2/11/A1/hnm40A1112606_01_0887613_001_001_568.jpg
릴렉스드 버튼 미디 드레스                                     79,000     https://image.thehyundai.com/static/6/5/7/13/A1/hnm40A1137565_01_0864561_001_001_568.jpg
리넨 퍼프 슬리브 미디 드레스                                   129,000    https://image.thehyundai.com/static/8/7/0/13/A1/hnm40A1130783_0881161003_202002_LB_0491_Q8_L_1120x868_srgb_568.jpg

... and son on.
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91