0

Trying to web scrape the Product count from this page https://www.digikey.com/en/products/filter/dc-dc-converters/922?s=N4IgjCBcpgnAHLKoDGUBmBDANgZwKYA0IA9lANogAMIAusQA4AuUIAykwE4CWAdgOYgAvkOIBWZCAZQwjaZDBUqIoA

Webpage snippet

My current code is:

from bs4 import BeautifulSoup
import time
url= 'https://www.digikey.com/en/products/filter/dc-dc-converters/922?s=N4IgjCBcpgnAHLKoDGUBmBDANgZwKYA0IA9lANogAMIAusQA4AuUIAykwE4CWAdgOYgAvkOIBWZCAZQwjaZDBUqIoA'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
time.sleep(10)
content = soup.find('span', attrs={'data-testid': 'static-product-count'})
print(content.text)

The html part is

<span data-testid="static-product-count" class="jss68">248,154 </span>

My code is returning a null output however, the same code would work fine a few weeks back when the html part was (but im wondering how that makes a difference? can someone help explain?) thanks!

<span class="jss82" data-testid="product-count">1604</span>
sujhand
  • 1
  • 2

1 Answers1

0

To extract the text 248,154 using Selenium you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

  • Using the text Results:

    driver.execute("get", {'url': 'https://www.digikey.com/en/products/filter/dc-dc-converters/922?s=N4IgjCBcpgnAHLKoDGUBmBDANgZwKYA0IA9lANogAMIAusQA4AuUIAykwE4CWAdgOYgAvkOIBWZCAZQwjaZDBUqIoA'})
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//span[@class='button-desktop' and @track-data='ref_page_event=Consent or View Privacy']"))).click()
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[contains(., 'Results')]//span"))).text)
    
  • Using the data-testid attribute:

    driver.execute("get", {'url': 'https://www.digikey.com/en/products/filter/dc-dc-converters/922?s=N4IgjCBcpgnAHLKoDGUBmBDANgZwKYA0IA9lANogAMIAusQA4AuUIAykwE4CWAdgOYgAvkOIBWZCAZQwjaZDBUqIoA'})
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//span[@class='button-desktop' and @track-data='ref_page_event=Consent or View Privacy']"))).click()
    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[@data-testid='static-product-count']"))).text)
    
  • Console output:

    248,154
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352