I am trying to scrape a hashtag feed on linkedin and with my limited knowledge and a lot of googling I managed to cobble together the below.
The idea is that I want to be able to go to a hashtag feed scroll through the feed and click "show more" until I can scrape about 1000 posts.
The problem is that I keep getting the stale element error. Having reviewed some of the other posts on this problem I have tried implementing webdriverwait
but that keeps breaking something else
browser.get('https://www.linkedin.com/feed/hashtag/conversationalai/')
start = time.time()
# will be used in the while loop
initialScroll = 0
finalScroll = 1000
while True:
button = browser.find_element(By.XPATH, "//*[@class='artdeco-button artdeco-button--muted artdeco-button--1 artdeco-button--full artdeco-button--secondary ember-view scaffold-finite-scroll__load-button']")
browser.execute_script(f"window.scrollTo({initialScroll}, {finalScroll})")
# this command scrolls the window starting from
# the pixel value stored in the initialScroll
# variable to the pixel value stored at the
# finalScroll variable
initialScroll = finalScroll
finalScroll += 1000
if button:
browser.execute_script("arguments[0].click();", button)
# we will stop the script for 3 seconds so that
# the data can load
time.sleep(7)
end = time.time()
# We will scroll for 20 seconds.
if round(end - start) > 20:
break