Selenium Problem extracting Google business description

Question

I seem to be struggling with this issue for a couple of days and could really use some help. I am trying to scrape Google busineses information with Python beautifulsoups and Selenium and I want to extract the business description that is available for some of them:

As you can see not all of the text is shown so I need to click “More”. That is where the problem comes, no matter what I do I can’t seem to click it. I tried:

Waiting after I get url with Selenium so elements load
Getting element by class
Getting element by xpath
Clicking element via js executed code
Checking if element is in iframe(seems like it is not)
Setting browser to max size, setting browser headless option on and of
Switching between Firefox and Chrome

EDIT: Code I tried using:

url = 'https://www.google.com/search?q=' + quote(''.join(company) + ' ' + ''.join(location))
    chrome_options = webdriver.FirefoxOptions()
    chrome_options.headless = True
    chrome_options.add_argument("--lang=en-GB")
    chrome_options.add_argument("--window-size=1100,1000")
    chrome_options.add_argument('--user-agent="Mozilla/5.0 (Windows Phone 10.0; Android 4.2.1; Microsoft; Lumia 640 XL LTE) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Mobile Safari/537.36 Edge/12.10166"')
    browser = webdriver.Firefox(executable_path='C:/geckodriver.exe', options=chrome_options)
    from selenium.webdriver.support import expected_conditions as EC
    browser.maximize_window()
    wait = WebDriverWait(browser, 10)
    browser.get(url)  # open a new tab in the new window
    wait.until(EC.element_to_be_clickable((By.XPATH, "//a[@href and .='More']"))).click()
    # browser.find_element_by_class_name('bJpcZ').click()
    html = browser.page_source
    browser.close()
    soup = BeautifulSoup(html, 'lxml')

If anyone feels like he/she knows a solution please pass it over :)

Please include the code of what you've tried so far – msenior_ Feb 18 '22 at 02:54 — msenior_, Feb 18 '22 at 02:54
Added the code to the question – Thresh Bot Feb 18 '22 at 18:26 — Thresh Bot, Feb 18 '22 at 18:26

score 0 · Answer 1 · answered Feb 18 '22 at 05:31

0

driver.maximize_window()
wait=WebDriverWait(driver,10)
driver.get("https://www.google.com/search?rlz=1C1NDCM_enCA792CA792&sxsrf=APq-WBsY3Q1E1ge_7PuFaovaxQ_Orvk8-w:1645162032562&q=dungeness+pest+control&spell=1&sa=X&ved=2ahUKEwjfuLGUwoj2AhUaHDQIHUtpCOkQBSgAegQIARAy&biw=1366&bih=663&dpr=1")  # open a new tab in the new window
wait.until(EC.element_to_be_clickable((By.XPATH,"//a[@href and .='More']"))).click()

Simply click the a tag with the text more.

There is a //div[@data-long-text] however where you could just .get_attribute("data-long-text") instead.

Import:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

answered Feb 18 '22 at 05:31

Arundeep Chohan

9,779
5
15
32

I added the code, to the question, I get selenium.common.exceptions.TimeoutException: Message: Stacktrace: WebDriverError@chrome://remote/content/shared/webdriver/Errors.jsm:183:5 NoSuchElementError@chrome://remote/content/shared/webdriver/Errors.jsm:395:5 element.find/<@chrome://remote/content/marionette/element.js:300:16 when I try to use this – Thresh Bot Feb 18 '22 at 18:26
It's most likely the headless setting. – Arundeep Chohan Feb 19 '22 at 23:08
Ahhh, I figured it out, there was an cookie agree popup that was appearing and was causing trouble! All I had to do is agree to the terms – Thresh Bot Feb 20 '22 at 00:27

Selenium Problem extracting Google business description

1 Answers1