So previously I had some issues writing my desired info from a list to csv. I got some help here on SO and managed to do what I wanted given given the following code:
import csv
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from webdriver_manager.chrome import ChromeDriverManager
# open target
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://www.exampleurl.com/facts')
# define xpath
def get_elements_by_xpath(driver, xpath):
return [entry.text for entry in driver.find_elements_by_xpath(xpath)]
# Xpaths
text_entries = [
("Fact 1", "//div[@class='fact' and contains(span, '')][1]"),
("Fact 2", "//div[@class='fact' and contains(span, '')][2]"),
]
# Print facts to csv
with open('facts.csv', 'a') as f:
csv_output = csv.writer(f)
entries = []
for name, xpath in text_entries:
entries.append(get_elements_by_xpath(driver, xpath))
csv_output.writerows(zip(*entries))
But now I'd like to use the same code but print it to Gsheets. I tried to solve it by using Pandas with the following code (below):
import pygsheets
import pandas as pd
import csv
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from webdriver_manager.chrome import ChromeDriverManager
# Open target
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://www.exampleurl.com/facts')
# Define xpath
def get_elements_by_xpath(driver, xpath):
return [entry.text for entry in driver.find_elements_by_xpath(xpath)]
# Xpaths
text_entries = [
("Fact 1", "//div[@class='fact' and contains(span, '')][1]"),
("Fact 2", "//div[@class='fact' and contains(span, '')][2]"),
]
# Print facts to Gsheets
gc = pygsheets.authorize(service_file='/users/Username/desktop/map/filekey.json')
df = pd.DataFrame(entries)
sh = gc.open('apts')
wks = sh[0]
wks.set_dataframe(entries,(1,1))
The code does print One fact for One div (although there are two facts and hundreds of divs using the same class). Printing it to CSV does give me ALL facts printed accordingly and correct. It also overwrites the file each time I do a scrape instead of adding new rows to the gsheet...
Any help or thoughts would be appreciated!