0

So previously I had some issues writing my desired info from a list to csv. I got some help here on SO and managed to do what I wanted given given the following code:

import csv
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from webdriver_manager.chrome import ChromeDriverManager

# open target
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://www.exampleurl.com/facts')

# define xpath
def get_elements_by_xpath(driver, xpath):
    return [entry.text for entry in driver.find_elements_by_xpath(xpath)]

# Xpaths
text_entries = [
    ("Fact 1", "//div[@class='fact' and contains(span, '')][1]"),
    ("Fact 2", "//div[@class='fact' and contains(span, '')][2]"),
    ]

# Print facts to csv
with open('facts.csv', 'a') as f:
    csv_output = csv.writer(f)
    entries = []
    for name, xpath in text_entries:
        entries.append(get_elements_by_xpath(driver, xpath))
    csv_output.writerows(zip(*entries))

But now I'd like to use the same code but print it to Gsheets. I tried to solve it by using Pandas with the following code (below):

import pygsheets
import pandas as pd
import csv
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.common.exceptions import NoSuchElementException
from webdriver_manager.chrome import ChromeDriverManager

# Open target
driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://www.exampleurl.com/facts')

# Define xpath
def get_elements_by_xpath(driver, xpath):
    return [entry.text for entry in driver.find_elements_by_xpath(xpath)]

# Xpaths
text_entries = [
    ("Fact 1", "//div[@class='fact' and contains(span, '')][1]"),
    ("Fact 2", "//div[@class='fact' and contains(span, '')][2]"),
    ]

# Print facts to Gsheets
gc = pygsheets.authorize(service_file='/users/Username/desktop/map/filekey.json')
df = pd.DataFrame(entries)
sh = gc.open('apts')
wks = sh[0]
wks.set_dataframe(entries,(1,1))

The code does print One fact for One div (although there are two facts and hundreds of divs using the same class). Printing it to CSV does give me ALL facts printed accordingly and correct. It also overwrites the file each time I do a scrape instead of adding new rows to the gsheet...

Any help or thoughts would be appreciated!

Neverend
  • 35
  • 8
  • Is your data-frame getting created correctly? – Nithin Oct 21 '19 at 02:53
  • @Nithin Well, it is at least when getting printed to CSV. But not when printed to Gsheets. Perhaps the logic when using pygsheets and pandas is not working according to the script which was designed for CSV initially :/ – Neverend Oct 21 '19 at 08:00
  • try printing your df. and see if all the values are as they need to be. also create a dummy df and try writing to gsheets – Nithin Oct 21 '19 at 08:17
  • @Nithin It works when I print the df (for the first Div...) but when I try to replicate that to Gsheets it gives me the following error: '-> 1256 df = df.replace(pd.np.nan, nan)' followed by 'AttributeError: 'list' object has no attribute 'replace'' – Neverend Oct 21 '19 at 10:02

0 Answers0