0

I have a dataframe of which I want to append the non-header records to a google sheet.

import gspread
import csv

latest = pd.read_csv("/Users/sam/Downloads/transactions.csv")

updatesheet = sheet.worksheet("testpy")

updatesheet.append_row([update], table_range='A1')

This gives an error of Object of type DataFrame is not JSON serializable

I have tried other options from various stack threads, most would result in a non JSON type error, but when I would convert df.to_json() it puts it all into a single cell.

Edit: Updating table_range='A1:A' did not fix.

stubbsy
  • 33
  • 7
  • Try changing the range. You are currently passing a single cell as the range. A1 is a cell. Try without a range... If that doesn't work, try `A:ZZ` instead. That should be wide enough, unless you have a lot of columns in the CSV file – Edo Akse Jul 18 '21 at 14:48
  • Also, check [this answer](https://stackoverflow.com/a/52974936/9267296) on how to load a CSV directly using gspread. – Edo Akse Jul 18 '21 at 14:57
  • Thinking some more, you want to add a single column to the sheet, change the table range to `A1:A`. That's a range instead of a cell... – Edo Akse Jul 18 '21 at 15:18
  • @EdoAkse that didn't fix it. The code I put in the post gives an error of Object of type DataFrame is not JSON serializable. If I turn that into to_json then that is when it insert everything into a single cell. Updating the range as suggested did not fix this. – stubbsy Jul 18 '21 at 16:24
  • Can you print out a sanitized line of the CSV file? And a screenshot of what you would expect to see in GSheets? Also, any specific reason why you're using pandas? – Edo Akse Jul 18 '21 at 16:28
  • It would also help if you could print out some of the df entries – Edo Akse Jul 18 '21 at 16:30
  • @EdoAkse Sorry for lack of formatting Columns are Date Time Amount Type Description Category Example data is 4/1/2019 21:49:02 -$11 Withdrawal Intuit *QuickBooks Work - Organization I tried just using another library from gspread_dataframe import set_with_dataframe but it overrides exists data, so also trying to figure out how to identify the first empty row update = latest[latest['Date'] > max_date] # latest data in a df to insert updatesheet = sheet.worksheet("testpy") # test sheet to inset into set_with_dataframe(updatesheet, update, include_column_header=False) – stubbsy Jul 18 '21 at 23:12

1 Answers1

0

I can think of 2 approaches here, I've included both below.

The JSON type error you're experiencing can be solved by casting a df row as a list, as shown in method2.

The issue where everything gets put into cell A1 can be solved by simply ommitting the table_range.

import gspread
import pandas as pd


# set variables
docid = 'yourdocidhere'
worksheetid = 'transactions'
credfile = 'path/to/file/credentials.json'
csvfile = 'path/to/file/transactions.csv'


# load GSheet
gc = gspread.service_account(filename=credfile)
sheet = gc.open_by_key(docid)
worksheet = sheet.worksheet(worksheetid)


# load CSV file into df
df_csv = pd.read_csv(csvfile)


def method1(worksheet: gspread.Worksheet, df: pd.DataFrame):
    """This is the 'cheaty' way. Just load the existing worksheet
    into df, append other df and overwrite existing worksheet.
    Note that this doesn't check if the values in df are already
    in the worksheet...

    Args:
        worksheet (gspread.Worksheet): the worksheet with which to work
        df (pd.DataFrame): the df to append
    """
    # just load existing GSheet into a df
    df_cheat = pd.DataFrame(worksheet.get_all_records())
    # perform append. You might want to play with how you merge/append
    # This method has the advantage of easy sorting.
    df_cheat = df_cheat.append(df).sort_values(by='Date')
    # overwrite worksheet
    # you could easily create a new worksheet with this method as you have
    # the df_cheat to work with
    worksheet.update([df_cheat.columns.values.tolist()] + df_cheat.values.tolist())


def method2(worksheet: gspread.Worksheet, df: pd.DataFrame):
    """This method goes through the df row by row and adds
    it to the worksheet by using casting.
    https://www.w3schools.com/python/python_casting.asp

    Args:
        worksheet (gspread.Worksheet): the worksheet with which to work
        df (pd.DataFrame): the df to append
    """
    # go over the values in the df by row
    for row in df.values:
        # cast as list
        row = list(row)
        # if you leave out the table_range it will append at the end
        worksheet.append_row(row)


method1(worksheet, df_csv)
method2(worksheet, df_csv)

path/to/file/transactions.csv

Date,Time,Amount,Type,Description,Category
4/1/2019,21:49:02,-$11,Withdrawal,Intuit,*QuickBooks Work - Organization

BEFORE before AFTER after

Edo Akse
  • 4,051
  • 2
  • 10
  • 21
  • Amazing yes method2 is what I wanted, that worked great! FYI, I first ingest the existing sheet data, and then I compare to the latest csv, to identify the latest data I need to insert. To do that comparison I cast the Date columns to datetime. Your method gives a json error for the datetime column, so I convert it back to a string before calling the method. – stubbsy Jul 19 '21 at 13:47