0

I'm currently using PyDrive to automatically modify existing Google Sheets I have uploaded to my Google Drive. I then have Google Data Studio connecting to these Google Sheets so that I can then visualize them.

The only problem is that when I modify the contents of the existing Google Sheet, it then completely breaks the data connection (Data Set Configuration Error).

The ID of the file and the name stays completely the same, so I'm a bit confused as to why this happens. Any thoughts?

Here's my code for modifying existing Google Drive files:

import glob
import os

from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive

gauth = GoogleAuth()
gauth.LocalWebserverAuth()
drive = GoogleDrive(gauth)

all_csv = glob.glob(os.path.join(r'/file/', '*.csv'))

def upload_files():

    def update_files(file_id, file_dir):
        update_file = drive.CreateFile({'id': file_id})
        update_file.SetContentFile(file_dir)
        update_file.Upload({'convert': True})


    for file_list in drive.ListFile({'q': "'*folder_ID*' in parents and trashed=false"}):
        for file1 in file_list:
            for x in all_csv:
                if file1['title'] == x.split('\\')[-1].split('.')[0]:
                    update_files(file1['id'], x)


upload_files()
  • I've had a quick look through the file meta-data, and it looks like the main things that change are the etag and the version number - could this possibly be the reason? – Moon Nectar Feb 15 '22 at 12:33

1 Answers1

1

Realised there must be some underlying meta-data issues when using PyDrive for Google Sheets.

Instead opted for gspread and oauth2client libarys to just access the Google Sheet files, clear them and then insert my dataframe with the most up-to-date data.

Here's what I used to get what I want:

import pandas as pd
import gspread
from oauth2client.service_account import ServiceAccountCredentials
    
scope = ['https://spreadsheets.google.com/feeds','https://www.googleapis.com/auth/drive']
creds = ServiceAccountCredentials.from_json_keyfile_name('api_key.json', scope)
client = gspread.authorize(creds)
    
def update_gsheet(dataframe, file_id):
    df = dataframe.fillna('')
    sheet = client.open_by_key(file_id).sheet1
    sheet.clear()
    sheet.update([df.columns.values.tolist()] + df.values.tolist())

Have to account for NaN values as otherwise it flags an error