6

I'm trying to write a pandas.DataFrame directly to Google Drive, without first writing the file out locally. I can't find a solution and not sure if it is even possible. I've tried the code below, but I get an AttributeError.

import pandas as pd
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive

gauth = GoogleAuth()
gauth.LoadCredentialsFile(mycreds)
drive = GoogleDrive(gauth)

df = pd.DataFrame({'a':[1,2],'b':[2,3]})

f = drive.CreateFile({'id': '0B_6_uVX9biFuX0FJWFkt'}) #test.xlsx file
f.SetContentString(df) 
f.Upload()

AttributeError: 'DataFrame' object has no attribute 'encode'

Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
cyril
  • 2,976
  • 5
  • 19
  • 26
  • Consider creating a temp file, uploading it, and then deleting it? – cs95 Jun 08 '17 at 20:39
  • A dataframe does not have a native storage format. What format do you want it to be stored as? – Stephen Rauch Jun 09 '17 at 01:43
  • @StephenRauch I'd like to save it as an Excel file. And specifically overwrite an existing file daily using the file id from Google Drive. – cyril Jun 09 '17 at 02:52

1 Answers1

1

It is possible to get pandas to write an Excel file directly to a string like:

Code:

wb = Workbook()
writer = pd.ExcelWriter('', engine='openpyxl')
writer.book = wb
df.to_excel(writer)
file_as_string = save_virtual_workbook(wb)

Full Code:

Here is the above code combined with your example. Please note that this part is untested.

import pandas as pd
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive

gauth = GoogleAuth()
gauth.LoadCredentialsFile(mycreds)
drive = GoogleDrive(gauth)

df = pd.DataFrame({'a': [1, 2], 'b': [2, 3]})

from openpyxl.workbook import Workbook
from openpyxl.writer.excel import save_virtual_workbook

wb = Workbook()
writer = pd.ExcelWriter('', engine='openpyxl')
writer.book = wb
df.to_excel(writer)

f = drive.CreateFile({'id': '0B_6_uVX9biFuX0FJWFkt'}) #test.xlsx file
f.SetContentString(save_virtual_workbook(wb))
f.Upload()
Stephen Rauch
  • 47,830
  • 31
  • 106
  • 135
  • I suggest avoiding `ExcelWriter` altogether as openpyxl has pretty good support for Dataframes http://openpyxl.readthedocs.io/en/latest/pandas.html – Charlie Clark Jun 09 '17 at 07:54
  • I get an error message: `UnicodeDecodeError: 'ascii' codec can't decode byte 0xc9 in position 12: ordinal not in range(128)` when I run this. – cyril Jun 09 '17 at 14:20
  • Python 2 or 3? Which line is complaining? Is that with the sample dataframe? – Stephen Rauch Jun 09 '17 at 15:06
  • Python 2. `UnicodeDecodeError ... ---> 12 f.SetContentString(save_virtual_workbook(wb))` – cyril Jun 09 '17 at 15:26
  • Is that the sample dataframe?, Also seperate the `SetContentString` from the `save_virtual_workbook` to see which is having the problem – Stephen Rauch Jun 09 '17 at 15:28
  • Okay, did this: `vw = save_virtual_workbook(wb)` `f.SetContentString(vw)` SetContentString is giving me the error. The dataframe is the same the example. – cyril Jun 09 '17 at 15:40
  • I do not have access to the `SetContentString` stuff, So I won't be much help there. – Stephen Rauch Jun 09 '17 at 16:11