2

I am using pydrive to read a file:

file_list = drive.ListFile({'q': "'root' in parents and trashed=false"}).GetList()
print(file_list)
took = False
for file1 in file_list:
  print('title: %s, id: %s' % (file1['title'], file1['id']))

And get list of files:

title: wikipedia.txt, id: 1emuNuhM0nMkKQEiABW4lcSn7CznTDR-w
title: wiki_2000_rows.csv, id: 1Kjs84pwQVXKyZKXPfbiQIy6QLtwIldcN

I want to read the file

wiki_2000_rows.csv

and put it's content into a dataframe but couldn't find how to do it. Can it be done? How can I read the content of the files?

Cranjis
  • 1,590
  • 8
  • 31
  • 64

3 Answers3

2

In order to read the wikipedia.txt , you just need to acces to content like this.

from pydrive.drive import GoogleDrive
from pydrive.auth import GoogleAuth

def login() :

    google_auth = GoogleAuth()
    google_auth.LocalWebserverAuth()
    google_auth.Authorize()
    google_drive = GoogleDrive( auth = google_auth )

    return google_drive


def read_file( id_file ) :

    metadata = dict( id = id_file )
    
    google_file = google_drive.CreateFile( metadata = metadata )

    google_file.GetContentFile( filename = id_file )

    content_bytes = google_file.content ; # BytesIO

    string_data = content_bytes.read().decode( 'utf-8' )

    return string_data


if __name__ == '__main__' :

    google_drive = login()

    ID_FILE = 'yout_file_id'

    string_data = read_file( id_file = ID_FILE )

    print( string_data )
1

You need to essentially CreateFile with that particular id. Following snippet should get your file downloaded. Then you can use pandas to read it.

def download_file(drive_obj, file_id, output_fname):
    gfile = drive_obj.CreateFile({'id': file_id})
    if output_fname is None:
        output_fname = file_id
    gfile.GetContentFile(output_fname)

    return output_fname
sophros
  • 14,672
  • 11
  • 46
  • 75
  • 2
    For the benefit of the asker you could indicate in more detail how would you use `pandas` to read the file. – sophros Apr 17 '19 at 15:22
0

Based on the answer of Ashesh, here is an implementation with Pandas. Note that I am using PyDrive2, since the first version is not longer maintained:

def download_file(drive_obj, file_id):
    # Create a file with the same id
    gfile = drive_obj.CreateFile({'id': file_id})
    # Save the content as a string
    content = gfile.GetContentString()
    # Transform the content into a dataframe
    df = pd.read_csv(content)

    return df