0

I need to access a google text document saved in GoogleDrive and download its content in html format. I would like to use a Python script, launched by my laptop Terminal. The google text document is not public, so I have to authorise to the server.

This question has been pointed, but apparently solves the problem for javascript users and it is 5 years old. Google APIs changed since then.

I tried the pyDrive module, but in the module API there isn’t any “html format” option. The info out there is quite ambiguous, there are a lot of examples which refer to old google APIs and I have not found a specific reference to download text docs in html format with Python.

I use very often the gSpread module. Is there anything like it for google text documents?

Can someone point me the right way to achieve this?

1 Answers1

3

Finally, my solution:

from __future__ import print_function
import httplib2
import os

import io
from apiclient import discovery
from apiclient.http import MediaIoBaseDownload
from oauth2client import client
from oauth2client import tools
from oauth2client.file import Storage

try:
    import argparse
    flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
except ImportError:
    flags = None

# If modifying these scopes, delete your previously saved credentials
# at ~/.credentials/drive-python-quickstart.json
SCOPES = 'https://www.googleapis.com/auth/drive'
CLIENT_SECRET_FILE = 'client_secret.json'
APPLICATION_NAME = 'Drive API Python Quickstart'

def get_credentials():
    """Gets valid user credentials from storage.

    If nothing has been stored, or if the stored credentials are invalid,
    the OAuth2 flow is completed to obtain the new credentials.

    Returns:
        Credentials, the obtained credential.
    """
    home_dir = os.path.expanduser('~')
    credential_dir = os.path.join(home_dir, '.credentials')
    if not os.path.exists(credential_dir):
        os.makedirs(credential_dir)
    credential_path = os.path.join(credential_dir,
                                   'drive-python-quickstart.json')

    store = Storage(credential_path)
    credentials = store.get()
    if not credentials or credentials.invalid:
        flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
        flow.user_agent = APPLICATION_NAME
        if flags:
            credentials = tools.run_flow(flow, store, flags)
        else: # Needed only for compatibility with Python 2.6
            credentials = tools.run(flow, store)
        print('Storing credentials to ' + credential_path)
    return credentials

def main(docID, myDocPath):
    credentials = get_credentials()
    http = credentials.authorize(httplib2.Http())
    service = discovery.build('drive', 'v3', http=http)
    request = service.files().export_media(fileId=docID,
                                           mimeType='text/html')

    fh = io.BytesIO()
    downloader = MediaIoBaseDownload(fh, request)
    done = False
    while done is False:
        status, done = downloader.next_chunk()
        print("Download %d%%." % int(status.progress() * 100))

    with open(myDocPath, "wb") as f:
        f.write(fh.getvalue())

if __name__ == '__main__':
    myDocID = 'PUT_HERE_YOUR_DOC_ID'
    main(myDocID, 'some.html')
  • Looks like `store.put(credentials)` is missing after `Storing credentials` Also, https://pypi.org/project/oauth2client/ has this "Note: oauth2client is now deprecated. No more features will be added to the libraries and the core team is turning down support. We recommend you use google-auth and oauthlib." – Denis Ryzhkov Jun 26 '20 at 13:39