How to download a Drive "attachment"
The "attachment" referred to is actually just a link to a Drive file, so confusingly it is not an attachment at all, but just text or HTML.
The issue here is that since it's not an attachment as such, you won't be able to fetch this with the Gmail API by itself. You'll need to use the Drive API.
To use the Drive API you'll need to get the file ID. Which will be within the HTML
content part among others.
You can use the re
module to perform a findall
on the HTML content, I used the following regex pattern to recognize drive links:
(?<=https:\/\/drive\.google\.com\/file\/d\/).+(?=\/view\?usp=drive_web)
Here is a sample python function to get the file IDs. It will return a list.
def get_file_ids(service, user_id, msg_id):
message = service.users().messages().get(userId=user_id, id=msg_id).execute()
for part in message['payload']['parts']:
if part["mimeType"] == "text/html":
b64 = part["body"]["data"].encode('UTF-8')
unencoded_data = str(base64.urlsafe_b64decode(b64))
results = re.findall(
'(?<=https:\/\/drive\.google\.com\/file\/d\/).+(?=\/view\?usp=drive_web)',
unencoded_data
)
return results
Once you have the IDs then you will need to make a call to the Drive API.
You could follow the example in the docs:
file_ids = get_file_ids(service, "me", "[YOUR_MSG_ID]"
for id in file_ids:
request = service.files().get_media(fileId=id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)
Remember, seeing as you will now be using the Drive API as well as the Gmail API, you'll need to change the scopes in your project. Also remember to activate the Drive API in the developers console, update your OAuth consent screen, credentials and delete the local token.pickle
file.
References