2

I am downloading files from One Drive with the following piece of code:


if response.status_code == 200:
    print('\n> Response Success')

    with open('Book2.xlsx', 'wb') as File:
        File.write(response.content)
        
        print('\n> File Downloaded')
else:
    print('\n> Failed:', response.status_code)
    print(response.content)

The code is from:This post here

The "File" is coming from One Drive with the use of the following code:


import sys, os, time, requests
import pandas as pd
import urllib.parse

OneDrive_FilePath = 'Book2.xlsx'

OneDrive_FileURL = 'https://graph.microsoft.com/v1.0/me/drive/root:/' + OneDrive_FilePath + ':/content'
OneDrive_FileURL = urllib.parse.quote(OneDrive_FileURL, safe=':/')
print(OneDrive_FileURL)

Client_Id = 'XXXX'
Tenant_Id = 'YYYYY'
Refresh_Token_First = 'ZZZZZ'

PostStr = {'grant_type': 'refresh_token', 'client_id': Client_Id, 'refresh_token': Refresh_Token_First}

Token_Response = requests.post('https://login.microsoftonline.com/' + Tenant_Id + '/oauth2/v2.0/token', data=PostStr)

Access_Token = Token_Response.json()['access_token']
New_Refresh_Token = Token_Response.json()['refresh_token']

if Access_Token is None or New_Refresh_Token is None:
    print('\n> Failed: Access_Token NOT Retrieved')
    sys.exit()

Response = requests.get(OneDrive_FileURL, headers={'Authorization': 'Bearer ' + Access_Token})

The "File" which is getting downloaded is in the form of "io.BufferedWriter". I want to actually load the "File" as a data frame so that I can do certain operations on that and upload it to AWS.

How can I do that, please help.

Thanks

zsh_18
  • 1,012
  • 1
  • 11
  • 29
  • Could you please provide a URL that contains this kind of file, so that one can experiment with it? – Roy2012 Jul 06 '20 at 07:28
  • Also, please include the code that actually creates 'File'. – Roy2012 Jul 06 '20 at 07:29
  • @Roy2012 - I have edited the post and added the code used for accessing the "File" from One drive. Hope this helps and this is the only code I am using for downloading the "File". – zsh_18 Jul 06 '20 at 23:47
  • I don't have the access tokens, so I can't run your code. If I understand correctly, 'File' is just an Excel file that you'd like to read into a dataframe? Is that what the question is about? – Roy2012 Jul 07 '20 at 06:30
  • @Roy2012 - Yes it is an Excel file in One Drive, But when the file is accessed the way it is accessed right now, it changes to a type - "io.BufferedWriter". I would like to change that to the dataframe or at the first place read it as dataframe. Thanks – zsh_18 Jul 08 '20 at 04:48
  • You're saving it to disk eventually. Why can't you just open it as an excel with pandas? – Roy2012 Jul 08 '20 at 05:03
  • @Roy2012- I am trying to establish One Drive to AWS S3 sync. I am not supposed to actually load things locally on my desktop and then create a data frame and upload it to S3. I want to do it straight. – zsh_18 Jul 08 '20 at 05:05
  • So the type of File is probably irrelevant. What's the type of "response.content"? I guess it's bytes or string? – Roy2012 Jul 08 '20 at 05:13
  • @Roy2012 - yes its Bytes – zsh_18 Jul 08 '20 at 05:29
  • See my answer below. – Roy2012 Jul 08 '20 at 05:33
  • Many Thanks. Sorted ! – zsh_18 Jul 08 '20 at 05:45

1 Answers1

2

As explained in the comments, the type of File is mostly irrelevant. All you need to do is read the excel directly from the response, as in :

url = "https://go.microsoft.com/fwlink/?LinkID=521962"
res = requests.get(url)
pd.read_excel(res.content)

The last statement produces a regular pandas dataframe. You can use that however you want.

Roy2012
  • 11,755
  • 2
  • 22
  • 35