0

I need to get multiple .csv files from SharePoint.

If I make this request via Postman

https://mycompany.sharepoint.com/teams/a/g/_api/web/GetFolderByServerRelativeUrl('Data%20Sources\')/Files('sharepoint_test.csv')/$value

With headers

Authorization: Bearer eyJ...
Accept: application/json;odata=verbose

I get the contents of "test_sharepoint.csv":

column a,column b,column c
32,523,88
46,34,659
25,767,78

I need to download multiple files at once and SharePoint doesn't seem to provide an endpoint for it. So using python and grequests, I get a response, but not the binary data:

>>> base_url = "https://mycompany.sharepoint.com/teams/a/g/_api/web/GetFolderByServerRelativeUrl('Data%20Sources\')/"
>>> url_1 = "Files('sharepoint_test.csv')/$value"
>>> url_2 = "Files('sharepoint_test_2.csv')/$value"
>>> allurls = [base_url + url_1, base_url + url_2]
>>> headers = {"Authorization": authtoken, "Content-Type": "application/json;odata=verbose", "Accept": "application/json;odata=verbose"}
>>> rs = (grequests.get(u, headers=headers, stream=True) for u in allurls)
>>> s = grequests.map(rs)
>>> s

[<Response [200]>, <Response [200]>]

>>> data = open(s[0], "rb").read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: expected str, bytes or os.PathLike object, not Response

How can I actually get the binary data via grequests?

user3871
  • 12,432
  • 33
  • 128
  • 268
  • First, did you notice that the first thing the documentation for `grequests` says is "Note: You should probably use requests-threads or requests-futures instead"? – abarnert Apr 06 '18 at 21:16

1 Answers1

0

grequests.get, like requests.get, returns a Response object.

The very first example shows how to use this object:

>>> r.status_code
200
>>> r.headers['content-type']
'application/json; charset=utf8'
>>> r.encoding
'utf-8'
>>> r.text
u'{"type":"User"...'
>>> r.json()
{u'private_gists': 419, u'total_private_repos': 77, ...}

The Binary Response Content section says:

You can also access the response body as bytes, for non-text requests:

>>> r.content
b'[{"repository":{"open_issues":0,"url":"https://github.com/...

So, what you're looking for is:

>>> data = open(s[0].content, "rb").read()

Although I'm not sure what good you expect this to do (is the HTTP response content really going to be a path to a file in your current working directory or local filesystem, encoded in your default filesystem encoding?), it is what you asked for.

Also, it's worth noting that the first thing the documentation for GRequests that you linked to says is:

Note: You should probably use requests-threads or requests-futures instead.

GRequests is barely maintained nowadays, and will probably break with Requests 3.0, while the newer alternatives are among the main drivers behind 3.0's redesign.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • I ended up using Pandas and io.BytesIO to read the stream and convert to dataframe: `pd.read_csv(io.BytesIO(res.content), encoding='utf8', sep=",")` – user3871 Apr 06 '18 at 21:51