0

I have a REST API that I am trying to upload data to , which is basically this : https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/update

Now, since the only option I have is PATCH, what are the options for me to have an optimized data load. I have been able to upload files by using the data parameter and using the read() function, but I don't think it's optimal as the entire file is read into memory I guess. I have tried using the files parameter (the multipaprt form encoding) and also looked at the toolbelt package but that does not seem to work for PATCH

This is the Sample Code that works but is not optimal

files={'file':('Sample',open('D:/FilePath/Demo.txt','rb'))}
length=os.stat('D:/FilePath/Demo.txt')
filesize=str(length.st_size)

with open('D:/File|Path/Demo.txt','rb') as f:
    file_data = f.read()

leng=len(file_data)

header = {
'Authorization': "Bearer " + auth_t
}

header_append = {
'Content-Length': filesize,
'Authorization': "Bearer " + auth_t
#'If-None-Match': "*" #Conditional HTTP Header
}

header_flush = {
'Content-Length': '0',
'Authorization': "Bearer " + auth_t
}


header_read = {
'Authorization': "Bearer " + auth_t
}

try:
    init_put=requests.put('https://adlstorageacc.dfs.core.windows.net/adobe/2019/02/DemoStreamFile4.txt?resource=file&recursive=True', headers=header_flush, proxies=proxies,verify=False)
    init_write=requests.patch('https://adlstorageacc.dfs.core.windows.net/adobe/2019/02/DemoStreamFile4.txt?action=append&position=0', headers=header_append, proxies=proxies,verify=False,data=file_data)

    flush_url='https://adlstorageacc.dfs.core.windows.net/adobe/2019/02/DemoStreamFile4.txt?action=flush&position=' + str(leng)
    init_flush=requests.patch(flush_url, headers=header_flush, proxies=proxies,verify=False)

Problem is the line

init_write=requests.patch('https://adlstorageacc.dfs.core.windows.net/adobe/2019/02/DemoStreamFile4.txt?action=append&position=0', headers=header_append, proxies=proxies,verify=False,data=file_data)

It only seems to take take the data parameter. If I change it to

init_write=requests.patch('https://adlstorageacc.dfs.core.windows.net/adobe/2019/02/DemoStreamFile4.txt?action=append&position=0', headers=header_append, proxies=proxies,verify=False,file=files)

I get an empty file.

Same is the case when I use the requestToolbelt package.

Does patch not recognize the file parameter? Nothing on the requests documents says any of it.

Also, if data parameter is the only way out, what is the best way to loading a file without doing a f.read() or iteratively specifying number of characters to read using f.read(n). Isn't there a better way?

Saugat Mukherjee
  • 778
  • 8
  • 32

1 Answers1

0

After also looking through Postman, was able to find the problem. Here's the solution. Problem was the with open statement and particularly the position parameter for the flush part, because the Content Length was overridden automatically, so had to get the Content Length from the request of the response.

files={'file':('Sample',open('D:/FilePath/Demo.txt','rb'))}
length=os.stat('D:/FilePath/Demo.txt')
filesize=str(length.st_size)
header = {
# 'Content-Type': 'text/plain',
'Authorization': "Bearer " + auth_t
#'If-None-Match': "*" #Conditional HTTP Header
}

header_append = {
'Content-Length': filesize,
'Authorization': "Bearer " + auth_t
#'If-None-Match': "*" #Conditional HTTP Header
}

header_flush = {
'Content-Type': "application/x-www-form-urlencoded",
'Content-Length': '0',
'Authorization': "Bearer " + auth_t,
#'If-None-Match': "*" #Conditional HTTP Header
}


header_read = {
# 'Content-Type': 'text/plain',
'Authorization': "Bearer " + auth_t,
#'Range': 'bytes=300000-302591'
#'If-None-Match': "*" #Conditional HTTP Header
}

try:
   init_put=requests.put('https://adlstorageacc.dfs.core.windows.net/adobe/2019/02/DemoStreamFile4.txt?resource=file&recursive=True', headers=header_flush, proxies=proxies,verify=False)
   init_write=requests.patch('https://adlstorageacc.dfs.core.windows.net/adobe/2019/02/DemoStreamFile4.txt?action=append&position=0', headers=header_append, proxies=proxies,verify=False,files=files)
   flush_length=init_write.request.headers['Content-Length']
   flush_url='https://adlstorageacc.dfs.core.windows.net/adobe/2019/02/DemoStreamFile4.txt?action=flush&position=' + str(flush_length)
   init_flush=requests.patch(flush_url, headers=header_flush, proxies=proxies,verify=False) 
except Exception as e:
    print("In Error")
    print(e)
Saugat Mukherjee
  • 778
  • 8
  • 32