I have the below file that was saved from a http POST that I want to parse using Python using werkzeug parse_form_data(). Note that I am not getting it via a request, but through a file. I cannot get the file directly from a flask request for other reasons. Since I have been using Flask, I tried to use werkzeug to do this. I thought I had problems with the boundary with the extraneous hyphens '--', but I am trimming everything down to the very simple format in the following test file:
Here's the file on the file system (myinputfile):
--806243354728155036129379
Content-Disposition: form-data; name="myfile"; filename="text.py"
Content-Type: application/octet-stream
some text in a file
--806243354728155036129379
Content-Disposition: form-data; name="field1"
abcde
--806243354728155036129379
Content-Disposition: form-data; name="field2"
123456678
--806243354728155036129379--
Here's the code I use:
from werkzeug import parse_form_data
import io
inputfile = 'myinputfile'
content_type = 'Content-Type: multipart/form-data; boundary=806243354728155036129379'
environ = {
'wsgi.input': io.open(inputfile, 'rb'),
'CONTENT_LENGTH': '',
'CONTENT_TYPE': content_type,
'REQUEST_METHOD': 'POST'}
stream, form, files = parse_form_data(environ, silent=False)
I keep getting this error:
File "/usr/local/lib/python2.7/dist-packages/werkzeug/formparser.py", line 92, in parse_form_data
cls, silent).parse_from_environ(environ)
File "/usr/local/lib/python2.7/dist-packages/werkzeug/formparser.py", line 171, in parse_from_environ
content_length, options)
File "/usr/local/lib/python2.7/dist-packages/werkzeug/formparser.py", line 195, in parse
content_length, options)
File "/usr/local/lib/python2.7/dist-packages/werkzeug/formparser.py", line 100, in wrapper
return f(self, stream, *args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/werkzeug/formparser.py", line 212, in _parse_multipart
form, files = parser.parse(stream, boundary, content_length)
File "/usr/local/lib/python2.7/dist-packages/werkzeug/formparser.py", line 519, in parse
return self.cls(form), self.cls(files)
File "/usr/local/lib/python2.7/dist-packages/werkzeug/datastructures.py", line 406, in __init__
for key, value in mapping or ():
File "/usr/local/lib/python2.7/dist-packages/werkzeug/formparser.py", line 517, in <genexpr>
form = (p[1] for p in formstream if p[0] == 'form')
File "/usr/local/lib/python2.7/dist-packages/werkzeug/formparser.py", line 476, in parse_parts
for ellt, ell in self.parse_lines(file, boundary, content_length):
File "/usr/local/lib/python2.7/dist-packages/werkzeug/formparser.py", line 395, in parse_lines
self.fail('Expected boundary at start of multipart data')
File "/usr/local/lib/python2.7/dist-packages/werkzeug/formparser.py", line 327, in fail
raise ValueError(message)
ValueError: Expected boundary at start of multipart data
Ultimately I want to be able to save the file (which can be binary), and get the form data 'Field1' and 'Field2' from the dict. Any ideas? I am open to use other methods too.