0

What is the standard practice for handling data in text fields and file upload fields?

The question is similar to one I asked previously, but this one is slightly more general.

If we borrow the example of a user registering an account, which includes a name, email, and several file upload fields, the actions taken after form submission amount to:

(1) Validate all text fields name, email

(2) If validation is success, create and save User instance into DB.

(3) Save images to disk

(4) Update User instance to include filepaths of saved images.

The files uploaded aren't very big, roughly 5mb or less, so problems associated with uploading large 1GB+ files aren't really an issue for this question.

From what I've read, there are two ways of handling this.

  • Submit everything all together.

    There are several unanswered threads about this: https://softwareengineering.stackexchange.com/questions/239170/how-to-parse-multipart-field-file-data-separately

    Node.js Busboy parse fields and files seperatly

    I know that the text fields should come before the file fields when submitting the form thanks to mscdex's comment in my other question.

    But there are other problems I can see:

    (a) IF validations fail for text fields, that means everything will have to be resent in another form submission. This could potentially lead to a DOS attack/bandwidth issue by having a malicious user continually submit a form with bad text fields, but with lots of files.

  • Submit files when first selected, then when form submits, upload only file hash.

    (a) A potential DOS attack may happen by having a malicious user upload a ton of images that just sits on the server. Even with an independent bash script that cleans up the /tmp folder after X minutes, a user could still clog the disk space in the X minutes before cleanup by continually sending files.

    (b) Having an independent script for cleanup creates timing issues. What if a legitimate user keeps sending a form that fails validations, but then after X minutes, the user finally sends the correct form. By that time, the images would have been wiped since X minutes has passed even though the validations passed.

  • Some other way that I don't know

I feel the first way may be easier since I could potentially rate-limit the connections using nginx. Since the files are never hitting disk until validations are complete, I won't have any cleanup issues with files in /tmp. But I've searched the net and can't find anybody really doing this, which leads me to believe that file uploading is not really done this way.

What's the best way to handle file uploads with form data?

Community
  • 1
  • 1
rublex
  • 1,893
  • 5
  • 27
  • 45
  • send the data first and if it's ok, then upload the files. only do the files if you need to, anything else is wasteful. – dandavis Apr 14 '15 at 18:01
  • Checkout [formidable](https://github.com/felixge/node-formidable). It's magic. – Nocturno Apr 15 '15 at 05:06
  • You haven't mentioned if you're using [sails.js](http://sailsjs.org) or just raw express, but [**skipper**](https://github.com/balderdashy/skipper) is the best way to do this. – Travis Webb Apr 17 '15 at 05:26

2 Answers2

0

Submitting everything together is easiest. If the validation fails, just abort the connection and/or immediately send back a response of some kind. This will prevent the rest of the form from being processed.

mscdex
  • 104,356
  • 15
  • 192
  • 153
0

Submit everything together, but I wouldn't use multipart form to send common form data.

I would submit two forms, one for all text fields and another for the file. Use AJAX to submit the first form, control how many times the user tried and upload the file when the user succeed the validations.

A common form is send to the server like this:
name=DanielSunami&email=daniel@gmail.com

A multipart data is quite different:

-----------boundary0
content-disposition: form-data; name="name"

DanielSunami
-----------boundary0
content-disposition: form-data; name="email"

daniel@gmail.com
-----------boundary0--

Just to parse it's already more complex, uses more time and memory. Most recent browsers put all fields before a file, but old browsers don't. Imagine if you'd got this:

-----------boundary0
content-disposition: form-data; name="name"

DanielSunami
-----------boundary0
Content-Disposition: form-data; name="img"; filename="myImg.png"

�PNG

IHDR'�;�sRGB���bKGD�������    pHYs[...]
-----------boundary0
content-disposition: form-data; name="email"

daniel@gmail.com
-----------boundary0--

To avoid it we must separate file fields from text fields:

(1) Submit text fieds
name=DanielSunami&email=daniel@gmail.com

(2) Validate all text fields name, email

(3) If validation is success, create and save User instance into DB and return some kind of 'OK+USERID' answer.

(4) Submit file

-----------boundary0
Content-Disposition: form-data; name="imgUSERID"; filename="myImg.png"

�PNG

IHDR'�;�sRGB���bKGD�������    pHYs[...]
-----------boundary0--

(5) Save images to disk

(6) Update User instance to include filepaths of saved images.

More about multipart form: https://www.ietf.org/rfc/rfc1867.txt