Do I need to filter the $_FILES['file'] before dealing with it?
Short answer: NO. It's a bunch of string values, nothing more.
Long Answer:
I used to protect web-apps by checking the files type and it's extension to know if those were in the whitelist or not.
This is a good approach, if applied and carried out correctly.
the $_FILES array is simply a carrier. It can't itself be abused, but you have to trust what it carries - ie trusting the file that is being passed to/by the server.
While I write this answer; below; it seems the OP is confused about what they're actually protecting against and why:
The OP states as "best practise" (which it is absolutely not):
If you want to use $_FILES['file']['tmp_name'] to be stored into your database or to display in your UI, you should use addslashes or PDO prepare statement to be protected against SQL-Injection attacks.
This is a misunderstanding of how the $_FILES
array is populated. The $_FILES['file']['tmp_name']
value is set by the server, not by the user or the client.
The user given values are:
$_FILES['file']['name']
$_FILES['file']['type']
$_FILES['file']['size']
These are the string values that would need to be vetted. As long as you do not trust these string values, you have nothing to worry about.
Storing files inside your database is not usually a good idea and has its own pitfalls, dhnwebpro has their own answer on this question, regarding database safety.
$_FILES['file']['tmp_name']
is the server location of the file in temporary storage space.
The PHP Manual clearly states:
Files will, by default be stored in the server's default temporary directory, unless another location has been given with the upload_tmp_dir directive in php.ini. The server's default directory can be changed by setting the environment variable TMPDIR in the environment in which PHP runs.
The file will be deleted from the temporary directory at the end of the request if it has not been moved away or renamed.
If you think that your $_FILES['file']['tmp_name']
value is being abused then this is a sign of server compromise and you've got a whole lorry load of trouble on your plate, well beyond a nefarious file upload.
So, how to vet the file that is being carried?
There are numerous types of file attacks and this topic is far beyond the scope that you are asking. For example; a genuine JPEG image can contain XSS scripting in the JPEG Metadata, but this XSS is triggered when the JPEG is loaded and viewed, but for all intents and purposes the JPEG file is not a "bad file" or is not a XSS file, to an outside observer who doesn't specifically check for this vulnerability.
So, do you block this file.jpg
or do you block all Jpeg files? It's a tough call to make but in PHP there are some very good work arounds (which are also I believe out of scope for this question). In short; your question could do with some editing and clarity as to what exactly you're trying to protect against and how far you're willing to go to reach that level of protection.
I can give you a rough catch-all guide to preventing certain MIME file-types from being accepted onto your server. This looks and feels like what you want, something to stop sneaky MP4 videos being uploaded as document files (or vice versa).
1:
Ignore the filename ($_FILES['file']['name']
). NEVER trust user data.
Edit: As pointed out by meagar, you may need to retain the original filename, in which case you should check it with a REGEX or similar to remove unwanted characters...
2:
Ignore the declared filetype ($_FILES['file']['type']
). Any filename given MIME type (such as .pdf
) should simply be ignored. NEVER trust user data.
3:
Use the PHP Finfo
function set as a preliminary indicator. It is not perfect but will catch most things.
$finfo = finfo_open(FILEINFO_MIME_TYPE); // return mime type ala mimetype extension
$mimeType = finfo_file($finfo, $_FILES['file']['tmp_name']);
$whitelist = ['text/html','image/gif','application/vnd.ms-excel'];
finfo_close($finfo);
if(in_array($mimeType,$whitelist)){
// File type is acceptable.
}
4: Images:
If you are checking an uploaded image the best approach is to check the finfo
filetype as per 3 and then have PHP load the image into a blank canvas and resave the image, thereby stripping out all the excess metadata and other potentially undesirable data that is not image-data.
Like this method: Remove exif data from jpg using php.
5:
It is also advisable always to give your uploaded files randomised names, never using the $_FILES['file']['name']
value.
6:
Depending on the sort of threat you're trying to avoid and/or neutralise, you can open the uploaded file and read the first few bytes of the file and compare it with confirmed bytes from whitelisted files of that type. This is quite nuanced and again beyond the scope of this answer, which is quite long enough.