1

I am working on a website, which allows users to upload different file formats. We need to restrict the user from uploading password protected files.

Is there a way to determine if a Microsoft Office file (Word, Powerpoint & Excel) is password protected before uploading the file? As per http://social.msdn.microsoft.com/Forums/en/oxmlsdk/thread/34701a34-f1d4-4802-9ce4-133f15039c69, I have implemented the following, but it throws an error saying "File contains corrupted data", while trying to open a password protected file.

 using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(mem, false))
 {
     DocumentProtection dp =
         wordDoc.MainDocumentPart.DocumentSettingsPart.Settings.GetFirstChild<DocumentProtection>();
     if (dp != null && dp.Enforcement == DocumentFormat.OpenXml.OnOffValue.FromBoolean(true))
     {
         return true;
     }
 }

Are there any other ways to determine this?

John Saunders
  • 160,644
  • 26
  • 247
  • 397
Sunita
  • 13
  • 1
  • 4

5 Answers5

5

Give this code a try:

public static Boolean IsProtected(String file)
{
    Byte[] bytes = File.ReadAllBytes(file);

    String prefix = Encoding.Default.GetString(bytes.Take(2).ToArray());

    // Zip and not password protected.
    if (prefix == "PK")
        return false;

    // Office format.
    if (prefix == "ÐÏ")
    {
        // XLS 2003
        if (bytes.Skip(0x208).Take(1).ToArray()[0] == 0xFE)
            return true;

        // XLS 2005
        if (bytes.Skip(0x214).Take(1).ToArray()[0] == 0x2F)
            return true;

        // DOC 2005
        if (bytes.Skip(0x20B).Take(1).ToArray()[0] == 0x13)
            return true;

        // Guessing
        if (bytes.Length < 2000)
            return false;

        // DOC/XLS 2007+
        String start = Encoding.Default.GetString(bytes.Take(2000).ToArray()).Replace("\0", " ");

        if (start.Contains("E n c r y p t e d P a c k a g e"))
            return true;

        return false;
    }

    // Unknown format.
    return false;
}
Tommaso Belluzzo
  • 23,232
  • 8
  • 74
  • 98
  • Wont I need to upload the file, before calling the above method?? since its using the file path – Sunita Jan 15 '13 at 22:41
  • I think it's the same for the first one, isn't? WordprocessingDocument.Open() requires a path or a Stream... – Tommaso Belluzzo Jan 15 '13 at 22:48
  • Yes.. it does take a stream. – Sunita Jan 15 '13 at 22:51
  • Tried the above code.. doesnt work .. returns false for password protected file. – Sunita Jan 15 '13 at 22:53
  • What is the version of this file? – Tommaso Belluzzo Jan 15 '13 at 22:59
  • Can you explain the above code... do encrypted files have those specific bytes? it works fine for ppt, pptx, doc, docx, xlsx.. I am only having issues with xls files – Sunita Jan 16 '13 at 19:48
  • Yes they normally do have and with a little bit of effort you can find their pattern. Further documentation: http://www.openoffice.org/sc/excelfileformat.pdf – Tommaso Belluzzo Jan 16 '13 at 19:48
  • I understand that you are looking for the FILEPASS (0x2F) record within the byte array, but not sure how did you determine the location (0x20C & 0x214) to look for.. – Sunita Jan 16 '13 at 21:19
  • Yes it is. I'm looking for that byte. Normally it's just a question of calculating the size of the header until FILEPASS and then reproduce this in the code. Now I don't have XLS2003 files to test... but what you can do is create 2 identical XLS2003 files, one protected and one not protected... then make a binary search for 0x2F bytes and compare the difference... that what the offset but maybe I'm wrong. – Tommaso Belluzzo Jan 16 '13 at 21:24
  • I tried your suggestion.. created two identical files Sample1 (without password) & Sample2 (with password), comparing all the bytes in both the files, I could not find any difference... – Sunita Jan 16 '13 at 22:19
  • I compared the hex versions of XLS files, Modified your check for XLS 2003 to the following if (file.RawData.Skip(0x208).Take(1).ToArray()[0] == 0xFE) return true; – Sunita Jan 17 '13 at 19:42
1

Sorry I'm a bit late to the party here. As I don't yet have 50 reputation I can't comment on Tomasso Belluzo's answer, but as I implemented it I found the following:

  1. To get the prefix I use Encoding.UTF7.GetString
  2. To check for "EncryptedPackage" I use Encoding.Unicode.GetString which obviates the need to remove all the \0s
0

The Following Is In The .aspx Source File

 Page Language="C#" AutoEventWireup="true" CodeBehind="TestForm.aspx.cs" Inherits="TestApp.TestForm"

 !DOCTYPE html PUBLIC
 Reference Page ="~/TestForm.aspx" // Note: Removed all HTML tags
    protected void Upload_Click(object sender, EventArgs e)
    {
        String noPW = "C:\\Users\\David\\Desktop\\Doc1.docx";
        String pwProtected = "C:\\Users\\David\\Desktop\\Test.docx"; 
    //         if (isProtected(pwProtected))
    //             outcome.Text = ("Document Is Password Protected");
    //         else
    //             outcome.Text = ("Document Is NOT Password Protected");

        if (isProtected(noPW))
            outcome.Text = ("Document Is Password Protected");
        else
            outcome.Text = ("Document Is NOT Password Protected");
    }

The Following Is In The .aspx.cs Code Behind File


    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Web;
    using System.Web.UI;
    using System.Web.UI.WebControls;
    using Microsoft.Office.Interop.Word;
    using System.Runtime.InteropServices;
    using Microsoft.Office.Interop.Word;


    namespace TestApp
    {
        public partial class TestForm : System.Web.UI.Page
        {

            protected void Page_Load(object sender, EventArgs e)
            {

            }
            public static bool isProtected(object filePath)
            {
                Application myapp = new Application();

                object pw = "thispassword";
                try
                {

                    // Trying this for Word document
                    myapp.Documents.Open(ref filePath, PasswordDocument: ref pw); // try open
                    myapp.Documents[ref filePath].Close();  // close it if it does open    
                }
                catch (COMException ex)
                {
                    if (ex.HResult == -2146822880) // Can't Open Doc Caused By Invalid Password
                        return true;
                    else
                        Console.WriteLine(ex.Message + "  " + ex.HResult);  // For debugging, have only tested this one document.
                }
                return false;
            }
        }

    }

At least on my computer, I get the expected output for both files, but this is not exactly what you call an exhaustive test of the code. In addition, I tried to upload a file using a FileUpload Control, and I got the COM error "Cannot Find C:\Windows\System\fileName.docx" which confused me just because the the file uploaded came from my desktop, but you probably know why that occurs as you are more familiar with ASP.NET than I am. Either way, this code is just something to try, hope that helps.

David Venegoni
  • 508
  • 3
  • 13
  • Password protected files are considered encrypted, though, I do not know if they extend to Word documents, but they do require a password upon opening... MSDN's definition "Encrypts a file so that only the account used to encrypt the file can decrypt it." – David Venegoni Jan 16 '13 at 02:33
  • Yeah, but do you have reason to believe that such files cause the `FileAttributes.Encrypted` to be set? – John Saunders Jan 16 '13 at 03:16
  • I am unsure to be completely honest, I will test it out in one second, sorry if I came off as rude in the previous comment, was in the middle of writing it when my gf called to nag me about something, lol. – David Venegoni Jan 16 '13 at 03:23
  • The original code didn't work, I updated my answer after trying a few things. – David Venegoni Jan 16 '13 at 04:30
  • I could not compile the above code in asp.net web application.. Is that Windows Appplication class? – Sunita Jan 16 '13 at 19:49
  • I made a quick console application to test it, I only added the references to Microsoft.Office.Interop.Word and System.Runtime.InteropServices. After my class, I will work with it using an ASP.NET Web Application template and update my response for you. – David Venegoni Jan 16 '13 at 21:35
0

To answer the question:

In order to tell if a file is password protected you would need to open that file in the browser and process it. Currently the only mechanism for opening files client side is through the FileAPI of HTML5 which isn't universally supported. Which means there is no reliable way of doing this.

Now, you can test the file on the server to determine if it is password protected or not and either throw it away or save it depending upon your rules.

Incidentally, the code you provided is server side code. Just modify it to catch the corrupted exception and display a message to the user about how the file is either corrupt or password protected with a note on how you don't allow password protected files to be uploaded.

NotMe
  • 87,343
  • 27
  • 171
  • 245
-1

Like notme I am unaware of a way to tell if a file is password protected prior to getting even part of a file uploaded, but the accepted answer to this question, while technically great, is a bit of overkill.

See Detect password protected word file for a much simpler and faster method to test whether a file is password protected.

Also, for those finding this question looking for the solution in VBA/S, the following is the version in the former, which is easily adaptable to the latter. www.ozgrid.com/forum/showthread.php?t=148962 . Though I would suggest checking for err.number = 5408 (what gets thrown with a wrong password, when protected), rather than using any err.number to determine that file is password protected.

Community
  • 1
  • 1
user66001
  • 774
  • 1
  • 13
  • 36
  • Not sure why file header analysis is overkill to you, to me it looks like introducing VBA or NetOffice on server's _backend_ only to know the file is protected is a guaranteed way to shoot both legs in the future. Even the guy in the post you referenced made kind of that shoot with NetOffice alerts being opened for protected files. The power of Tomasso's answer is that it can be ported even to client-side JS, so the fact file's protected will be known prior to uploading. – grafgenerator Apr 24 '20 at 11:27