0

I have a C# project connected to a SQL Server Database, and I' m doing a format recognition for file upload. More specifically, I upload files in varbinary and I store they in a column in the Db. When I download they, I want to recognize their format. I found on the web some hot keywords (or metadata) through which I could recognize if they are docx or xlsx or other. My problem is that if I do this on SQL:

select * from table where convert(varchar(8000),objectFile) like '%docprop%'`

it works and the db returns only word file. But if I want to do this on C# after taking the varbinary how can I do? I tried this but don' t work:

var item = context.tObject_M.SingleOrDefault(x => x.objectId == objectId);
var files = item.objectFile;
string filess = Convert.ToString(files);
byte[] itemi = item.objectFile;
string ciao = System.Text.Encoding.UTF8.GetString(itemi);
if (ciao.Contains("DOCPROPS"))
   {
      filess = "ciao";
    }

1 Answers1

0

http://www.garykessler.net/library/file_sigs.html

Check this link, you should investigate about magic numbers. This is the only reliable way to known the file type. In your case you can use content type like I told you in the comments !

If you search for sequence starting with [03] [04] [00]

byte[] fileSign={0x03, 0x04, 0x00};

for(int i=0; i<fileSign.Length; i++)
{
   if(fileByteArray[i] != fileSign[i])
   {
       //not the same files
   }
}
mybirthname
  • 17,949
  • 3
  • 31
  • 55
  • Thank you for the answer! But it doesn' t solve my problem; Office file (xlsx,docx etc) has the same header file for the first 8 number! So I' ve found for every type of file some keywords; if I put they in the query as above it give me the correct extension! This is the reason why I want to use this method! – kirkfrusciante Sep 19 '14 at 13:07
  • @kirkfrusciante Do you read the link which I send you about docx, xlsx files. Microsoft Office Open XML Format (OOXML) Document NOTE: There is no subheader for MS OOXML files as there is with DOC, PPT, and XLS files. To better understand the format of these files, rename any OOXML file to have a .ZIP extension and then unZIP the file; look at the resultant file named [Content_Types].xml to see the content types. In particular, look for the – mybirthname Sep 19 '14 at 13:10
  • I read this yesterday during my research; however I have found a subheader or something similar that helps me as I wrote above. But it helps me only in SQL. I don' t need other methods to reach my goal; I want to succeed on my method and I repeat the question: how can I replicate the query in SQL written above in C# with the same result? Thank you for the interest! – kirkfrusciante Sep 19 '14 at 13:28
  • I'm not aware of this functionality in C# ! – mybirthname Sep 19 '14 at 15:45