5

How to find file extension if the file has been renamed? Is there any tool are available for this?

Example: I have a file "1.doc"; I hope everybody know this is a Word document I just renamed as "1.txt". But the file is a Word document originally; how can I get the original file extension?

Thiru G
  • 620
  • 2
  • 6
  • 17
  • "What's in a name." Extension is just a name and does not have a lot to do with the actual type of file contents. – user562374 Jan 05 '11 at 13:08
  • Another canidate for [the `confusion-of-ideas` tag](http://stackoverflow.com/questions/58640/great-programming-quotes/59001#59001) - what's in a name? –  Jan 05 '11 at 13:13
  • @delnan: Not really. Various file formats do have characteristic signs after which they might be recognized (see e.g. PNG or GIF formats). OTOH, Windows used to (still does? not sure) recognize a file type *only* by the file's extension - so if you rename the file, Windows is clueless what to do with it. – Piskvor left the building Jan 05 '11 at 13:30
  • Okay, A file is there example 1.fgf, someone renamed as "fgf" how can I use that file? The file may be word document,mp3 etc... – Thiru G Jan 05 '11 at 13:33

3 Answers3

4

Of Course You can :)

This is the C# code for you. I guess you can bulid your own tool ;)

using System.Runtime.InteropServices;
using System.IO;
using Microsoft.Win32;

    [DllImport(@"urlmon.dll", CharSet = CharSet.Auto)]
    private extern static System.UInt32 FindMimeFromData(
        System.UInt32 pBC,
        [MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
        [MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
        System.UInt32 cbSize,
        [MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
        System.UInt32 dwMimeFlags,
        out System.UInt32 ppwzMimeOut,
        System.UInt32 dwReserverd
    );


    public static string getMimeFromFile(string filename)
    {
        if (!File.Exists(filename))
            throw new FileNotFoundException(filename + " not found");

        byte[] buffer = new byte[256];
        using (FileStream fs = new FileStream(filename, FileMode.Open))
        {
            if (fs.Length >= 256)
                fs.Read(buffer, 0, 256);
            else
                fs.Read(buffer, 0, (int)fs.Length);
        }
        try
        {
            System.UInt32 mimetype;
            FindMimeFromData(0, null, buffer, 256, null, 0, out mimetype, 0);
            System.IntPtr mimeTypePtr = new IntPtr(mimetype);
            string mime = Marshal.PtrToStringUni(mimeTypePtr);
            Marshal.FreeCoTaskMem(mimeTypePtr);
            return mime;
        }
        catch (Exception e)
        {
            return "unknown/unknown";
        }
    }

You get the mimetype using this code. To find the extension from mime-type just do a little google search.

honibis
  • 811
  • 1
  • 5
  • 13
  • Nice & thanks honibis, extraction of mime-type is not an issue :-) – Thiru G Jan 05 '11 at 13:25
  • While this would work, you're not getting the *original* extension back, you're getting a best-effort guess what it *might* have been. Note that this will probably not work for some types of files (such as ZIP archives, which have the "headers" at the end of file; as such, the start of file can contain anything). – Piskvor left the building Jan 05 '11 at 13:27
  • @Piskvor for zip files it returns: application/x-zip-compressed, actually its more than guess since original file extension is strictly linked with mimetype – honibis Jan 05 '11 at 14:00
  • @honibis: It's an educated guess (based on format artifacts and statistics) that's mostly correct, but a guess nonetheless. There are filetypes that have multiple extensions: "was it image.jpg, or image.jpeg?" As I said, your code will work, possibly 90+% of the time. Just be aware that `FindMimeFromData` runs on pattern recognition and statistics, not magic :) – Piskvor left the building Jan 05 '11 at 14:03
  • @Piskvor There is something called 'file header' and all standart formats has headers. the first 256 bytes is where the headers are kept. So NO pattern recognition or magic. – honibis Jan 05 '11 at 14:13
  • 1
    @honibis: I have this feeling that we're violently in agreement :D – Piskvor left the building Jan 05 '11 at 14:24
3

Impossible. If you're on a *nix type system, use the file command to determine file format.

If you're really paranoid about stuff like this happening (and messing up your workflow), you can do 2 things:

  1. make a hash of your file, for example, an MD5 hash so you know your file hasn't been tinkered with
  2. take note of your file's timestamp so you can see when was the last time it changed
  3. take note of your file's extension at that timestamp

This will guard you in few ways:

The hash will make sure your file hasn't been changed.

The timestamp will tell you the last time it was modified.

The extension will tell you its original extension.

Since just renaming file's extension won't modify its timestamp, you need step 3.

Using techniques like this will tell you in 99.99999999999% cases that your file has been modified by something or somebody.

darioo
  • 46,442
  • 10
  • 75
  • 103
1

You can't. You'd have to use a tool like file to try and detect the file's format.

Pekka
  • 442,112
  • 142
  • 972
  • 1,088