1

Is it possible to look at an object code and tell which language has been used originally to produce it? or does the language leaves a trace or a stamp on the object code ? do the compilers of various languages use a fixed format for a given ISA to develop the object code?

BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
KawaiKx
  • 9,558
  • 19
  • 72
  • 111

2 Answers2

2

There is no general algorithm, but in practice it is often possible. Usually you can just look at the libraries that the application depends on - if a Windows application depends on msvcrt.dll, for example, then there's a high chance that it's a C or C++ program compiled with Visual C++. Sometimes a compiler leaves traces of evidence in the .data section. Here is what I see when opening a "Hello, World!"-like Haskell binary (compiled with GHC) in a hex editor:

GHC

Here's what GCC's "copyright notice" looks like:

GCC

A trained eye can even recognize compiler version by looking at disassembly (every compiler optimizes code slightly differently and has its own implementation quirks). If you need to automate this, I suggest looking at machine learning techniques.

Mikhail Glushenkov
  • 14,928
  • 3
  • 52
  • 65
1

Nope. x86 is x86- once it's in that format, there's no trace of the original language.

Puppy
  • 144,682
  • 38
  • 256
  • 465