How is heuristic-based virus detection possible?

Question

The Halting Problem states that it is impossible for one program to predict the output of another, or whether it will terminate.

That got me thinking... how do heuristics based-scanners decide whether a given executable program's instructions are "virus-like", seeing as that would entirely involve predicting what the program is going to do?

I believe the first step is to see whether any code matches code from known viruses. Sort of like how the first step to guessing passwords is to try the most commonly-used passwords. — William Gaul, Oct 17 '13 at 01:45
@William: That's the traditional approach of virus scanners, yes, but I thought the whole point of heuristics-based virus scanning was to be able to detect *unknown* viruses, albeit with some false positives and negatives. — icktoofay, Oct 17 '13 at 01:47
This question appears to be off-topic because it is not specific enough for [SO]. Maybe on [programmers.se]? — John Saunders, Oct 17 '13 at 01:48
That's not quite what the halting problem says. Turing proved that a *general* solution for *all possible* programs cannot exist. In theory, the output of *some* code could be predicted. No idea if any malware scanners attempt that. — Michael Petrotta, Oct 17 '13 at 01:48
There’s a level of analysis you can do on the file based on pattern matching, but the real power of heuristic-based virus detection comes in when the malware actually runs. The AV hooks into suspicious things and blocks them. Sure, you have to run it to completely get around the Halting problem, but it can work rather well. — Ry-, Oct 17 '13 at 01:49
@icktoofay That's what makes it a heuristic! To use my analogy, a password-guessing *algorithm* would try every possible combination, in order, based on the algorithm. But a password-guessing *heuristic* employs some guesswork and shortcuts, like a table of known passwords. — William Gaul, Oct 17 '13 at 01:51
Detection is not 100%.... there are some false positive also. — Grijesh Chauhan, Oct 22 '13 at 17:26

score 2 · Accepted Answer · answered Oct 17 '13 at 01:47

2

Usually viruses use some kind of "pattern" in their code, like opening some special registry keys or execution of rare used system functions, or self-code modifications, so analyzer can "see" these actions and mark such program as potentially virus, of course it has some percentage of false alarm

answered Oct 17 '13 at 01:47

Iłya Bursov

23,342
4
33
57

1

Furthermore, the algorithm can "learn", just as the human immune system can learn how to recognize pathogens. – Hot Licks Oct 17 '13 at 02:49

How is heuristic-based virus detection possible?

1 Answers1