How to defeat deobfuscation of obfuscated javascript code?

Question

This is a generic question

I've seen javascript on some websites which is obfuscated

When you try to deobfuscate the code using standard deobfuscators (deobfuscatejavascript.com, jsnice.org and jsbeautifier.org) , the code is not easily deobfuscated

I know it's practically impossible to avoid deobfuscation. I want to make it really tough for an attacker to deobfuscate it

Please suggest some ways I can acheive this

Should I write my own obfuscator, then obfuscate the output with another online obfuscator. Will this beat it?

Thanks in advance

P.S: I tried google closure compiler, uglifyjs, js-obfuscator and a bunch of other tools. None of them (used individually or in combination) are able to beat the deobfuscators

What do you mean by attacker? Because if your code is open to attack (are you storing passwords client-side, for example) then obfuscating the JS is not going to help. — Andy, May 13 '16 at 18:23
What are you trying to defend against? Obfuscators are only good for preventing code stealing... — Eugene Sh., May 13 '16 at 18:25
Is this really necessary? IMO minifying is the only valid use for obfuscating JS code. As you said yourself, you can't avoid deobfuscation. If you're trying to prevent people from "stealing" your source code, then stop developing for the web or make your client site static with all the important logic on the server. Lastly, as web developers or, really, as programmers in general, we have to realize that none of us are special snowflakes. Many programmers and companies (including Microsoft now!), which are far more successful than we will ever be, intentionally publish even their compiled code. — Michael L., May 13 '16 at 18:29
You may find [The case for code obfuscation?](http://programmers.stackexchange.com/questions/129296/the-case-for-code-obfuscation) over at Programmers SE interesting. The answers there pretty much say the same thing as folks here: You can slow down folks trying to read your JavaScript, but you can't make it impossible. — BSMP, May 13 '16 at 18:37
Now the question: Is your code really that interesting that people will even try to deobfuscate it if it is somehow obfuscated? — Eugene Sh., May 13 '16 at 18:42

Ira Baxter · Answer 1 · 2016-05-14T04:23:48.900

Obfuscation can be accomplished at several levels of sophistication.

Most available obfuscators scramble (shrink?) identifiers and remove whitespace. Prettyprinting the code can restore nice indentation; sweat and lots of guesses can restore sensible identifier names with enough effort. So people say this is weak obfuscation. They're right; sometimes it is enough. [Encryption is not obfuscation; it is trivially reversed].

But one can obfuscate code in more complex ways. In particular, one can take advantage of the Turing Tarpit and the fact that reasoning about the obfuscated program can be hard/impossible in practice. One can do this by scrambling the control flow and injecting opaque control-flow control predicates that are Turing-hard to reason about; you can construct such predicates in a variety of ways. For example, including tests based on constructing artificial pointer-aliasing (or array subscripting, which is equivalent) problems of the form of "*p==*q" for p and q being pointers computed from messy complicated graph data structures.

Such obfuscated programs are much harder to reverse engineer because they build on problems that are Turing hard to solve.

Here's an example paper that talks about scrambling control flow. Here's a survey on control flow scrambling, including opaque predicates.

What OP wants is an obfuscator that operates at this more complex level. These are available for Java and C#, I believe, because building program analyzers to determine (and harness) control flow is relatively easy once you have a byte code representation of the program rather than just its text. They are not so available for other languages. Probably just a matter of time.

(Full disclosure: my company builds the simpler kind of obfuscators. We think about the fancier ones occasionally but get distracted by shiny objects a lot).

FYI, Jscrambler new release already implements scrambling of the control flow and opaque predicates for javascript. As far as I know they are the only ones doing it in Javascript. — rmribeiro, Jun 03 '16 at 08:45

score 1 · Accepted Answer · answered May 13 '16 at 19:53

The public de-obfuscators listed by you use not much more than a simple eval() followed by a beautifier to de-obfuscate the code. This might need several runs. It works because the majority of obfuscators do their thing and add a function at the end to de-obfuscate it enough to allow the engine to run it. It is a simple character replacement (a kind of a Cesar cipher) in most cases and an eval() is enough to get some code, made more or less readable by a beautifier after that.

To answer your question: you can make it tougher ("tougher" in the sense that just c&p'ing it into a de-obfuscator doesn't work anymore) by using some kind of "encryption" that uses a password the the code gets from the server after the first round of de-obfuscation and uses a relative path that the browser completes instead of a full path. That would need manual intervention. Build that path in a complicated and non-obvious way and you have a deterrent for the average script-kiddie.

In general: you need something to de-obfuscate the script that is not in the script itself.

But beware: it does only answer your question, that is, it makes it impossible to de-obfuscate by simple c&p into one of those public de-obfuscators and not more. See Ira's answer for the more complex stuff.

Please be aware of the reasons to obfuscate code:

hide malicious intent/content
hide stolen code
hide bad code
a pointy haired boss/investor
other (I know what that is, but I am too polite to say)

Now, what do the people think, if they see your obfuscated code? That your investor insisted on it to give you money to write that little browser game everyone loves so much?

"Reasons to obfuscate" are funny. Agreed. – Stack May 14 '16 at 00:22 — Stack, May 14 '16 at 00:22

score 0 · Answer 3 · answered May 13 '16 at 18:36

0

JavaScript is interpreted from clear text by your browser. If a browser can do it, so can you. It's the nature of the beast. There are plenty of other programming languages out there that allow you to compile/black box before distribution. If you are hell-bent on protecting your intellectual property, compile the server side data providers that your JavaScript uses.

answered May 13 '16 at 18:36

Stack

348
3
17

1

I think you confuse "execute" with "understand". Your CPU executes your binary code. It has no understanding of what the program is supposed to do. – Ira Baxter May 13 '16 at 21:49
@Ira Baxter I am confused by your comment. I mention neither "execute" nor "understand", there is no mention of a CPU... Explain please. – Stack May 14 '16 at 00:14
"If your browser can do it, so can you". You are implicitly treating the browser as an execution engine, e.g., some kind of CPU. You imply the ability to execute a program is the ability to understand it; that just nonsense. The ability to *reason* about a program's content might lead you eventually to understand it. Execution is not "reasoning". – Ira Baxter May 14 '16 at 00:26
I'm explicit in stating that your browser has to INTERPRET JavaScript first. Are you sure we are talking about the same language? https://en.wikipedia.org/wiki/JavaScript – Stack May 14 '16 at 00:34
I found the product you are referring to https://www.semanticdesigns.com/Products/Obfuscators/JavaObfuscator.html?Home=JavaTools - `Java != JavaScript` – Stack May 14 '16 at 00:46
a) I didn't refer to any product (other than a browser) in discussing your answer b) somehow you missed the JavaScript obfuscator at that same site. c) You complain that a Java obfuscator isn't relevant. Abstractly the issues are the same. d) "INTERPRET" means execute; it does not mean "understand". Somehow you seem not to understand this. – Ira Baxter May 14 '16 at 02:19
Interpret does not mean execute, modern browsers compile before they execute. `V8 compiles JavaScript to native machine code before executing it, instead of more traditional techniques such as interpreting bytecode or compiling the whole program to machine code and executing it from a filesystem.` - from here https://en.wikipedia.org/wiki/V8_%28JavaScript_engine%29. Source obfuscation techniques are pretty much the same, you are on point here. – Stack May 14 '16 at 03:19
In most discussions without qualifiers about programming languages, "interpret" indeed means 'execute". You can build fancier "interpreters" that JIT or whole-body compile the code before hand off interpretation to a CPU. But none of that improves "your" understanding of it. – Ira Baxter May 14 '16 at 03:45
In case of JavaScript, you are handing the source code to the client. `...the term "interpreted" is practically reserved for "software processed" languages (by virtual machine or emulator) on top of the native (i.e. hardware) processor` from here https://en.wikipedia.org/wiki/Interpreted_language – Stack May 14 '16 at 09:06

score 0 · Answer 4 · answered May 16 '16 at 11:27

No JavaScript obfuscation or protection can say it makes it impossible to reverse a piece of code. That being said there are tools that offer a very simple obfuscation that is easy to reverse and others that actually turn your JavaScript into something that is extremely hard and unfeasible to reverse. The most advanced product I know that actually protects your code is Jscrambler. They have the strongest obfuscation techniques and they add code locks and anti-debugging features that turn the process of retrieving your code into complete hell. I've used it to protect my apps and it works, it's worth checking out IMO

His Dudeness hath spoken. – Dominic Cerisano Jun 02 '16 at 23:05 — Dominic Cerisano, Jun 02 '16 at 23:05

How to defeat deobfuscation of obfuscated javascript code?

4 Answers4