Track Data Input Through Application Code and System Libraries

Question

I am a security dude, and I have done extensive research on this one, and at this point I am looking for guidance on where to go next.

Also, sorry for the long post, I bolded the important parts.

What I am trying to do at a high level is simple: I am trying to input some data into a program, and "follow" this data, and track how it's processed, and where it ends up.

For example, if I input my login credentials to FileZilla, I want to track every memory reference that accesses, and initiate traces to follow where that data went, which libraries it was sent to, and bonus points if I can even correlate it down to the network packet.

Right now I am focusing on the Windows platform, and I think my main question comes down to this: Are there any good APIs to remote control a debugger that understand Windows forms and system libraries?

Here are the key attributes I have found so far:

The name of this analysis technique is "Dynamic Taint Analysis"
It's going to require a debugger or a profiler
Inspect.exe is a useful tool to find Windows UI elements that take input
The Windows automation framework in general may be useful
Automating debuggers seems to be a pain. IDebugClient interface allows for more rich data, but debuggers like IDAPro or even CheatEngine have better memory analysis utilities
I am going to need to place memory break points, and track the references and registers that are associated with the input.

Here are a collection of tools I have tried:

I have played with all the following tools: WinDBG (awesome tool), IDA Pro, CheatEngine, x64dbg, vdb (python debugger), Intel's PIN, Valgrind, etc...

Next, a few Dynamic Taint Analysis tools, but they don't support detecting of .NET components or other conveniences that Windows debugging framework provides natively provided by utilities like Inspect.exe:

I then tried writing my own C# program using IDebugClient interface, but the it's poorly documented, and the best project I could find was from this fellow, and is 3 years old: C# app to act like WINDBG's "step into" feature

I am willing to contribute code to an existing project that fits this use case, but at this point I don't even know where to start.

I feel like as a whole dynamic program analysis and debugging tools could use some love... I feel kind of stuck, and don't know where to move from here. There are so many different tools and approaches to solving this problem, and all of them are lacking in some manner of another.

Anyway, I appreciate any direction or guidance. If you made it this far thanks!!

-Dave

I am answering my own question in case anyone else comes across this post in the future. Essentially the concept here isn't done by any software utility that I know of today, but the following tools can lead any other people looking for similar functionality to fruitful results. Omniscient Debuggers: https://omniscientdebugger.github.io/ — Dave, Aug 07 '17 at 00:14
By combining data from the omniscient debugger (wrapping your function calls for tracing), then correlating those to system calls (API Monitor), then correlating the specific socket used for network communication could create a full featured path trace for software from the live running program all the way through the Windows system and down to the network wire. Might come in handy to automate this, but these tools taken separately provide the data sources. — Dave, Aug 07 '17 at 00:21
Links for API Monitor and ProcMon got cut off: APIMonitor: http://www.rohitab.com/apimonitor ProcMon: https://learn.microsoft.com/en-us/sysinternals/downloads/procmon — Dave, Aug 07 '17 at 00:36

Ira Baxter · Answer 1 · 2019-01-20T11:26:15.557

If you insist on doing this at runtime, Valgrind or Pin might be your best bet. As I understand it (having never used it), you can configure these tools to interpret each machine instruction in an arbitrary way. You want to trace dataflows through machine instructions to track tainted data (reads of such data, followed by writes to registers or condition code bits). A complication will likely be tracing the origin of an offending instruction back to a program element (DLL? Link module? Named subroutine) so that you can complain appropriately.

This a task you might succeed at doing as an individual in terms of effort.

This should work for applications.

I suspect one of your problems will be tracing where goes in the OS. That's a lot harder although the same principle applies; your difficulty will be getting the OS supplier to let you track insructions executed in the OS.

Doing this as runtime analysis has the downside that if a malicious application doesn't do anything bad on your particular execution, you won't find any problems. That's the classic shortcoming of dynamic analysis.

You could consider tracking the data the source code level using classic compiler techniques. This requires that you have access to all the source code that might be involved (that's actually really hard if your application depends on a wide variety of libraries), that you have tools that can parse and track dataflows through source modules, and that these tools talk to each other for different languages (assembler, C, Java, SQL, HTML, even CSS...).

As static analysis, this has the chance of detecting an undesired dataflow no matter which execution occurs. Turing limitations means that you likely cannot detect all such issues. THat's the shortcoming of static analysis.

Building your own tools, or even integrating individual ones, to do this is likely outside what you can reasonably do as an individual. You'll need to find uniform framework for building such tools. [Check my bio for one].

Track Data Input Through Application Code and System Libraries

1 Answers1