Add module inside cuckoo sandbox

Question

For malware dynamic malware analysis, I am using Automated Malware Analysis - Cuckoo Sandbox. Now I wan to add new modules for analysis on malware. I have studied cuckoo sandbox's development documentation. But currently I am unable to add my custom script for static analysis on malware/samples. Python scripts available here.

Can anyone guide me how can I add more modules / analysis script inside cuckoo sandbox processing module. If their is any article available on net please share it.

Thanks

please, DO NOT just refer to official cuckoo docs, anyone who's concerned with the described problem has most likely already searched through them and haven't found the answer :) — nicks, Jul 14 '15 at 12:33
So you need to add [Analysis Package](http://docs.cuckoosandbox.org/en/latest/customization/packages/)? Please point what exactly documentation missing or where you're experiencing difficulties. — Serafim Suhenky, Jul 19 '15 at 21:29
1) Should I add or rather modify the existing "exe" package? 2) Where will the script be performed, client or host? If it'll be performed on the host, how can i fully access guest system in order to perform my logic? If it's performed on guest, where can I store gathered information for further extracting into report? — nicks, Jul 20 '15 at 08:51
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow. Instead, describe the problem and **what has been done so far to solve it**. Questions seeking debugging help ("why isn't this code working?") *must include the desired behavior, a specific problem or error and the shortest code necessary to reproduce it* **in the question itself**. Questions without a clear problem statement are not useful to other readers. See: How to create a Minimal, Complete, and Verifiable example. — SherylHohman, Sep 03 '18 at 20:41

Raydel Miranda · Accepted Answer · 2015-08-03T19:07:02.560

First some words about concepts.

According the documentation:

The analysis packages are a core component of Cuckoo Sandbox. They consist in structured Python classes which, when executed in the guest machines, describe how Cuckoo’s analyzer component should conduct the analysis.

So, an analysis package is responsible of perform the needed actions to process the file.

Examples (On windows guests)

An exe needs to be executed
A *.doc needs to be opened with Microsoft Word.
A dll needs to be executed with "C:\\WINDOWS\\system32\\rundll32.exe"
An html is loaded using Internet Explorer.
A PDF is tried to be opened with some version of Acrobat Reader.
Etc ...

So, you write an Analisys Package to tell cuckoo how to open or execute a file. A Processing Module to process the file and extract the information for the report (Reporting Modules).

If you want to perform static analysis, you don't need to write an Analisis Package but a Processing Module. If you want to add new behavioural analysis you need to implement both.

This answer is about writting processing modules, since your question is about static analysis.

Implementing a cuckoo's processing module

I use the documentation last version. In the docs I found many things that was helpful, others things (like how to display the module report in the html interface) I discovered my self in a test-and-error process and hacking the code.

Implementing the module

In order to be a processing module, your script has to meet some requirements. Below, you will see which are these requirements and how to put them together to get a processing module.

After an analysis is completed, Cuckoo will invoke all the processing modules available in the modules/processing/ directory. Every module will then be initialized and executed and the data returned will be appended in a data structure that we’ll call global container. This container is simply just a big Python dictionary that contains all the abstracted results produced by all the modules sorted by their defined keys.

The resulting data of your processing module will be added to the global container, this way other modules (E.g report modules) can access to that information.

A basic processing module (lets call it simple_module) could look like this:

# simple_module.py
from lib.cuckoo.common.abstracts import Processing

class SimpleModule(Processing):     # A class inheriting Processing.

    def run(self):                  # A run() function
        self.key = "simple_info"    # The name that will have the returned data in the global container.
        data = "This is the data returned by simple_module."
        return data                 # A set of data (list, dictionary or string etc.) that will be appended to the global container.
                                    # under the key secified in `self.key`.

Where to put the new module.

There are several module categories, if you look at cuckoo's directory hierarchy you will find a directory called modules and inside some directories:

auxiliary -- Self explanatory
machinery -- Modules for handling hardware and virtualization.
processing -- Modules for processing files (those you want to add)
reporting -- Modules needed for reporting the results obtained by processing modules
signatures -- It is not claer to me (I might have a different idea of what signature means).

The directory you have to care about is: processing. There you will put your new module.

Enabling the new module

Add a seccion like the following to the conf/processing.conf file:

[simple_module]
enabled = yes

How to view the result?

After the analysis raw results have been processed and abstracted by the processing modules and the global container is generated (ref. Processing Modules), it is passed over by Cuckoo to all the reporting modules available, which will make some use of it and will make it accessible and consumable in different formats.

Yes!! We need other module in order to be able to see the output of the new processing module. The easiest way is to log the result to a file:

You can visit the Reporting Modules documentation an you will find an example like this:

Lets implement a report for our processing module, simple_module:

# simple_report.py
import os

from lib.cuckoo.common.abstracts import Report
from lib.cuckoo.common.exceptions import CuckooReportError

class SimpleReport(Report):

    def run(self, results):                         # IMPORTANT!! Here the parameter result will be the Global Container we saw before
        try:
            report = open(os.path.join(self.reports_path, "simple_report.txt"), "w")
            report.write(results["simple_info"])    # We add our information to the Global Container under the key: simple_info
                                                    # now we are querying that info to write it down to a file.
            report.close()
        except (TypeError, IOError) as e:
            raise CuckooReportError("Failed to make a simple report, :(")

Also you will need to enable this reporting module:

Every module should also have a dedicated section in the file conf/reporting.conf, for example if you create a module module/reporting/foobar.py you will have to append the following section to conf/reporting.conf

[simple_report]
enabled = on

Now wen you run a new analysis, you will be able to find a file called "simple_report.txt" in the storage/analyses/<analysis-number>/reports folder.

Output to the report to the file

What about HTML, I want to see the result in the broswer!!

Well ... thats a little more complex. If you take a look at the file modules/reporting/reporthtml.py you will find a class ReportHtml that at some point has code like this:

try:
    tpl = env.get_template("report.html")       # Ahhhh, so cuckoo is using a template for this.
    html = tpl.render({"results": results})     # Look, the template receives the Global Container (this dude again!!!, it must be a VIP).
except Exception as e:
    raise CuckooReportError("Failed to generate HTML report: %s" % e)

try:
    with codecs.open(os.path.join(self.reports_path, "report.html"), "w", encoding="utf-8") as report:
        report.write(html)
except (TypeError, IOError) as e:
    raise CuckooReportError("Failed to write HTML report: %s" % e)

The templates are in web/templates/analysis there you can find report.html. Reading that file you will note two important code blocks:

Code for tabs:

<ul class="nav nav-tabs">
    <li class="active"><a href="#overview" data-toggle="tab">Quick Overview</a></li>
    <li><a href="#static" data-toggle="tab">Static Analysis</a></li>
    {% if analysis.behavior.processes %}<li><a href="#behavior" data-toggle="tab" id="graph_hook">Behavioral Analysis</a></li>{% endif %}
    <li><a href="#network" data-toggle="tab">Network Analysis</a></li>
    <li><a href="#dropped" data-toggle="tab">Dropped Files</a></li>
    {% if analysis.procmemory %}<li><a href="#procmemory" data-toggle="tab">Process Memory</a></li>{% endif %}
    {% if analysis.memory %}<li><a href="#memory" data-toggle="tab">Memory Analysis</a></li>{% endif %}
    <li><a href="#admin" data-toggle="tab">Admin</a></li>
</ul>

And code for content (some code was omitted for brevity):

<div class="tab-content">
    <div class="tab-pane fade in active" id="overview">
        {% include "analysis/overview/index.html" %}
    </div>
    <div class="tab-pane fade" id="static">
        {% include "analysis/static/index.html" %}
    </div>
    {% if analysis.behavior.processes %}
    <div class="tab-pane fade" id="behavior">
        {% include "analysis/behavior/index.html" %}
    </div>
    {% endif %}
    ...
    ...
</div>

Ok, it is obvious, we need to add our template, lets proceed:

1- Create a file, web/templates/analysis/simple_module/index.html

 {{analysis.simple_info}}

In the above line analysis points to the root of the dictionary Global Results. And simple info is the key added to such dictionary by our process module, simple_module.

This will replace {{analysis.simple_info}} with the value we set to that key in the Global Conatiner. See also The Django template language: for Python programmers.

2- Update web/templates/analysis/report.html to include your templates

Add the line

<li class="active"><a href="#simple_module" data-toggle="tab">Simple Module</a></li>

to the tabs section. And the following lines to the content section:

<div class="tab-pane fade" id="simple_module">
    {% include "analysis/simple_module/index.html" %}
</div>

And... Hocus Pocus ...

enter image description here

Is important to note that if you only want to display the result in the html format, you don't have to implemet a report module, just create the corresponding templates and use the respective variables.

what i needed was `analysis package`, which you didn't even add to your tutorial. sorry, but your post is completely irrelevant. — nicks, Jul 22 '15 at 05:09
@NikaGamkrelidze I don't think so. If you take a look at files inside cuckoo. you will find inside the folder `modules/processing/`, files with the name: **static.py**, **analysisinfo.py**, **network.py**, **memory.py**, etc... Processing Modules are those modules that perform the analysis. The OP wants to perform a static analysis on OLE files, so, the tutorial is correct. To analyze a file, you have to process it. — Raydel Miranda, Jul 22 '15 at 12:09
Static analysis - yes, but I created bounty on that question and specified my problem. — nicks, Jul 23 '15 at 12:35
Sorry, I'm had no idea who created the bounty on the question, that is something you can't see at first view. I will add your questions in to my answer in short. — Raydel Miranda, Jul 23 '15 at 12:45
I wondered, how should I extend dynamic analysis functionality, which involves editing `analysis` package. — nicks, Jul 23 '15 at 13:15

Raydel Miranda · Answer 2 · 2017-06-27T18:11:42.647

I wrote a different answer for this. In order to answer the questions of Nika (he/she created the bounty on the question).

How should I extend dynamic analysis functionality, which involves editing analysis package?

To aswer to your main question, I will first answer the questions you posted as comments:

Nika: Should I add or rather modify the existing "exe" package?

You should add another module, you can specify the analysis package on submission.

#analizer.py

...
# If no analysis package was specified at submission, we try to select
# one automatically.
        if not self.config.package:
            log.debug("No analysis package specified, trying to detect "
                      "it automagically.")
...

Nika: Where will the script be performed, client or host?

I think by "client" you meant "guest".

The scripts are "performed" in the guest, If you look at the code of the agent.py you will see something like this:

from SimpleXMLRPCServer import SimpleXMLRPCServer

and also:

def add_analyzer(self, data):
        """Add analyzer.
        @param data: analyzer data.
        @return: operation status.
        """
        data = data.data

        if not self._initialize():
            return False

        try:
            zip_data = StringIO()
            zip_data.write(data)

            with ZipFile(zip_data, "r") as archive:
                archive.extractall(ANALYZER_FOLDER)
        finally:
            zip_data.close()

        self.analyzer_path = os.path.join(ANALYZER_FOLDER, "analyzer.py")

        return True

These two code snippets shows that: first, the agent use RCP and second, the analyzer is copied to to the target virtual machine, all that together suggest the sripts are executed in the guest system.

In fact there is a function that shows how its executed:

def execute(self):
        """Execute analysis.
        @return: analyzer PID.
        """
        global ERROR_MESSAGE
        global CURRENT_STATUS

        if not self.analyzer_path or not os.path.exists(self.analyzer_path):
            return False

        try:
            proc = subprocess.Popen([sys.executable, self.analyzer_path],
                                    cwd=os.path.dirname(self.analyzer_path))
            self.analyzer_pid = proc.pid
        except OSError as e:
            ERROR_MESSAGE = str(e)
            return False

If it's performed on guest, where can I store gathered information for further extracting into report?

See my other answer, the information is always gathered by processing modules and added to the global container. Then from there you can access to it using a reporting module.

Your analysis package should use some tool to get the result or information you want. Then in the processing module you have the member self.dropped_path you can look the file there, process it and add the information to the global container.

I hope this help you to get closer to what to want to acheive.