1

I'm working on a project which detects a malware based on Machine Learning techniques. My primary targets are linux devices. My first question is;

  1. How can I extract data about processes from a linux kernel using a kernel driver? I'd like to extract data about running processes by myself for the first time just for proof of concept. Later on I'd like to write a kernel driver to do that automatically and in real time.
  2. Are there any other ways to extract data for running processes such as ProcessName, PID, UID, IS_ROOT and etc.?
akyayik
  • 664
  • 1
  • 8
  • 25
  • 2
    Your question is too broad and unclear. "extract data"? What data? Extract by whom (human, C program, web page...)? For the second point you can get alot of that info even at user level via the [/proc filesystem](http://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/proc.html) – kaylum Nov 14 '16 at 00:32
  • Thank you for correction. I edited the post. For the proc file system, is it possible to export the results into a csv or any type of file. I think I can write a python script to do that, but I'm wondering if there is a shortcut in bash for it. @kaylum – akyayik Nov 14 '16 at 01:35
  • Sorry, but stackOverflow is about helping people fix their existing code, not researching, spec(k)ing it, developing and testing. Given your Qs, you need to spend more time defining your end goal, and writing some code that tries to achieve that. When you get stuck after that then post a Q following guidlines for [MCVE](http://stackoverflow.com/help/mcve) . Good luck. – shellter Nov 14 '16 at 03:27

1 Answers1

3

To do this from User space:

ps -U <username/UID> | tr -s ' '| tr ' ' ','| cut -d ',' -f2,5 > out.csv

From Kernel space, as a module:

#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/sched.h>

static int uid=0;

static int procx_init(void){
    struct task_struct *task;
    for_each_process(task)
            printk ("uid=%d, pid=%d, command=%s\n", task->cred->uid, task->pid, task->comm);
    return 0;
}
static void procx_exit(void)
{
    printk("procx destructor\n");
}
module_init(procx_init);
module_exit(procx_exit);
module_param(uid, int, 0);

MODULE_AUTHOR ("sundeep471@gmail.com");
MODULE_DESCRIPTION ("Print process Info");
MODULE_LICENSE("GPL");

I didn't check for the UID, but you can pass it as module parameter or runtime passer to trigger a kthread

Sun
  • 1,505
  • 17
  • 25
  • I think this is the answer that I'm looking for. I'm trying to get lstart,cmd,pid,ppid,uid,pgrp,pcpu,%mem,vsize,share,cmin_flt,time,size,ruser from the user space. However since I'm new to this, could you please explain the first way that you mentioned. For example how you are printing the PID and cmd? is it -s and -d? – akyayik Nov 14 '16 at 04:56
  • I think I made it. Here is how :) ps -eo cmd,pid,ppid,uid,pgrp,pcpu,%mem,vsize,share,cmin_flt,size,ruser | tr -s ' '| tr ' ' ',' > cleandata.csv – akyayik Nov 14 '16 at 05:40