3

I currently have a python script that accomplishes a very useful task in a large network. What I do is use lsof -iTCP -F and a few other options to dump all listening TCP sockets. I am able to get argv[0], this is not a problem. But to get the full argv value, I need to then run a ps, and map the PIDs together, and then merge the full argv value from ps into the record created by lsof. This feels needlessly complex, and for 10000+ hosts, in Python, it is very slow to merge this data.

Is there any way to show the full argv value w/lsof? I have read the manual and I couldn't find anything, so I am not too hopeful there is any way to do this. Sure, I could write a patch for lsof, but then I'd have to deploy it to 10000+ systems, and that's a non-starter at this point.

Also, if anyone has any clever ways to deal with the processing in Python such that it doesn't take 10 minutes to merge the data, I'd love to know. Currently, I load all the lsof and ps data into a dict where the key is (ip,pid) and then I merge them. I then create a new dict using the data in the merged dict where the key is (ip,port). This is really slow because the first two processes require iterating over all the lsof data. This is probably not a question, but I figured I'd throw it in here. My only idea at this point is to count processers and spawn N subprocesses, each with a chunk of the data to parse, then return them all back to the parent.

adam
  • 384
  • 2
  • 9

1 Answers1

1

If you know the PID (eg. 12345) of the process, you can determine the entire argv array by reading the special file /proc/12345/cmdline. It contains the argv array separated by NUL (\0) characters.

BingsF
  • 1,269
  • 10
  • 15
  • right, or I can run ps.. I meant using just one invocation of lsof. Looks like the answer is no. – adam Feb 23 '16 at 08:06
  • Ah, I see. I don't think that's supported, but reading the cmdline file at least ought to be faster than invoking n separate `ps` processes. – BingsF Feb 23 '16 at 20:21