Since I never found a good answer to this on the web, I spent many hours working on this problem. I hope I can spare someone this pain. lsof by itself will print out horizontal output with missing values making it impossible to parse properly
To format lsof
you need to use the command:
lsof -F pcuftDsin
adding the -F will print results out vertically, let me explain each part.
lsof
: gets a list of all open files by process
-F
: formats the output vertical instead of horizontal
p
: will prefix the PID or (Process ID) column
c
: will prefix the COMMAND or (Process Name) column
u
: will prefix the User column that the process is running under
f
: will prefix the File Descriptor column
t
: will prefix the type column
D
: will prefix the Device column
s
: will prefix the SizeOff column
i
: will prefix the Node column
n
: will prefix the Name or (File Path)
output:
p3026
ccom.apple.appkit.xpc.openAndSavePanelService
u501
fcwd
tDIR
D0x1000004
s704
i2
n/
ftxt
tREG
D0x1000004
s94592
i1152921500312434319
n/System/Library/Frameworks/AppKit.framework/Versions/C/XPCServices/com.apple.appkit.xpc.openAndSavePanelService.xpc/Contents/MacOS/com.apple.appkit.xpc.openAndSavePanelService
ftxt
tREG
D0x1000004
s27876
i45156619
n/Library/Preferences/Logging/.plist-cache.usI0gbvW
ftxt
tREG
D0x1000004
s28515184
i1152921500312399135
n/usr/share/icu/icudt64l.dat
ftxt
tREG
D0x1000004
s239648
i31225967
n/private/var/db/timezone/tz/2019c.1.0/icutz/icutz44l.dat
ftxt
tREG
D0x1000004
s3695464
i1152921500312406201
n/System/Library/CoreServices/SystemAppearance.bundle/Contents/Resources/SystemAppearance.car
ftxt
tREG
D0x1000004
s136100
i38828241
n/System/Library/Caches/com.apple.IntlDataCache.le.kbdx
As you can see, each line is prefixed with the proper letter assigned above. Another important thing to note is that "Process ID", "Process Name" and User will only be printed one time per set of open files, for the database storage, I needed these fields for each line that was printed. I was performing a java project, so the code I used to parse it was as shown below:
public static void main(String[] args) {
String command = "lsof -F pcuftDsin";
String captureBody = "";
Process proc = null;
try {
proc = Runtime.getRuntime().exec(command);
} catch (IOException e) {
e.printStackTrace();
}
BufferedReader reader = new BufferedReader(new InputStreamReader(proc.getInputStream()));
String line = "";
String ProcessID = "";
String ProcessName = "";
String User = "";
String FD = "null";
String Type = "null";
String Device = "null";
String SizeOff = "null";
String Node = "null";
String File = "null";
while(true) {
try {
line = reader.readLine();
if (line == null) {
break;
} else {
if (line.startsWith("p")) {
ProcessID = line;
} else if (line.startsWith("c")) {
ProcessName = line;
} else if (line.startsWith("u")) {
User = line;
} else if (line.startsWith("f")) {
FD = line;
} else if (line.startsWith("t")) {
Type = line;
} else if (line.startsWith("D")) {
Device = line;
} else if (line.startsWith("s")) {
SizeOff = line;
} else if (line.startsWith("i")) {
Node = line;
} else if (line.startsWith("n")){
File = line;
System.out.println(ProcessID + "," + ProcessName + "," + User + "," + FD + "," + Type + "," + Device + "," + SizeOff + "," + Node + "," + File);
FD = "null";
Type = "null";
Device = "null";
SizeOff = "null";
Node = "null";
File = "null";
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
try {
proc.waitFor();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
output
p94484,ccom.apple.CoreSimulator.CoreSim,u501,ftxt,tREG,D0x1000004,s239648,i31225967,n/private/var/db/timezone/tz/2019c.1.0/icutz/icutz44l.dat
Because I was storing the output, I needed the empty fields to show something, I used null, you can use anything as default text, or even just use an empty string for the missing fields, not all fields will be populated. If anyone has any suggestions on how I could improve the code performance I am all ears.