0

Situation: I am scanning a directory using NtQueryDirectoryFile(..., FileBothDirectoryInformation, ...). In addition to data returned by this call I need security data (typically returned by GetKernelObjectSecurity) and list of alternate streams (NtQueryInformationFile(..., FileStreamInformation)).

Problem: To retrieve security and alternate stream info I need to open (and close) each file. In my tests it slows down the operation by factor of 3. Adding GetKernelObjectSecurity and NtQueryInformationFile slows it down by factor of 4 (making it 12x).

Question: Is there a better/faster way to get this information (by either opening files faster or avoiding file open altogether)?

Ideas: If target file system is local I could access it directly and (knowing NTFS/FAT/etc details extract info from raw data). But it isn't going to work for remote file systems.

C.M.
  • 3,071
  • 1
  • 14
  • 33
  • What about `GetNamedSecurityInfo()`, does it still open the file internally? – zett42 Jan 26 '18 at 22:54
  • @zett42 - of course. `GetNamedSecurityInfo` is only shell of `GetKernelObjectSecurity` which is shell of `NtQuerySecurityObject`. open handle to object, for query it security is mandatory from user mode. no another way – RbMm Jan 27 '18 at 01:05
  • maybe there is a way to get this info in bulk? i.e. for a bunch of files in one request – C.M. Jan 27 '18 at 01:06
  • no, you need open file handle separate for every file. no another way. no bulk open. like and for scan folder - you also need open it handle (direct or indirect). however you can do asynchronous scan (open files in asynchronous mode) and after query some folder not wait for results(handle it in callback) but continue scan another folder/files. on ssd this can very speed up process – RbMm Jan 27 '18 at 01:28
  • @RbMm So, the only way is to overlap these requests. But problem is that `NtOpenFile/NtQueryInformationFile/NtQuerySecurityObject` are all synchronous (unlike `NtQueryDirectoryFile`). I suspect it is possible to issue underlying IRPs on my own, but I have no idea how to do it (my code runs in user mode). – C.M. Jan 27 '18 at 04:01
  • @RbMm Something isn't right here... I use about 70 threads to performs this scan, each file is processed in a separate thread. I.e. I already overlap related operations (just not in the most efficient way) -- opening file (if it is a simple operation) shouldn't slow me down by factor of 3. Which means that it probably does smth expensive (like reading data from disk) -- I need to make this operation as cheap as possible. May be some combination of flags that will tell OS that I'll read only metadata? – C.M. Jan 27 '18 at 18:32
  • i think some problem in your implementation, which is invisible here. i not think that must be slow me down in 3,4 times, if no errors in logic. 70 threads think also too many. threads count must be equal cpu count in group. i some time ago implement tool for fast search string/bytes in file. use asynchronous io for both enumerate folder context and read files. but open files of course synchronous operation. on nvme disks i got best result by speed when process in concurrent ~32 files/folders. but this is not thread count, but i/o count – RbMm Jan 27 '18 at 19:14
  • @RbMm I understand that having too many threads is bad, but I have no choice here -- all calls (with exception of `NtQueryDirectoryFile`) are synchronous. Logic is very simple (storage device is NetApp in local 1Gb network) -- get list of files in given directory, for each file create a work item (to be picked up by thread from a thread pool). Simple enumeration is at ~20k files/second. – C.M. Jan 27 '18 at 20:07
  • ... if I add open-file + close-file (and nothing else) to a work-item processing logic -- overall performance drops ~3 times to 6.5k/second – C.M. Jan 27 '18 at 20:08
  • if this is open file via network - really can serious slow down every operation. but anyway no sense have more threads than cpu count (in process group). when you have more threads - it will be executed sequential. so need create fixed count thread (=cpu core count) thread pool and queues tasks (file name) to pool – RbMm Jan 27 '18 at 20:33
  • Alas, because I am forced to use synchronous functions -- thread that calls them ends up simply sleeping (waiting for response) most of the time. If I reduce thread count to 8 (number of cores) -- instead of 70 simultaneously pending requests I end up with only 8. I checked it and overall performance drops pretty bad. – C.M. Jan 27 '18 at 20:48

1 Answers1

0

Custom SMB client is the answer, it seems. Skipping Windows/NT API layer opens all doors.

C.M.
  • 3,071
  • 1
  • 14
  • 33